Facebook released their search strategy today, the so called “third pillar” of Facebook’s future.
Search is hard, very hard. It’s why I have always been fascinated by search, it is also one of the reasons I have a massive amount of respect for Google, beyond their annoying marketing strategy of “do no evil”, 0-10 PageRank and Android, which is a half baked mobile OS IMHO, is the fact that they have engineering cohones.
Their UX is horrible, their products are scattered (Google+, Wave).
But their search is amazing.
And search as I mentioned before fascinates me.
“Index the entire web, then, for whatever term I type into the search engine, return to me the most relevant sources of information and make sure it is trusted, timely, and relevant. Infer what I mean when I type into that little box. Make it go.”
That is an exceedingly difficult problem which, by all rights they’ve done an amazing job delivering upon.
The World Wide Web is made up of unstructured data: blogs here, websites there, forums, reviews, images, comments, stuff stuff and more stuff. When data and information is not structured it is difficult, very difficult to filter, sort and rank. Again, all things in life being imperfect, Google has delivered on that claim and passed with flying colours.
That’s why you and I use Google everyday. It’s important because it’s very very useful.
Now to circle back to my original thesis: Facebook will fail at search and here is why:
Facebook is avoiding the very real and very tough problem Google tackled head on from day one: unstructured data. Google is attempting to infer the meaning and create structure behind unstructured data.
Do I like something simply because I mention it? How does the content reflect my actual point of view? Am I an expert regarding the topic I am commenting upon?
Facebook’s solution to search is the “Like” and the Open Graph. Their structured database,which holds stores, categorizes and makes accessible everything you do on Facebook and by extension using “Login in using Facebook” through a subset of the Word Wide Web.
Facebook has structured data about our lives, all of our posts, images, comments etc in their Open Graph, a structured data set that makes claims to knowing who people *really* are, their real connections and their social lives.
These are the claims that Facebook has promised are their technological “secret sauce” on both pre-IPO and post-IPO. But there’s an issue, which gets us back back to my points about Google earlier and the challenging issues they tackled head on from day one.
However, they cannot distinguish when someone although they “Like” McDonald’s doesn’t really like McDonald’s through their unstructured sentiment, my comments about them are not indicating a positive sentiment even though I hit the “Like” button.
Using sentiment to express an outcome versus a structured data set element such as a “Like”. Google has done this from day one via Hilltop and the hundreds of iterations to their PageRank algorithm (not the 0-10 scale, the algorithmic PageRank that is Google’s IP). It’s how they rank and sort the unstructured web.
Anyhow, this blog post is already too poorly written and too long, but I find this conversation fascinating because these are the claim of amazing technologies (Facebook) versus the reality of execution (Google).
Facebook cannot, or will not, attempt to address the tough problem: finding meaning through unstructured data.
Rather they want to force a structured data set (read: Open Graph) onto our lives but will not get into the sentiment problem.
This spurred an interesting conversation about structured data and sentiment on Google+ with a long time colleague of mine Aaron Bradley who is a search marketing expert who legitimately knows his shit. Here is the thread:
Interesting case Dan. In short, however much Open Graph’s “intelligent structured data” can be leveraged for advertising and other purposes, one cannot infer the presence of negative sentiment based soley on the absence of positive sentiment.
Put another way, this is where the absence of a “Dislike” button is something of Achilles’ heel for Facebook (and, by extension, the absence of a “-1” button in Google).
Open Graph can’t speak to what you and your friends don’t like, because there’s no mechanism for this. Both built-in Open Graph actions and built-in Open Graph objects are, at best, neutral when it comes to sentiment. Facebook may be able to see that a friend “Liked” (action) Catcher in the Rye (object) – a positive sentiment – or just “Read” (action) Catcher in the Rye – a possibly neutral sentiment, but one I’ll bet is processed (like the built-in actions “Watch”, “Listen” and “Follow”) like a “Like” by Facebook’s algorithms. It’s perhaps (unintentionally) telling that theplaceholders for built-in objects all contain content like this:
I don’t know that Google – even outside the Google+ environment and its lack of a -1 – that Google is better suited to make sentiment decisions for advertising delivery based on structured data. The exception here is review data, which is really a sentiment scale. But in order to throttle the display of a McDonald’s ad based on structured data, Google would have to know that you disliked McDonald’s – regardless of the general sentiment surrounding the restaurant – because you gave it one out of five on a review. (Of course your friends’ reviews might count if Google knew as much about you and your relationships based on Google+ as Facebook does based on … well, Facebook. In reality? Ha.)
So is Facebook delivering McDonald’s ads to you a sign of failure? As much as I’m not particularly a FB fanboy I’d have to say no: Facebook’s algorithm can’t read your mind. It might even be reasonable targeting using structured data, based on the fact that a certain proportion of your Facebook friends “Like” McDonald’s Page – which would be the equivalent of me being targeted with a Tim Horton’s ad (I don’t despise them and their deceptive advertising – I just find their coffee appalling).
Of course one could also infer from positive sentiment things it’s likely I am neutral or negative toward. If I “Like” Hitchens’ God is not Greatand Dawkins’ The God Delusion you’re probably not going to get far showing me an ad for Jesus Calling (evangelical bestseller – thanks Google). But that would take multiple levels of sentiment analysis and topical classification on top of other algorithmic gymnastics.
I recall a conversation you and I had on Facebook concerning why one should grind one’s beef, or (in my case) acquire it from cow-loving but non-vegetarian hippies. But we never expressed that in a formal way (clicked a “Like” button associated with the non-built-in object “Homemade Hamburgers”). So Facebook had the sentiment, but didn’t have structured data pertaining to it. And so you got asked about Mickey D’s.
And my thoughts:
Awesome points – however what Facebook needs to be able to do with their structured data goldmine is infer sentiment and semantics from the unstructured portions of their data set.
Indeed the convenient construct is an explicit dislike, however that is an intrusive model from a user perspective.
I would then have to (as a user) explicitly identify that I indeed do Like or Dislike something in order for Facebook’s algorithm to be able to understand my sentiment.
Sentiments are unstructured notions. How I “feel” about a given subject does not always have a structured data model which is convenient for the system to process.
So – is Facebook’s idea to enforce a structure and exclude a sentiment? It seems so. From a technological innovation perspective Google assumes lack of structure and provides benefits where possible. Facebook OTOH wants to impose structure and ignore the really difficult problem, inferring sentiment from unstructured data. That’s not fundamentally a problem except that Facebook makes claims to understanding our lives and how we interact. It’s a bit of a bait and switch of claims versus reality.
Lastly, some Facebook PR regarding their search technology with some translation from VentureBeat. I’m now summarizing my thoughts in sound bites, but:
“web search is designed to … return links that may have answers to the questions that you’re trying to ask. Graph Search is designed to return the answer, not links that might get you to the answer.”
Translation: We have structured data. That gives us the answer from our formal data set. Hilltop and Google suck, reference to link authority. Indexing the World Wide Web is hard. We want to make it easier by using our data not everyone else’s.
“We came up with something we think is a lot more natural,” he (Zuckerberg) said.
Translation: Natural to us is our definition of structured data. Figuring out what you mean online is hard work, we don’t want to do that. Natural means you Like something (or by extension in their Want, Listen notions etc in the open graph).
“It’s gonna take years and years to index everything,” Zuckerberg said. “There’s more content we haven’t gotten to than content we have.” Search for mobile, more languages, text posts, and Open Graph content will be coming soon. And, of course, an API is also on the roadmap, but perhaps a bit further down the line.”
Translation: Google has been indexing for years. What is open graph content? It’s your content on your site shoved into their database then made to conform so they can monetize easily while avoiding the work.
Am I wrong? Is everything I’ve written complete nonsense? Has the world gone crazy by not observing this or am I just totally insane?
Mark Zuckerberg explaining Facebook Search (PR Video): [youtube]http://www.youtube.com/watch?v=U94DTrjAvuA[/youtube]
“Yahoo, according to Ms. Bartz, simply feeds search results for people who have grown curious while reading one of its news stories or watching a video. It doesn’t generally pop into peoples’ minds as the first place to go look for answers during the course of their day-to-day activities.”
How is that possibly the case? On so many levels I would argue the validity of this claim:
“The biggest thing for Yahoo is increasing the number of pages people consume and slapping as many display ads as possible across those pages. “My fortunes are tied to my pages,” Ms. Bartz said.”
This is nonsenscial. I have an ad network. We are interested in content pages to serve advertising. Yahoo is a software technology company…er…it WAS a software technology company. How sad for Yahoo to have a such a short sighted myopic CEO.
“According to Ms. Bartz, the majority of Yahoo’s sites will go the way of Sports. In particular, Yahoo will throw investments behind its entertainment, finance and news operations. Ms. Bartz noted that there are plenty of unemployed journalists out there to pick up.”
Well Carol. Hey Carol. Umm Carol….those journalists are unemployed because the notion of traditional journalism and simply serving up that content and selling ads is not the same as it used to be. They are unemployed because many companies in this space are unprofitable.
Excuse me while I go bang my head against a brick wall.
“In addition, Ms. Bartz will remember that Terry Semel, a longtime Warner Brothers executive, was brought in before to turn Yahoo into more of a media company. Mr. Semel’s tenure was perhaps characterized more for losing to Google than anything else.”
Clearly Carol doesn’t believe the notion that if we are not aware of our history we are doomed to repeat the mistakes of the past. I think we can revise this though at this point, Yahoo isn’t in the “losing” position any longer. It has lost. Full Stop.
“Ms. Bartz has decided to correct past mistakes by getting all of the employees on the same page and presenting a more consistent look across Yahoo’s sites. In addition, she’s trying to boost morale and get the energy of the company up again –- a task hurt by the hit Yahoo’s shares took after the Microsoft deal was announced.
“I felt bad for the employees because they think it’s a report card,” Ms. Bartz said.”
Honestly, this woman is a CEO? Of any company? Your share price is a report card of sorts, it’s the market responding to the strategic decisions being made. Clearly this is perceived as being a bad decision. Which it is. It’s a horrible decision.
In fact it’s a series of horrible decisions, capped off by myopic thinking and topped off with a healthy dose of delusion.
Way to go Carol!
Update – Just ran into a fantastic quote from the New York Times Bits Blog:
I’ve got to wonder how much running a sales force that peddles expensive software to engineers and designers has to do with running a free Web site that attracts users through branding and products and makes money through advertising.
Just thought I would let you all know that I’ve made this blog do follow. Now remember – real comments only please but they will pass link authority.
As I am going through this resurrection of my site I’ve had the opportunity to rethink a few thinks in terms of categories, tags, URL structure, Sub Domain structure and in particular making all comments DoFollow.
Having said that I installed the DoFollow plugin yesterday for a few reasons:
Having said that I am curious. Is your blog a DoFollow blog?
Over the pas months I’ve had excellent success in building online reputation through Do Follow blogs. Here is a fantastic list from Squidoo of categorized list of Do Follow blogs.
I’ll be updating this page with more Do Follow resources.
Powered by Twitter Tools.
Powered by Twitter Tools.
Due to the Craptastic nature of EMC Web Hosting I have been forced to relatively quickly move my domain to my dedicated server. Which in the end is a good thing but in the short term is an annoying thing.
Stay Tuned. I’ll be back shortly with everything updated and new features as well.