Tuesday, 8 July 2008

Powerset, Microsoft and the Semantic Web

Given Microsoft has acquired the semantic search engine start up Powerset for a rumoured USD $100 million, it is probably a good time to consider why Micosoft considers Powerset to be important, i.e. "Powerset technology is more about indexing the content and understanding its meaning, than the query itself", against the following statement from ReadWriteWeb:

So far, none of the larger search engines have been able to capitalise on the promises of semantic search. Most of the innovations in the space so far have come from small start-ups and even those never made any real inroads in terms of market share when compared to the keyword driven search engines of Google, Ask, Yahoo, and Microsoft.

It will be interesting to see if the acquisition of Powerset helps Microsoft deliver on the promises of semantic searching. Certainly the Microsoft Start Up blog posting by Don Dodge states that, there "are many lucrative markets for this technology...not just consumer web search." Mind you, not everyone is impressed with Powerset so this could be a dud acquisition by Microsoft.

So how does this work. Well as explained by Microsoft:

Powerset is using linguistics and (NLP) to better understand the meaning and context of search queries. But the real power of Powerset is applied to the search index, not the query. The index of billions of web pages is indexed in the traditional way. The big difference is in the post processing of the index. They analyze the indexed pages for "semantics", context, meaning, similar words, and categories. They add all of this contextual meta data to the search index so that search queries can find better results.

What will be really interesting is: (a) what will Google do, and (b) what does this all mean for libraries.

