Sean Gallagher, Ars Technica’s IT Editor, reveals the architecture of Google Knowledge Graph and Microsoft’s Satori Graph, which are both creating a ‘new world’ search – “a knowledge-driven approach to search”.
Google’s Knowledge Graph derives from Freebase, a proprietary graph database acquired by Google in 2010 when it bought Metaweb. Google’s Thakur, who is technical lead on Knowledge Graph, says that significant additional development has been done to get the database up to Google’s required capacity. Based on some of the architecture discussed by Google, Knowledge Graph may also rely on some batch processes powered by Google’s Pregel graph engine, the high-performance graph processing tool that Google developed to handle many of its Web indexing tasks—though Thakur declined to discuss those sorts of details. Microsoft’s Satori (named after a Zen Buddhist term for enlightenment) is a graph-based repository that comes out of Microsoft Research’s Trinity graph database and computing platform. It uses the Resource Description Framework and the SPARQL query language, and it was designed to handle billions of RDF “triples” (or entities). For a sense of scale, the 2010 US Census in RDF form has about one billion triples.
Gallagher writes that the two search engines have different objectives to search.
Microsoft and Google have exposed the knowledge stored in their entities in similar ways, though Microsoft has also added other interface items based on the processing of social graphs and other social media. On the surface, Google’s new engine appears to be more about getting answers to questions, while Microsoft’s new Bing front-end exposes entities in a way that is more suited to taking actions—or to making transactions.
They are still big challenges ahead for both with managing the ever-expanding size of semantic databases and maximizing the performance of the search. Also, natural language queries are not yet possible when parsing through Graph entity properties. And finally, the Knowledge Graph and Satori, have been focusing only on content in US English. Read the full article.