Real-Time Graph Search of Millions of Artworks Powered by Neo4j
The mission of the Netherlands-based Europeana Foundation is to “transform the world with culture.” To achieve this, the Foundation has encouraged over 3,000 European institutions to provide digital data for all of their artifacts. By gathering this information and making it available online, Europeana has been able to promote these institutions by making their collections known to enthusiasts and researchers worldwide.
It is art appreciation for the digital age. “We are living in a digitized society,” said Europeana Infrastructure Manager Matt Nader. “Going to a library and reading an old newspaper doesn’t happen very often any more. But a digital version, it makes it much closer to you.”
Europeana’s team of developers has built databases of structured information for every artifact, which includes the date it was created, by whom, its location and more. They also wanted to encourage in-depth exploration of the artifacts by creating connections between related items – like all information about the Mona Lisa, or all works by the same painter or composer. But at this point, the Foundation realized that using traditional databases to house such connected data was impractical.
“Our objective is to make as many connections between the cultural artifacts as possible,” said System Architect Yorgos Mamakis. “But we were missing a meaningful way to have a ‘relation’ and to go from one object to another via these relations because it was hidden somewhere in the data model. It’s so memory-intensive that, considering our number of records, it would result in billions or trillions of triples in a standard semantic repository. And traversing over that or retrieving that sort of information would be extremely slow with standard hardware.”
In 2014, Europeana found its answer in Neo4j. “The most obvious solution was Neo4j, a graph database supporting everything we wanted out-of-the-box,” said Mamakis. “Neo4j provided the relations traversal and the links we needed in a structured manner.”
The database, which was easy to implement – “We did not find it complicated at all to work with Neo4j,” Mamakis said – now holds over 6 million (12%) of its 53 million cultural objects and records in Neo4j. Mamakis continued, “We expect that, as the number of artifacts we have increases, more and more will end up in Neo4j so that it becomes one of our core systems.”
Through Neo4j, Europeana offers visitors “Similar Items” to encourage them to move between related pieces of information and find out more about artifacts, based on their own interests. For example, searches on Mona Lisa now turn up tens or even hundreds of results. Europeana also offers an “Explore” button and hosts dozens of online “Exhibitions” to encourage further discovery.
Mamakis explained: “Neo4j adds a benefit in the quality of our records and the user experience. It’s the fact that the users get another way of browsing through the data. Now you don’t just retrieve the object, but also the family of objects that is closely related to it – so you have another entry point to access new objects and potentially find more information about what you are interested in.”