Polyglot Persistence Case Study: Wanderu + Neo4j + MongoDB

Solution Architectural Diagram of Polyglot Persistence for Wanderu between Neo4j and MongoDBEvery language and data storage solution has its strengths. After all, no single solution is most performant and cost-effective for every possible task in your application.

In order to tap into the varying strengths of different data storage solutions, your application needs to take advantage of polyglot persistence. That’s exactly what Wanderu did when building their meta-search travel website – and here’s how they did it:

The Technical Challenge

Wanderu provides the ability for consumers to search for bus and train tickets, with journeys combining legs from multiple different transportation companies. The route data is stored in JSON, making a document storage engine like MongoDB a great solution for their route leg data storage.

However, they also needed to be able to find the optimal path from origin to destination. This is perfect for a graph database like Neo4j, because Neo4j can understand the data relationships between different transit route legs.

Polyglot Persistence: Using MongoDB and Neo4j

Wanderu didn’t want to force MongoDB (a document-based data store) to handle graph-style relationships because the implementation would have been costly and inefficient. Instead, they used a polyglot persistence approach to capitalize on the strengths of each, deciding to use both MongoDB and Neo4j together.

Solution Architectural Diagram

Wanderu's polyglot persistence architecture between mongoDB and Neo4j

The Wanderu ticket search engine uses both MongoDB (for easy JSON document storage) and Neo4j (for efficient route calculations).

The Challenge of Sync

With the bus route legs stored in MongoDB, Wanderu had to decide whether to write application code to synchronize this information into Neo4j as a graph model or use a syncing technology to handle this automatically.

Eddy Wong, CTO and Co-Founder of Wanderu, discovered the GitHub project called “mongo-connector,” which enabled Mongo’s built-in replication service to replicate data to another database. Eddy only had to write a Doc Manager for Neo4j which handled callbacks on each MongoDB insert or update operation.

As new entries are added to the MongoDB OpLog, the Mongo Connector calls the Neo4j DocMgr. The Neo4j DocMgr code written by Wanderu then uses the py2neo open source Python library to create the corresponding nodes, properties and relationships in Neo4j. The API server then uses Node-Neo4j to send queries to the graph database.

The resulting solution takes advantage of Neo4j, MongoDB, JSON, Node.js, Express.js, Mongo Connector, Python and py2neo. Polyglot persistence ensures that each of these technologies are used according to their greatest strengths. And for Wanderu, it means a better search and routing experience for their users.

Read more about Wanderu’s use of Neo4j in their online case study.

O’Reilly’s Graph Databases compares NoSQL database solutions and shows you how to apply graph technologies to real-world problems. Click below to get your free copy of the definitive book on graph databases and your introduction to Neo4j.