Goals Taking advantage of the strengths of multiple database technologies is the concept of polyglot persistence. For example, a product catalog application might use a document database (such as MongoDB) to power browsing / searching for products along with a… Read more →
Taking advantage of the strengths of multiple database technologies is the concept of polyglot persistence. For example, a product catalog application might use a document database (such as MongoDB) to power browsing / searching for products along with a graph database (such as Neo4j) to provide real time personalized product recommendations. To enable polyglot persistence the application needs to store data in multiple databases, each with its own data model (graph vs. document). Being able to connect MongoDB to Neo4j and synchronize data automatically makes this process much simpler.
You should have an understanding of MongoDB, Neo4j, be famililar with both the document data model and property graph data model, and have MongoDB and Neo4j installed.
Often it is useful to synchronize meta data or a subset of data from MongoDB to Neo4j to take advantage of realationships in the data, something that Neo4j is optimized for and is more difficult to accomplish in MongoDB. The Neo4j Doc Manager project below facilitates this process.
Neo4j Doc Manager
The developers at MongoDB have provided the mongo-connector project which provides a mechanism for listening for all update operations in MongoDB and facilitates mirroring those updates to another system.
From the mongo-connector documentation:
mongo-connector creates a pipeline from a MongoDB cluster to one or more target systems, such as Solr, Elasticsearch, or another MongoDB cluster. It synchronizes data in MongoDB to the target then tails the MongoDB oplog, keeping up with operations in MongoDB in real-time. It has been tested with Python 2.6, 2.7, 3.3, and 3.4.
To facilitate synchronizing data from MongoDB to a Neo4j instance, the community has implemented a Neo4j Doc Manager for mongo-connector. It is intended for live one-way synchronization from MongoDB to Neo4j, where you have both databases running to take advantage of each databases’ strengths in your application.
|This project is currently beta quality and not officially supported by Neo Technology.|
You must have Python installed to use Neo4j Doc Manager.
It is recommended to install using pip, the Python package manager.
For alternative installation workflow, see the documentation here.
Using Neo4j Doc Manager
Ensure that a Neo4j instance is running. If authentication is enabled (version 2.2+) for Neo4j, set the NEO4J_AUTH environment variable, containing username and password:
Ensure that MongoDB is running a replica set. To initiate a replica set, start MongoDB with this command:
Then open mongo-shell and run:
Refer to the Mongo Connector FAQ for more information.
Start the mongo-connector service with the following command:
- -m provides the MongoDB endpoint
- -t specifies the Neo4j endpoint
- -d specifies Neo4j Doc Manager as the doc manager
To see all configuration options for Neo4j Doc Manager (including specifying which collections and fields to synchronize) see the documentation here.
neo4j_doc_manager service running, any documents inserted into MongoDB will be converted to a property graph structure and immediately inserted into Neo4j as well. Document keys will be turned into nodes.
Nested values on each key will become properties.
To see this in action, let’s consider the following document:
If we insert the document into MongoDB using the mongo-shell:
the document is converted to a property graph and inserted into Neo4j with this structure:
To see more examples and the full documentation, please refer to the Neo4j Doc Manager project on GitHub.