Neo4j and MongoDB

This approach is deprecated in favor of APOC procedures for working with MongoDB. This page is being maintained for reference, but is not current or supported. Please consult the APOC documentation for the latest information.
Goals
Taking advantage of the strengths of multiple database technologies is the concept of polyglot persistence. For example, a product catalog application might use a document database (such as MongoDB) to power browsing / searching for products along with a graph database (such as Neo4j) to provide real time personalized product recommendations. To enable polyglot persistence the application needs to store data in multiple databases, each with its own data model (graph vs. document). Being able to connect MongoDB to Neo4j and synchronize data automatically makes this process much simpler.
Prerequisites
You should have an understanding of MongoDB, Neo4j, be famililar with both the document data model and property graph data model, and have MongoDB and Neo4j installed.

Intermediate

General Observations

Often it is useful to synchronize meta data or a subset of data from MongoDB to Neo4j to take advantage of realationships in the data, something that Neo4j is optimized for and is more difficult to accomplish in MongoDB. The Neo4j Doc Manager project below facilitates this process.

Neo4j Doc Manager

The developers at MongoDB have provided the mongo-connector project which provides a mechanism for listening for all update operations in MongoDB and facilitates mirroring those updates to another system.

From the mongo-connector documentation:

mongo-connector creates a pipeline from a MongoDB cluster to one or more target systems, such as Solr, Elasticsearch, or another MongoDB cluster. It synchronizes data in MongoDB to the target then tails the MongoDB oplog, keeping up with operations in MongoDB in real-time. From the Github repo for Mongo-connector - mongo-connector supports Python 3.4+ and MongoDB versions 3.4 and 3.6.

To facilitate synchronizing data from MongoDB to a Neo4j instance, the community has implemented a Neo4j Doc Manager for mongo-connector. It is intended for live one-way synchronization from MongoDB to Neo4j, where you have both databases running to take advantage of each databases' strengths in your application.

This project is currently beta quality and not officially supported by Neo4j.

Installing

You must have Python installed to use Neo4j Doc Manager.

It is recommended to install using pip, the Python package manager.

pip install neo4j-doc-manager --pre

For alternative installation workflow, see the documentation here.

Using Neo4j Doc Manager

Ensure that a Neo4j instance is running. If authentication is enabled for Neo4j, set the NEO4J_AUTH environment variable, containing username and password:

export NEO4J_AUTH=user:password

Ensure that MongoDB is running a replica set. To initiate a replica set, start MongoDB with this command:

mongod --replSet myDevReplSet

Then open mongo-shell and run:

rs.initiate()

Refer to the Mongo Connector FAQ for more information.

Start the mongo-connector service with the following command:

mongo-connector -m localhost:27017 -t http://localhost:7474/db/data -d neo4j_doc_manager
  • -m provides the MongoDB endpoint

  • -t specifies the Neo4j endpoint

  • -d specifies Neo4j Doc Manager as the doc manager

To see all configuration options for Neo4j Doc Manager (including specifying which collections and fields to synchronize) see the documentation here.

Data synchronization

With the neo4j_doc_manager service running, any documents inserted into MongoDB will be converted to a property graph structure and immediately inserted into Neo4j as well. Document keys will be turned into nodes. Nested values on each key will become properties.

Document to graph

To see this in action, let’s consider the following document:

{
  "session": {
    "title": "12 Years of Spring: An Open Source Journey",
    "abstract": "Spring emerged as a core open source project in early 2003 and evolved to a broad portfolio of open source projects up until 2015."
  },
  "topics":  ["keynote", "spring"],
  "room": "Auditorium",
  "timeslot": "Wed 29th, 09:30-10:30",
  "speaker": {
    "name": "Juergen Hoeller",
    "bio": "Juergen Hoeller is co-founder of the Spring Framework open source project.",
    "twitter": "https://twitter.com/springjuergen",
    "picture": "http://www.springio.net/wp-content/uploads/2014/11/juergen_hoeller-220x220.jpeg"
  }
}

If we insert the document into MongoDB using the mongo-shell:

db.talks.insert(  { "session": { "title": "12 Years of Spring: An Open Source Journey", "abstract": "Spring emerged as a core open source project in early 2003 and evolved to a broad portfolio of open source projects up until 2015." }, "topics":  ["keynote", "spring"], "room": "Auditorium", "timeslot": "Wed 29th, 09:30-10:30", "speaker": { "name": "Juergen Hoeller", "bio": "Juergen Hoeller is co-founder of the Spring Framework open source project.", "twitter": "https://twitter.com/springjuergen", "picture": "http://www.springio.net/wp-content/uploads/2014/11/juergen_hoeller-220x220.jpeg" } } );

the document is converted to a property graph and inserted into Neo4j with this structure:

Property graph

To see more examples and the full documentation, please refer to the Neo4j Doc Manager project on GitHub.