Goals There’s lots of example datasets for Neo4j. This guide outlines some of them. Prerequisites You should be comfortable installing and importing data into Neo4j. Intermediate Overview Data Sets Loading Data from Source Using a copy of a Neo4j database… Learn More →

Goals
There’s lots of example datasets for Neo4j. This guide outlines some of them.
Prerequisites
You should be comfortable installing and importing data into Neo4j.
Intermediate


Data Sets

We want to cover data sets from the different domains. For each of the data sets we want to provide a description, the graph model and some use-case queries.

This is work in progress, so some data sets might not be updated with that information yet.
Have fun, and send us feedback to feedback@neo4j.com or raise an issue if something doesn’t work as expected.

Loading Data from Source

The most reliable way to get a dataset into Neo4j is to import it from the raw sources. Then you are independent of database and store-versions, which you otherwise have to upgrade. That’s why we provided raw data (CSV, JSON, XML) for many of the data sets, accompanied by import scripts.

You would import the data using a command-line client like neo4j-shell or CyCli a modern, colorful Neo4j client, which understands the same file format.

$NEO4J_HOME/bin/neo4j-shell -file import-file.cypher

# or

cycli -f import-file.cypher

Using a copy of a Neo4j database

Other data sets are provided as compressed copy (zip) of a Neo4j datastore. You can find the datastore files in $NEO4J_HOME/data/databases/[graph.db] or in the directory you selected at startup. Please stop your Neo4j server and uncompress the store-files into the appropriate directory.

For some data sets the Neo4j version they were built with might be older than your Neo4j version. Then you might need to configure Neo4j to upgrade your database automatically, by setting dbms.allow_format_migration=true in $NEO4J_HOME/conf/neo4j.config

You can also run neo4j-shell on the extracted directory directly:

$NEO4J_HOME/bin/neo4j-shell -path /path/to/graph.db

The Data Sets

Title

Description

Code

Download

Jim Webber’s Doctor Who Data Set

The Dr.Who universe of doctors, actors, enemies and props from the Neo4j Koans Tutorial.

GitHub

drwho.zip

Movie Database

12k movies, 50k actors. Original Source: TheMovieDB

GitHub

cineasts_12k_movies_50k_actors.zip (14MB)

The Musicbrainz main entities

Most of the interesting entities (800,000 Artists, 12,000,000 Tracks, 1,200,000 Releases, 75,000 Record Labels) from the Musicbrainz dataset.

Blog Post

musicbrainz_21.zip (4.5GB)

Neo4j Graph Gists

Neo4j Graph Gist examples are a great source for datasets to get started, as they not only come with the example data setup, but also explanations and use case queries.

Public Datasets with Instructions

These are not prebuilt data-stores but existing data sets (mostly CSV) to be imported.

The linked articles and repositories also provide instructions for the import.