Goals There’s lots of example datasets for Neo4j. This guide outlines some of them. Prerequisites You should be comfortable installing and importing data into Neo4j. Intermediate Overview Data Sets Loading Data from Source Using a copy of a Neo4j database… Learn More →
There’s lots of example datasets for Neo4j. This guide outlines some of them.
You should be comfortable installing and importing data into Neo4j.
We want to cover data sets from the different domains. For each of the data sets we want to provide a description, the graph model and some use-case queries.
|This is work in progress, so some data sets might not be updated with that information yet.|
Loading Data from Source
The most reliable way to get a dataset into Neo4j is to import it from the raw sources. Then you are independent of database and store-versions, which you otherwise have to upgrade. That’s why we provided raw data (CSV, JSON, XML) for many of the data sets, accompanied by import scripts.
You would import the data using a command-line client like
neo4j-shell or CyCli a modern, colorful Neo4j client, which understands the same file format.
Using a copy of a Neo4j database
Other data sets are provided as compressed copy (zip) of a Neo4j datastore. You can find the datastore files in
$NEO4J_HOME/data/databases/[graph.db] or in the directory you selected at startup. Please stop your Neo4j server and uncompress the store-files into the appropriate directory.
| For some data sets the Neo4j version they were built with might be older than your Neo4j version. Then you might need to configure Neo4j to upgrade your database automatically, by setting |
You can also run neo4j-shell on the extracted directory directly:
The Data Sets
Jim Webber’s Doctor Who Data Set
The Dr.Who universe of doctors, actors, enemies and props from the Neo4j Koans Tutorial.
12k movies, 50k actors. Original Source: TheMovieDB
The Musicbrainz main entities
Most of the interesting entities (800,000 Artists, 12,000,000 Tracks, 1,200,000 Releases, 75,000 Record Labels) from the Musicbrainz dataset.
Neo4j Graph Gists
Neo4j Graph Gist examples are a great source for datasets to get started, as they not only come with the example data setup, but also explanations and use case queries.
Public Datasets with Instructions
These are not prebuilt data-stores but existing data sets (mostly CSV) to be imported.
The linked articles and repositories also provide instructions for the import.
- The Panama Papers
- Northwind Database Import
- Importing Stack Overflow into Neo4j
- The Cosmic Web of Galaxies
- Chicago Crime Dataset
- How I met your Mother Series
- Awesome Public Datasets
- Consumer Complaint Data
- Football(Soccer) Worldcup, Data Model
- Flight & Airline, Music, Train Schedules
- Kaggle Publication Dataset
- GitHub Event Data