Here you can find a list of available example datasets for Neo4j and learn how to import and explore them.
For getting started with Neo4j, it’s helpful to use example datasets relevant to your domain and use case. For each we want to provide a description, the graph model and some use case queries.
Neo4j Browser comes with two built-in databases, which you can create and explore using interactive slideshows.
The "Movies" example, is launched via the
:guide movie-graph command and contains a small graph of movies and people related to those movies as actors, directors, producers etc.
The "Northwind" example, is run via
:guide northwind-graph and contains a traditional retail-system with products, orders, customers, suppliers and employees.
It walks you through the import of the data and incrementally complex queries using the available data.
Other example datasets that you can run within your own Neo4j Browser are:
Game of Thrones Interactions —
UK company registration, property ownership, political donations —
Stack Overflow users, tags and Q&A data —
BBC Good Foods recipe data —
Airbnb listings data —
Football (Soccer) transfer data —
When creating a new database in Neo4j AuraDB, besides the default empty you can also select one of the starting datasets:
Graph based Recommendations
Graphs for Cybersecurity
You can explore them following the Browser guides instructions and test data with suggested Cypher queries.
In addition, you have few options to download graph data into Aura from other Neo4j instances:
Load a dump from Neo4j Sandbox backup.
Load a dump from Neo4j Graph Example repository.
Load a dump from Neo4j Desktop.
For more information, you can read the blog post Week 10 — Getting Dumps and Example Projects into Aura Free and watch the corresponding video from the series Discover Neo4j Aura Free with Michael and Alex.
To explore a wide variety of datasets in an online setup without a local installation, you can use the Neo4j sandbox.
Each sandbox is available for at least three days after creation and can also be remotely accessed from applications using any Neo4j driver.
Except for the "blank" sandbox, all other sandboxes come prepopulated with the domain data and focus on use case specific queries.
All sandboxes provide access to Neo4j Browser, Neo4j Bloom, APOC, Graph Data Science, neosemantics (n10s) and a GraphQL integration.
The use cases range from
If you need to explore more graph databases you can access the server on https://demo.neo4jlabs.com:7473
This server hosts a number of datasets with read-only access for public use.
The username and password are the same as the database name.
For instance, for
recommendations database the username is
recommendations and password is
The most reliable way to get a dataset into Neo4j is to import it from the raw sources. Then you are independent of database versions, which you otherwise might have to upgrade. That’s why we provided raw data (CSV, JSON, XML) for several of the datasets, accompanied by import scripts in Cypher.
You could run the Cypher script using a command-line client like
./bin/cypher-shell -u neo4j -p "password" -f import-file.cypher
You can also drag and drop or paste the script into Neo4j Browser (check that
multi-statement editor is enabled in the settings) and run it from there.
Other datasets are provided as dump of a Neo4j datastore.
Stop your Neo4j server.
Then you can import the file using the
./bin/neo4j-admin load --force true --from file.dumpcommand.
Start the Neo4j server.
Import the file using the
./bin/neo4j-admin load --force true --from file.dump --database <dbname>command.
Make the new database known to the system database with
CREATE DATABASE dbnamewhich will also automatically start it.
The Neo4j version of some of the datasets might be older than your Neo4j version.
Then you might need to configure Neo4j to upgrade your database automatically, by setting
This is a graph-import of the Stack Overflow archive with 16.4M questions, 52k tags and 8.9M users (Stack Overflow Dump (6.2GB)). This graph is pretty big, for global graph queries you’d need a page-cache of 6G and heap of 16G to work with it.
Here is an article explaining the data model and some exploratory analysis we ran on the data.
The database is available in the Demo Server as outlined above.
Was this page helpful?