Developer Guides Getting Started Getting Started What is a Graph Database? Intro to Graph DBs Video Series Concepts: RDBMS to Graph Concepts: NoSQL to Graph Getting Started Resources Neo4j Graph Platform Graph Platform Overview Neo4j Desktop Intro Neo4j Browser Intro… Read more →

Developer Guides

Want to Speak? Get $ back.

Guide: Example Datasets

Goals
This Guide introduces different example datasets for Neo4j and demonstrates how to import and explore them.
Prerequisites
You should be comfortable installing Neo4j (Desktop, Docker) or spinning up an instance in the cloud.
Intermediate

Datasets

For getting started with using Neo4j it’s helpful to use example datasets relevant to your domain and use-cases. For each we want to provide a description, the graph model and some use-case queries.

Have fun, and send us feedback to devrel@neo4j.com or raise an issue for the site if something doesn’t work as expected.

Built-In Examples

Neo4j Browser comes with two built-in examples, which you can create and explore using interactive slideshows.

The “Movies” example, is launched via the :play movie-graph command and contains a small graph of movies and people related to those movies as actors, directors, producers etc.

browser example guides movies

The “Northwind” example, is run via :play northwind-graph and contains an traditional retail-system with products, orders, customers, suppliers and employees. It walks you through the import of the data and incrementally complex queries using the available data.

browser example guides northwind

Neo4j Sandboxes

To explore a wide variety of datasets in an online setup without a local installation, you can use the Neo4j sandbox: https://neo4j.com/sandbox

Each sandbox is available for at least 3 days after creation and can also be remotely accessed from applications using any Neo4j driver.

neo4j sandboxes

Except for the “blank” sandbox, all other sandboxes come prepopulated with the domain data and focus on use-case specific queries.

The use-cases range from

  • movie recommendations,
  • network management,
  • investigative data from the ICIJ Panama Papers to
  • crime investigation and
  • social networks optionally using your own Twitter account.

Other Guide Examples

Other examples that you can quickly run within your own Neo4j Browser are:

  • :play got Game of Thrones Interactions
  • :play nasa NASA knowledge graph example
  • :play ukcompanies UK company registration, property ownership, political donations
  • :play stackoverflow Stack Overflow users, tags and Q&A data
  • ` ` BBC Good Foods recipe data
  • :play listings Airbnb listings data
  • :play football_transfers Football (Soccer) transfer data

Means of Data Import

Loading Data from Source Data

The most reliable way to get a dataset into Neo4j is to import it from the raw sources. Then you are independent of database versions, which you otherwise might have to upgrade. That’s why we provided raw data (CSV, JSON, XML) for several of the datasets, accompanied by import scripts in Cypher.

You would run the Cypher script using a command-line client like cypher-shell.

cypher import shell

Run Cypher Shell from the “Terminal” tab of your Graph in Neo4j Desktop
./bin/cypher-shell -u neo4j -p "password" < import-file.cypher

You can also drag and drop or paste the script into Neo4j Browser (check that multi-statement editor is enabled in the settings) and run it from there.

cypher import browser

CSV data can be imported using either LOAD CSV clause in Cypher or neo4j-admin import --mode csv for initial bulk imports of large datasets.

For JSON, XML, XLS etc. you need to have the APOC utility library installed, which comes with a number of procedures for importing data also from other databases.

Using a copy or dump of a Neo4j database

Other datasets are provided as dump of a Neo4j datastore.

  1. Please stop your Neo4j server.
  2. Then you can import the file using the ./bin/neo4j-admin load --force true --from file.dump command.
The Neo4j version of some of the datasets might be older than your Neo4j version. Then you might need to configure Neo4j to upgrade your database automatically, by setting dbms.allow_upgrade=true in your Neo4j settings, or directly in $NEO4J_HOME/conf/neo4j.conf

Large Data Dumps

Stack Overflow

This is a graph-import of the Stack Overflow archive with 16.4M questions, 52k tags and 8.9M users (Stack Overflow Dump (6.2GB)). This graph is pretty big, for best full scale querying you’d need a page-cache and heap of

Here is an article explaining the data model and some exploratory analysis we ran on the data.

0*lOrKWCLdlLGG4BXe

The database is also available as a Neo4j Online Database with username “stackoverflow” and password “stackoverflow”.