Online Course Introduction to Neo4j 4.0 Neo4j is a Graph Database The Neo4j Graph Platform Introduction to Cypher Using WHERE to Filter Queries Working with Patterns in Queries Working with Cypher Data Controlling the Query Chain Controlling Results Returned Creating… Read more →

Using the Neo4j ETL Tool for Import

About this module

You have learned how to import data into the graph using Cypher, APOC, the neo4j-admin tool, and also using a client application.

Another way that you can import data into the graph is with the Neo4j ETL Tool.

At the end of this module, you should be able to:

  • Create a connection to a live RDBMS.
  • Customize mappings from the RDBMS to the graph.
  • Import data from the RDBMS to the graph.

Why use the Neo4j ETL Tool for the import?

  • The Neo4j ETL Tool requires that both the source RDBMS and the target DBMS are online.
  • It enables you to control how much of the data in an existing RDBMS will be imported into the graph.
  • It also enables you to customize how nodes and relationships will be created in the graph.

Steps for importing with the ETL Tool

  1. Install the Neo4j ETL Tool into your Neo4j Desktop project.
  2. Create and start the Neo4j database into which you will import the data.
  3. Use the Neo4j ETL Tool to import the data:
    1. Specify and test the RDBMS connection.
    2. Prepare for mapping
    3. View the default mapping to be performed.
    4. Optionally, modify the default mapping.
    5. Perform the import.

Install the Neo4j ETL Tool

In Neo4j Desktop, you either create or view a project that you have created. Then you simply click the Add Application icon to add the Neo4j ETL Tool.

AddETLTool

Here is what your project should look like with the Neo4j ETL Tool added:

ETLToolAdded

Note
Make sure that you have updated Neo4j Desktop so that it has the latest version of the ETL Tool. At this writing, the latest version is 1.5.0.

Create and start the Neo4j instance

To import using the Neo4j ETL Tool, the Neo4j graph into which you will import the data must be started. For a new project, we simply click add graph and start a new Neo4j instance.

CreateGraphForImport

Note
You can skip this step if you plan to use an existing project that already has a Neo4j instance running.

Create the database you will import data into

You will most likely be importing the data into a newly-created database.

create database customers;
show databases

DatabaseForImport

Next, you are ready to use the Neo4j ETL Tool for import.

Starting the Neo4j ETL Tool

Here is the initial page you see when you start the Neo4j ETL tool:

OpenETLTool

The first thing you should do is connect to the RDBMS.

Connecting to the RDBMS

Here is an example where we are providing the connection information for an existing RDBMS from which we will be retrieving data for the import.

JDBCConnection

Connection tested and saved

You must test and save the RDBMS connection to ensure the Neo4j ETL Tool will be able to access the RDBMS.

ConnectionSaved

Prepare for mapping

After you have connected to the RDBMS, you must select the Neo4j Desktop project and the Neo4j instance to use for the mapping.

PrepareForMapping

You then click START MAPPING to begin the mapping.

Successful mapping

If the Neo4j ETL Tool can successfully derive a mapping from the RDBMS, you will see a message that the mapping was successful. You can clear the message and then click NEXT.

MappingSuccessful

Default node mapping

For the northwind RDBMS, here is the default mapping that could be used to import the nodes.

DefaultNodeMapping

Default relationship mapping

And here is the default relationship mapping.

DefaultRelationshipMapping

Changing node labels and what data will be imported

In the node tab, you can change the name that node labels will have and also skip data to be imported.

ChangeNodeLabels

Changing node properties

You can even change what properties will be imported, their names, and types by selecting the edit icon for a node.

NodeProperties

Changing relationship types

You can modify the names of relationships and if they will be skipped upon import.

RelationshipMapping

Saving the mapping

After your customization of the mapping, you should always save it.

SaveMapping

Then you click NEXT to continue to the import.

Select the database for import

Before you import, you must select the currently started database that you want to import the data into.

PrepareToImport

Import the data

You then click IMPORT DATA to import the data.

ImportSuccessful

Exercise 20: Importing data using Neo4j ETL Tool

  1. Create a new database in an existing project named northwind.
  2. Install Neo4j ETL Tool for the project.
  3. Restart the Neo4j instance.
  4. Configure a JDBC connection with these guidelines:
    1. database name: northwind
    2. host: db-examples.cmlvojdj5cci.us-east-1.rds.amazonaws.com
    3. user: n4examples
    4. password: 36gdOVABr3Ex
  5. Import the data into the northwind database.

You will only be able to perform the steps of this exercise if you use Neo4j Desktop. Estimated time to complete: 10 minutes.

Check your understanding

Question 1

What type of connection to the RDBMS is used for importing from an RDBMS with the Neo4j ETL Tool?

Select the correct answer.

  • Java
  • ODBC
  • JDBC
  • Bolt

Question 2

What are some of the things that you can you modify for the mapping from the RDBMS?

Select the correct answers.

  • What nodes will be created.
  • What relationships will be created.
  • Node labels.
  • Relationship types.

Question 3

What property information can be modified in the mapping?

Select the correct answers.

  • Node property names
  • Node property types
  • Relationship property names
  • Relationship property types

Summary

You should now be able to:

  • Create a connection to a live RDBMS.
  • Customize mappings from the RDBMS to the graph.
  • Import data from the RDBMS to the graph.

Stay Connected

Sign up to find out more about Neo4j's upcoming events & meetups.