Using the Neo4j ETL Tool for Import

About this module

You have learned how to import data into the graph using Cypher, APOC, the neo4j-admin tool, and also using a client application.

Another way that you can import data into the graph is with the Neo4j ETL Tool.

At the end of this module, you will be able to:

  • Create a connection to a live RDBMS.

  • Customize mappings from the RDBMS to the graph.

  • Import data from the RDBMS to the graph.

Why use the Neo4j ETL Tool for the import?

  • The Neo4j ETL Tool requires that both the source RDBMS and the target DBMS are online.

  • It enables you to control how much of the data in an existing RDBMS will be imported into the graph.

  • It also enables you to customize how nodes and relationships will be created in the graph.

The Neo4j ETL Tool can only be used if you are using Neo4j Desktop to access the graph application. With Neo4j Desktop, you can connect to a Sandbox database or a Neo4j Aura database, but you need Neo4j Desktop to use the Neo4j ETL Tool.

Guided Exercise: Importing with the ETL Tool

Here are the steps you will follow:

  1. Ensure the Neo4j DBMS is started.

  2. Start the Neo4j ETL Tool.

  3. Specify the target Neo4j database.

  4. Specify and test the source RDBMS connection.

  5. Prepare for mapping.

  6. View the default mapping to be performed.

    1. Optionally, modify the default mapping.

  7. Save the mapping.

  8. Import the data.

Step 1: Ensure the Neo4j DBMS is started

  • If your database will be hosted in the local DBMS, make sure that it is created and started in your Neo4j Desktop project.

  • If your database will be hosted in a Neo4j Sandbox or Neo4j Aura, make sure that you add a connection to it from your project in Neo4j Desktop and connect to it.

Example: Create the database you will import data into

You will most likely be importing the data into a newly-created database. Here is how you create a database in the DBMS:

CREATE DATABASE customers;
DatabaseForImport

Next, you are ready to use the Neo4j ETL Tool for import.

Step 2: Start the Neo4j ETL Tool

For the started local DBMS or remote connection, open the ETL Tool by selecting it in the dropdown list for the Open button:

StartETLTool
Select Allow when asked if you want to allow background processes to execute.

Here is the initial page you see when you start the Neo4j ETL tool:

OpenETLTool

Step 3: Specify target Neo4j database

Next, you can select the Neo4j Desktop project and the local or remote DBMS that you will be importing into.

SelectTargetDB

Step 4: Specify and test the source RDBMS connection

The next thing you must do is connect to the RDBMS. You click ADD CONNECTION.

Specify the connection information

Provide this information for the JDBC Connection:

Connection Name
Northwind
Host
db-examples.cmlvojdj5cci.us-east-1.rds.amazonaws.com
Type
postgresql
Database
northwind
Username
n4examples
Password
36gdOVABr3Ex

This is what your configuration will looks like:

JDBCConnection

Test the connection to the RDBMS

Next, you must test and save the RDBMS connection to ensure the Neo4j ETL Tool will be able to access the RDBMS. Click the TEST AND SAVE CONNECTION button.

You will see this message that you can then close:

ConnectionSaved

Step 5: Prepare for mapping

In the Load Data Model window, ensure that both the source and target databases are selected as shown here.

You then click START MAPPING to begin the mapping.

ClickStartMapping

Successful mapping

If the Neo4j ETL Tool can successfully derive a mapping from the RDBMS, you will see a message that the mapping was successful.

MappingSuccessful

You can clear the message and then click NEXT.

Step 6: Review the default mapping

For the northwind RDBMS, here is the default mapping that could be used to import the nodes.

DefaultNodeMapping

And here is the default relationship mapping.

DefaultRelationshipMapping

Optional: Changing node labels and what data will be imported

In the node tab, you can change the name that node labels will have and also skip data to be imported.

ChangeNodeLabels

Optional: Changing node properties

You can even change what properties will be imported, their names, and types by selecting the edit icon for a node.

NodeProperties

Optional: Changing relationship types

You can modify the names of relationships and if they will be skipped upon import.

RelationshipMapping

Step 7: Save the mapping

After your customization of the mapping, you must always save it by clicking the SAVE MAPPING button.

SaveMapping

Then you click NEXT to continue to the import.

Step 8: Import the data

Next, on the Import your data into Neo4j window, you can review the import information displayed.

ImportData

You then click IMPORT DATA to import the data.

ImportSuccessful

You can also confirm that the data is in the Neo4j database by viewing it in Neo4j Browser.

Imported

Check your understanding

Question 1

What type of connection to the RDBMS is used for importing from an RDBMS with the Neo4j ETL Tool?

Select the correct answer.

  • Java

  • ODBC

  • JDBC

  • Bolt

Question 2

What are some of the things that you can you modify for the mapping from the RDBMS?

Select the correct answers.

  • What nodes will be created.

  • What relationships will be created.

  • Node labels.

  • Relationship types.

Question 3

What property information can be modified in the mapping?

Select the correct answers.

  • Node property names

  • Node property types

  • Relationship property names

  • Relationship property types

Summary

You can now:

  • Create a connection to a live RDBMS.

  • Customize mappings from the RDBMS to the graph.

  • Import data from the RDBMS to the graph.