Graph Keys

When using the connector to write data, it’s necessary to indicate which elements of the dataframe correspond to the identifying properties / keys of the node that you’re writing.

In the Writing section, the following options were discussed, applying to the "Keys" strategy.

  • node.keys

  • relationship.source.node.keys

  • relationship.target.node.keys

The following sections describe how to use key mappings to express the connection between DataFrame format and desired graph schema.

Graph Key Format

Each of these fields is a comma-separated list of keys, such as field1,field2. In turn, each of the keys themselves can contain a mapping from a DataFrame attribute to a node property, such as EventID:id.

This mapping is always expressed in the order DataFrameID:NodeID, and allows for the data frame column name, and the Neo4j node property name to differ.

Simple Example

Probably the most common example will be to simply provide the name of a single attribute in the DataFrame; the node will receive a property of the same name.

my_person_dataframe.write
  .format("org.neo4j.spark.DataSource")
  .mode(SaveMode.Overwrite)
  .option("url", "bolt://localhost:7687")
  .option("labels", ":Person")
  .option("node.keys", "id")
  .save()

Complex Example

For example, let’s say that we wanted to write a dataframe of "Location" nodes. Imagine we had a dataframe that looked like this:

LocationName,LocationType
USA,Country
Richmond,City

Further, let’s assume that we need a compound key (both attributes must be used to uniquely identify a node) and that we want to use simpler names on node properties, so that we end up with Neo4j nodes like this:

(:Location { name: 'USA', type: 'Country' })
(:Location { name: 'Richmond', type: 'City' })

In order to do this, we would use the Graph Key expression of "LocationName:name,LocationType:type"

locations_dataframe.write
  .format("org.neo4j.spark.DataSource")
  .mode(SaveMode.Overwrite)
  .option("url", "bolt://localhost:7687")
  .option("labels", ":Location")
  .option("node.keys", "LocationName:name,LocationType:type")
  .save()