Write nodes
All the examples in this page assume that the |
With the labels
option, the connector writes a DataFrame to the Neo4j database as a set of nodes with the given labels.
The connector builds a CREATE
or a MERGE
Cypher® query (depending on the save mode) that uses the UNWIND
clause to write a batch of rows (an events
list with size defined by the batch.size
option).
The code from the example creates new nodes with the :Person
label.
case class Person(name: String, surname: String, age: Int)
val peopleDF = List(
Person("John", "Doe", 42),
Person("Jane", "Doe", 40)
).toDF()
peopleDF.write
.format("org.neo4j.spark.DataSource")
.mode(SaveMode.Append)
.option("labels", ":Person")
.save()
# Create example DataFrame
peopleDF = spark.createDataFrame(
[
{"name": "John", "surname": "Doe", "age": 42},
{"name": "Jane", "surname": "Doe", "age": 40},
]
)
(
peopleDF.write.format("org.neo4j.spark.DataSource")
.mode("Append")
.option("labels", ":Person")
.save()
)
Equivalent Cypher query
UNWIND $events AS event
CREATE (n:Person)
SET n += event.properties
You can write nodes with multiple labels using the colon as a separator. The colon before the first label is optional.
peopleDF.write
.format("org.neo4j.spark.DataSource")
.mode(SaveMode.Append)
// ":Person:Employee" and "Person:Employee"
// are equivalent
.option("labels", ":Person:Employee")
.save()
(
peopleDF.write.format("org.neo4j.spark.DataSource")
.mode("Append")
.option("labels", ":Person")
# ":Person:Employee" and "Person:Employee"
# are equivalent
.option("labels", ":Person:Employee")
.save()
)
Node keys
With the Overwrite
mode, you must specify the DataFrame columns to use as keys to match the nodes.
The node.keys
option takes a comma-separated list of key:value
pairs, where the key is the DataFrame column name and the value is the node property name.
If |
The same code using the Overwrite
save mode:
df.write
.format("org.neo4j.spark.DataSource")
.mode(SaveMode.Overwrite)
.option("labels", ":Person")
.option("node.keys", "name,surname")
.save()
Equivalent Cypher query
UNWIND $events AS event
MERGE (n:Person {
name: event.keys.name,
surname: event.keys.surname
})
SET n += event.properties
Due to the concurrency of Spark jobs, when using the |