Project overview
The Neo4j Connector for Apache Spark is intended to make integrating graphs with Spark easy.
There are effectively two ways of using the connector:
-
As a data source: you can read any set of nodes or relationships as a DataFrame in Spark.
-
As a sink: you can write any DataFrame to Neo4j as a collection of nodes or relationships or use a Cypher statement to process records in a DataFrame into the graph pattern of your choice.
Multiple languages support
Because the connector is based on the new Spark DataSource API, other Spark interpreters for languages such as Python and R work.
The API remains the same, and mostly only slight syntax changes are necessary to accommodate the differences between (for example) Python and Scala.
Compatibility
Neo4j compatibility
This connector works with Neo4j 3.5 and the entire 4.x series of Neo4j, whether run as a single instance, in Causal Cluster mode, or run as a managed service in Neo4j AuraDB. The connector does not rely on Enterprise Edition features and as such works with Neo4j Community Edition as well, with the appropriate version number.
Neo4j versions prior to 3.5 are not supported. |
Spark and Scala compatibility
This connector currently supports:
-
Spark 2.4.5+ with Scala 2.11 and Scala 2.12.
-
Spark 3.0+ with Scala 2.12.
Depending on the combination of Spark and Scala versions you need a different JAR.
JARs are named in the form:
neo4j-connector-apache-spark_${scala.version}_${connector.version}_for_${spark.version}
Ensure that you have the appropriate JAR file for your environment. Here’s a compatibility table to help you choose the correct JAR.
Spark 2.4 | Spark 3.0+ | |
---|---|---|
Scala 2.11 |
|
(not available) |
Scala 2.12 |
|
|
Training
If you want an introduction on the Neo4j Connector for Apache Spark, take a look at the training that Andrea Santurbano presented at NODES2020.
Availability
This connector is provided under the terms of the Apache 2.0 license, which can be found in the GitHub repository.
Support
For Neo4j Enterprise and Neo4j AuraDB customers, official releases of this connector are supported under the terms of your existing Neo4j support agreement. This support extends only to regular releases and excludes alpha, beta, and pre-releases. If you have any questions about the support policy, get in touch with Neo4j.
Was this page helpful?