Neo4j Connector for Apache Spark
The Neo4j Connector for Apache Spark provides integration between Neo4j and Apache Spark.
You can use the connector to process and transfer data between Neo4j and other platforms such as Databricks and several data warehouses. Based on the Spark DataSource API, the connector supports all the programming languages that Spark supports.
Graphs and DataFrames
The connector uses schema inference to convert Neo4j graphs into Spark table-based DataFrames. For example, consider a graph with the following schema:
The connector creates a DataFrame with :Customer
and :Product
nodes connected by the BOUGHT
relationship, along with any node or relationship properties.
The Schema inference section shows a more detailed example of this process, while the Data type mapping section shows how data types are mapped between Neo4j and Spark.
Compatibility
Neo4j compatibility
The connector supports Neo4j 5.x and 4.4, whether run as a managed service in Neo4j Aura, as a single instance, or as a cluster. It supports both the Community and the Enterprise Edition.
License
The source code is provided under the terms of the Apache 2.0 license. You are free to download, modify, and redistribute the connector; however, Neo4j support applies only to official builds provided by Neo4j.
Support
For Neo4j Enterprise and Neo4j AuraDB customers, official releases of this connector are supported under the terms of your existing Neo4j support agreement. This support extends only to regular releases and excludes alpha, beta, and pre-releases. If you have any questions about the support policy, get in touch with Neo4j.
© 2024
License: Creative Commons 4.0