Quick Java Example

In order to use Neo4j Connector for Apache Spark in your Java application you need to add Spark Packages repository and the dependency. Look at this section for more information.

Code

Let’s say you have a Neo4j instance with the movie graph running on localhost.

import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import org.apache.spark.sql.SparkSession;

public class SparkApp {

    public static void main(String[] args) {
        SparkSession spark = SparkSession
                .builder()
                .appName("Spark SQL Example")
                .config("spark.master", "local")
                .getOrCreate();

        Dataset<Row> ds = spark.read().format("org.neo4j.spark.DataSource")
                .option("url", "bolt://localhost:7687")
                .option("authentication.basic.username", "neo4j")
                .option("authentication.basic.password", "password")
                .option("labels", "Person")
                .load();

        ds.show();
    }
}

This code will produce the following output:

+----+--------+------------------+----+
|<id>|<labels>|              name|born|
+----+--------+------------------+----+
|   1|[Person]|      Keanu Reeves|1964|
|   2|[Person]|  Carrie-Anne Moss|1967|
|   3|[Person]|Laurence Fishburne|1961|
|   4|[Person]|      Hugo Weaving|1960|
|   5|[Person]|    Andy Wachowski|1967|
|   6|[Person]|    Lana Wachowski|1965|
|   7|[Person]|       Joel Silver|1952|
|   8|[Person]|       Emil Eifrem|1978|
|  12|[Person]|   Charlize Theron|1975|
|  13|[Person]|         Al Pacino|1940|
|  14|[Person]|   Taylor Hackford|1944|
|  16|[Person]|        Tom Cruise|1962|
|  17|[Person]|    Jack Nicholson|1937|
|  18|[Person]|        Demi Moore|1962|
|  19|[Person]|       Kevin Bacon|1958|
|  20|[Person]| Kiefer Sutherland|1966|
|  21|[Person]|         Noah Wyle|1971|
|  22|[Person]|  Cuba Gooding Jr.|1968|
|  23|[Person]|      Kevin Pollak|1957|
|  24|[Person]|        J.T. Walsh|1943|
+----+--------+------------------+----+
only showing top 20 rows