Chapter 1. Quick Start

Get started fast for common scenarios, using neo4j-streams as a plugin.

1.1. Install the Plugin

1.2. Configure Kafka Connection

If you are running locally or against a standalone machine, configure neo4j.conf to point to that server:

neo4j.conf. 

kafka.zookeeper.connect=localhost:2181
kafka.bootstrap.servers=localhost:9092

If you are using Confluent Cloud (managed Kafka), you can connect to Kafka in this way, filling in your own CONFLUENT_CLOUD_ENDPOINT, CONFLUENT_API_KEY, and CONFLUENT_API_SECRET

neo4j.conf. 

kafka.bootstrap.servers: <<CONFLUENT_CLOUD_ENDPOINT_HERE>>
kafka.sasl.jaas.config: org.apache.kafka.common.security.plain.PlainLoginModule required username="<<CONFLUENT_API_KEY HERE>>" password="<<CONFLUENT_API_SECRET HERE>>";
kafka.ssl.endpoint.identification.algorithm: https
kafka.security.protocol: SASL_SSL
kafka.sasl.mechanism: PLAIN
kafka.request.timeout_ms: 20000
kafka.retry.backoff.ms: 500

1.3. Decide: Consumer, Producer, or Both

Follow one or both subsections according to your use case and need:

1.3.1. Consumer

Take data from Kafka and store it in Neo4j (Neo4j as a data sink) by adding configuration such as:

neo4j.conf. 

streams.sink.enabled=true
streams.sink.topic.cypher.my-ingest-topic=MERGE (n:Label {id: event.id}) ON CREATE SET n += event.properties

This will process every message that comes in on my-ingest-topic with the given cypher statement. When that cypher statement executes, the event variable that is referenced will be set to the message received, so this sample cypher will create a (:Label) node in the graph with the given ID, copying all of the properties in the source message.

For full details on what you can do here, see the Consumer Section of the documentation.

1.3.2. Producer

Produce data from Neo4j and send it to a Kafka topic (Neo4j as a source) by adding configuration such as:

neo4j.conf. 

streams.source.topic.nodes.my-nodes-topic=Person{*}
streams.source.topic.relationships.my-rels-topic=KNOWS{*}
streams.source.enabled=true
streams.source.schema.polling.interval=10000

This will produce all graph nodes labeled (:Person) on to the topic my-nodes-topic and all relationships of type -[:KNOWS]→ to the topic named my-rels-topic. Further, schema changes will be polled every 10,000 ms, which affects how quickly the database picks up new indexes/schema changes.

The expressions Person{*} and KNOWS{*} are patterns. You can find documentation on how to change these in the Patterns section.

For full details on what you can do here, see the Producer Section of the documentation.