When to use Kafka Connect Neo4j Connector vs. Neo4j Streams as a Plugin

This section covers how to decide whether to run as a Kafka Connect worker, or as a Neo4j Plugin.

Kafka Connect

Pros

  • Processing is outside of Neo4j so that memory & CPU impact doesn’t impact Neo4j. You don’t need to size the database with Kafka utilization in mind.

  • Much easier for Kafka pros to manage; they benefit from the Confluent ecosystem, such as connecting the REST API to manipulate connectors, the control center to administer & monitor them.

  • By restarting the worker, you can restart your sink/source strategy without having downtime for Neo4j.

  • Upgrade Neo4j-Streams without restarting the cluster

  • Strictly an external bolt client, so better overall security management of plugin actions.

Cons

  • If you’re using Confluent Cloud, you can’t host the connector in the cloud (yet). So this requires a 3rd piece of architecture: Confluent Cloud, Neo4j, and the Connect Worker (usually a separate VM)

  • Possibly worse throughput due to bolt latency & overhead, and separate network hop.

Neo4j-Streams Plugin

Pros

  • Much easier for Neo4j pros to manage

  • You can produce records back out to Kafka

  • You can use Neo4j procedures so that "custom produce" (see later section) becomes an option.

  • Possibly better throughput, because you don’t have bolt latency / overhead

Cons

  • Memory & CPU consumption on your Neo4j Server

  • Need to track config to be identical across all members in the cluster

  • Lesser ability to manage the plugin because it is running inside of the database and not under a particular user account.