Discovery

Overview

In order to form or connect to a running cluster, a Core Server or a Read Replica needs to know the addresses of some of the Core Servers. This information is used to bind to the Core Servers in order to run the discovery protocol and get the full information about the cluster. The way in which this is best done depends on the configuration in each specific case.

If the addresses of the other cluster members are known upfront, they can be listed explicitly. This is convenient, but has limitations:

  • If Core members are replaced and the new members have different addresses, the list will become outdated. An outdated list can be avoided by ensuring that the new members can be reached via the same address as the old members, but this is not always practical.

  • Under some circumstances the addresses are unknown when configuring the cluster. This can be the case, for example, when using container orchestration to deploy a Causal Cluster.

Additional mechanisms for using DNS are provided for the cases where it is not practical or possible to explicitly list the addresses of cluster members to discover.

The discovery configuration is just used for initial discovery and a running cluster will continuously exchange information about changes to the topology. The behavior of the initial discovery is determined by the parameters causal_clustering.discovery_type and causal_clustering.initial_discovery_members, and is described in the following sections.

Discovery using a list of server addresses

If the addresses of the other cluster members are known upfront, they can be listed explicitly. We use the default causal_clustering.discovery_type=LIST and hard code the addresses in the configuration of each machine. This alternative is illustrated by Configure a Core-only cluster.

Discovery using DNS with multiple records

When using initial discovery with DNS, a DNS record lookup is performed when an instance starts up. Once an instance has joined a cluster, further membership changes are communicated amongst Core members as part of the discovery service.

The following DNS-based mechanisms can be used to get the addresses of Core Cluster members for discovery:

causal_clustering.discovery_type=DNS

With this configuration, the initial discovery members will be resolved from DNS A records to find the IP addresses to contact. The value of causal_clustering.initial_discovery_members should be set to a single domain name and the port of the discovery service. For example: causal_clustering.initial_discovery_members=cluster01.example.com:5000. The domain name should return an A record for every Core member when a DNS lookup is performed. Each A record returned by DNS should contain the IP address of the Core Server. The configured Core Server will use all the IP addresses from the A records to join or form a cluster.

The discovery port must be the same on all Cores when using this configuration. If this is not possible, consider using the discovery type SRV instead.

causal_clustering.discovery_type=SRV

With this configuration, the initial discovery members will be resolved from DNS SRV records to find the IP addresses/hostnames and discovery service ports to contact. The value of causal_clustering.initial_discovery_members should be set to a single domain name and the port set to 0. For example: causal_clustering.initial_discovery_members=cluster01.example.com:0. The domain name should return a single SRV record when a DNS lookup is performed. The SRV record returned by DNS should contain the IP address or hostname, and the discovery port, for the Core Servers to be discovered. The configured Core Server will use all the addresses from the SRV record to join or form a cluster.

Discovery in Kubernetes

A special case is when a Causal Cluster is running in Kubernetes and each Core Server is running as a Kubernetes service. Then, the addresses of the Core Cluster members can be obtained using the List Service API, as described in the Kubernetes API documentation.

The following settings are used to configure for this scenario:

With this configuration, causal_clustering.initial_discovery_members is not used and any value assigned to it will be ignored.

  • The pod running Neo4j must use a service account which has permission to list services. For further information, see the Kubernetes documentation on RBAC authorization or ABAC authorization.

  • The configured causal_clustering.discovery_advertised_address must exactly match the Kubernetes-internal DNS name, which will be of the form <service-name>.<namespace>.svc.cluster.local.

As with DNS-based methods, the Kubernetes record lookup is only performed at startup.