This section shows how to configure Neo4j servers so that they are topology/data center-aware. It describes the precise configuration needed to achieve a scalable multi-data center deployment.
This section describes the following:
Before doing anything else, we must enable the multi-data center functionality. This is described in Section C.2.1, “Licensing for multi-data center operations”.
|Licensing for multi-data center|
The multi-data center functionality is separately licensed and must be specifically enabled.
In order to optimize the use of our Causal Cluster servers according to our specific requirements, we sort them into Server Groups. Server Group membership can map to data centers, availability zones, or any other significant topological elements from the operator’s domain. Server Groups can also overlap.
Server Groups are defined as a key that maps onto a set of servers in a Causal Cluster.
Server Group membership is defined on each server using the
causal_clustering.server_groups parameter in neo4j.conf.
Each server in a Causal Cluster can belong to zero or more server groups.
The membership of a server group or groups can be set in neo4j.conf as in the following examples:
# Add the current instance to the groups `us` and `us-east` causal_clustering.server_groups=us,us-east
# Add the current instance into the group `london` causal_clustering.server_groups=london
# Add the current instance into the group `eu` causal_clustering.server_groups=eu
We must be aware that membership of each server group is explicit.
For example, a server in the
gb-london group is not automatically part of some
eu group unless that server is explicitly added to those groups.
That is, any (implied) relationship between groups is reified only when those groups are used as the basis for requesting
data from upstream systems.
Server Groups are not mandatory, but unless they are present, we cannot set up specific upstream transaction dependencies for servers. In the absence of any specified server groups, the cluster defaults to its most pessimistic fall-back behavior: each Read Replica will catch up from a random Core Server.
Strategy plugins are sets of rules that define how Read Replicas contact servers in the cluster in order to synchronize transaction logs. Neo4j comes with a set of pre-defined strategies, and also provides a Design Specific Language, DSL, to flexibly create user-defined strategies. Finally, Neo4j supports an API which advanced users may use to enhance upstream recommendations.
Once a strategy plugin resolves a satisfactory upstream server, it is used for pulling transactions to update the local Read Replica for a single synchronization. For subsequent updates, the procedure is repeated so that the most preferred available upstream server is always resolved.
Neo4j ships with the following pre-defined strategy plugins. These provide coarse-grained algorithms for choosing an upstream instance:
|Plugin name||Resulting behavior|
Connect to any Core Server selecting at random from those currently available.
Connect to any available Read Replica, but around 10% of the time connect to any random Core Server.
Connect at random to any available Read Replica in any of the server groups specified in the comma-separated list
Connect only to the current Raft leader of the Core Servers.
Connect at random to any available Read Replica in any of the server groups to which this server belongs.
Deprecated, please use
Pre-defined strategies are used by configuring the
Doing so allows us to specify an ordered preference of strategies to resolve an upstream provider of transaction data.
We provide a comma-separated list of strategy plugin names with preferred strategies earlier in that list.
The upstream strategy is chosen by asking each of the strategies in list-order whether they can provide an upstream server
from which transactions can be pulled.
Consider the following configuration example:
With this configuration the instance will first try to connect to any other instance in the group(s) specified in
Should we fail to find any live instances in those groups, then we will connect to a random Read Replica.
To ensure that downstream servers can still access live data in the event of upstream failures, the last resort of any instance
is always to contact a random Core Server.
This is equivalent to ending the
causal_clustering.upstream_selection_strategy configuration with
Neo4j Causal Clusters support a small DSL for the configuration of client-cluster load balancing. This is described in detail in the section called “Policy definitions” and the section called “Filters”. The same DSL is used to describe preferences for how an instance binds to another instance to request transaction updates.
The DSL is made available by selecting the
user-defined strategy as follows:
Once the user-defined strategy has been specified, we can add configuration to the
causal_clustering.user_defined_upstream_strategy setting based on the server groups that have been set for the cluster.
We will describe this functionality with two examples:
For illustrative purposes we propose four regions:
west and within each region we have a number of data centers such as
We configure our server groups so that each data center maps to its own server group.
Additionally we will assume that each data center fails independently from the others and that a region can act as a supergroup
of its constituent data centers.
So an instance in the
north region might have configuration like
causal_clustering.server_groups=north2,north which puts it in two groups that match to our physical topology as shown in the diagram below.
Once we have our server groups, our next task is to define some upstream selection rules based on them.
For our design purposes, let’s say that any instance in one of the
north region data centers prefers to catchup within the data center if it can, but will resort to any northern instance otherwise.
To configure that behavior we add:
causal_clustering.user_defined_upstream_strategy=groups(north2); groups(north); halt()
The configuration is in precedence order from left to right.
groups() operator yields a server group from which to catch up.
In this case only if there are no servers in the
north2 server group will we proceed to the
groups(north) rule which yields any server in the
north server group.
Finally, if we cannot resolve any servers in any of the previous groups, then we will stop the rule chain via
Note that the use of
halt() will end the rule chain explicitly.
If we don’t use
halt() at the end of the rule chain, then the
all() rule is implicitly added.
all() is expansive: it offers up all servers and so increases the likelihood of finding an available upstream server.
all() is indiscriminate and the servers it offers are not guaranteed to be topologically or geographically local, potentially increasing
the latency of synchronization.
The example above shows a simple hierarchy of preferences. But we can be more sophisticated if we so choose. For example we can place conditions on the server groups from which we catch up.
In this example we wish to roughly qualify cluster health before choosing from where to catch up.
For this we use the
min() filter as follows:
causal_clustering.user_defined_upstream_strategy=groups(north2)->min(3), groups(north)->min(3); all();
groups(north2)->min(3) states that we want to catch up from the
north2 server group if it has three available machines, which we here take as an indicator of good health.
north2 can’t meet that requirement (is not healthy enough) then we try to catch up from any server across the
north region provided there are at least three of them available as per
Finally, if we cannot catch up from a sufficiently healthy
north region, then we’ll (explicitly) fall back to the whole cluster with
min() filter is a simple but reasonable indicator of server group health.
Neo4j supports an API which advanced users may use to enhance upstream recommendations in arbitrary ways: load, subnet, machine
size, or anything else accessible from the JVM.
In such cases we are invited to build our own implementations of
org.neo4j.causalclustering.readreplica.UpstreamDatabaseSelectionStrategy to suit our own needs, and register them with the strategy selection pipeline just like the pre-packaged plugins.
We have to override the
org.neo4j.causalclustering.readreplica.UpstreamDatabaseSelectionStrategy#upstreamDatabase() method in our code.
Overriding that class gives us access to the following items:
This is a directory service which provides access to the addresses of all servers and server groups in the cluster.
This provides the configuration from neo4j.conf for the local instance. Configuration for our own plugin can reside here.
This provides the unique cluster
Once our code is written and tested, we have to prepare it for deployment.
UpstreamDatabaseSelectionStrategy plugins are loaded via the Java Service Loader.
This means when we package our code into a jar file, we’ll have to create a file META-INF.services/org.neo4j.causalclustering.readreplica.UpstreamDatabaseSelectionStrategy in which we write the fully qualified class name(s) of the plugins, e.g.
To deploy this jar into the Neo4j server we copy it into the plugins directory and restart the instance.
In Section 184.108.40.206, “Bias cluster leadership with follower-only instances” we saw how we can bias the leadership credentials of instances in a cluster. Generally speaking this is a rare occurence in a homogenous cluster inside a single data center.
In a multi-DC scenario, while it remains a rare occurence, it does allow operators to bias which data centers are used to
host Raft leaders (and thus where writes are directed).
causal_clustering.refuse_to_be_leader=true in those data centers where we do not want leaders to materialize.
In doing so we implicitly prefer the instances where we have not applied that setting.
This may be useful when planning for highly distributed multi-data center deployments. However this must be very carefully considered because in failure scenarios it limits the availability of the cluster. It is advisable to engage Neo4j professional services to help design a suitably resilient topology.