This section discusses considerations when designing the backup strategy for a Neo4j Causal Cluster.
This section includes:
Backups of a Neo4j Causal Cluster can be configured in a variety of ways with regards to physical configuration and SSL implementation. This section discusses some considerations that should be taken into account before determining which backup configuration to use.
The table below lists the configuration parameters relevant to backups:
|Parameter name||Default value||Description|
Enable support for running online backups.
Listening server for online backups.
The SSL policy used on the backup port.
Encrypted backups are available with Causal Clustering.
Both the server running the backup, and the backup target, must be configured with the same SSL policy. This can be the same as that used for encrypting the regular cluster traffic (see Section 5.5, “Intra-cluster encryption”), or a separate one. The policy to be used for encrypting backup traffic must be assigned on both servers.
For examples on how to configure encrypted backups, see Section 7.3.6, “Backup scenarios and examples”.
It is generally recommended to select Read Replicas to act as backup providers, since they are more numerous than Core Servers in typical cluster deployments. Furthermore, the possibility of performance issues on a Read Replica, caused by a large backup, will not affect the performance or redundancy of the Core Cluster.
However, since Read Replicas are asynchronously replicated from Core Servers, it is possible for them to be fall behind in applying transactions with respect to the Core Cluster. It may even be possible for a Read Replica to become orphaned from a Core Server such that its contents are quite stale.
We can use transaction IDs in order to avoid taking a backup from a Read Replica that has lagged too far behind the Core Cluster. Since transaction IDs are strictly increasing integer values, we can check the last transaction ID processed on the Read Replica and verify that it is sufficiently close to the latest transaction ID processed by the Core Server. If so, we can safely proceed to backup from our Read Replica in confidence that it is up-to-date with respect to the Core Servers.
The latest transaction ID can be found by exposing Neo4j metrics or via Neo4j Browser.
To view the latest processed transaction ID (and other metrics) in Neo4j Browser, type
:sysinfo at the prompt.
In a Core-only cluster, we do not have access to Read Replicas for scaling out workload. Instad, we pick one of the Core Servers to run backups based on factors such as its physical proximity, bandwidth, performance, and liveness.
The cluster will function as normal even while large backups are taking place. However, the additional I/O burdens placed on the Core Server being used as a backup server, may impact its performance.
A very conservative view would be to treat the backup server as an unavailable instance, assuming its performance will be lower than the other instances in the cluster. In such cases, it is recommended that there is sufficient redundancy in the cluster such that one slower server does not reduce the capacity to mask faults.
We can factor this conservative strategy into our cluster planning.
M = 2F + 1 demonstrates the relationship between
M being the number of members in the cluster required to tolerate
To tolerate the possibility of one slower machine in the cluster during backup we increase
Thus if we originally envisaged a cluster of three Core Servers to tolerate one fault, we could increase that to five to maintain
a plainly safe level of redundancy.
As described in Section 7.1.4, “Network protocols for backups”, the
catchup protocol is used both for keeping Read Replicas up-to-date within a Causal Cluster, and for backups.
It is therefore possible to run backups by defining a separate
dbms.backup.address for backup traffic, or simply by "listening to" the same messages as Read Replicas do for keeping in sync with the Core Cluster.
To perform backups on a Causal Cluster, you will need to combine some settings and arguments.
The table below illustrates the available options when using the
|Backup target address on database server||Corresponding SSL policy setting on database server||Corresponding SSL policy setting on backup client||Default port|
Before performing a backup of a Causal Cluster, you need to consider which port you will be performing backup from. For example, if you are planning to perform a backup from the transaction port, the backup policy for your backup client should match the cluster policy of the server. Otherwise, if you are planning to perform the backup from the backup port, the backup policy for the backup client should match the server’s backup policy.
The images below illustrate the settings and arguments to be used when setting up backups for your cluster using either