8.4. Monitoring a Causal Cluster

This section covers additional facilities available for monitoring a Neo4j Causal Cluster.

In addition to specific metrics as described in previous sections, Neo4j Causal Clusters provide an infrastructure that operators will wish to monitor as well as new affordances for observing the state of the overall cluster. The procedures can be used to inspect the cluster state and to understand its current condition and topology. Additionally, there are HTTP endpoints for checking health and status.

The features described in this section are available when running in Causal Clustering mode only.

8.4.1. Procedures for monitoring a Causal Cluster

This section covers procedures for monitoring a Neo4j Causal Cluster.

The section describes the following:

8.4.1.1. Find out the role of a cluster member

The procedure dbms.cluster.role() can be called on every instance in a Causal Cluster to return the role of the instance.

Syntax:

CALL dbms.cluster.role()

Returns:

Name Type Description

role

String

This is the role of the current instance, which can be LEADER, FOLLOWER, or READ_REPLICA.

Considerations:

  • While this procedure is useful in and of itself, it serves as basis for more powerful monitoring procedures.
Example 8.6. Check the role of this instance

The following example shows how to find out the role of the current instance, which in this case is 'FOLLOWER'.

CALL dbms.cluster.role()
role

FOLLOWER

8.4.1.2. Gain an overview over the instances in the cluster

The procedure dbms.cluster.overview() provides an overview of cluster topology by returning details on all the instances in the cluster.

Syntax:

CALL dbms.cluster.overview()

Returns:

Name Type Description

id

String

This is id of the instance.

addresses

List<String>

This is a list of all the addresses for the instance.

role

String

This is the role of the instance, which can be LEADER, FOLLOWER, or READ_REPLICA.

Considerations:

  • This procedure can only be called from Core instances, since they are the only ones that have the full view of the cluster.
Example 8.7. Get an overview of the cluster

The following example shows how to explore the cluster topology.

CALL dbms.cluster.overview()
id addresses role

08eb9305-53b9-4394-9237-0f0d63bb05d5

[bolt://neo20:7687, http://neo20:7474, https://neo20:7473]

LEADER

cb0c729d-233c-452f-8f06-f2553e08f149

[bolt://neo21:7687, http://neo21:7474, https://neo21:7473]

FOLLOWER

ded9eed2-dd3a-4574-bc08-6a569f91ec5c

[bolt://neo22:7687, http://neo22:7474, https://neo22:7473]

FOLLOWER

00000000-0000-0000-0000-000000000000

[bolt://neo34:7687, http://neo34:7474, https://neo34:7473]

READ_REPLICA

00000000-0000-0000-0000-000000000000

[bolt://neo28:7687, http://neo28:7474, https://neo28:7473]

READ_REPLICA

00000000-0000-0000-0000-000000000000

[bolt://neo31:7687, http://neo31:7474, https://neo31:7473]

READ_REPLICA

8.4.1.3. Get routing recommendations

From the application point of view it is not interesting to know about the role a member plays in the cluster. Instead, the application needs to know which instance can provide the wanted service. The procedure dbms.cluster.routing.getServers() provides this information.

Syntax:

CALL dbms.cluster.routing.getServers()

Example 8.8. Get routing recommendations

The following example shows how discover which instances in the cluster can provide which services.

CALL dbms.cluster.routing.getServers()

The procedure returns a map between a particular service, READ, WRITE and ROUTE, and the addresses of instances that provide this service. It also returns a Time To Live (TTL) for the information.

The result is not primarily intended for human consumption. Expanding it this is what it looks like.

ttl: 300,
server: [
{
    addresses: [neo20:7687],
    role: WRITE
}, {
    addresses: [neo21:7687, neo22:7687, neo34:7687, neo28:7687, neo31:7687],
    role: READ
}, {
    addresses: [neo20:7687, neo21:7687, neo22:7687],
    role: ROUTE
}
]

8.4.2. Endpoints for status information

8.4.2.1. Introduction

A causal cluster exposes some HTTP endpoints which can be used to monitor the health of the cluster. In this section we will describe these endpoints and explain their semantics.

8.4.2.2. The endpoints

Core Servers come with 3 endpoints regarding their status. Those are:

  • /db/manage/server/core/writable
  • /db/manage/server/core/read-only
  • /db/manage/server/core/available

The /writable/ endpoint can be used to direct write traffic to specific instances. The /read-only/ endpoint can be used to direct read traffic to specific instances. The /available/ endpoint exists for the general case of directing arbitrary request types to instances that are available for processing read transactions.

Read Replicas come with 1 endpoint. It is:

  • /db/manage/server/read-replica/available

The /available/ endpoint exists for the general case of directing arbitrary request types to instances that are available for processing read transactions.

To use the endpoints, perform an HTTP GET operation and the following will be returned:

Table 8.12. Core HTTP endpoint responses
Endpoint Instance State Returned Code Body text

/db/manage/server/core/writable

Leader

200 OK

true

Follower

404 Not Found

false

/db/manage/server/core/read-only

Leader

404 Not Found

false

Follower

200 OK

true

/db/manage/server/core/available

Leader

200 OK

true

Follower

200 OK

true

Table 8.13. Read Replica HTTP endpoint responses
Endpoint Returned Code Body text

/db/manage/server/read-replica/available

200 OK

true

8.4.2.3. Examples

From the command line, a common way to ask those endpoints is to use curl. With no arguments, curl will do an HTTP GET on the URI provided and will output the body text, if any. If you also want to get the response code, just add the -v flag for verbose output. Here are some examples:

  • Requesting writable endpoint on a core server that is currently elected leader with verbose output:
#> curl -v localhost:7474/db/manage/server/core/writable
* About to connect() to localhost port 7474 (#0)
*   Trying ::1...
* connected
* Connected to localhost (::1) port 7474 (#0)
> GET /db/manage/server/core/writable HTTP/1.1
> User-Agent: curl/7.24.0 (x86_64-apple-darwin12.0) libcurl/7.24.0 OpenSSL/0.9.8r zlib/1.2.5
> Host: localhost:7474
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/plain
< Access-Control-Allow-Origin: *
< Transfer-Encoding: chunked
< Server: Jetty(6.1.25)
<
* Connection #0 to host localhost left intact
true* Closing connection #0

If the Neo4j server has Basic Security enabled, the Causal Clustering status endpoints will also require authentication credentials. For some load balancers and proxy servers, providing this with the request is not an option. For those situations, consider disabling authentication of the CC status endpoints by setting dbms.security.causal_clustering_status_auth_enabled=false in the neo4j.conf configuration file.