Monitoring system

This section describes features for monitoring a system’s capacity and analytics workload using the Neo4j Graph Data Science library.

GDS supports multiple users concurrently working on the same system. Typically, GDS procedures are resource heavy in the sense that they may use a lot of memory and/or many CPU cores to do their computation. To know whether it is a reasonable time for a user to run a GDS procedure it is useful to know the current capacity of the system hosting Neo4j and GDS, as well as the current GDS workload on the system. Graphs and models are not shared between non-admin users by default, however GDS users on the same system will share its capacity.

1. System monitor procedure

To be able to get an overview of the system’s current capacity and its analytics workload one can use the procedure gds.alpha.systemMonitor. It will give you information on the capacity of the DBMS’s JVM instance in terms of memory and CPU cores, and an overview of the resources consumed by the GDS procedures currently being run on the system.

1.1. Syntax

Monitor the system capacity and analytics workload:
CALL gds.alpha.systemMonitor()
YIELD
  freeHeap,
  totalHeap,
  maxHeap,
  jvmAvailableCpuCores,
  availableCpuCoresNotRequested,
  jvmHeapStatus,
  ongoingGdsProcedures
Table 1. Results
Name Type Description

freeHeap

Integer

The amount of currently free memory in bytes in the Java Virtual Machine hosting the Neo4j instance.

totalHeap

Integer

The total amount of memory in bytes in the Java virtual machine hosting the Neo4j instance. This value may vary over time, depending on the host environment.

maxHeap

Integer

The maximum amount of memory in bytes that the Java virtual machine hosting the Neo4j instance will attempt to use.

jvmAvailableCpuCores

Integer

The number of logical CPU cores currently available to the Java virtual machine. This value may change vary over the lifetime of the DBMS.

availableCpuCoresNotRequested

Integer

The number of logical CPU cores currently available to the Java virtual machine that are not requested for use by currently running GDS procedures. Note that this number may be negative in case there are fewer available cores to the JVM than there are cores being requested by ongoing GDS procedures.

jvmHeapStatus

Map

The above-mentioned heap metrics in human-readable form.

ongoingGdsProcedures

List of Map

A list of maps containing resource usage and progress information for all GDS procedures (of all users) currently running on the Neo4j instance. Each map contains the name of the procedure, how far it has progressed, its estimated memory usage as well as how many CPU cores it will try to use at most.

freeHeap is influenced by ongoing GDS procedures, graphs stored the Graph catalog and the underlying Neo4j DBMS. Stored graphs can take up a significant amount of heap memory. To inspect the graphs in the graph catalog you can use the Graph list procedure.

1.2. Example

First let us assume that we just started gds.beta.node2vec.stream procedure with some arbitrary parameters.

We can have a look at the status of the JVM heap.

Monitor JVM heap status:
CALL gds.alpha.systemMonitor()
YIELD
  freeHeap,
  totalHeap,
  maxHeap
Table 2. Results
freeHeap totalHeap maxHeap

1234567

2345678

3456789

We can see that there currently is around 1.23 MB free heap memory in the JVM instance running our Neo4j DBMS. This may increase independently of any procedures finishing their execution as totalHeap is currently smaller than maxHeap. We can also inspect CPU core usage as well as the status of currently running GDS procedures on the system.

Monitor CPU core usage and ongoing GDS procedures:
CALL gds.alpha.systemMonitor()
YIELD
  availableCpuCoresNotRequested,
  jvmAvailableCpuCores,
  ongoingGdsProcedures
Table 3. Results
jvmAvailableCpuCores availableCpuCoresNotRequested ongoingGdsProcedures

100

84

[{ procedure: "Node2Vec", progress: "33.33%", estimatedMemoryRange: "[123 kB …​ 234 kB]", requestedNumberOfCpuCores: "16" }]

Here we can note that there is only one GDS procedure currently running, namely the Node2Vec procedure we just started. It has finished around 33.33% of its execution already. We also see that it may use up to an estimated 234 kB of memory. Note that it may not currently be using that much memory and so it may require more memory later in its execution, thus possible lowering our current freeHeap. Apparently it wants to use up to 16 CPU cores, leaving us with a total of 84 currently available cores in the system not requested by any GDS procedures.