Monitoring system
GDS supports multiple users concurrently working on the same system. Typically, GDS procedures are resource heavy in the sense that they may use a lot of memory and/or many CPU cores to do their computation. To know whether it is a reasonable time for a user to run a GDS procedure it is useful to know the current capacity of the system hosting Neo4j and GDS, as well as the current GDS workload on the system. Graphs and models are not shared between non-admin users by default, however GDS users on the same system will share its capacity.
System monitor procedure
To be able to get an overview of the system’s current capacity and its analytics workload one can use the procedure gds.systemMonitor
.
It will give you information on the capacity of the DBMS’s JVM instance in terms of memory and CPU cores, and an overview of the resources consumed by the GDS procedures currently being run on the system.
Syntax
CALL gds.systemMonitor()
YIELD
freeHeap,
totalHeap,
maxHeap,
jvmAvailableCpuCores,
availableCpuCoresNotRequested,
jvmHeapStatus,
ongoingGdsProcedures
Name | Type | Description |
---|---|---|
freeHeap |
Integer |
The amount of currently free memory in bytes in the Java Virtual Machine hosting the Neo4j instance. |
totalHeap |
Integer |
The total amount of memory in bytes in the Java virtual machine hosting the Neo4j instance. This value may vary over time, depending on the host environment. |
maxHeap |
Integer |
The maximum amount of memory in bytes that the Java virtual machine hosting the Neo4j instance will attempt to use. |
jvmAvailableCpuCores |
Integer |
The number of logical CPU cores currently available to the Java virtual machine. This value may change vary over the lifetime of the DBMS. |
availableCpuCoresNotRequested |
Integer |
The number of logical CPU cores currently available to the Java virtual machine that are not requested for use by currently running GDS procedures. Note that this number may be negative in case there are fewer available cores to the JVM than there are cores being requested by ongoing GDS procedures. |
jvmHeapStatus |
Map |
The above-mentioned heap metrics in human-readable form. |
ongoingGdsProcedures |
List of Map |
A list of maps containing resource usage and progress information for all GDS procedures (of all users) currently running on the Neo4j instance. Each map contains the name of the procedure, how far it has progressed, its estimated memory usage as well as how many CPU cores it will try to use at most. |
|
Example
First let us assume that we just started gds.node2vec.stream
procedure with some arbitrary parameters.
We can have a look at the status of the JVM heap.
CALL gds.systemMonitor()
YIELD
freeHeap,
totalHeap,
maxHeap
freeHeap | totalHeap | maxHeap |
---|---|---|
1234567 |
2345678 |
3456789 |
We can see that there currently is around 1.23 MB
free heap memory in the JVM instance running our Neo4j DBMS.
This may increase independently of any procedures finishing their execution as totalHeap
is currently smaller than maxHeap
.
We can also inspect CPU core usage as well as the status of currently running GDS procedures on the system.
CALL gds.systemMonitor()
YIELD
availableCpuCoresNotRequested,
jvmAvailableCpuCores,
ongoingGdsProcedures
jvmAvailableCpuCores | availableCpuCoresNotRequested | ongoingGdsProcedures |
---|---|---|
100 |
84 |
[{ username: "bob", jobId: "42", procedure: "Node2Vec", progress: "33.33%", estimatedMemoryRange: "[123 kB … 234 kB]", requestedNumberOfCpuCores: "16" }] |
Here we can note that there is only one GDS procedure currently running, namely the Node2Vec
procedure we just started. It has finished around 33.33%
of its execution already.
We also see that it may use up to an estimated 234 kB
of memory.
Note that it may not currently be using that much memory and so it may require more memory later in its execution, thus possible lowering our current freeHeap
.
Apparently it wants to use up to 16
CPU cores, leaving us with a total of 84
currently available cores in the system not requested by any GDS procedures.
Memory reporting procedures
To facilitate memory management, the Neo4j GDS Library offers two memory reporting procedures, alongside the aforementioned monitoring capabilities.
Detailed memory listing
Through the gds.memory.list
procedure, a user can obtain a combined list of their projected graphs and running tasks alongside their memory footprint.
For graphs, the size of the in-memory graph is returned, whereas for algorithms the memory estimation i.e., the maximum memory needs that this task is expected to require over the course of its execution.
This procedure can help a user realize which of his operations might be memory demanding and plan accordingly (e.g., drop a graph, or termination a transaction).
A user can access only his own graphs or tasks, but an admin can obtain information for all users.
Syntax
CALL gds.memory.list()
YIELD
user: String,
name: String,
entity: String
memoryInBytes: Integer
Name | Type | Description |
---|---|---|
user |
String |
The username |
name |
String |
The name of the graph or running task. |
entity |
String |
If the reporting entity is a task, this corresponds to its job id. Otherwise, it is set to "graph". |
memoryInBytes |
Integer |
The occupying memory for a graph or the estimation of a task. |
Example
CALL gds.memory.list()
YIELD
user, name, entity, memoryInBytes
user | name | entity | memoryInBytes |
---|---|---|---|
"Bob" |
"my-graph" |
"graph" |
20 |
"Bob" |
"Node2Vec" |
111-222-333 |
234 |
As we can see, Bob has projected a graph named 'my-graph' occupying 200 bytes memory, and is currently running Node2Vec which has a memory footprint of 234 bytes.
Memory Summary
The gds.memory.summary
procedure provides an overview of the total graph memory, and the total memory requirements of running tasks for a given users.
A user can access only his own graphs or tasks, but an admin can obtain information for all users.
Syntax
CALL gds.memory.summary()
YIELD
user: String,
totalGraphsMemory : Intger,
totalTasksMemory : Integer
Name | Type | Description |
---|---|---|
user |
String |
The username. |
totalGraphsMemory |
Integer |
The total requirement for that user’s graphs. |
totalTasksMemory |
Integer |
The total requirements of that user’s tasks. |