Metrics reference
Enterprise Edition

You should use caution when interpreting unfamiliar metrics. Reading the Performance section is recommended to better understand the metrics.

Types of metrics

Neo4j has the following types of metrics:

Global — covers the whole Neo4j DBMS.
Per database — covers an individual database.

The metrics fall into one of the following categories:

Gauge — shows an instantaneous reading of a particular value.
Counter — shows an accumulated value.
Histogram — shows the distribution of values.

Global metrics

Global metrics cover the whole database management system, and represents the status of the system as a whole.

Global metrics have the following name format: <user-configured-prefix>.<metric-name> if metrics.namespaces.enabled is false, or <user-configured-prefix>.dbms.<metric-name> if the setting is true. The <user-configured-prefix> can be configured with the metrics.prefix configuration setting.

Metrics of this type are reported as soon as the database management system is available. For example, all JVM related metrics are global. In particular, the neo4j.vm.thread.count metric has a default user-configured-prefix neo4j and the metric name is vm.thread.count.

By default, global metrics include:

Bolt metrics
Page cache metrics
GC metrics
Thread metrics
Database operation metrics
Web Server metrics
JVM metrics

Database metrics

Each database metric is reported for a particular database only. Database metrics are only available during the lifetime of the database. When a database becomes unavailable, all of its metrics become unavailable also.

Database metrics have the following name format: <user-configured-prefix>.<database-name>.<metric-name> if metrics.namespaces.enabled is false, or <user-configured-prefix>.database.<database-name>.<metric-name> if the setting is true. The <user-configured-prefix> can be configured with the metrics.prefix configuration setting.

For example, any transaction metric is a database metric. In particular, the neo4j.mydb.transaction.started metric has a default user-configured-prefix neo4j and is a metric for the mydb database.

By default, database metrics include:

Transaction metrics
Checkpoint metrics
Log rotation metrics
Database data metrics
Cypher metrics
Causal clustering metrics

General-purpose metrics

Table 1. Bolt metrics
Name	Description
`<prefix>.bolt.sessions_started`	The total number of Bolt sessions created by users since startup. This includes both succeeded and failed sessions (deprecated, use connections_opened instead). Useful for monitoring load via the Bolt drivers in combination with other metrics. (counter)
`<prefix>.bolt.connections_opened`	The total number of Bolt connections opened since startup. This includes both succeeded and failed connections. Useful for monitoring load via the Bolt drivers in combination with other metrics. (counter)
`<prefix>.bolt.connections_closed`	The total number of Bolt connections closed since startup. This includes both properly and abnormally ended connections. Useful for monitoring load via Bolt drivers in combination with other metrics. (counter)
`<prefix>.bolt.connections_running`	The total number of Bolt connections that are currently executing Cypher and returning results. Useful to track the overall load on Bolt connections. This is limited to the number of Bolt worker threads that have been configured via `dbms.connector.bolt.thread_pool_max_size`. Reaching this maximum indicated the server is running at capacity. (gauge)
`<prefix>.bolt.connections_idle`	The total number of Bolt connections that are not currently executing Cypher or returning results. (gauge)
`<prefix>.bolt.messages_received`	The total number of messages received via Bolt since startup. Useful to track general message activity in combination with other metrics. (counter)
`<prefix>.bolt.messages_started`	The total number of messages that have started processing since being received. A received message may have begun processing until a Bolt worker thread becomes available. A large gap being observed between between `bolt.messages_received` and `bolt.messages_started` could indicate the server is running at capacity. (counter)
`<prefix>.bolt.messages_done`	The total number of Bolt messages that have completed processing whether successfully or unsuccessfully. Useful for tracking overall load. (counter)
`<prefix>.bolt.messages_failed`	The total number of messages that have failed while processing. A high number of failures may indicate an issue with server and further investigation of the logs is recommended. (counter)
`<prefix>.bolt.accumulated_queue_time`	(unsupported feature) When `internal.server.bolt.thread_pool_queue_size` is enabled, the total time in milliseconds that a Bolt message waits in the processing queue before a Bolt worker thread becomes available to process it. Sharp increases in this value indicate that server is running at capacity. If `internal.server.bolt.thread_pool_queue_size` is disabled, the value should be `0`, meaning that messages are directly handed off to worker threads. (counter)
`<prefix>.bolt.accumulated_processing_time`	The total amount of time in milliseconds that worker threads have been processing messages. Useful for monitoring load via Bolt drivers in combination with other metrics. (counter)

Table 2. Database checkpointing metrics
Name	Description
`<prefix>.check_point.events`	The total number of check point events executed so far. (counter)
`<prefix>.check_point.total_time`	The total time, in milliseconds, spent in check pointing so far. (counter)
`<prefix>.check_point.duration`	The duration, in milliseconds, of the last check point event. Checkpoints should generally take several seconds to several minutes. Long checkpoints can be an issue, as these are invoked when the database stops, when a hot backup is taken, and periodically as well. Values over `30` minutes or so should be cause for some investigation. (gauge)

Table 3. Cypher metrics
Name	Description
`<prefix>.cypher.replan_events`	The total number of times Cypher has decided to re-plan a query. Neo4j caches 1000 plans by default. Seeing sustained replanning events or large spikes could indicate an issue that needs to be investigated. (counter)
`<prefix>.cypher.replan_wait_time`	The total number of seconds waited between query replans. (counter)

Table 4. Database data count metrics
Name	Description
`<prefix>.neo4j.count.relationship`	The total number of relationships in the database. (gauge)
`<prefix>.neo4j.count.node`	The total number of nodes in the database. A rough metric of how big your graph is. And if you are running a bulk insert operation you can see this tick up. (gauge)

Table 5. Database neo4j pools metrics
Name	Description
`<prefix>.pool.<pool>.<database>.used_heap`	Used or reserved heap memory in bytes. (gauge)
`<prefix>.pool.<pool>.<database>.used_native`	Used or reserved native memory in bytes. (gauge)
`<prefix>.pool.<pool>.<database>.total_used`	Sum total used heap and native memory in bytes. (gauge)
`<prefix>.pool.<pool>.<database>.total_size`	Sum total size of capacity of the heap and/or native memory pool. (gauge)
`<prefix>.pool.<pool>.<database>.free`	Available unused memory in the pool, in bytes. (gauge)

Table 6. Database operation count metrics
Name	Description
`<prefix>.db.operation.count.create`	Count of successful database create operations. (counter)
`<prefix>.db.operation.count.start`	Count of successful database start operations. (counter)
`<prefix>.db.operation.count.stop`	Count of successful database stop operations. (counter)
`<prefix>.db.operation.count.drop`	Count of successful database drop operations. (counter)
`<prefix>.db.operation.count.failed`	Count of failed database operations. (counter)
`<prefix>.db.operation.count.recovered`	Count of database operations which failed previously but have recovered. (counter)

Table 7. Database data metrics
Name	Description
`<prefix>.ids_in_use.relationship_type`	The total number of different relationship types stored in the database. Informational, not an indication of any issue. Spikes or large increases indicate large data loads, which could correspond with some behavior you are investigating. (gauge)
`<prefix>.ids_in_use.property`	The total number of different property names used in the database. Informational, not an indication of any issue. Spikes or large increases indicate large data loads, which could correspond with some behavior you are investigating. (gauge)
`<prefix>.ids_in_use.relationship`	The total number of relationships stored in the database. Informational, not an indication of any issue. Spikes or large increases indicate large data loads, which could correspond with some behavior you are investigating. (gauge)
`<prefix>.ids_in_use.node`	The total number of nodes stored in the database. Informational, not an indication of any issue. Spikes or large increases indicate large data loads, which could correspond with some behavior you are investigating. (gauge)

Table 8. Global neo4j pools metrics
Name	Description
`<prefix>.dbms.pool.<pool>.used_heap`	Used or reserved heap memory in bytes. (gauge)
`<prefix>.dbms.pool.<pool>.used_native`	Used or reserved native memory in bytes. (gauge)
`<prefix>.dbms.pool.<pool>.total_used`	Sum total used heap and native memory in bytes. (gauge)
`<prefix>.dbms.pool.<pool>.total_size`	Sum total size of capacity of the heap and/or native memory pool. (gauge)
`<prefix>.dbms.pool.<pool>.free`	Available unused memory in the pool, in bytes. (gauge)

Table 9. Database page cache metrics
Name	Description
`<prefix>.page_cache.eviction_exceptions`	The total number of exceptions seen during the eviction process in the page cache. (counter)
`<prefix>.page_cache.flushes`	The total number of page flushes executed by the page cache. (counter)
`<prefix>.page_cache.merges`	The total number of page merges executed by the page cache. (counter)
`<prefix>.page_cache.unpins`	The total number of page unpins executed by the page cache. (counter)
`<prefix>.page_cache.pins`	The total number of page pins executed by the page cache. (counter)
`<prefix>.page_cache.evictions`	The total number of page evictions executed by the page cache. (counter)
`<prefix>.page_cache.evictions.cooperative`	The total number of cooperative page evictions executed by the page cache due to low available pages. (counter)
`<prefix>.page_cache.page_faults`	The total number of page faults in the page cache. If this count keeps increasing over time, it may indicate that more page cache is required. However, note that when Neo4j Enterprise starts up, all page cache warmup activities result in page faults. Therefore, it is normal to observe a significant page fault count immediately after startup.(counter)
`<prefix>.page_cache.hits`	The total number of page hits happened in the page cache. (counter)
`<prefix>.page_cache.hit_ratio`	The ratio of hits to the total number of lookups in the page cache. Performance relies on efficiently using the page cache, so this metric should be in the 98-100% range consistently. If it is much lower than that, then the database is going to disk too often. (gauge)
`<prefix>.page_cache.usage_ratio`	The ratio of number of used pages to total number of available pages. This metric shows what percentage of the allocated page cache is actually being used. If it is 100%, then it is likely that the hit ratio will start dropping, and you should consider allocating more RAM to page cache. (gauge)
`<prefix>.page_cache.bytes_read`	The total number of bytes read by the page cache. (counter)
`<prefix>.page_cache.bytes_written`	The total number of bytes written by the page cache. (counter)
`<prefix>.page_cache.iops`	The total number of IO operations performed by page cache.
`<prefix>.page_cache.throttled.times`	The total number of times page cache flush IO limiter was throttled during ongoing IO operations.
`<prefix>.page_cache.throttled.millis`	The total number of millis page cache flush IO limiter was throttled during ongoing IO operations.

Table 10. Query execution metrics
Name	Description
`<prefix>.db.query.execution.success`	Count of successful queries executed. (counter)
`<prefix>.db.query.execution.failure`	Count of failed queries executed. (counter)
`<prefix>.db.query.execution.latency.millis`	Execution time in milliseconds of queries executed successfully. (histogram)

Table 11. Database store size metrics
Name	Description
`<prefix>.store.size.total`	The total size of the database and transaction logs, in bytes. The total size of the database helps determine how much cache page is required. It also helps compare the total disk space used by the data store and how much is available. (gauge)
`<prefix>.store.size.database`	The size of the database, in bytes. The total size of the database helps determine how much cache page is required. It also helps compare the total disk space used by the data store and how much is available. (gauge)

Table 12. Database transaction log metrics
Name	Description
`<prefix>.log.rotation_events`	The total number of transaction log rotations executed so far. (counter)
`<prefix>.log.rotation_total_time`	The total time, in milliseconds, spent in rotating transaction logs so far. (counter)
`<prefix>.log.rotation_duration`	The duration, in milliseconds, of the last log rotation event. (gauge)
`<prefix>.log.appended_bytes`	The total number of bytes appended to transaction log. (counter)
`<prefix>.log.flushes`	The total number of transaction log flushes. (counter)
`<prefix>.log.append_batch_size`	The size of the last transaction append batch. (gauge)

Table 13. Database transaction metrics
Name	Description
`<prefix>.transaction.started`	The total number of started transactions. (counter)
`<prefix>.transaction.peak_concurrent`	The highest peak of concurrent transactions. This is a useful value to understand. It can help you with the design for the highest load scenarios and whether the Bolt thread settings should be altered. (counter)
`<prefix>.transaction.active`	The number of currently active transactions. Informational, not an indication of any issue. Spikes or large increases could indicate large data loads, or just high read load. (gauge)
`<prefix>.transaction.active_read`	The number of currently active read transactions. (gauge)
`<prefix>.transaction.active_write`	The number of currently active write transactions. (gauge)
`<prefix>.transaction.committed`	The total number of committed transactions. Informational, not an indication of any issue. Spikes or large increases indicate large data loads, or just high read load. (counter)
`<prefix>.transaction.committed_read`	The total number of committed read transactions. Informational, not an indication of any issue. Spikes or large increases indicate high read load. (counter)
`<prefix>.transaction.committed_write`	The total number of committed write transactions. Informational, not an indication of any issue. Spikes or large increases indicate large data loads, which could correspond with some behavior you are investigating. (counter)
`<prefix>.transaction.rollbacks`	The total number of rolled back transactions. (counter)
`<prefix>.transaction.rollbacks_read`	The total number of rolled back read transactions. (counter)
`<prefix>.transaction.rollbacks_write`	The total number of rolled back write transactions. Seeing a lot of writes rolled back may indicate various issues with locking, transaction timeouts, etc. (counter)
`<prefix>.transaction.terminated`	The total number of terminated transactions. (counter)
`<prefix>.transaction.terminated_read`	The total number of terminated read transactions. (counter)
`<prefix>.transaction.terminated_write`	The total number of terminated write transactions. (counter)
`<prefix>.transaction.last_committed_tx_id`	The ID of the last committed transaction. Track this for each instance. (Cluster) Track this for each Core cluster member, and each Read Replica. Might break into separate charts. It should show one line, ever increasing, and if one of the lines levels off or falls behind, it is clear that this member is no longer replicating data and action is needed to rectify the situation. (counter)
`<prefix>.transaction.last_closed_tx_id`	The ID of the last closed transaction. (counter)
`<prefix>.transaction.tx_size_heap`	The transactions' size on heap in bytes. (histogram)
`<prefix>.transaction.tx_size_native`	The transactions' size in native memory in bytes. (histogram)

Table 14. Server metrics
Name	Description
`<prefix>.server.threads.jetty.idle`	The total number of idle threads in the jetty pool. (gauge)
`<prefix>.server.threads.jetty.all`	The total number of threads (both idle and busy) in the jetty pool. (gauge)

Metrics specific to Causal Clustering

Table 15. CatchUp Metrics
Name	Description
`<prefix>.causal_clustering.catchup.tx_pull_requests_received`	TX pull requests received from read replicas. (counter)

Table 16. Discovery core metrics
Name	Description
`<prefix>.causal_clustering.core.discovery.replicated_data`	Size of replicated data structures. (gauge)
`<prefix>.causal_clustering.core.discovery.cluster.members`	Discovery cluster member size. (gauge)
`<prefix>.causal_clustering.core.discovery.cluster.unreachable`	Discovery cluster unreachable size. (gauge)
`<prefix>.causal_clustering.core.discovery.cluster.converged`	Discovery cluster convergence. (gauge)

Table 17. Raft core metrics
Name	Description
`<prefix>.causal_clustering.core.append_index`	The append index of the Raft log. Each index represents a write transaction (possibly internal) proposed for commitment. The values mostly increase, but sometimes they can decrease as a consequence of leader changes. The append index should always be less than or equal to the commit index. (gauge)
`<prefix>.causal_clustering.core.commit_index`	The commit index of the Raft log. Represents the commitment of previously appended entries. Its value increases monotonically if you do not unbind the cluster state. The commit index should always be bigger than or equal to the append index. (gauge)
`<prefix>.causal_clustering.core.applied_index`	The applied index of the Raft log. Represents the application of the committed Raft log entries to the database and internal state. The applied index should always be bigger than or equal to the commit index. The difference between this and the commit index can be used to monitor how up-to-date the follower database is. (gauge)
`<prefix>.causal_clustering.core.term`	The Raft Term of this server. It increases monotonically if you do not unbind the cluster state. (gauge)
`<prefix>.causal_clustering.core.tx_retries`	Transaction retries. (counter)
`<prefix>.causal_clustering.core.is_leader`	Is this server the leader? Track this for each Core cluster member. It will report `0` if it is not the leader and `1` if it is the leader. The sum of all of these should always be `1`. However, there will be transient periods in which the sum can be more than `1` because more than one member thinks it is the leader. Action may be needed if the metric shows `0` for more than 30 seconds. (gauge)
`<prefix>.causal_clustering.core.in_flight_cache.total_bytes`	In-flight cache total bytes. (gauge)
`<prefix>.causal_clustering.core.in_flight_cache.max_bytes`	In-flight cache max bytes. (gauge)
`<prefix>.causal_clustering.core.in_flight_cache.element_count`	In-flight cache element count. (gauge)
`<prefix>.causal_clustering.core.in_flight_cache.max_elements`	In-flight cache maximum elements. (gauge)
`<prefix>.causal_clustering.core.in_flight_cache.hits`	In-flight cache hits. (counter)
`<prefix>.causal_clustering.core.in_flight_cache.misses`	In-flight cache misses. (counter)
`<prefix>.causal_clustering.core.raft_log_entry_prefetch_buffer.lag`	Raft Log Entry Prefetch Lag. (gauge)
`<prefix>.causal_clustering.core.raft_log_entry_prefetch_buffer.bytes`	Raft Log Entry Prefetch total bytes. (gauge)
`<prefix>.causal_clustering.core.raft_log_entry_prefetch_buffer.size`	Raft Log Entry Prefetch buffer size. (gauge)
`<prefix>.causal_clustering.core.raft_log_entry_prefetch_buffer.async_put`	Raft Log Entry Prefetch buffer async puts. (gauge)
`<prefix>.causal_clustering.core.raft_log_entry_prefetch_buffer.sync_put`	Raft Log Entry Prefetch buffer sync puts. (gauge)
`<prefix>.causal_clustering.core.message_processing_delay`	Delay between Raft message receive and process. (gauge)
`<prefix>.causal_clustering.core.message_processing_timer`	Timer for Raft message processing. (counter, histogram)
`<prefix>.causal_clustering.core.replication_new`	The total number of Raft replication requests. It increases with write transactions (possibly internal) activity. (counter)
`<prefix>.causal_clustering.core.replication_attempt`	The total number of Raft replication requests attempts. It is bigger or equal than the replication requests. (counter)
`<prefix>.causal_clustering.core.replication_fail`	The total number of Raft replication attempts that have failed. (counter)
`<prefix>.causal_clustering.core.replication_maybe`	Raft Replication maybe count. (counter)
`<prefix>.causal_clustering.core.replication_success`	The total number of Raft replication requests that have succeeded. (counter)
`<prefix>.causal_clustering.core.last_leader_message`	The time elapsed since the last message from a leader in milliseconds. Should reset periodically. (gauge)

Table 18. Read Replica Metrics
Name	Description
`<prefix>.causal_clustering.read_replica.pull_updates`	The total number of pull requests made by this instance. (counter)
`<prefix>.causal_clustering.read_replica.pull_update_highest_tx_id_requested`	The highest transaction id requested in a pull update by this instance. (counter)
`<prefix>.causal_clustering.read_replica.pull_update_highest_tx_id_received`	The highest transaction id that has been pulled in the last pull updates by this instance. (counter)

Java Virtual Machine Metrics

The JVM metrics show information about garbage collections (for example, the number of events and time spent collecting), memory pools and buffers, and the number of active threads running. They are environment dependent and therefore, may vary on different hardware and with different JVM configurations. The metrics about the JVM’s memory usage expose values that are provided by the MemoryPoolMXBeans and BufferPoolMXBeans. The memory pools are memory managed by the JVM, for example, neo4j.vm.memory.pool.g1_survivor_space. Therefore, if necessary, you can tune them using the JVM settings. The buffer pools are space outside of the memory managed by the garbage collector. Neo4j allocates buffers in those pools as it needs them. You can limit this memory using JVM settings, but there is never any good reason for you to set them.

Table 19. JVM file descriptor metrics.
Name	Description
`<prefix>.vm.file.descriptors.count`	The current number of open file descriptors. (gauge)
`<prefix>.vm.file.descriptors.maximum`	(OS setting) The maximum number of open file descriptors. It is recommended to be set to 40K file handles, because of the native and Lucene indexing Neo4j uses. If this metric gets close to the limit, you should consider raising it. (gauge)

Table 20. GC metrics.
Name	Description
`<prefix>.vm.gc.time.<gc>`	Accumulated garbage collection time in milliseconds. Long GCs can be an indication of performance issues, or potentially instability. If this approaches the heartbeat timeout in a cluster, it may cause unwanted leader switches. (counter)
`<prefix>.vm.gc.count.<gc>`	Total number of garbage collections. (counter)

Table 21. JVM Heap metrics.
Name	Description
`<prefix>.vm.heap.committed`	Amount of memory (in bytes) guaranteed to be available for use by the JVM. (gauge)
`<prefix>.vm.heap.used`	Amount of memory (in bytes) currently used. This is the amount of heap space currently used at a given point in time. Monitor this to identify if you are maxing out consistently, in which case, you should increase initial and max heap size, or if you are underutilizing, you should decrease the initial and max heap sizes. (gauge)
`<prefix>.vm.heap.max`	Maximum amount of heap memory (in bytes) that can be used. This is the amount of heap space currently used at a given point in time. Monitor this to identify if you are maxing out consistently, in which case, you should increase initial and max heap size, or if you are underutilizing, you should decrease the initial and max heap sizes. (gauge)

Table 22. JVM memory buffers metrics.
Name	Description
`<prefix>.vm.memory.buffer.<bufferpool>.count`	Estimated number of buffers in the pool. (gauge)
`<prefix>.vm.memory.buffer.<bufferpool>.used`	Estimated amount of memory used by the pool. (gauge)
`<prefix>.vm.memory.buffer.<bufferpool>.capacity`	Estimated total capacity of buffers in the pool. (gauge)

Table 23. JVM memory pools metrics.
Name	Description
`<prefix>.vm.memory.pool.<pool>`	Estimated amount of memory in bytes used by the pool. (gauge)

Table 24. JVM pause time metrics.
Name	Description
`<prefix>.vm.pause_time`	Accumulated detected VM pause time. (gauge)

Table 25. JVM threads metrics.
Name	Description
`<prefix>.vm.thread.count`	Estimated number of active threads in the current thread group. (gauge)
`<prefix>.vm.thread.total`	The total number of live threads including daemon and non-daemon threads. (gauge)

Metrics referenceEnterprise Edition