Understanding memory consumption

So you have configured Neo4j to use 4GB of heap and 6GB of page cache and sat back relaxed, thinking the Java process would not go above 10GB in your 12GB machine only to realise that Neo4j had an OOM error and crashed. What’s happening under the hood? Why is Neo4j consuming more memory than you allocated? Is this a memory leak or normal behaviour? So many questions! Let’s try to answer some of these questions so that you’re not caught off guard when it comes to memory.

While memory leaks can happen, more often than not, a higher memory consumption is normal behaviour by the JVM. In order to function correctly, the JVM needs to allocate more memory in several other categories. The most significant categories of JVM memory are:

Heap - The heap is where your Class instantiations or “Objects” are stored.
Thread stacks - Each thread has its own call stack. The stack stores primitive local variables and object references along with the call stack (list of method invocations) itself. The stack is cleaned up as stack frames move out of context so there is no GC performed here.
Metaspace (PermGen in older Java versions) - Metaspace stores the Class definitions of your Objects, and some other metadata.
Code cache - The JIT compiler stores native code it generates in the code cache to improve performance by reusing it.
Garbage collection - In order for the GC to know which objects are eligible for collection, it needs to keep track of the object graphs. So this is one part of the memory lost to this internal bookkeeping.
Buffer pools - Many libraries and frameworks allocate buffers outside of the heap to improve performance. These buffer pools can be used to share memory between Java code and native code, or map regions of a file into memory.

You can end up using memory for other reasons than listed above as well, but I just wanted to make aware that there is a significant amount of memory eaten up by the JVM internals.

Do I need to worry about all this?

Let’s put this into perspective!

When configuring Neo4j’s memory, you may start encountering many terms such as on-heap, off-heap, page cache, direct memory, OS memory… What does this all mean and what should you be aware when configuring your memory? First, let’s start by understanding some of these terms:

Heap: The JVM has a heap that is the runtime data area from which memory for all class instances and arrays are allocated. Heap storage for objects is reclaimed by an automatic storage management system (known as a garbage collector or GC)
Off-Heap: Sometimes heap memory is not enough, especially when we need to cache a lot of data without increasing GC pauses, share cached data between JVMs or add a persistence layer in memory resistant to JVM crashes. In all mentioned cases off-heap memory is one of possible solutions. As the off-heap store continues to be managed in memory, it is slightly slower than the on-heap store, but still faster than the disk store (and also not subject to GC).
Page cache: The page cache lives off-heap and is used to cache the Neo4j data (and native indexes). The caching of graph data and indexes into memory will help avoid costly disk access and result in optimal performance.

While heap and off-heap are general Java terms, page cache refers to Neo4j’s native caching.

Below is a picture of how this all falls into place:

Figure 1: Memory consumption in Neo4j

As you can see above we can divide the Neo4j’s memory consumption into 2 main areas: On-heap and off-heap:

On-heap is where the runtime data lives and it’s also where query execution, graph management and transaction state¹ exist.

Setting the heap to an optimal value is a tricky task by itself and this article doesn’t aim to cover that but rather understand Neo4j’s memory consumption as a whole.

Off-heap can itself be divided into 3 categories. Not only do we have Neo4j’s page cache (responsible for caching the graph data into memory) but also all other memory the JVM needs to work (JVM Internals). The remaining block you see there is direct memory, we’ll get to that in a minute.

In Neo4j there are three memory settings you can configure: initial heap size (-Xms), maximum heap size (-Xmx) and page cache. In reality, the first two affect the same memory space so Neo4j only allows you to configure two of all of the above mentioned memory categories. These are the ones you can impose limits on though. Setting the max heap to 4GB and page cache to another 4GB guarantees that those specific components will not grow bigger than that. So how can your process consume more than the values set? A common miss-conception is that by setting both heap and page cache, Neo4j’s process memory consumption will not grow beyond that but it’s more likely that the memory footprint of Neo4j is larger.

Well - as we’ve seen above - the JVM will require some extra memory to function correctly. For example, running a highly concurrent environment means that the memory occupied by the thread stack will be equal to the number of concurrent threads the JVM will process times the thread stack size (-Xss).

There are harder to find sources of non-heap memory use, such as buffer pools. Enter direct memory. Direct byte buffers are important for improving performance because they allow native code and Java code to share data without copying it. However this is expensive, which means byte buffers are usually reused once they’re created. As a result, some frameworks keep them around for the life of the process. One example is Netty. Neo4j uses Netty (an asynchronous event-driven network application framework for rapid development of maintainable high performance protocol servers & clients, https://netty.io/) and that’s by far the primary user of direct memory. It is used predominantly for buffering and IO and but it’s used by several components in Neo4j.

In almost all cases, your direct memory usage will not grow to problematic levels but in very extreme and demanding use cases (ie: when we have loads of concurrent accesses and updates), you may start seeing Neo4j’s process consume way more memory than what we configured or you can also get some Out of Memory errors regarding direct memory:

2018-11-14 09:32:49.292+0000 ERROR [o.n.b.t.SocketTransportHandler] Fatal error occurred when handling a client connection: failed to allocate 16777216 byte(s) of direct memory (used: 6442450944, max: 6442450944) failed to allocate 16777216 byte(s) of direct memory (used: 6442450944, max: 6442450944)
io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 6442450944, max: 6442450944)
at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:624)
at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:578)
at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:718)
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:707)
...

These symptoms are related to direct memory growth. While we do not manage the memory Netty uses, there is a way to limit the direct memory Neo4j (and any Java process) can use via a JVM setting: -XX:MaxDirectMemorySize. This works in conjunction with dbms.jvm.additional=-Dio.netty.maxDirectMemory=0 in the neo4j.conf file. This will force Netty to use the direct memory settings and thus effectively limiting how much it can grow.

THESE ARE SENSITIVE SETTINGS THAT WILL AFFECT THE NORMAL FUNCTIONALITY OF NEO4J. Please do not change these settings without consulting with Neo4j’s professionals. You can just log a support ticket if you’re running into issues with Direct Memory and we’ll advise you the best we can.

Indexes

Depending on whether you are using Lucene or native indexes, the memory taken by these will live in different places. If you are using Lucene indexes, these will live off-heap and we have no control on what memory is used by them. On the image above, they would live alongside the page cache but in an unmanaged block.

If you are using native indexes, the memory taken by them will live inside the page cache meaning we can somewhat control how much memory they can take. You should account for this when setting the page cache size.

Monitoring

By now you must have realized that memory configuration is not that trivial. What do you have to make your life easier? You can use the Native Memory Tracking which is a JVM feature and tracks internal memory usage. To enable it you need to add the following to your neo4j.conf file:

dbms.jvm.additional=-XX:NativeMemoryTracking=detail

Then grab the PID of Neo4j, and use jcmd to print out native memory use for the process using jcmd <PID> VM.native_memory summary. You will get the detailed allocation information for each category in memory, as shown below:

$ jcmd <PID> VM.native_memory summary
Native Memory Tracking:

Total: reserved=3554519KB, committed=542799KB
-                 Java Heap (reserved=2097152KB, committed=372736KB)
                            (mmap: reserved=2097152KB, committed=372736KB)

-                     Class (reserved=1083039KB, committed=38047KB)
                            (classes #5879)
                            (malloc=5791KB #6512)
                            (mmap: reserved=1077248KB, committed=32256KB)

-                    Thread (reserved=22654KB, committed=22654KB)
                            (thread #23)
                            (stack: reserved=22528KB, committed=22528KB)
                            (malloc=68KB #116)
                            (arena=58KB #44)

-                      Code (reserved=251925KB, committed=15585KB)
                            (malloc=2325KB #3622)
                            (mmap: reserved=249600KB, committed=13260KB)

-                        GC (reserved=82398KB, committed=76426KB)
                            (malloc=5774KB #182)
                            (mmap: reserved=76624KB, committed=70652KB)

-                  Compiler (reserved=139KB, committed=139KB)
                            (malloc=9KB #128)
                            (arena=131KB #3)

-                  Internal (reserved=6127KB, committed=6127KB)
                            (malloc=6095KB #7439)
                            (mmap: reserved=32KB, committed=32KB)

-                    Symbol (reserved=9513KB, committed=9513KB)
                            (malloc=6724KB #60789)
                            (arena=2789KB #1)

-    Native Memory Tracking (reserved=1385KB, committed=1385KB)
                            (malloc=121KB #1921)
                            (tracking overhead=1263KB)

-               Arena Chunk (reserved=186KB, committed=186KB)
                            (malloc=186KB)

Usually, the jcmd dump is only moderately useful by itself. It’s more common to take multiple dumps and compare them by running jcmd <PID> VM.native_memory summary.diff

This is a great tool for debugging memory problems.

¹ Starting from 3.5, transaction state can also be configured to be allocated separately from the heap

Is this page helpful?

Knowledge Base

Understanding memory consumption

Do I need to worry about all this?

Indexes

Monitoring