22.11. Memory mapped IO settings

Introduction

Each file in the Neo4j store can use memory mapped I/O for reading/writing. Best performance is achieved if the full file can be memory mapped but if there isn’t enough memory for that Neo4j will try and make the best use of the memory it gets (regions of the file that get accessed often will more likely be memory mapped).

[Important]Important

Neo4j makes heavy use of the java.nio package. Native I/O will result in memory being allocated outside the normal Java heap so that memory usage needs to be taken into consideration. Other processes running on the OS will impact the availability of such memory. Neo4j will require all of the heap memory of the JVM plus the memory to be used for memory mapping to be available as physical memory. Other processes may thus not use more than what is available after the configured memory allocation is made for Neo4j.

A well configured OS with large disk caches will help a lot once we get cache misses in the node and relationship caches. Therefore it is not a good idea to use all available memory as Java heap.

If you look into the directory of your Neo4j database, you will find its store files, all prefixed by neostore:

  • nodestore stores information about nodes
  • relationshipstore holds all the relationships
  • propertystore stores information of properties and all simple properties such as primitive types (both for relationships and nodes)
  • propertystore strings stores all string properties
  • propertystore arrays stores all array properties

There are other files there as well, but they are normally not interesting in this context.

This is how the default memory mapping configuration looks:

neostore.nodestore.db.mapped_memory=25M
neostore.relationshipstore.db.mapped_memory=50M
neostore.relationshipgroupstore.db.mapped_memory=10M
neostore.propertystore.db.mapped_memory=90M
neostore.propertystore.db.strings.mapped_memory=130M
neostore.propertystore.db.arrays.mapped_memory=130M

Optimizing for traversal speed example

To tune the memory mapping settings start by investigating the size of the different store files found in the directory of your Neo4j database. Here is an example of some of the files and sizes in a Neo4j database:

14M neostore.nodestore.db
510M neostore.propertystore.db
1.2G neostore.propertystore.db.strings
304M neostore.relationshipstore.db

In this example the application is running on a machine with 4GB of RAM. We’ve reserved about 2GB for the OS and other programs. The Java heap is set to 1.5GB, that leaves about 500MB of RAM that can be used for memory mapping.

[Tip]Tip

If traversal speed is the highest priority it is good to memory map as much as possible of the node- and relationship stores.

An example configuration on the example machine focusing on traversal speed would then look something like:

neostore.nodestore.db.mapped_memory=15M
neostore.relationshipstore.db.mapped_memory=285M
neostore.relationshipgroupstore.db.mapped_memory=10M
neostore.propertystore.db.mapped_memory=100M
neostore.propertystore.db.strings.mapped_memory=100M
neostore.propertystore.db.arrays.mapped_memory=0M

Batch insert example

Read general information on batch insertion in Chapter 35, Batch Insertion.

The configuration should suit the data set you are about to inject using BatchInsert. Lets say we have a random-like graph with 10M nodes and 100M relationships. Each node (and maybe some relationships) have different properties of string and Java primitive types (but no arrays). The important thing with a random graph will be to give lots of memory to the relationship and node store:

neostore.nodestore.db.mapped_memory=90M
neostore.relationshipstore.db.mapped_memory=3G
neostore.relationshipgroupstore.db.mapped_memory=10M
neostore.propertystore.db.mapped_memory=50M
neostore.propertystore.db.strings.mapped_memory=100M
neostore.propertystore.db.arrays.mapped_memory=0M

The configuration above will fit the entire graph (with exception to properties) in memory.

A rough formula to calculate the memory needed for the nodes:

number_of_nodes * 9 bytes

and for relationships:

number_of_relationships * 33 bytes

Properties will typically only be injected once and never read so a few megabytes for the property store and string store is usually enough. If you have very large strings or arrays you may want to increase the amount of memory assigned to the string and array store files.

An important thing to remember is that the above configuration will need a Java heap of 3.3G+ since in batch inserter mode normal Java buffers that gets allocated on the heap will be used instead of memory mapped ones.