22.11. Memory mapped IO settings
Each file in the Neo4j store can use memory mapped I/O for reading/writing. Best performance is achieved if the full file can be memory mapped but if there isn’t enough memory for that Neo4j will try and make the best use of the memory it gets (regions of the file that get accessed often will more likely be memory mapped).
Neo4j makes heavy use of the
A well configured OS with large disk caches will help a lot once we get cache misses in the node and relationship caches. Therefore it is not a good idea to use all available memory as Java heap.
If you look into the directory of your Neo4j database, you will find its store files, all prefixed by
nodestorestores information about nodes
relationshipstoreholds all the relationships
propertystorestores information of properties and all simple properties such as primitive types (both for relationships and nodes)
propertystore stringsstores all string properties
propertystore arraysstores all array properties
There are other files there as well, but they are normally not interesting in this context.
This is how the default memory mapping configuration looks:
neostore.nodestore.db.mapped_memory=25M neostore.relationshipstore.db.mapped_memory=50M neostore.relationshipgroupstore.db.mapped_memory=10M neostore.propertystore.db.mapped_memory=90M neostore.propertystore.db.strings.mapped_memory=130M neostore.propertystore.db.arrays.mapped_memory=130M
Optimizing for traversal speed example
To tune the memory mapping settings start by investigating the size of the different store files found in the directory of your Neo4j database. Here is an example of some of the files and sizes in a Neo4j database:
14M neostore.nodestore.db 510M neostore.propertystore.db 1.2G neostore.propertystore.db.strings 304M neostore.relationshipstore.db
In this example the application is running on a machine with 4GB of RAM. We’ve reserved about 2GB for the OS and other programs. The Java heap is set to 1.5GB, that leaves about 500MB of RAM that can be used for memory mapping.
If traversal speed is the highest priority it is good to memory map as much as possible of the node- and relationship stores.
An example configuration on the example machine focusing on traversal speed would then look something like:
neostore.nodestore.db.mapped_memory=15M neostore.relationshipstore.db.mapped_memory=285M neostore.relationshipgroupstore.db.mapped_memory=10M neostore.propertystore.db.mapped_memory=100M neostore.propertystore.db.strings.mapped_memory=100M neostore.propertystore.db.arrays.mapped_memory=0M
Batch insert example
Read general information on batch insertion in Chapter 35, Batch Insertion.
The configuration should suit the data set you are about to inject using BatchInsert. Lets say we have a random-like graph with 10M nodes and 100M relationships. Each node (and maybe some relationships) have different properties of string and Java primitive types (but no arrays). The important thing with a random graph will be to give lots of memory to the relationship and node store:
neostore.nodestore.db.mapped_memory=90M neostore.relationshipstore.db.mapped_memory=3G neostore.relationshipgroupstore.db.mapped_memory=10M neostore.propertystore.db.mapped_memory=50M neostore.propertystore.db.strings.mapped_memory=100M neostore.propertystore.db.arrays.mapped_memory=0M
The configuration above will fit the entire graph (with exception to properties) in memory.
A rough formula to calculate the memory needed for the nodes:
number_of_nodes * 9 bytes
and for relationships:
number_of_relationships * 33 bytes
Properties will typically only be injected once and never read so a few megabytes for the property store and string store is usually enough. If you have very large strings or arrays you may want to increase the amount of memory assigned to the string and array store files.
An important thing to remember is that the above configuration will need a Java heap of 3.3G+ since in batch inserter mode normal Java buffers that gets allocated on the heap will be used instead of memory mapped ones.