Vector index memory configuration

Vector indexes are based on Lucene. Lucene-backed vector index files are cached by the operating system’s filesystem cache rather than Neo4j page cache memory, as described in the Memory configuration section. For vector indexes, you must ensure that there is sufficient memory for Neo4j (JVM heap and Neo4j page cache) and that enough RAM remains available for the operating system’s filesystem cache. If insufficient RAM is left for that cache, the OS will read data from disk more often and vector search performance will degrade. Under broader memory pressure, the OS may also start swapping. Tools like iotop, or equivalent Linux I/O monitoring tools, can help you understand disk I/O usage.

Optimal memory configuration for Neo4j with vector indexes

To estimate the operating system’s filesystem cache requirements of a vector index, two quantities are meaningful:

  • Logical vector data size — The bytes of vector data being indexed, independent of storage and indexing overhead.

  • Physical vector index size — The size on disk of the populated vector index.

Either can be used as the starting point for estimating the memory requirements for vector search.

The logical vector data size is:

dimension precision x dimension count x vector count

For Lucene-backed vector indexes, the dimension precision is 4 bytes. For example, 1,000,000 vectors with 768 dimensions have a logical vector data size of 3.072 GB.

The physical vector index size can either be measured after the index is built by checking its size on disk, or estimated in advance. The estimate depends on whether quantization is enabled (default true, see vector.quantization.enabled).

The following estimates are per vector, where HNSW_M is the configured vector.hnsw.m value (default 16):

  • With quantization — Approximately 1.1 x (1.25 x dimension precision x dimension count + 8 x HNSW_M) bytes

  • Without quantization — Approximately 1.1 x (dimension precision x dimension count + 8 x HNSW_M) bytes

To estimate the full physical vector index size, multiply the per-vector estimate by the vector count.

As a starting point, allow about 40% of the physical vector index size for the operating system filesystem cache for a quantized index. For an unquantized index, allow about 100% of the physical vector index size.

The overall memory configuration needs to account for Neo4j heap, Neo4j page cache, OS filesystem cache for vector indexes, and other OS memory.

Neo4j page cache depends on vector access pattern

Access pattern Page cache guidance

Vectors are indexed and searched, but not returned or reused

Neo4j page cache can be sized mainly for the rest of the graph and any other accessed properties. The vector property values do not need to stay hot in Neo4j page cache if the query does not read them back from the graph store.

Vectors are returned, re-ranked, or reused after the index lookup

Neo4j page cache must also account for reading vector property values from the graph store. Size the page cache allocation accordingly.

Example calculations

The following example shows how to estimate the vector-specific storage and memory requirements for the following scenario:

  • 10 million vectors with 768 dimensions of type FLOAT32

  • default vector index settings, with quantization enabled and vector.hnsw.m = 16

  • vectors are indexed and searched, but are not returned or reused after the index lookup.

This example calculates the vector-specific disk footprint and the OS filesystem cache for the index. Neo4j heap, Neo4j page cache, and other OS memory still depend on the rest of the graph and the query workload.

Table 1. Disk storage requirements
Component Formula Value

Logical vector data size

10M vectors x 4 bytes per dimension x 768 dimensions / 1,000,000,000

30.72GB

Physical vector index size (single index)

(10M vectors x 1.1 x (1.25 x 4 bytes per dimension x 768 dimensions + 8 bytes x 16)) / 1,000,000,000

43.648GB

Total vector-related footprint

logical vector data size + physical vector index size

74.368GB

Table 2. Memory allocation components
Component Estimate Notes

Neo4j heap

Workload dependent

Use neo4j-admin server memory-recommendation for an initial recommendation, then adjust based on query concurrency and application behavior.

Neo4j page cache

Workload dependent

Use page cache capacity planning and neo4j-admin server memory-recommendation as starting points, then account for the rest of the graph and any non-vector properties that the workload reads.

OS filesystem cache for the index

40% of physical vector index size

17.46GB

Other OS memory

Workload dependent

Reserve memory for the operating system and other processes

AuraDB instance sizing

For AuraDB-specific vector sizing and vector-optimized instance guidance, see Aura → Vector optimization.

Warming up the vector index

The vector index is loaded into the operating system’s filesystem cache as it is queried. The first queries may need to read more of the Lucene index from disk. As more of the index becomes resident in the filesystem cache, later queries avoid more disk reads and performance becomes more consistent.

Warm-up can therefore be done by issuing random queries against the index before serving production traffic. The number of queries required depends on the size of the index, the amount of RAM available, and how representative the warm-up queries are. As a starting point:

  • for a smaller index (up to 1M entries), around five random queries have worked well in testing

  • for a larger index, start with around 100 random queries and adjust based on observed disk I/O and latency on the target system

Use representative queries and monitor disk read activity during warm-up. Tools like iotop, or equivalent Linux I/O monitoring tools, can help show whether the operating system is still reading heavily from disk.

Because Lucene-backed vector index files rely on the operating system’s filesystem cache, a server restart clears that cache. After a restart, rolling restart, patch, or upgrade, the index may need to be warmed up again.