Planning and sizingEnterprise EditionNot available on AuraIntroduced in 2025.10
Planning the topology of a sharded property database
The sharded property database is deployed into a Neo4j cluster with the graph shard being a regular Raft group. This means that you should deploy the graph shard cluster with a topology consisting of at least three servers hosting databases in primary mode (read and write, RW) for high availability. Additional primaries may be added to support a higher fault tolerance. If high availability is not required, you can create a database with a single primary host for minimum write latency and cost efficiency.
Databases in secondary mode can be added to the graph shard to scale out read workloads. Secondaries act like caches for the graph data and are fully capable of executing read-only (RO) queries and procedures.
Replicas contain the property data.
The property data is replicated from the databases in primary mode (RW) via transaction log shipping.
Replicas periodically poll an upstream server for new transactions and have these shipped over.
They are not in a Raft group and do not have the same high availability features as the graph shards.
To achieve high availability of the replicas containing the property shards, it is recommended to have multiple replicas per property shard.
The fault tolerance for a property shard replica is calculated with the formula M = F + 1, where M is the number of replicas required to tolerate F faults.
The following diagram illustrates a sample architecture of a high availability property sharding deployment, which comprises a graph shard, graph shard secondaries, and 4 property shards with 2 replicas for each property shard:
Planning the sizing of a sharded property database
Property sharding relies on the capabilities provided by Neo4j clustering for managing and sizing the infrastructure. More specifically:
-
Some servers can be associated with the graph shard databases. They can also be separated and restricted into primary and secondary members of the cluster.
-
Other servers can be associated with the property shard databases. It is important to consider the number of available servers, along with the number of shards and replicas (i.e., multiple copies of the same shard for high availability and read scalability).
-
Data in sharded property databases is evenly distributed. It is recommended to consider that each database does not exceed a size suitable for the available hardware and allows a relatively smooth set of administrative operations. For example, in commodity virtual or physical hardware, the size of each database must not exceed 1 to 3 TB.
-
Should property sharding start relatively small and grow over time, it is recommended to create more property shards that may initially be co-located on the same server.
-
As the database grows in size, additional servers may be added to allow hardware resharding of the database. This administrative change happens during normal online operations.
-
Database resharding, i.e., changing the number of property shards, can be executed offline using the
neo4j-admin database copycommand. See Splitting an existing database into shards.
The block format (see Store formats) is required for both the graph shard and the property shard. For accurate sizing estimation, contact your Neo4j representative for assistance.