When you issue a Cypher query, it gets compiled to an execution plan (see Section 3.7, “Execution plans”) that can run and answer your question. To produce an efficient plan for your query, Neo4j needs information about your database, such as the schema — what indexes and constraints do exist? Neo4j will also use statistical information it keeps about your database to optimize the execution plan. With this information, Neo4j can decide which access pattern leads to the best performing plans.
The statistical information that Neo4j keeps is:
Neo4j keeps the statistics up to date in two different ways. For label counts for example, the number is updated whenever you set or remove a label from a node. For indexes, Neo4j needs to scan the full index to produce the selectivity number. Since this is potentially a very time-consuming operation, these numbers are collected in the background when enough data on the index has been changed.
Execution plans are cached and will not be replanned until the statistical information used to produce the plan has changed. The following configuration options allows you to control how sensitive replanning should be to updates of the database.
Controls whether indexes will automatically be re-sampled when they have been updated enough. The Cypher query planner depends on accurate statistics to create efficient plans, so it is important it is kept up to date as the database evolves.
If background sampling is turned off, make sure to trigger manual sampling when data has been updated.
Index resampling can be triggered using two built-in procedures
Here is an example of using
cypher-shell to trigger resampling.
> cypher-shell 'CALL db.resampleIndex(":Person(name)");' > cypher-shell 'CALL db.resampleOutdatedIndexes();'