The task of executing a query is decomposed into operators, each of which implements a specific piece of work. The operators are combined into a tree-like structure called an execution plan. Each operator in the execution plan is represented as a node in the tree. Each operator takes as input zero or more rows, and produces as output zero or more rows. This means that the output from one operator becomes the input for the next operator. Operators that join two branches in the tree combine input from two incoming streams and produce a single output.
Evaluation model. Evaluation of the execution plan begins at the leaf nodes of the tree. Leaf nodes have no input rows and generally comprise operators such as scans and seeks. These operators obtain the data directly from the storage engine, thus incurring database hits. Any rows produced by leaf nodes are then piped into their parent nodes, which in turn pipe their output rows to their parent nodes and so on, all the way up to the root node. The root node produces the final results of the query.
Eager and lazy evaluation. In general, query evaluation is lazy: most operators pipe their output rows to their parent operators as soon as they are produced. This means that a child operator may not be fully exhausted before the parent operator starts consuming the input rows produced by the child.
However, some operators, such as those used for aggregation and sorting, need to aggregate all their rows before they can produce output. Such operators need to complete execution in its entirety before any rows are sent to their parents as input. These operators are called eager operators, and are denoted as such in Section 7.2, “Execution plan operators at a glance”. Eagerness can cause high memory usage and may therefore be the cause of query performance issues.
Statistics. Each operator is annotated with statistics.
Page Cache Hits,
Page Cache Misses,
Page Cache Hit Ratio
hitsand a low number of
misseswill typically make the query run faster.
Timeis only shown for some operators when using the
pipelinedruntime. The number shown is the time in milliseconds it took to execute the given operator.
To produce an efficient plan for a query, the Cypher query planner requires information about the Neo4j database. This information includes which indexes and constraints are available, as well as various statistics maintained by the database. The Cypher query planner uses this information to determine which access patterns will produce the best execution plan.
The statistical information maintained by Neo4j is:
Information about how the statistics are kept up to date, as well as configuration options for managing query replanning and caching, can be found in the Operations Manual → Statistics and execution plans.
Chapter 6, Query tuning describes how to tune Cypher queries. In particular, see Section 6.2, “Profiling a query” for how to view the execution plan for a query and Section 6.6, “Planner hints and the USING keyword” for how to use hints to influence the decisions of the planner when building an execution plan for a query.
For a deeper understanding of how each operator works, refer to Section 7.2, “Execution plan operators at a glance” and the linked sections per operator. Please remember that the statistics of the particular database where the queries run will decide the plan used. There is no guarantee that a specific query will always be solved with the same plan.