Execution plans

Introduction

The task of executing a query is decomposed into operators, each of which implements a specific piece of work. The operators are combined into a tree-like structure called an execution plan. Each operator in the execution plan is represented as a node in the tree. Each operator takes as input zero or more rows, and produces as output zero or more rows. This means that the output from one operator becomes the input for the next operator. Operators that join two branches in the tree combine input from two incoming streams and produce a single output.

Evaluation model

Evaluation of the execution plan begins at the leaf nodes of the tree. Leaf nodes have no input rows and generally comprise operators such as scans and seeks. These operators obtain the data directly from the storage engine, thus incurring database hits. Any rows produced by leaf nodes are then piped into their parent nodes, which in turn pipe their output rows to their parent nodes and so on, all the way up to the root node. The root node produces the final results of the query.

Eager and lazy evaluation

In general, query evaluation is lazy: most operators pipe their output rows to their parent operators as soon as they are produced. This means that a child operator may not be fully exhausted before the parent operator starts consuming the input rows produced by the child.

However, some operators, such as those used for aggregation and sorting, need to aggregate all their rows before they can produce output. Such operators need to complete execution in its entirety before any rows are sent to their parents as input. These operators are called eager operators, and are denoted as such in Execution plan operators at a glance. Eagerness can cause high memory usage and may therefore be the cause of query performance issues.

Statistics

Each operator is annotated with statistics.

Rows

The number of rows that the operator produced. This is only available if the query was profiled.

EstimatedRows

This is the estimated number of rows that is expected to be produced by the operator. The estimate is an approximate number based on the available statistical information. The compiler uses this estimate to choose a suitable execution plan.

DbHits

Each operator will ask the Neo4j storage engine to do work such as retrieving or updating data. A database hit is an abstract unit of this storage engine work. The actions triggering a database hit are listed in Database hits (DbHits).

Page Cache Hits, Page Cache Misses, Page Cache Hit Ratio

These metrics are only shown for some queries when using Neo4j Enterprise Edition. The page cache is used to cache data and avoid accessing disk, so having a high number of hits and a low number of misses will typically make the query run faster.

Time

Time is only shown for some operators when using the pipelined runtime. The number shown is the time in milliseconds it took to execute the given operator.

To produce an efficient plan for a query, the Cypher® query planner requires information about the Neo4j database. This information includes which indexes and constraints are available, as well as various statistics maintained by the database. The Cypher query planner uses this information to determine which access patterns will produce the best execution plan.

The statistical information maintained by Neo4j is:

  1. The number of nodes having a certain label.

  2. The number of relationships by type.

  3. Selectivity per index.

  4. The number of relationships by type, ending with or starting from a node with a specific label.

Information about how the statistics are kept up to date, as well as configuration options for managing query replanning and caching, can be found in the Operations Manual → Statistics and execution plans.

Query tuning describes how to tune Cypher queries. In particular, see Query tuning for how to view the execution plan for a query and Planner hints and the USING keyword for how to use hints to influence the decisions of the planner when building an execution plan for a query.

For a deeper understanding of how each operator works, refer to Execution plan operators at a glance and the linked sections per operator. Please remember that the statistics of the particular database where the queries run will decide the plan used. There is no guarantee that a specific query will always be solved with the same plan.

Execution plan operators at a glance

This table comprises all the execution plan operators ordered lexicographically.

  • Leaf operators, in most cases, locate the starting nodes and relationships required in order to execute the query.

  • Updating operators are used in queries that update the graph.

  • Eager operators accumulate all their rows before piping them to the next operator.

Name Description Leaf? Updating? Considerations

AllNodesScan

Reads all nodes from the node store.

Y

AntiConditionalApply

Performs a nested loop. If a variable is null, the right-hand side will be executed.

AntiSemiApply

Performs a nested loop. Tests for the absence of a pattern predicate.

Apply

Performs a nested loop. Yields rows from both the left-hand and right-hand side operators.

Argument

Indicates the variable to be used as an argument to the right-hand side of an Apply operator.

Y

AssertSameNode

Used to ensure that no unique constraints are violated.

CacheProperties

Reads node or relationship properties and caches them.

CartesianProduct

Produces a cartesian product of the inputs from the left-hand and right-hand operators.

ConditionalApply

Performs a nested loop. If a variable is not null, the right-hand side will be executed.

CreateIndex

Creates an index on a property for all nodes having a certain label.

Y

Y

CreateNodeKeyConstraint

Creates a node key constraint on a set of properties for all nodes having a certain label.

Y

Y

Create

Creates nodes and relationships.

Y

CreateNodePropertyExistenceConstraint

Creates an existence constraint on a property for all nodes having a certain label.

Y

Y

CreateRelationshipPropertyExistenceConstraint

Creates an existence constraint on a property for all relationships of a certain type.

Y

Y

CreateUniqueConstraint

Creates a unique constraint on a property for all nodes having a certain label.

Y

Y

Delete

Deletes a node or relationship.

Y

DetachDelete

Deletes a node and its relationships.

Y

DirectedRelationshipByIdSeek

Reads one or more relationships by id from the relationship store.

Y

Distinct

Drops duplicate rows from the incoming stream of rows.

Eager

DropIndex (deprecated syntax)

Drops an index from a property for all nodes having a certain label.

Y

Y

DropIndex (new syntax)

Drops an index using its name.

Y

Y

DropConstraint

Drops a constraint using its name.

Y

Y

DropNodeKeyConstraint

Drops a node key constraint from a set of properties for all nodes having a certain label.

Y

Y

DropNodePropertyExistenceConstraint

Drops an existence constraint from a property for all nodes having a certain label.

Y

Y

DropRelationshipPropertyExistenceConstraint

Drops an existence constraint from a property for all relationships of a certain type.

Y

Y

DropResult

Produces zero rows when an expression is guaranteed to produce an empty result.

DropUniqueConstraint

Drops a unique constraint from a property for all nodes having a certain label.

Y

Y

Eager

For isolation purposes, Eager ensures that operations affecting subsequent operations are executed fully for the whole dataset before continuing execution.

Eager

EagerAggregation

Evaluates a grouping expression.

Eager

EmptyResult

Eagerly loads all incoming data and discards it.

EmptyRow

Returns a single row with no columns.

Y

Expand(All)

Traverses incoming or outgoing relationships from a given node.

Expand(Into)

Finds all relationships between two nodes.

Filter

Filters each row coming from the child operator, only passing through rows that evaluate the predicates to true.

Foreach

Performs a nested loop. Yields rows from the left-hand operator and discards rows from the right-hand operator.

LetAntiSemiApply

Performs a nested loop. Tests for the absence of a pattern predicate in queries containing multiple pattern predicates.

LetSelectOrSemiApply

Performs a nested loop. Tests for the presence of a pattern predicate that is combined with other predicates.

LetSelectOrAntiSemiApply

Performs a nested loop. Tests for the absence of a pattern predicate that is combined with other predicates.

LetSemiApply

Performs a nested loop. Tests for the presence of a pattern predicate in queries containing multiple pattern predicates.

Limit

Returns the first 'n' rows from the incoming input.

LoadCSV

Loads data from a CSV source into the query.

Y

LockNodes

Locks the start and end node when creating a relationship.

MergeCreateNode

Creates the node when failing to find the node.

Y

Y

MergeCreateRelationship

Creates the relationship when failing to find the relationship.

Y

NodeByIdSeek

Reads one or more nodes by id from the node store.

Y

NodeByLabelScan

Fetches all nodes with a specific label from the node label index.

Y

NodeCountFromCountStore

Uses the count store to answer questions about node counts.

Y

NodeHashJoin

Executes a hash join on node ids.

Eager

NodeIndexContainsScan

Examines all values stored in an index, searching for entries containing a specific string.

Y

NodeIndexEndsWithScan

Examines all values stored in an index, searching for entries ending in a specific string.

Y

NodeIndexScan

Examines all values stored in an index, returning all nodes with a particular label having a specified property.

Y

NodeIndexSeek

Finds nodes using an index seek.

Y

NodeIndexSeekByRange

Finds nodes using an index seek where the value of the property matches the given prefix string.

Y

NodeLeftOuterHashJoin

Executes a left outer hash join.

Eager

NodeRightOuterHashJoin

Executes a right outer hash join.

Eager

NodeUniqueIndexSeek

Finds nodes using an index seek within a unique index.

Y

NodeUniqueIndexSeekByRange

Finds nodes using an index seek within a unique index where the value of the property matches the given prefix string.

Y

OrderedAggregation

Like EagerAggregation but relies on the ordering of incoming rows. Is not eager.

OrderedDistinct

Like Distinct but relies on the ordering of incoming rows.

Optional

Yields a single row with all columns set to null if no data is returned by its source.

OptionalExpand(All)

Traverses relationships from a given node, producing a single row with the relationship and end node set to null if the predicates are not fulfilled.

OptionalExpand(Into)

Traverses all relationships between two nodes, producing a single row with the relationship and end node set to null if no matching relationships are found (the start node will be the node with the smallest degree).

PartialSort

Sorts a row by multiple columns if there is already an ordering.

PartialTop

Returns the first 'n' rows sorted by multiple columns if there is already an ordering.

ProcedureCall

Calls a procedure.

ProduceResults

Prepares the result so that it is consumable by the user.

ProjectEndpoints

Projects the start and end node of a relationship.

Projection

Evaluates a set of expressions, producing a row with the results thereof.

Y

RelationshipCountFromCountStore

Uses the count store to answer questions about relationship counts.

Y

RemoveLabels

Deletes labels from a node.

Y

RollUpApply

Performs a nested loop. Executes a pattern expression or pattern comprehension.

SelectOrAntiSemiApply

Performs a nested loop. Tests for the absence of a pattern predicate if an expression predicate evaluates to false.

SelectOrSemiApply

Performs a nested loop. Tests for the presence of a pattern predicate if an expression predicate evaluates to false.

SemiApply

Performs a nested loop. Tests for the presence of a pattern predicate.

SetLabels

Sets labels on a node.

Y

SetNodePropertiesFromMap

Sets properties from a map on a node.

Y

SetProperty

Sets a property on a node or relationship.

Y

SetRelationshipPropertiesFromMap

Sets properties from a map on a relationship.

Y

Skip

Skips 'n' rows from the incoming rows.

Sort

Sorts rows by a provided key.

Eager

Top

Returns the first 'n' rows sorted by a provided key.

Eager

TriadicSelection

Solves triangular queries, such as the very common 'find my friend-of-friends that are not already my friend'.

UndirectedRelationshipByIdSeek

Reads one or more relationships by id from the relationship store.

Y

Union

Concatenates the results from the right-hand operator with the results from the left-hand operator.

Unwind

Returns one row per item in a list.

ValueHashJoin

Executes a hash join on arbitrary values.

Eager

VarLengthExpand(All)

Traverses variable-length relationships from a given node.

VarLengthExpand(Into)

Finds all variable-length relationships between two nodes.

VarLengthExpand(Pruning)

Traverses variable-length relationships from a given node and only returns unique end nodes.

Database hits (DbHits)

Each operator will send a request to the storage engine to do work such as retrieving or updating data. A database hit is an abstract unit of this storage engine work.

We list below all the actions that trigger one or more database hits:

  • Create actions

    • Create a node

    • Create a relationship

    • Create a new node label

    • Create a new relationship type

    • Create a new ID for property keys with the same name

  • Delete actions

    • Delete a node

    • Delete a relationship

  • Update actions

    • Set one or more labels on a node

    • Remove one or more labels from a node

  • Node-specific actions

    • Get a node by its ID

    • Get the degree of a node

    • Determine whether a node is dense

    • Determine whether a label is set on a node

    • Get the labels of a node

    • Get a property of a node

    • Get an existing node label

    • Get the name of a label by its ID, or its ID by its name

  • Relationship-specific actions

    • Get a relationship by its ID

    • Get a property of a relationship

    • Get an existing relationship type

    • Get a relationship type name by its ID, or its ID by its name

  • General actions

    • Get the name of a property key by its ID, or its ID by the key name

    • Find a node or relationship through an index seek or index scan

    • Find a path in a variable-length expand

    • Find a shortest path

    • Ask the count store for a value

  • Schema actions

    • Add an index

    • Drop an index

    • Get the reference of an index

    • Create a constraint

    • Drop a constraint

  • Call a procedure

  • Call a user-defined function

The presented value can vary slightly depending on the Cypher runtime that was used to execute the query. In the pipelined runtime the number of database hits will typically be higher since it uses a more accurate way of measuring.