Aggregating functions

An aggregating function performs a calculation over a set of values, returning a single value. Aggregation can be computed over all the matching paths, or it can be further divided by introducing grouping keys.

To learn more about how Cypher® handles aggregations performed on zero rows, visit Neo4j Knowledge Base → Understanding aggregations on zero rows.

Example graph

The following graph is used for the examples below:

To recreate the graph, run the following query against an empty Neo4j database:

CREATE
  (keanu:Person {name: 'Keanu Reeves', age: 58}),
  (liam:Person {name: 'Liam Neeson', age: 70}),
  (carrie:Person {name: 'Carrie Anne Moss', age: 55}),
  (guy:Person {name: 'Guy Pearce', age: 55}),
  (kathryn:Person {name: 'Kathryn Bigelow', age: 71}),
  (speed:Movie {title: 'Speed'}),
  (keanu)-[:ACTED_IN]->(speed),
  (keanu)-[:KNOWS]->(carrie),
  (keanu)-[:KNOWS]->(liam),
  (keanu)-[:KNOWS]->(kathryn),
  (carrie)-[:KNOWS]->(guy),
  (liam)-[:KNOWS]->(guy)

avg()

Details
Syntax	`avg(input)`
Description	Returns the average of a set of `INTEGER`, `FLOAT` or `DURATION` values.
Arguments	Name	Type	Description
	`input`	`INTEGER \| FLOAT \| DURATION`	A value aggregated to form an average.
Returns	`INTEGER \| FLOAT \| DURATION`

Considerations
Any `null` values are excluded from the calculation.
`avg(null)` returns `null`.

Example 1. avg() - numerical values

Query

MATCH (p:Person)
RETURN avg(p.age)

The average of all the values in the property age is returned:

Result
avg(p.age)
`61.8`
Rows: 1

Example 2. avg() - duration values

Query

UNWIND [duration('P2DT3H'), duration('PT1H45S')] AS dur
RETURN avg(dur)

The average of the two supplied DURATION values is returned:

Result
avg(dur)
`P1DT2H22.5S`
Rows: 1

collect()

Details
Syntax	`collect(input)`
Description	Returns a list containing the values returned by an expression.
Arguments	Name	Type	Description
	`input`	`ANY`	A value aggregated into a list.
Returns	`LIST<ANY>`

Considerations
Any `null` values are ignored and will not be added to the list.
`collect(null)` returns an empty list.

Example 3. collect()

Query

MATCH (p:Person)
RETURN collect(p.age)

All the values are collected and returned in a single list:

Result
collect(p.age)
`[58, 70, 55, 55, 71]`
Rows: 1

collect_list()
Cypher 25 onlyIntroduced in Neo4j 2026.02

Details
Syntax	`collect_list(input)`
Description	Returns a list containing the values returned by an expression.
Arguments	Name	Type	Description
	`input`	`ANY`	A value aggregated into a list.
Returns	`LIST<ANY>`

This function is an alias to the collect() function, and it was introduced as part of Cypher®'s GQL conformance.

Considerations
Any `null` values are ignored and will not be added to the list.
`collect_list(null)` returns an empty list.

Example 4. collect_list()

Query

MATCH (p:Person)
RETURN collect_list(p.age)

All the values are collected and returned in a single list:

Result
collect_list(p.age)
`[58, 70, 55, 55, 71]`
Rows: 1

count()

Details
Syntax	`count(input)`
Description	Returns the number of values or rows.
Arguments	Name	Type	Description
	`input`	`ANY`	A value to be aggregated.
Returns	`INTEGER`

Considerations
`count(*)` includes rows returning `null`.
`count(input)` ignores `null` values.
`count(null)` returns `0`.

Neo4j maintains a transactional count store for holding count metadata, which can significantly increase the speed of queries using the count() function. For more information about the count store, refer to Neo4j Knowledge Base → Fast counts using the count store.

Using `count(*)` to return the number of nodes

The function count(*) can be used to return the number of nodes; for example, the number of nodes connected to a node n.

Example 5. count()

Query

MATCH (p:Person {name: 'Keanu Reeves'})-->(x)
RETURN labels(p), p.age, count(*)

The labels and age property of the start node Keanu Reeves and the number of nodes related to it are returned:

Result
labels(p)	p.age	count(*)
`["Person"]`	`58`	`4`
Rows: 1

Using `count(*)` to group and count relationship types

The function count(*) can be used to group the type of matched relationships and return the number of types.

Example 6. count()

Query

MATCH (p:Person {name: 'Keanu Reeves'})-[r]->()
RETURN type(r), count(*)

The type of matched relationships are grouped and the group count of relationship types is returned:

Result
type(r)	count(*)
`"ACTED_IN"`	`1`
`"KNOWS"`	`3`
Rows: 2

Counting non-`null` values

Instead of simply returning the number of rows with count(*), the function count(expression) can be used to return the number of non-null values returned by the expression.

Example 7. count()

Query

MATCH (p:Person)
RETURN count(p.age)

The number of nodes with the label Person and a property age is returned: (To calculate the sum, use sum(n.age))

Result
count(p.age)
`5`
Rows: 1

Counting with and without duplicates

The default behavior of the count function is to count all matching results, including duplicates. To avoid counting duplicates, use the DISTINCT keyword.

It is also possible to use the ALL keyword with aggregating functions. This will count all results, including duplicates, and is functionally the same as not using the DISTINCT keyword. The ALL keyword was introduced as part of Cypher®'s GQL conformance.

This example tries to find all friends of friends of Keanu Reeves and count them. It shows the behavior of using both the ALL and the DISTINCT keywords:

Example 8. count()

Query

MATCH (p:Person)-->(friend:Person)-->(friendOfFriend:Person)
WHERE p.name = 'Keanu Reeves'
RETURN friendOfFriend.name, count(friendOfFriend), count(ALL friendOfFriend), count(DISTINCT friendOfFriend)

The nodes Carrie Anne Moss and Liam Neeson both have an outgoing KNOWS relationship to Guy Pearce. The Guy Pearce node will, therefore, get counted twice when not using DISTINCT.

Result
friendOfFriend.name	count(friendOfFriend)	count(ALL friendOfFriend)	count(DISTINCT friendOfFriend)
`"Guy Pearce"`	`2`	`2`	`1`
Rows: 1

max()

Details
Syntax	`max(input)`
Description	Returns the maximum value in a set of values.
Arguments	Name	Type	Description
	`input`	`ANY`	A value to be aggregated.
Returns	`ANY`

Considerations
Any `null` values are excluded from the calculation.
In a mixed set, any numeric value is always considered to be higher than any `STRING` value, and any `STRING` value is always considered to be higher than any LIST<ANY>`.
Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to the end.
`max(null)` returns `null`.

Example 9. max()

Query

UNWIND [1, 'a', null, 0.2, 'b', '1', '99'] AS val
RETURN max(val)

The highest of all the values in the mixed set — in this case, the numeric value 1 — is returned:

Result
max(val)
`1`
Rows: 1

The value '99' (a STRING), is considered to be a lower value than 1 (an INTEGER), because '99' is a STRING.

Example 10. max()

Query

UNWIND [[1, 'a', 89], [1, 2]] AS val
RETURN max(val)

The highest of all the lists in the set — in this case, the list [1, 2] — is returned, as the number 2 is considered to be a higher value than the STRING 'a', even though the list [1, 'a', 89] contains more elements.

Result
max(val)
`[1,2]`
Rows: 1

Example 11. max()

Query

MATCH (p:Person)
RETURN max(p.age)

The highest of all the values in the property age is returned:

Result
max(p.age)
`71`
Rows: 1

min()

Details
Syntax	`min(input)`
Description	Returns the minimum value in a set of values.
Arguments	Name	Type	Description
	`input`	`ANY`	A value to be aggregated.
Returns	`ANY`

Considerations
Any `null` values are excluded from the calculation.
In a mixed set, any `STRING` value is always considered to be lower than any numeric value, and any `LIST<ANY>` is always considered to be lower than any `STRING`.
Lists are compared in dictionary order, i.e. list elements are compared pairwise in ascending order from the start of the list to the end.
`min(null)` returns `null`.

Example 12. min()

Query

UNWIND [1, 'a', null, 0.2, 'b', '1', '99'] AS val
RETURN min(val)

The lowest of all the values in the mixed set — in this case, the STRING value "1" — is returned. Note that the (numeric) value 0.2, which may appear at first glance to be the lowest value in the list, is considered to be a higher value than "1" as the latter is a STRING.

Result
min(val)
`"1"`
Rows: 1

Example 13. min()

Query

UNWIND ['d', [1, 2], ['a', 'c', 23]] AS val
RETURN min(val)

The lowest of all the values in the set — in this case, the list ['a', 'c', 23] — is returned, as (i) the two lists are considered to be lower values than the STRING "d", and (ii) the STRING "a" is considered to be a lower value than the numerical value 1.

Result
min(val)
`["a","c",23]`
Rows: 1

Example 14. min()

Query

MATCH (p:Person)
RETURN min(p.age)

The lowest of all the values in the property age is returned:

Result
min(p.age)
`55`
Rows: 1

percentileCont()

Details
Syntax	`percentileCont(input, percentile)`
Description	Returns the percentile of a value over a group using linear interpolation.
Arguments	Name	Type	Description
	`input`	`FLOAT`	A value to be aggregated.
	`percentile`	`FLOAT`	A percentile between 0.0 and 1.0.
Returns	`FLOAT`

Considerations
Any `null` values are excluded from the calculation.
`percentileCont(null, percentile)` returns `null`.

Example 15. percentileCont()

Query

MATCH (p:Person)
RETURN percentileCont(p.age, 0.4)

The 40th percentile of the values in the property age is returned, calculated with a weighted average:

Result
percentileCont(p.age, 0.4)
`56.8`
Rows: 1

percentile_cont()
Cypher 25 onlyIntroduced in Neo4j 2026.02

Details
Syntax	`percentile_cont(input, percentile)`
Description	Returns the percentile of a value over a group using linear interpolation.
Arguments	Name	Type	Description
	`input`	`FLOAT`	A value to be aggregated.
	`percentile`	`FLOAT`	A percentile between 0.0 and 1.0.
Returns	`FLOAT`

This function is an alias to the percentileCont() function, and it was introduced as part of Cypher®'s GQL conformance.

Considerations
Any `null` values are excluded from the calculation.
`percentile_cont(null, percentile)` returns `null`.

Example 16. percentile_cont()

Query

MATCH (p:Person)
RETURN percentile_cont(p.age, 0.4)

The 40th percentile of the values in the property age is returned, calculated with a weighted average:

Result
percentile_cont(p.age, 0.4)
`56.8`
Rows: 1

percentileDisc()

Details
Syntax	`percentileDisc(input, percentile)`
Description	Returns the nearest `INTEGER` or `FLOAT` value to the given percentile over a group using a rounding method.
Arguments	Name	Type	Description
	`input`	`INTEGER \| FLOAT`	A value to be aggregated.
	`percentile`	`FLOAT`	A percentile between 0.0 and 1.0.
Returns	`INTEGER \| FLOAT`

Considerations
Any `null` values are excluded from the calculation.
`percentileDisc(null, percentile)` returns `null`.

Example 17. percentileDisc()

Query

MATCH (p:Person)
RETURN percentileDisc(p.age, 0.5)

The 50th percentile of the values in the property age is returned:

Result
percentileDisc(p.age, 0.5)
`58`
Rows: 1

percentile_disc()
Cypher 25 onlyIntroduced in Neo4j 2026.02

Details
Syntax	`percentile_disc(input, percentile)`
Description	Returns the nearest `INTEGER` or `FLOAT` value to the given percentile over a group using a rounding method.
Arguments	Name	Type	Description
	`input`	`INTEGER \| FLOAT`	A value to be aggregated.
	`percentile`	`FLOAT`	A percentile between 0.0 and 1.0.
Returns	`INTEGER \| FLOAT`

This function is an alias to the percentileDisc() function, and it was introduced as part of Cypher®'s GQL conformance.

Considerations
Any `null` values are excluded from the calculation.
`percentile_disc(null, percentile)` returns `null`.

Example 18. percentile_disc()

Query

MATCH (p:Person)
RETURN percentile_disc(p.age, 0.5)

The 50th percentile of the values in the property age is returned:

Result
percentile_disc(p.age, 0.5)
`58`
Rows: 1

stDev()

Details
Syntax	`stDev(input)`
Description	Returns the standard deviation for the given value over a group for a sample of a population.
Arguments	Name	Type	Description
	`input`	`FLOAT`	The value to calculate the standard deviation of.
Returns	`FLOAT`

Considerations
Any `null` values are excluded from the calculation.
`stDev(null)` returns `0`.

Example 19. stDev()

Query

MATCH (p:Person)
WHERE p.name IN ['Keanu Reeves', 'Liam Neeson', 'Carrie Anne Moss']
RETURN stDev(p.age)

The standard deviation of the values in the property age is returned:

Result
stDev(p.age)
`7.937253933193772`
Rows: 1

stdev_samp()
Cypher 25 onlyIntroduced in Neo4j 2026.02

Details
Syntax	`stdev_samp(input)`
Description	Returns the standard deviation for the given value over a group for a sample of a population.
Arguments	Name	Type	Description
	`input`	`FLOAT`	The value to calculate the standard deviation of.
Returns	`FLOAT`

This function is an alias to the stDev() function, and it was introduced as part of Cypher®'s GQL conformance.

Considerations
Any `null` values are excluded from the calculation.
`stdev_samp(null)` returns `0`.

Example 20. stdev_samp()

Query

MATCH (p:Person)
WHERE p.name IN ['Keanu Reeves', 'Liam Neeson', 'Carrie Anne Moss']
RETURN stdev_samp(p.age)

The standard deviation of the values in the property age is returned:

Result
stdev_samp(p.age)
`7.937253933193772`
Rows: 1

stDevP()

Details
Syntax	`stDevP(input)`
Description	Returns the standard deviation for the given value over a group for an entire population.
Arguments	Name	Type	Description
	`input`	`FLOAT`	The value to calculate the population standard deviation of.
Returns	`FLOAT`

Considerations
Any `null` values are excluded from the calculation.
`stDevP(null)` returns `0`.

Example 21. stDevP()

Query

MATCH (p:Person)
WHERE p.name IN ['Keanu Reeves', 'Liam Neeson', 'Carrie Anne Moss']
RETURN stDevP(p.age)

The population standard deviation of the values in the property age is returned:

Result
stDevP(p.age)
`6.48074069840786`
Rows: 1

stdev_pop()
Cypher 25 onlyIntroduced in Neo4j 2026.02

Details
Syntax	`stdev_pop(input)`
Description	Returns the standard deviation for the given value over a group for an entire population.
Arguments	Name	Type	Description
	`input`	`FLOAT`	The value to calculate the population standard deviation of.
Returns	`FLOAT`

This function is an alias to the stDevP() function, and it was introduced as part of Cypher®'s GQL conformance.

Considerations
Any `null` values are excluded from the calculation.
`stdev_pop(null)` returns `0`.

Example 22. stdev_pop()

Query

MATCH (p:Person)
WHERE p.name IN ['Keanu Reeves', 'Liam Neeson', 'Carrie Anne Moss']
RETURN stdev_pop(p.age)

The population standard deviation of the values in the property age is returned:

Result
stdev_pop(p.age)
`6.48074069840786`
Rows: 1

sum()

Details
Syntax	`sum(input)`
Description	Returns the sum of a set of `INTEGER`, `FLOAT` or `DURATION` values
Arguments	Name	Type	Description
	`input`	`INTEGER \| FLOAT \| DURATION`	A value to be aggregated.
Returns	`INTEGER \| FLOAT \| DURATION`

Considerations
Any `null` values are excluded from the calculation.
`sum(null)` returns `0`.

Example 23. sum() - numeric values

Query

MATCH (p:Person)
RETURN sum(p.age)

The sum of all the values in the property age is returned:

Result
sum(p.age)
`309`
Rows: 1

Example 24. sum() - duration values

Query

UNWIND [duration('P2DT3H'), duration('PT1H45S')] AS dur
RETURN sum(dur)

The sum of the two supplied durations is returned:

Result
sum(dur)
`P2DT4H45S`
Rows: 1

Aggregating expressions and grouping keys

Aggregating expressions are expressions which contain one or more aggregating functions. A simple aggregating expression consists of a single aggregating function. For instance, sum(x.a) is an aggregating expression that only consists of the aggregating function sum( ) with x.a as its argument. Aggregating expressions are also allowed to be more complex, where the result of one or more aggregating functions are input arguments to other expressions. For instance, 0.1 * (sum(x.a) / count(x.b)) is an aggregating expression that contains two aggregating functions, sum( ) with x.a as its argument and count( ) with x.b as its argument. Both are input arguments to the division expression.

Grouping keys are non-aggregating expressions that are used to group the values going into the aggregating functions. For example, given the following query containing two return expressions, n and count(*):

RETURN n, count(*)

The first, n is not an aggregating function, so it will be the grouping key. The latter, count(*) is an aggregating function. The matching paths will be divided into different buckets, depending on the grouping key. The aggregating function will then be run on these buckets, calculating an aggregate value per bucket.

The input expression of an aggregating function can contain any expression, including expressions that are not grouping keys. However, not all expressions can be composed with aggregating functions. The example below will throw an error since n.x, which is not a grouping key, is combined with the aggregating function count(*).

RETURN n.x + count(*)

To sort the result set using aggregating functions, the aggregation must be included in the ORDER BY sub-clause following the RETURN clause.

Examples

Example 25. Simple aggregation without any grouping keys

Query

MATCH (p:Person)
RETURN max(p.age)

Result
max(p.age)
`71`
Rows: 1

Example 26. Addition of an aggregation and a constant, without any grouping keys

Query

MATCH (p:Person)
RETURN max(p.age) + 1

Result
max(p.age) + 1
`72`
Rows: 1

Example 27. Subtraction of a property access and an aggregation

Note that p is a grouping key:

Query

MATCH (p:Person{name:'Keanu Reeves'})-[:KNOWS]-(f:Person)
RETURN p, p.age - max(f.age)

Result
p	p.age - max(f.age)
`{{"name":"Keanu Reeves","age":58}}`	`-13`
Rows: 1

Example 28. Subtraction of a property access and an aggregation.

Note that p.age is a grouping key:

Query

MATCH (p:Person {name:'Keanu Reeves'})-[:KNOWS]-(f:Person)
RETURN p.age, p.age - max(f.age)

Result
p.age	p.age - max(f.age)
`58`	`-13`
Rows: 1

Grouping keys themselves can be complex expressions. For better query readability, Cypher® only recognizes a sub-expression in aggregating expressions as a grouping key if the grouping key is either:

A variable - e.g. the p in RETURN p, p.age - max(f.age).
A property access - e.g. the p.age in RETURN p.age, p.age - max(f.age).
A map access - e.g. the p.age in WITH {name:'Keanu Reeves', age:58} AS p RETURN p.age, p.age - max(p.age).

If more complex grouping keys are needed as operands in aggregating expression, it is always possible to project them in advance using WITH.

Using the property p.age will throw an exception, since p.age is not a grouping key. Therefore, it cannot be used in the expressions which contain the aggregating function. The below two queries would consequently return the same error message:

Query

MATCH (p:Person {name:'Keanu Reeves'})-[:KNOWS]-(f:Person)
RETURN p.age - max(f.age)

Query

MATCH (p:Person {name:'Keanu Reeves'})-[:KNOWS]-(f:Person)
RETURN p.age + p.age, p.age + p.age - max(f.age)

GQLSTATUS error chain
42I18: error: syntax error or access rule violation - invalid reference to implicitly grouped expressions. The aggregation column contains implicit grouping expressions referenced by the variables `p`. Implicit grouping expressions are variables not explicitly declared as grouping keys. 42001: error: syntax error or access rule violation - invalid syntax

However, the latter query would work if rewritten to:

Query

MATCH (p:Person {name:'Keanu Reeves'})-[:KNOWS]-(f:Person)
WITH p.age + p.age AS groupingKey, f
RETURN groupingKey, groupingKey - max(f.age)

Result
groupingKey	groupingKey - max(f.age)
`116`	`45`
Rows: 1

Rules for aggregating expressions

For aggregating expressions to be correctly computable for the buckets formed by the grouping key(s), they have to fulfill some requirements. Specifically, each sub-expression in an aggregating expression has to be either:

an aggregating function, e.g. sum(x.a).
a constant, e.g. 0.1.
a parameter, e.g. $param.
a grouping key, e.g. the a in RETURN a, count(*).
a local variable, e.g. the x in count(*) + size([ x IN range(1, 10) | x ]).
a sub-expression, all operands of which have to be allowed in an aggregating expression.

Aggregating functions

Example graph

avg()

collect()

collect_list()Cypher 25 onlyIntroduced in Neo4j 2026.02

count()

Using count(*) to return the number of nodes

Using count(*) to group and count relationship types

Counting non-null values

Counting with and without duplicates

max()

min()

percentileCont()

percentile_cont()Cypher 25 onlyIntroduced in Neo4j 2026.02

percentileDisc()

percentile_disc()Cypher 25 onlyIntroduced in Neo4j 2026.02

stDev()

stdev_samp()Cypher 25 onlyIntroduced in Neo4j 2026.02

stDevP()

stdev_pop()Cypher 25 onlyIntroduced in Neo4j 2026.02

sum()

Aggregating expressions and grouping keys

Examples

Rules for aggregating expressions

collect_list()
Cypher 25 onlyIntroduced in Neo4j 2026.02

Using `count(*)` to return the number of nodes

Using `count(*)` to group and count relationship types

Counting non-`null` values

percentile_cont()
Cypher 25 onlyIntroduced in Neo4j 2026.02

percentile_disc()
Cypher 25 onlyIntroduced in Neo4j 2026.02

stdev_samp()
Cypher 25 onlyIntroduced in Neo4j 2026.02

stdev_pop()
Cypher 25 onlyIntroduced in Neo4j 2026.02