Duplicate Check in Cypher
I answered a question on StackOverflow that required a duplicate check for a collection.
This would be easy with an isUnique(coll)
in cypher or a to_set(coll)
/ uniq(coll)
function to allow an expression like size(to_set(coll)) = size(coll)
.
But neither is there, so we need a tiny algorithm to solve it.
One solution is: Iterate over a collection and check if the current element is contained in the rest of the collection.
With Cypher we can use reduce
and CASE
expressions.
The accumulator holds the rest of the collection and x is the current element. We shortcut the execution by returning NULL
in the duplicate case. Otherwise when
the IN
check does not succeed we return the rest of the collection to be the new accumulator.
WITH [1,2,3] AS coll
RETURN reduce(a=coll, x IN coll |
CASE WHEN a IS NULL OR x IN tail(a) THEN NULL ELSE tail(a) END ) IS NOT NULL as is_unique
WITH [1,2,3,1] AS coll
RETURN reduce(a=coll, x IN coll |
CASE WHEN a IS NULL OR x IN tail(a) THEN NULL ELSE tail(a) END ) IS NOT NULL as is_unique
Chris Leishman posted a nice solution for simulating the unique function:
WITH [1,2,3,1] AS coll
RETURN reduce(a=[], x IN coll | CASE WHEN x IN a THEN a ELSE a + x END) as unique
Is this page helpful?