Merge nodes

The APOC library contains a procedure that can be used to merge nodes.

This procedure allows for merging a list of nodes onto the first node in the list (all relationships are merged onto that node as well). The merge behaviour can be specified for properties globally and/or individually.

Procedure for merging nodes

Qualified Name Type

apoc.refactor.mergeNodes(nodes LIST<NODE>, config MAP<STRING, ANY>) - merges the given LIST<NODE> onto the first NODE in the LIST<NODE>. All RELATIONSHIP values are merged onto that NODE as well.

Procedure

MATCH (p:Person)
WITH p ORDER BY p.created DESC // newest one first
WITH p.email AS email, collect(p) as nodes
CALL apoc.refactor.mergeNodes(nodes, {properties: {
    name:'discard',
    age:'overwrite',
    kids:'combine',
    `addr.*`: 'overwrite',
    `.*`: 'discard'
}})
YIELD node
RETURN node

Config options

Below are the config options for this procedure: These config option also works for apoc.refactor.mergeRelationships([rels],{config}).

type operations

discard

the property from the first node will remain if already set, otherwise the first property in list will be written

overwrite / override

last property in list wins

combine

if there is only one property in list, it will be set / kept as single property otherwise create an array, tries to coerce values

In addition, mergeNodes supports the following config properties:

type operations

mergeRels

true/false: give the possibility to merge relationships with same type and direction.

produceSelfRel

true/false: if true (default), any eventual new self-relationship that would be created from the nodes merged into target node are inserted, otherwise not. Please note that this param is independent from mergeRels config and doesn’t affect on pre-existing self-relationships (in this case it is used the preserveExistingSelfRels config).

preserveExistingSelfRels

true/false: is valid only with mergeRels:true. If true (default), the pre-existing self-relationships in the final node will remain, otherwise will be deleted.

singleElementAsArray

false/true: if is false (default) and type is combine in case the merge of two arrays is a new array with size 1 it extracts the single value.

Properties with a name not included in the config map are overridden.

Relationships properties are managed with the same method, except when the config map is null, in which case the entity properties are combined.

Examples

The below examples will further explain this procedure.

Same start and end nodes

The following creates a graph containings relationships that have the same start and end nodes
CREATE (n1:Person {name:'Tom'}),
(n2:Person {name:'John'}),
(n3:Company {name:'Company1'}),
(n5:Car {brand:'Ferrari'}),
(n6:Animal:Cat {name:'Derby'}),
(n7:City {name:'London'}),

(n1)-[:WORKS_FOR {since:2015}]->(n3),
(n2)-[:WORKS_FOR {since:2018}]->(n3),
(n3)-[:HAS_HQ {since:2004}]->(n7),
(n1)-[:DRIVE {since:2017}]->(n5),
(n2)-[:HAS {since:2013}]->(n6);
return *;
apoc.refactor.mergeNodes.createDataSetFirstExample
The following merges John and Tom into a single node:
MATCH (a1:Person{name:'John'}), (a2:Person {name:'Tom'})
WITH head(collect([a1,a2])) as nodes
CALL apoc.refactor.mergeNodes(nodes,{properties:"combine", mergeRels:true})
YIELD node
RETURN count(*)

If the above query is run, it will result in the following graph:

apoc.refactor.mergeNodes.resultFirstExample

Since the relationships have the same start and end nodes, the relationships are merged and properties are combined.

Different start and end nodes

The following creates a graph containings relationships that have different start or end nodes:
Create (n1:Person {name:'Tom'}),
(n2:Person {name:'John'}),
(n3:Company {name:'Company1'}),
(n4:Company {name:'Company2'}),
(n5:Car {brand:'Ferrari'}),
(n6:Animal:Cat {name:'Derby'}),
(n7:City {name:'London'}),
(n8:City {name:'Liverpool'}),
(n1)-[:WORKS_FOR{since:2015}]->(n3),
(n2)-[:WORKS_FOR{since:2018}]->(n4),
(n3)-[:HAS_HQ{since:2004}]->(n7),
(n4)-[:HAS_HQ{since:2007}]->(n8),
(n1)-[:DRIVE{since:2017}]->(n5),
(n2)-[:HAS{since:2013}]->(n6)
return *;
apoc.refactor.mergeNodes.createDataSetSecondExample
The following merges John and Tom into a single node:
MATCH (a1:Person{name:'John'}), (a2:Person {name:'Tom'})
WITH head(collect([a1,a2])) as nodes
CALL apoc.refactor.mergeNodes(nodes,{
    properties:"combine",
    mergeRels:true
})
YIELD node
RETURN count(*)

If the above query is run, it will result in the following graph:

apoc.refactor.mergeNodes.resultSecondExample
apoc.refactor.mergeNodes.resultSecondExampleData

Since the relationships have different end nodes, all relationships and properties are maintained.