Categorize

The APOC library contains a procedure that replaces string property values on nodes with a relationship with a unique category node for that property value.

Procedure to create category nodes

Qualified Name Type

apoc.refactor.categorize(sourceKey STRING, type STRING, outgoing BOOLEAN, label STRING, targetKey STRING, copiedKeys LIST<STRING>, batchSize INTEGER) - creates new category NODE values from NODE values in the graph with the specified sourceKey as one of its property keys. The new category NODE values are then connected to the original NODE values with a RELATIONSHIP of the given type.

Procedure

Example

The below example will explain this procedure in more detail.

The following creates nodes with a favoriteColor property:
CREATE (:Person {name: "Mark", favoriteColor: "Red"})
CREATE (:Person {name: "Jennifer", favoriteColor: "Blue"})
CREATE (:Person {name: "David", favoriteColor: "Red"})

In order to run this procedure, a unique constraint must exist on the new node label. In this case:

CREATE CONSTRAINT ON (n:Color) ASSERT n.Color IS UNIQUE
The following turns all favoriteColor properties into FAVORITE_COLOR relationships to Color nodes with a matching color property.
CALL apoc.refactor.categorize('favoriteColor', 'FAVORITE_COLOR', true, 'Color', 'color', [], 100)

The above query will return the following graph:

apoc.categorize