Fingerprinting

The following functions calculate a hashsum over nodes, relationships or the entire graph. It takes into account all properties, node labels and relationship types.

The algorithm used for hashing may change between APOC versions. It is therefore only possible to compare the hashing results of two entities/graphs from the same graph, or from different graphs using the same apoc version.

The hashsum of a graph first calculates the hashsum for each node. The resulting hashsum list is ordered, and for each node the hashsum for all relationships and their end nodes are added. Internal ids are not included in the hashsum.

It is also possible to supply a list of propertyKeys which should be ignored on all nodes. This can be useful when storing properties, such as created=timestamp() that should be ignored.

Function name Description

Function name	Description
`apoc.hashing.fingerprint(object ANY, excludedPropertyKeys LIST<STRING>)`	calculates a MD5 checksum over a `NODE` or `RELATIONSHIP` (identical entities share the same checksum). Unsuitable for cryptographic use-cases.
`apoc.hashing.fingerprinting(object ANY, config MAP<STRING, ANY>)`	calculates a MD5 checksum over a `NODE` or `RELATIONSHIP` (identical entities share the same checksum). Unlike `apoc.hashing.fingerprint()`, this function supports a number of config parameters. Unsuitable for cryptographic use-cases.
`apoc.hashing.fingerprintGraph(propertyExcludes LIST<STRING>)`	Calculates a MD5 checksum over the full graph. This function uses in-memory data structures. Unsuitable for cryptographic use-cases.

apoc.hashing.fingerprint(object ANY, excludedPropertyKeys LIST<STRING>)

calculates a MD5 checksum over a NODE or RELATIONSHIP (identical entities share the same checksum). Unsuitable for cryptographic use-cases.

apoc.hashing.fingerprinting(object ANY, config MAP<STRING, ANY>)

calculates a MD5 checksum over a NODE or RELATIONSHIP (identical entities share the same checksum). Unlike apoc.hashing.fingerprint(), this function supports a number of config parameters. Unsuitable for cryptographic use-cases.

apoc.hashing.fingerprintGraph(propertyExcludes LIST<STRING>)

Calculates a MD5 checksum over the full graph. This function uses in-memory data structures. Unsuitable for cryptographic use-cases.

Configuration parameters

Table 1. `apoc.hashing.fingerprinting` configuration params
name	type	default	description
digestAlgorithm	STRING	"MD5"	The algorithm used to compute the fingerprint. Supported values are: `MD5`, `SHA-1`, `SHA-256`
strategy	STRING	"LAZY"	Defines the filtering behaviour of nodes/relationships. Supported values are: `LAZY` - does not include properties. `EAGER` - includes all non-filtered properties.
nodeAllowMap	MAP<STRING, LIST<STRING>>	{}	Node label name mapped to a list of allowed properties for that label.
nodeDisallowMap	MAP<STRING, LIST<STRING>>	[]	Node label name mapped to a list of properties to ignore for that label.
relAllowMap	MAP<STRING, LIST<STRING>>	{}	Relationship type name mapped to a list of allowed properties for that type.
relDisallowMap	MAP<STRING, LIST<STRING>>	[]	Relationship type name mapped to a list of properties to ignore for that type.
mapAllowList	LIST<STRING>	[]	A list of allowed keys when the object being hashed is a map.
mapDisallowList	LIST<STRING>	[]	A list of keys to ignore when the object being hashed is a map.
allNodesAllowList	LIST<STRING>	[]	A list of globally allowed node properties
allNodesDisallowList	LIST<STRING>	[]	A list of globally ignored node properties.
allRelsAllowList	LIST<STRING>	[]	A list of globally allowed relationship properties.
allRelsDisallowList	LIST<STRING>	[]	A list of globally ignored relationship properties.

It is not possible to define both allow and disallow lists for the same entity type. Lists must consequently be either allowed or disallowed when setting the fingerprinting parameters for nodes, relationships, and maps.

Fingerprinting strategy

In case the properties defined in the configuration are not present on the node and/or relationship, it is possible to define how the fingerprinting procedure proceeds:

EAGER: includes all properties in the hashing if no allow/disallow lists are supplied.
LAZY: excludes all properties in the hashing if no allow/disallow lists are supplied.