Fingerprinting

The following functions calculate hashsums over nodes, relationship or the entire graph. It takes into account all properties, node labels and relationship types.

The algorithm used for hashing may change between APOC versions. It is therefore only possible to compare the hashing results of two entities/graphs from the same graph, or from different graphs using the same apoc version.

The hashsum of a graph first calculates the hashsums for each node. The resulting hashsum list is ordered, and for each node the hashsum for all relationships and their end nodes are added. This approach provides independence of internal ids.

It is also possible to supply a list of propertyKeys which should be ignored on all nodes. This can be useful when storing properties, such as created=timestamp() that should be ignored.

Function name Description

apoc.hashing.fingerprint(object, <list_of_props_to_ignore>)

calculates a md5 hashsum over the object. It deals with ordering (in case of maps), scalars and arrays. Unsuitable for cryptographic use-cases.

apoc.hashing.fingerprinting(object, {conf})

calculates a md5 hashsum over the object. It deals with ordering (in case of maps), scalars and arrays (see the Fingerprinting configuration params table for more details). Unsuitable for cryptographic use-cases.

apoc.hashing.graph(<list_of_props_to_ignore>)

calculates a md5 hashsum over the full graph. Unsuitable for cryptographic use-cases.

Configuration parameters

Table 1. Fingerprinting configuration params
Property name Type Default Description

digestAlgorithm

ENUM

MD5

the digest algorithm used to create the fingerprint. Currently there is only support for MD5.

nodeAllowMap

Map<K,V>

empty

a map where the key is the node Label and the value is a lists properties allowed.

relAllowMap

Map<K,V>

empty

a map where the key is the rel-type and the value is a lists properties allowed.

nodeDisallowMap

Map<K,V>

empty

a DisallowMap map where the key is the node Label and the value is a lists properties disallowed.

relDisallowMap

Map<K,V>

empty

a DisallowMap map where the key is the rel-type and the value is a lists properties disallowed.

mapAllowList

List

empty

a List used in case the input to the procedure is a map allowed.

mapDisallowList

List

empty

a List used in case the input to the procedure is a map disallowed.

allNodesAllowList

List

empty

a List used for properties common to all Nodes must be included in the fingerprint.

allRelsAllowList

List

empty

a List used for properties common to all Relationships that must be included in the fingerprint.

allNodesDisallowList

List

empty

a List used for properties common to all Nodes that must be excluded from the fingerprint.

allRelsDisallowList

List

empty

a List used for properties common to all Relationships that must be excluded from the fingerprint.

strategy

Enum[EAGER, LAZY]

LAZY

defines the behaviour in case the properties are not present for the specific node/relationship (see the Fingerprinting strategy paragraph for more details).

It is not possible to allow and disallow lists for the same entity type. Lists must consequently be either allowed or disallowed when setting the fingerprinting parameters for nodes, relationships, and maps.

Fingerprinting strategy

In case the properties defined in the configuration are not present in the node and/or relationship, it is possible to define how the fingerprinting procedure must proceed with the process:

  • EAGER: it evaluates the whole node properties in order to create the fingerprint of the node/relationship

  • LAZY: it evaluates only the nodes/relationships provided in the configuration