apoc.meta.data

This procedure is not considered safe to run from multiple threads. It is therefore not supported by the parallel runtime. For more information, see the Cypher Manual → Parallel runtime.

Details
Syntax	`apoc.meta.data([ config ]) :: (label, property, count, unique, index, existence, type, array, sample, left, right, other, otherLabels, elementType)`
Description	Examines the full graph and returns a table of metadata.
Input arguments	Name	Type	Description
Input arguments	`config`	`MAP`	Number of nodes to sample, setting sample to `-1` will remove sampling; `{ sample = 1000 :: INTEGER }` The default is: `{}`.
Return arguments	Name	Type	Description
	`label`	`STRING`	The label or type name.
	`property`	`STRING`	The property name.
	`count`	`INTEGER`	The count of seen values.
	`unique`	`BOOLEAN`	If all seen values are unique.
	`index`	`BOOLEAN`	If an index exists for this property.
	`existence`	`BOOLEAN`	If an existence constraint exists for this property.
	`type`	`STRING`	The type represented by this row.
	`array`	`BOOLEAN`	Indicates whether the property is an array. If the type column is "RELATIONSHIP," this will be true if there is at least one node with two outgoing relationships of the type specified by the label or property column.
	`sample`	`LIST<ANY>`	This is always null.
	`left`	`INTEGER`	The ratio (rounded down) of the count of outgoing relationships for a specific label and relationship type relative to the total count of those patterns.
	`right`	`INTEGER`	The ratio (rounded down) of the count of incoming relationships for a specific label and relationship type relative to the total count of those patterns.
	`other`	`LIST<STRING>`	The labels of connect nodes.
	`otherLabels`	`LIST<STRING>`	For uniqueness constraints, this field shows other labels present on nodes that also contain the uniqueness constraint.
	`elementType`	`STRING`	Whether this refers to a node or a relationship.

Config parameters

This procedure supports the following config parameters:

Config parameters
Name	Type	Default	Description
`sample`	`INTEGER`	1000	Number of nodes to sample. Setting `sample` to `-1` will remove sampling.

Sampling

Specify the sample parameter (1000 by default) to analyze a subset of the data.

The sample, along with the count of nodes for each label, is used to calculate a skip value. Since this value is generated using a random number generator, results obtained through the sampling method may vary between subsequent runs.

Example 1. Calculating skip count for data sampling

If a database contains 500 nodes with the label Foo label, the skip count for that label is calculated as follows:

The skip count per node label is determined by generating a random number between (totalNodesForLabel / sample) ± 0.1.

Sample 10: skipCount = 500 / 10 = 50
The resulting skip count will be between 45 and 55.

Sample 50: skipCount = 500 / 50 = 10
The resulting skip count will be between 9 and 11.

Sample 100: skipCount = 500 / 100 = 5
The resulting skip count will be 5.

The skip count represents the number of nodes skipped before one is examined. For instance, with a skip count of 5, every 5th node is examined. Consequently, a higher sample number results in more nodes being sampled.

To stop sampling set sample: -1.

Usage Examples

The examples in this section are based on the following sample graph:

CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (TomH:Person {name:'Tom Hanks', born:1956})
CREATE (Carrie:Person {name:'Carrie-Anne Moss', born:1967})
CREATE (Laurence:Person {name:'Laurence Fishburne', born:1961})
CREATE (Hugo:Person {name:'Hugo Weaving', born:1960})
CREATE (LillyW:Person {name:'Lilly Wachowski', born:1967})
CREATE (LanaW:Person {name:'Lana Wachowski', born:1965})

CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (TheMatrixReloaded:Movie {title:'The Matrix Reloaded', released:2003, tagline:'Free your mind'})
CREATE (TheMatrixRevolutions:Movie {title:'The Matrix Revolutions', released:2003, tagline:'Everything that has a beginning has an end'})
CREATE (SomethingsGottaGive:Movie {title:"Something's Gotta Give", released:2003})
CREATE (TheDevilsAdvocate:Movie {title:"The Devil's Advocate", released:1997, tagline:'Evil has its winning ways'})
CREATE (Something:Something:Else {foo: 'bar'})

CREATE (YouveGotMail:Movie {title:"You've Got Mail", released:1998, tagline:'At odds in life... in love on-line.'})
CREATE (SleeplessInSeattle:Movie {title:'Sleepless in Seattle', released:1993, tagline:'What if someone you never met, someone you never saw, someone you never knew was the only someone for you?'})
CREATE (ThatThingYouDo:Movie {title:'That Thing You Do', released:1996, tagline:'In every life there comes a time when that thing you dream becomes that thing you do'})
CREATE (CloudAtlas:Movie {title:'Cloud Atlas', released:2012, tagline:'Everything is connected'})

CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrix)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixReloaded)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixRevolutions)
CREATE (Keanu)-[:ACTED_IN {roles:['Julian Mercer']}]->(SomethingsGottaGive)
CREATE (Keanu)-[:ACTED_IN {roles:['Kevin Lomax']}]->(TheDevilsAdvocate)

CREATE (TomH)-[:ACTED_IN {roles:['Joe Fox']}]->(YouveGotMail)
CREATE (TomH)-[:ACTED_IN {roles:['Sam Baldwin']}]->(SleeplessInSeattle)
CREATE (TomH)-[:ACTED_IN {roles:['Mr. White']}]->(ThatThingYouDo)
CREATE (TomH)-[:ACTED_IN {roles:['Zachry', 'Dr. Henry Goose', 'Isaac Sachs', 'Dermot Hoggins']}]->(CloudAtlas)

CREATE (Keanu)-[:LIKES {rate:10}]->(Carrie)
CREATE (Keanu)-[:LIKES {rate:6}]->(TomH)
CREATE (Keanu)-[:LIKES {rate:4}]->(Laurence)
CREATE (Keanu)-[:LIKES {rate:8}]->(Hugo)
CREATE (Keanu)-[:LIKES {rate:9}]->(LillyW)
CREATE (Keanu)-[:LIKES {rate:6}]->(LanaW)
CREATE (Keanu)-[:LIKES {rate:100}]->(TheMatrix)
CREATE (Carrie)-[:LIKES {rate:7}]->(Keanu)

CREATE (Something)-[:RELATED_TO {rate:99}]->(TheMatrix)
CREATE (Keanu)-[:RELATED_TO {rate:12}]->(TheMatrix)
CREATE (Keanu)-[:RELATED_TO {rate:12}]->(TheMatrixReloaded)
CREATE (Keanu)-[:RELATED_TO {rate:23}]->(TheMatrixRevolutions)
CREATE (Carrie)-[:RELATED_TO {rate:34}]->(TheMatrix)
CREATE (TheMatrix)-[:RELATED_TO {rate:34}]->(Laurence)
CREATE (TheMatrixReloaded)-[:RELATED_TO {rate:345}]->(LanaW)

CALL apoc.meta.data();

Results
label	property	count	unique	index	existence	type	array	sample	left	right	other	otherLabels	elementType
"ACTED_IN"	"Person"	2	false	false	false	"RELATIONSHIP"	true	null	4	0	["Movie"]	[]	"relationship"
"ACTED_IN"	"roles"	0	false	false	false	"LIST"	true	null	0	0	[]	[]	"relationship"
"LIKES"	"Person"	2	false	false	false	"RELATIONSHIP"	true	null	4	1	["Movie", "Person"]	[]	"relationship"
"LIKES"	"rate"	0	false	false	false	"INTEGER"	false	null	0	0	[]	[]	"relationship"
"RELATED_TO"	"Person"	2	false	false	false	"RELATIONSHIP"	true	null	2	0	["Movie"]	[]	"relationship"
"RELATED_TO"	"rate"	0	false	false	false	"INTEGER"	false	null	0	0	[]	[]	"relationship"
"RELATED_TO"	"Movie"	2	false	false	false	"RELATIONSHIP"	false	null	1	2	["Person"]	[]	"relationship"
"RELATED_TO"	"Something"	1	false	false	false	"RELATIONSHIP"	false	null	1	0	["Movie"]	[]	"relationship"
"RELATED_TO"	"Else"	1	false	false	false	"RELATIONSHIP"	false	null	1	0	["Movie"]	[]	"relationship"
"Person"	"RELATED_TO"	2	false	false	false	"RELATIONSHIP"	true	null	2	0	["Movie"]	[]	"node"
"Person"	"ACTED_IN"	2	false	false	false	"RELATIONSHIP"	true	null	4	0	["Movie"]	[]	"node"
"Person"	"LIKES"	2	false	false	false	"RELATIONSHIP"	true	null	4	1	["Movie", "Person"]	[]	"node"
"Person"	"born"	0	false	false	false	"INTEGER"	false	null	0	0	[]	[]	"node"
"Person"	"name"	0	false	false	false	"STRING"	false	null	0	0	[]	[]	"node"
"Movie"	"RELATED_TO"	2	false	false	false	"RELATIONSHIP"	false	null	1	2	["Person"]	[]	"node"
"Movie"	"title"	0	false	false	false	"STRING"	false	null	0	0	[]	[]	"node"
"Movie"	"tagline"	0	false	false	false	"STRING"	false	null	0	0	[]	[]	"node"
"Movie"	"released"	0	false	false	false	"INTEGER"	false	null	0	0	[]	[]	"node"
"Something"	"RELATED_TO"	1	false	false	false	"RELATIONSHIP"	false	null	1	0	["Movie"]	[]	"node"
"Something"	"foo"	0	false	false	false	"STRING"	false	null	0	0	[]	[]	"node"
"Else"	"RELATED_TO"	1	false	false	false	"RELATIONSHIP"	false	null	1	0	["Movie"]	[]	"node"
"Else"	"foo"	0	false	false	false	"STRING"	false	null	0	0	[]	[]	"node"

The unique column shows if there is an unique constraint in that specific label and property. Similarly, the index column show if there is an index or not, while the existence looks for an existence constraint.

The array column check if the row is of type array and, in case type columns is "RELATIONSHIP", the result will be true if there is at least one node with 2 outgoing relationships with the type of relation given by label or property column, so ACTED_IN in the example above.

The left, right and count columns regard only rows with column type equals to "RELATIONSHIP" (otherwise they are equal to 0). Please note that, because we examine a sample, these counts are just estimates to give an overview and proximity to actual values depends on the dataset and the sample set (default 100).

In particular the count column indicates the number of nodes with an outgoing relationship (e.g. the row with label = ACTED_IN and property = Person has count 2 because there are 2 nodes (node:Person)-[:ACTED_IN]→(), i.e. (Keanu) and (TomH)).

The left value represents the ratio (rounded down) of the count of the outgoing patterns for a certain label and a specific type of relationship to count. In cypher, it corresponds to:

MATCH p=(start:`<LABEL>`)-[:`<TYPE>`]->()
WITH count(distinct start) as nodes, count(p) as counts
RETURN CASE when nodes = 0 then 0 else counts / nodes end

For example, regarding the row with label = RELATED_TO and property = Movie, there are 2 relationships (:Movie)-[rel:RELATED_TO]→(), i.e (TheMatrix)-[:RELATED_TO {rate:34}]→(Laurence) and (TheMatrixReloaded)-[:RELATED_TO {rate:345}]→(LanaW). So the left value is 1 (2 relationships divided by 2 nodes found).

Instead, regarding the row with label = LIKES and property = Person, there are 8 relationships (:Person)-[rel:LIKES]→(), 7 starting from (Keanu) node, and 1 from (Carrie). Then the left value is 4 (8 relationships divided by 2 nodes found).

The right value is the ratio (rounded down) of the count of the incoming patterns for a certain label and a specific type of relationship to count, where the patterns included in the count are those in which there is an equivalent outgoing relationship. In cypher it corresponds to:

MATCH p=(start:`<LABEL>`)<-[:`<TYPE>`]-()
WHERE exists((start:`<LABEL>`)-[:`<TYPE>`]->())
WITH count(distinct start) as nodes, count(p) as counts
RETURN CASE when nodes = 0 then 0 else counts / nodes end

For example, regarding the row with label = RELATED_TO and property = Movie, the (TheMatrix) node, which has an outgoing RELATED_TO relationship, has 3 incoming relationships as well, while the TheMatrixReloaded node has 1 incoming relationship. So the right value is 2, that is 4 divided by 2 node founds.

Therefore, via the right and left values, we provide a dataset estimate of the possible degree averages.