apoc.meta.schema
Procedure APOC Core
apoc.meta.schema({config}) - examines a subset of the graph to provide a map-like meta information
Usage Examples
The examples in this section are based on the following sample graph:
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (TomH:Person {name:'Tom Hanks', born:1956})
CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (TheMatrixReloaded:Movie {title:'The Matrix Reloaded', released:2003, tagline:'Free your mind'})
CREATE (TheMatrixRevolutions:Movie {title:'The Matrix Revolutions', released:2003, tagline:'Everything that has a beginning has an end'})
CREATE (SomethingsGottaGive:Movie {title:"Something's Gotta Give", released:2003})
CREATE (TheDevilsAdvocate:Movie {title:"The Devil's Advocate", released:1997, tagline:'Evil has its winning ways'})
CREATE (YouveGotMail:Movie {title:"You've Got Mail", released:1998, tagline:'At odds in life... in love on-line.'})
CREATE (SleeplessInSeattle:Movie {title:'Sleepless in Seattle', released:1993, tagline:'What if someone you never met, someone you never saw, someone you never knew was the only someone for you?'})
CREATE (ThatThingYouDo:Movie {title:'That Thing You Do', released:1996, tagline:'In every life there comes a time when that thing you dream becomes that thing you do'})
CREATE (CloudAtlas:Movie {title:'Cloud Atlas', released:2012, tagline:'Everything is connected'})
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrix)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixReloaded)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixRevolutions)
CREATE (Keanu)-[:ACTED_IN {roles:['Julian Mercer']}]->(SomethingsGottaGive)
CREATE (Keanu)-[:ACTED_IN {roles:['Kevin Lomax']}]->(TheDevilsAdvocate)
CREATE (TomH)-[:ACTED_IN {roles:['Joe Fox']}]->(YouveGotMail)
CREATE (TomH)-[:ACTED_IN {roles:['Sam Baldwin']}]->(SleeplessInSeattle)
CREATE (TomH)-[:ACTED_IN {roles:['Mr. White']}]->(ThatThingYouDo)
CREATE (TomH)-[:ACTED_IN {roles:['Zachry', 'Dr. Henry Goose', 'Isaac Sachs', 'Dermot Hoggins']}]->(CloudAtlas)
CREATE (s0:sameName{id:1}) -[r0:sameName {alfa: 'beta'}] -> (t0:sameName{id:2});
CALL apoc.meta.schema()
YIELD value
UNWIND keys(value) AS key
RETURN key, value[key] AS value;
Note that, in case of relationship type and node label with the same name, the relationships will be distinguished by the suffix " (RELATIONSHIP)"
key | value |
---|---|
"Movie" |
{count: 9, relationships: {ACTED_IN: {count: 41, properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}, direction: "in", labels: ["Person"]}}, type: "node", properties: {tagline: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, title: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, released: {existence: FALSE, type: "INTEGER", indexed: FALSE, unique: FALSE}}, labels: []} |
"ACTED_IN" |
{count: 9, type: "relationship", properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}} |
"Person" |
{count: 2, relationships: {ACTED_IN: {count: 9, properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}, direction: "out", labels: ["Movie"]}}, type: "node", properties: {name: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, born: {existence: FALSE, type: "INTEGER", indexed: FALSE, unique: FALSE}}, labels: []} |
"sameName (RELATIONSHIP)" |
{"count":1,"type":"relationship","properties":{"alfa":{"existence":false,"type":"STRING","array":false}}} |
"sameName" |
{count: 2, relationships: {"sameName": {count: 1, properties: {alfa: {existence: false, type: "STRING", array: false}}, direction: "out", labels: ["sameName"]}}, type:"node", properties: {id: {existence: false,type: "INTEGER", indexed: false,unique: false}},labels: []} |
Because the count stores return an incomplete picture of the data, we have to cross check the results with the actual data to filter out false positives.
We use a subset of the data to analyze by specifying the sample
parameter (1000 by default).
Through this parameter, for each label we split data for each node-label into batches of (total / sample) ± rand
where total
is the total number of nodes with that label and rand
is a number between 0
and total / sample / 10
So, we pick a percentage of nodes with that label of roughly sample / total * 100
% to check against.
We pick the first node of each batch and we analyze the properties and the relationships.
For example, given the following graph:
CREATE (:Foo), (:Other)-[:REL_0]->(:Other), (:Other)-[:REL_1]->(:Other)<-[:REL_2 {baz: 'baa'}]-(:Other), (:Other {alpha: 'beta'}), (:Other {foo:'bar'})-[:REL_3]->(:Other)
Without sample
parameter we receive:
CALL apoc.meta.schema()
YIELD value RETURN value["Other"] as value;
value |
---|
|
Otherwise, with sample: 2
we obtain (the result can change):
CALL apoc.meta.schema({sample: 2})
YIELD value RETURN value["Other"] as value
value |
---|
|