apoc.meta.schema

Procedure APOC Core

apoc.meta.schema({config}) - examines a subset of the graph to provide a map-like meta information

Signature

apoc.meta.schema(config = {} :: MAP?) :: (value :: MAP?)

Input parameters

Name Type Default

config

MAP?

{}

Output parameters

Name Type

value

MAP?

Usage Examples

The examples in this section are based on the following sample graph:

CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (TomH:Person {name:'Tom Hanks', born:1956})

CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (TheMatrixReloaded:Movie {title:'The Matrix Reloaded', released:2003, tagline:'Free your mind'})
CREATE (TheMatrixRevolutions:Movie {title:'The Matrix Revolutions', released:2003, tagline:'Everything that has a beginning has an end'})
CREATE (SomethingsGottaGive:Movie {title:"Something's Gotta Give", released:2003})
CREATE (TheDevilsAdvocate:Movie {title:"The Devil's Advocate", released:1997, tagline:'Evil has its winning ways'})

CREATE (YouveGotMail:Movie {title:"You've Got Mail", released:1998, tagline:'At odds in life... in love on-line.'})
CREATE (SleeplessInSeattle:Movie {title:'Sleepless in Seattle', released:1993, tagline:'What if someone you never met, someone you never saw, someone you never knew was the only someone for you?'})
CREATE (ThatThingYouDo:Movie {title:'That Thing You Do', released:1996, tagline:'In every life there comes a time when that thing you dream becomes that thing you do'})
CREATE (CloudAtlas:Movie {title:'Cloud Atlas', released:2012, tagline:'Everything is connected'})

CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrix)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixReloaded)
CREATE (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrixRevolutions)
CREATE (Keanu)-[:ACTED_IN {roles:['Julian Mercer']}]->(SomethingsGottaGive)
CREATE (Keanu)-[:ACTED_IN {roles:['Kevin Lomax']}]->(TheDevilsAdvocate)

CREATE (TomH)-[:ACTED_IN {roles:['Joe Fox']}]->(YouveGotMail)
CREATE (TomH)-[:ACTED_IN {roles:['Sam Baldwin']}]->(SleeplessInSeattle)
CREATE (TomH)-[:ACTED_IN {roles:['Mr. White']}]->(ThatThingYouDo)
CREATE (TomH)-[:ACTED_IN {roles:['Zachry', 'Dr. Henry Goose', 'Isaac Sachs', 'Dermot Hoggins']}]->(CloudAtlas)

CREATE (s0:sameName{id:1}) -[r0:sameName {alfa: 'beta'}] -> (t0:sameName{id:2});
CALL apoc.meta.schema()
YIELD value
UNWIND keys(value) AS key
RETURN key, value[key] AS value;

Note that, in case of relationship type and node label with the same name, the relationships will be distinguished by the suffix " (RELATIONSHIP)"

Table 1. Results
key value

"Movie"

{count: 9, relationships: {ACTED_IN: {count: 41, properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}, direction: "in", labels: ["Person"]}}, type: "node", properties: {tagline: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, title: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, released: {existence: FALSE, type: "INTEGER", indexed: FALSE, unique: FALSE}}, labels: []}

"ACTED_IN"

{count: 9, type: "relationship", properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}}

"Person"

{count: 2, relationships: {ACTED_IN: {count: 9, properties: {roles: {existence: FALSE, type: "LIST", array: TRUE}}, direction: "out", labels: ["Movie"]}}, type: "node", properties: {name: {existence: FALSE, type: "STRING", indexed: FALSE, unique: FALSE}, born: {existence: FALSE, type: "INTEGER", indexed: FALSE, unique: FALSE}}, labels: []}

"sameName (RELATIONSHIP)"

{"count":1,"type":"relationship","properties":{"alfa":{"existence":false,"type":"STRING","array":false}}}

"sameName"

{count: 2, relationships: {"sameName": {count: 1, properties: {alfa: {existence: false, type: "STRING", array: false}}, direction: "out", labels: ["sameName"]}}, type:"node", properties: {id: {existence: false,type: "INTEGER", indexed: false,unique: false}},labels: []}

Because the count stores return an incomplete picture of the data, we have to cross check the results with the actual data to filter out false positives.

We use a subset of the data to analyze by specifying the sample parameter (1000 by default).

Through this parameter, for each label we split data for each node-label into batches of (total / sample) ± rand where total is the total number of nodes with that label and rand is a number between 0 and total / sample / 10

So, we pick a percentage of nodes with that label of roughly sample / total * 100% to check against.

We pick the first node of each batch and we analyze the properties and the relationships.

For example, given the following graph:

CREATE (:Foo), (:Other)-[:REL_0]->(:Other), (:Other)-[:REL_1]->(:Other)<-[:REL_2 {baz: 'baa'}]-(:Other), (:Other {alpha: 'beta'}), (:Other {foo:'bar'})-[:REL_3]->(:Other)

Without sample parameter we receive:

CALL apoc.meta.schema()
YIELD value RETURN value["Other"] as value;
Table 2. Results
value
{
    "count": 8,
    "relationships": {
        "REL_2": {
            "count": 1,
            "properties": {
                "baz": {
                    "existence": false,
                    "type": "STRING",
                    "array": false
                }
            },
            "direction": "out",
            "labels": [
                "Other",
                "Other"
            ]
        },
        "REL_3": {
            "count": 1,
            "properties": {

            },
            "direction": "out",
            "labels": [
                "Other",
                "Other"
            ]
        },
        "REL_0": {
            "count": 1,
            "properties": {

            },
            "direction": "out",
            "labels": [
                "Other",
                "Other"
            ]
        },
        "REL_1": {
            "count": 1,
            "properties": {

            },
            "direction": "out",
            "labels": [
                "Other",
                "Other"
            ]
        }
    },
    "type": "node",
    "properties": {
        "alpha": {
            "existence": false,
            "type": "STRING",
            "indexed": false,
            "unique": false
        },
        "foo": {
            "existence": false,
            "type": "STRING",
            "indexed": false,
            "unique": false
        }
    },
    "labels": []
}

Otherwise, with sample: 2 we obtain (the result can change):

CALL apoc.meta.schema({sample: 2})
YIELD value RETURN value["Other"] as value
Table 3. Results
value
{
  "count": 8,
  "relationships": {
    "REL_1": {
      "count": 1,
      "properties": {},
      "direction": "out",
      "labels": [
        "Other",
        "Other"
      ]
    }
  },
  "type": "node",
  "properties": {
    "alpha": {
      "existence": false,
      "type": "STRING",
      "indexed": false,
      "unique": false
    }
  },
  "labels": []
}