apoc.import.graphml
Procedure APOC Core
apoc.import.graphml(urlOrBinaryFile,config) - imports graphml file
Signature
apoc.import.graphml(urlOrBinaryFile :: ANY?, config :: MAP?) :: (file :: STRING?, source :: STRING?, format :: STRING?, nodes :: INTEGER?, relationships :: INTEGER?, properties :: INTEGER?, time :: INTEGER?, rows :: INTEGER?, batchSize :: INTEGER?, batches :: INTEGER?, done :: BOOLEAN?, data :: STRING?)
Config parameters
The procedure support the following config parameters:
name | type | default | description |
---|---|---|---|
readLabels |
Boolean |
false |
Creates node labels based on the value in the |
defaultRelationshipType |
String |
RELATED |
The default relationship type to use if none is specified in the GraphML file |
storeNodeIds |
Boolean |
false |
store the |
batchSize |
Integer |
20000 |
The number of elements to process per transaction |
compression |
|
|
Allow taking binary data, either not compressed (value: |
Output parameters
Name | Type |
---|---|
file |
STRING? |
source |
STRING? |
format |
STRING? |
nodes |
INTEGER? |
relationships |
INTEGER? |
properties |
INTEGER? |
time |
INTEGER? |
rows |
INTEGER? |
batchSize |
INTEGER? |
batches |
INTEGER? |
done |
BOOLEAN? |
data |
STRING? |
Reading from a file
By default importing from the file system is disabled.
We can enable it by setting the following property in apoc.conf
:
apoc.import.file.enabled=true
If we try to use any of the import procedures without having first set this property, we’ll get the following error message:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Import from files not enabled, please set apoc.import.file.enabled=true in your apoc.conf |
Import files are read from the import
directory, which is defined by the dbms.directories.import
property.
This means that any file path that we provide is relative to this directory.
If we try to read from an absolute path, such as /tmp/filename
, we’ll get an error message similar to the following one:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Can’t read url or key file:/path/to/neo4j/import/tmp/filename as json: /path/to/neo4j//import/tmp/filename (No such file or directory) |
We can enable reading files from anywhere on the file system by setting the following property in apoc.conf
:
apoc.import.file.use_neo4j_config=false
Neo4j will now be able to read from anywhere on the file system, so be sure that this is your intention before setting this property. |
Usage Examples
Import simple GraphML file
The simple.graphml
file contains a graph representation from the GraphML primer.
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns
http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<graph id="G" edgedefault="undirected">
<node id="n0"/>
<node id="n1"/>
<node id="n2"/>
<node id="n3"/>
<node id="n4"/>
<node id="n5"/>
<node id="n6"/>
<node id="n7"/>
<node id="n8"/>
<node id="n9"/>
<node id="n10"/>
<edge source="n0" target="n2"/>
<edge source="n1" target="n2"/>
<edge source="n2" target="n3"/>
<edge source="n3" target="n5"/>
<edge source="n3" target="n4"/>
<edge source="n4" target="n6"/>
<edge source="n6" target="n5"/>
<edge source="n5" target="n7"/>
<edge source="n6" target="n8"/>
<edge source="n8" target="n7"/>
<edge source="n8" target="n9"/>
<edge source="n8" target="n10"/>
</graph>
</graphml>
simple.graphml
CALL apoc.import.graphml("http://graphml.graphdrawing.org/primer/simple.graphml", {})
If we run this query, we’ll see the following output:
file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data |
---|---|---|---|---|---|---|---|---|---|---|---|
"http://graphml.graphdrawing.org/primer/simple.graphml" |
"file" |
"graphml" |
11 |
12 |
0 |
618 |
0 |
-1 |
0 |
TRUE |
NULL |
We could also copy simple.graphml
into Neo4j’s import
directory, and import the file from there.
We can then run the import procedure in the following way:
simple.graphml
CALL apoc.import.graphml("file://simple.graphml", {})
The Neo4j Browser visualization below shows the imported graph:
Import GraphML file created by Export GraphML procedures
movies.graphml
contains a subset of Neo4j’s movies graph, and was generated by the Export GraphML procedure.
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="born" for="node" attr.name="born"/>
<key id="name" for="node" attr.name="name"/>
<key id="tagline" for="node" attr.name="tagline"/>
<key id="label" for="node" attr.name="label"/>
<key id="title" for="node" attr.name="title"/>
<key id="released" for="node" attr.name="released"/>
<key id="roles" for="edge" attr.name="roles"/>
<key id="label" for="edge" attr.name="label"/>
<graph id="G" edgedefault="directed">
<node id="n188" labels=":Movie"><data key="labels">:Movie</data><data key="title">The Matrix</data><data key="tagline">Welcome to the Real World</data><data key="released">1999</data></node>
<node id="n189" labels=":Person"><data key="labels">:Person</data><data key="born">1964</data><data key="name">Keanu Reeves</data></node>
<node id="n190" labels=":Person"><data key="labels">:Person</data><data key="born">1967</data><data key="name">Carrie-Anne Moss</data></node>
<node id="n191" labels=":Person"><data key="labels">:Person</data><data key="born">1961</data><data key="name">Laurence Fishburne</data></node>
<node id="n192" labels=":Person"><data key="labels">:Person</data><data key="born">1960</data><data key="name">Hugo Weaving</data></node>
<node id="n193" labels=":Person"><data key="labels">:Person</data><data key="born">1967</data><data key="name">Lilly Wachowski</data></node>
<node id="n194" labels=":Person"><data key="labels">:Person</data><data key="born">1965</data><data key="name">Lana Wachowski</data></node>
<node id="n195" labels=":Person"><data key="labels">:Person</data><data key="born">1952</data><data key="name">Joel Silver</data></node>
<edge id="e267" source="n189" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Neo"]</data></edge>
<edge id="e268" source="n190" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Trinity"]</data></edge>
<edge id="e269" source="n191" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Morpheus"]</data></edge>
<edge id="e270" source="n192" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Agent Smith"]</data></edge>
<edge id="e271" source="n193" target="n188" label="DIRECTED"><data key="label">DIRECTED</data></edge>
<edge id="e272" source="n194" target="n188" label="DIRECTED"><data key="label">DIRECTED</data></edge>
<edge id="e273" source="n195" target="n188" label="PRODUCED"><data key="label">PRODUCED</data></edge>
</graph>
</graphml>
movies.graphml
CALL apoc.import.graphml("movies.graphml", {})
If we run this query, we’ll see the following output:
file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies.graphml" |
"file" |
"graphml" |
8 |
7 |
36 |
23 |
0 |
-1 |
0 |
TRUE |
NULL |
We can run the following query to see the imported graph:
MATCH p=()-->()
RETURN p
p |
---|
({name: "Laurence Fishburne", born: "1961", labels: ":Person"})-[:ACTED_IN {roles: "[\"Morpheus\"]", label: "ACTED_IN"}]→({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ":Movie"}) |
({name: "Carrie-Anne Moss", born: "1967", labels: ":Person"})-[:ACTED_IN {roles: "[\"Trinity\"]", label: "ACTED_IN"}]→({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", la bels: ":Movie"}) |
({name: "Lana Wachowski", born: "1965", labels: ":Person"})-[:DIRECTED {label: "DIRECTED"}]→({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ":Movie"}) |
({name: "Joel Silver", born: "1952", labels: ":Person"})-[:PRODUCED {label: "PRODUCED"}]→({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ":Movie"}) |
({name: "Lilly Wachowski", born: "1967", labels: ":Person"})-[:DIRECTED {label: "DIRECTED"}]→({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ":Movie"}) |
({name: "Keanu Reeves", born: "1964", labels: ":Person"})-[:ACTED_IN {roles: "[\"Neo\"]", label: "ACTED_IN"}]→({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", labels: ": Movie"}) |
({name: "Hugo Weaving", born: "1960", labels: ":Person"})-[:ACTED_IN {roles: "[\"Agent Smith\"]", label: "ACTED_IN"}]→({tagline: "Welcome to the Real World", title: "The Matrix", released: "1999", la bels: ":Movie"}) |
The labels defined in the GraphML file have been added to the labels
property on each node, rather than being added as a node label.
We can set the config property readLabels: true
to import native labels:
movies.graphml
and stores node labelsCALL apoc.import.graphml("movies.graphml", {readLabels: true})
file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies.graphml" |
"file" |
"graphml" |
8 |
7 |
21 |
23 |
0 |
-1 |
0 |
TRUE |
NULL |
And now let’s re-run the query to see the imported graph:
MATCH p=()-->()
RETURN;
p |
---|
(:Person {name: "Lilly Wachowski", born: "1967"})-[:DIRECTED]→(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"}) |
(:Person {name: "Carrie-Anne Moss", born: "1967"})-[:ACTED_IN {roles: "[\"Trinity\"]"}]→(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"}) |
(:Person {name: "Hugo Weaving", born: "1960"})-[:ACTED_IN {roles: "[\"Agent Smith\"]"}]→(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"}) |
(:Person {name: "Laurence Fishburne", born: "1961"})-[:ACTED_IN {roles: "[\"Morpheus\"]"}]→(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"}) |
(:Person {name: "Keanu Reeves", born: "1964"})-[:ACTED_IN {roles: "[\"Neo\"]"}]→(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"}) |
(:Person {name: "Joel Silver", born: "1952"})-[:PRODUCED]→(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"}) |
(:Person {name: "Lana Wachowski", born: "1965"})-[:DIRECTED]→(:Movie {tagline: "Welcome to the Real World", title: "The Matrix", released: "1999"}) |
Binary file
You can also import a file from a binary byte[]
(not compressed) or a compressed file (allowed compression algos are: GZIP
, BZIP2
, DEFLATE
, BLOCK_LZ4
, FRAMED_SNAPPY
).
CALL apoc.import.graphml(`binaryGzipByteArray`, {compression: 'GZIP'})
or:
CALL apoc.import.graphml(`binaryFileNotCompressed`, {compression: 'NONE'})
For example, this one works well with apoc.util.compress function:
WITH apoc.util.compress('<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<graph id="G" edgedefault="directed">
<node id="n0"> <data key="labels">:FOO</data><data key="name">foo</data> </node>
<node id="n1"> <data key="labels">:BAR</data><data key="name">bar</data> <data key="kids">[a,b,c]</data> </node>
<edge id="e0" source="n0" target="n1"> <data key="label">:EDGE_LABEL</data> <data key="name">foo</data> </edge>
</graph>
</graphml>', {compression: 'DEFLATE'}) as xmlCompressed
CALL apoc.import.graphml(xmlCompressed, {compression: 'DEFLATE'})
YIELD source, format, nodes, relationships, properties
RETURN source, format, nodes, relationships, properties
source | format | nodes | relationships | properties |
---|---|---|---|---|
"binary" |
"graphml" |
2 |
1 |
7 |