Load GEXF (Graph Exchange XML Format)
Many existing applications and data integrations use GEXF to describes a graph with nodes and edges. For further information, you should visit the official documentation.
It is possible to load or import nodes and relationship from a GEXF file with the procedures
apoc.load.gexf
and apoc.import.gexf
. You need to:
-
provide a path to a GEXF file
-
provide configuration (optional)
The apoc.import.gexf
read as the apoc.load.gexf
but also create nodes and relationships in Neo4j.
For reading from files you’ll have to enable the config option:
apoc.import.file.enabled=true
By default file paths are global, for paths relative to the import
directory set:
apoc.import.file.use_neo4j_config=true
Examples for apoc.load.gexf
<?xml version="1.0" encoding="UTF-8"?> <gexf version="1.2"> <graph defaultedgetype="directed"> <nodes> <node foo="bar"> <attvalues> <attvalue for="0" value="http://gephi.org"/> </attvalues> </node> </nodes> </graph> </gexf>
CALL apoc.load.gexf('load.gexf')
value |
---|
{_type: gexf, _children: [{_type: graph, defaultedgetype: directed, _children: [{_type: nodes, _children: [{_type: node, _children: [{_type: attvalues, _children: [{_type: attvalue, for: 0, value: http://gephi.org}]}], foo: bar}]}]}], version: 1.2} |
Examples for apoc.import.gexf
Besides the file you can pass in a config map:
name | type | default | description |
---|---|---|---|
readLabels |
Boolean |
false |
Creates node labels based on the value in the |
defaultRelationshipType |
String |
RELATED |
The default relationship type to use if none is specified in the GraphML file |
storeNodeIds |
Boolean |
false |
store the |
batchSize |
Integer |
20000 |
The number of elements to process per transaction |
compression |
|
|
Allow taking binary data, either not compressed (value: |
source |
Map<String,String> |
Empty map |
See |
target |
Map<String,String> |
Empty map |
See |
With the following file will be created:
-
1 node with label Gephi
-
2 nodes with label Webatlas
-
1 node with label RTGI
-
1 node with label BarabasiLab
-
6 relationships of kind KNOWS
-
1 relationship of kind HAS_TICKET
-
1 relationship of kind BAZ
<?xml version="1.0" encoding="UTF-8"?> <gexf xmlns="http://gexf.net/1.3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://gexf.net/1.3 http://gexf.net/1.3/gexf.xsd" version="1.2"> <meta lastmodifieddate="2009-03-20"> <creator>Gephi.org</creator> <description>A Web network</description> </meta> <graph defaultedgetype="directed"> <attributes class="node"> <attribute id="0" title="url" type="string"/> <attribute id="room" title="room" type="integer"/> <attribute id="projects" title="projects" type="long"/> <attribute id="price" title="price" type="double"/> <attribute id="1" title="indegree" type="float"/> <attribute id="members" title="members" type="liststring"/> <attribute id="pins" title="pins" type="listboolean"/> <attribute id="2" title="frog" type="boolean"> <default>true</default> </attribute> </attributes> <attributes class="edge"> <attribute id="score" title="score" type="float"/> </attributes> <nodes> <node id="0" label="Gephi"> <attvalues> <attvalue for="0" value="http://gephi.org"/> <attvalue for="1" value="1"/> <attvalue for="room" value="10"/> <attvalue for="price" value="10.02"/> <attvalue for="projects" value="300"/> <attvalue for="members" value="[Altomare, Sterpeto, Lino]"/> <attvalue for="pins" value="[true, false, true, false]"/> </attvalues> </node> <node id="5" label="Gephi"> <attvalues> <attvalue for="0" value="http://test.gephi.org"/> <attvalue for="1" value="2"/> </attvalues> </node> <node id="1" label="Webatlas"> <attvalues> <attvalue for="0" value="http://webatlas.fr"/> <attvalue for="1" value="2"/> </attvalues> </node> <node id="2" label="RTGI"> <attvalues> <attvalue for="0" value="http://rtgi.fr"/> <attvalue for="1" value="1"/> </attvalues> </node> <node id="3" label=":BarabasiLab:Webatlas"> <attvalues> <attvalue for="0" value="http://barabasilab.com"/> <attvalue for="1" value="1"/> <attvalue for="2" value="false"/> </attvalues> </node> </nodes> <edges> <edge source="0" target="1" kind="KNOWS"> <attvalues> <attvalue for="score" value="1.5"/> </attvalues> </edge> <edge source="0" target="0" kind="BAZ"> <attvalues> <attvalue for="foo" value="bar"/> <attvalue for="score" value="2"/> </attvalues> </edge> <edge source="0" target="2" kind="HAS_TICKET"> <attvalues> <attvalue for="ajeje" value="brazorf"/> <attvalue for="score" value="3"/> </attvalues> </edge> <edge source="0" target="2" kind="KNOWS" /> <edge source="1" target="0" kind="KNOWS" /> <edge source="2" target="1" kind="KNOWS" /> <edge source="0" target="3" kind="KNOWS" /> <edge source="5" target="3" kind="KNOWS" /> </edges> </graph> </gexf>
CALL apoc.import.gexf('data.gexf', {readLabels:true})
value |
---|
{ "relationships" : 8, "batches" : 0, "file" : "file:/../data.gexf", "nodes" : 5, "format" : "gexf", "source" : "file", "time" : 9736, "rows" : 0, "batchSize" : -1, "done" : true, "properties" : 21 } |
We can also store the node IDs by executing:
CALL apoc.import.gexf('data.gexf', {readLabels:true, storeNodeIds: true})
source / target config
Allows the import of relations in case the source and / or target nodes are not present in the file, searching for nodes via a custom label and property.
To do this, we can insert into the config map source: {label: '<MY_SOURCE_LABEL>', id: ’<MY_SOURCE_ID>'
}` and/or source: {label: '<MY_TARGET_LABEL>', id: ’<MY_TARGET_ID>'
}`
In this way, we can search start and end nodes via the source and end attribute of edge
tag.
For example, with a config map {source: {id: 'myId', label: 'Foo'}, target: {id: 'other', label: 'Bar'}}
with a edge row like <edge id="e0" source="n0" target="n1" label="KNOWS"><data key="label">KNOWS</data></edge>
we search a source node (:Foo {myId: 'n0'})
and an end node (:Bar {other: 'n1'})
.
The id key is optional (the default is 'id'
).