Load GEXF (Graph Exchange XML Format)

Many existing applications and data integrations use GEXF to describes a graph with nodes and edges. For further information, you should visit the official documentation.

It is possible to load or import nodes and relationship from a GEXF file with the procedures apoc.load.gexf and apoc.import.gexf. You need to:

  • provide a path to a GEXF file

  • provide configuration (optional)

The apoc.import.gexf read as the apoc.load.gexf but also create nodes and relationships in Neo4j.

For reading from files you’ll have to enable the config option:

apoc.import.file.enabled=true

By default file paths are global, for paths relative to the import directory set:

apoc.import.file.use_neo4j_config=true

Examples for apoc.load.gexf

load.gexf
<?xml version="1.0" encoding="UTF-8"?>
<gexf version="1.2">
  <graph defaultedgetype="directed">
    <nodes>
      <node foo="bar">
        <attvalues>
          <attvalue for="0" value="http://gephi.org"/>
        </attvalues>
      </node>
    </nodes>
  </graph>
</gexf>
CALL apoc.load.gexf('load.gexf')
Table 1. Results
value

{_type: gexf, _children: [{_type: graph, defaultedgetype: directed, _children: [{_type: nodes, _children: [{_type: node, _children: [{_type: attvalues, _children: [{_type: attvalue, for: 0, value: http://gephi.org}]}], foo: bar}]}]}], version: 1.2}

Examples for apoc.import.gexf

Besides the file you can pass in a config map:

Table 2. Config parameters
name type default description

readLabels

Boolean

false

Creates node labels based on the value in the labels property of node elements

defaultRelationshipType

String

RELATED

The default relationship type to use if none is specified in the GraphML file

storeNodeIds

Boolean

false

store the id property of node elements

batchSize

Integer

20000

The number of elements to process per transaction

compression

Enum[NONE, BYTES, GZIP, BZIP2, DEFLATE, BLOCK_LZ4, FRAMED_SNAPPY]

null

Allow taking binary data, either not compressed (value: NONE) or compressed (other values)

source

Map<String,String>

Empty map

See source / target config parameter below

target

Map<String,String>

Empty map

See source / target config parameter below See the Binary file example

With the following file will be created:

  • 1 node with label Gephi

  • 2 nodes with label Webatlas

  • 1 node with label RTGI

  • 1 node with label BarabasiLab

  • 6 relationships of kind KNOWS

  • 1 relationship of kind HAS_TICKET

  • 1 relationship of kind BAZ

data.gexf
<?xml version="1.0" encoding="UTF-8"?>
<gexf xmlns="http://gexf.net/1.3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://gexf.net/1.3 http://gexf.net/1.3/gexf.xsd" version="1.2">
  <meta lastmodifieddate="2009-03-20">
    <creator>Gephi.org</creator>
    <description>A Web network</description>
  </meta>
  <graph defaultedgetype="directed">
    <attributes class="node">
      <attribute id="0" title="url" type="string"/>
      <attribute id="room" title="room" type="integer"/>
      <attribute id="projects" title="projects" type="long"/>
      <attribute id="price" title="price" type="double"/>
      <attribute id="1" title="indegree" type="float"/>
      <attribute id="members" title="members" type="liststring"/>
      <attribute id="pins" title="pins" type="listboolean"/>
      <attribute id="2" title="frog" type="boolean">
        <default>true</default>
      </attribute>
    </attributes>
    <attributes class="edge">
      <attribute id="score" title="score" type="float"/>
    </attributes>
    <nodes>
      <node id="0" label="Gephi">
        <attvalues>
          <attvalue for="0" value="http://gephi.org"/>
          <attvalue for="1" value="1"/>
          <attvalue for="room" value="10"/>
          <attvalue for="price" value="10.02"/>
          <attvalue for="projects" value="300"/>
          <attvalue for="members" value="[Altomare, Sterpeto, Lino]"/>
          <attvalue for="pins" value="[true, false, true, false]"/>
        </attvalues>
      </node>
       <node id="5" label="Gephi">
          <attvalues>
            <attvalue for="0" value="http://test.gephi.org"/>
            <attvalue for="1" value="2"/>
          </attvalues>
        </node>
      <node id="1" label="Webatlas">
        <attvalues>
          <attvalue for="0" value="http://webatlas.fr"/>
          <attvalue for="1" value="2"/>
        </attvalues>
      </node>
      <node id="2" label="RTGI">
        <attvalues>
          <attvalue for="0" value="http://rtgi.fr"/>
          <attvalue for="1" value="1"/>
        </attvalues>
      </node>
      <node id="3" label=":BarabasiLab:Webatlas">
        <attvalues>
          <attvalue for="0" value="http://barabasilab.com"/>
          <attvalue for="1" value="1"/>
          <attvalue for="2" value="false"/>
        </attvalues>
      </node>
    </nodes>
    <edges>
      <edge source="0" target="1" kind="KNOWS">
          <attvalues>
            <attvalue for="score" value="1.5"/>
          </attvalues>
      </edge>
      <edge source="0" target="0" kind="BAZ">
          <attvalues>
            <attvalue for="foo" value="bar"/>
            <attvalue for="score" value="2"/>
          </attvalues>
      </edge>
      <edge source="0" target="2" kind="HAS_TICKET">
          <attvalues>
            <attvalue for="ajeje" value="brazorf"/>
            <attvalue for="score" value="3"/>
          </attvalues>
      </edge>
      <edge source="0" target="2" kind="KNOWS" />
      <edge source="1" target="0" kind="KNOWS" />
      <edge source="2" target="1" kind="KNOWS" />
      <edge source="0" target="3" kind="KNOWS" />
      <edge source="5" target="3" kind="KNOWS" />
    </edges>
  </graph>
</gexf>
CALL apoc.import.gexf('data.gexf', {readLabels:true})
Table 3. Results
value

{ "relationships" : 8, "batches" : 0, "file" : "file:/../data.gexf", "nodes" : 5, "format" : "gexf", "source" : "file", "time" : 9736, "rows" : 0, "batchSize" : -1, "done" : true, "properties" : 21 }

We can also store the node IDs by executing:

CALL apoc.import.gexf('data.gexf', {readLabels:true, storeNodeIds: true})

source / target config

Allows the import of relations in case the source and / or target nodes are not present in the file, searching for nodes via a custom label and property. To do this, we can insert into the config map source: {label: '<MY_SOURCE_LABEL>', id: ’<MY_SOURCE_ID>'}` and/or source: {label: '<MY_TARGET_LABEL>', id: ’<MY_TARGET_ID>'}` In this way, we can search start and end nodes via the source and end attribute of edge tag.

For example, with a config map {source: {id: 'myId', label: 'Foo'}, target: {id: 'other', label: 'Bar'}} with a edge row like <edge id="e0" source="n0" target="n1" label="KNOWS"><data key="label">KNOWS</data></edge> we search a source node (:Foo {myId: 'n0'}) and an end node (:Bar {other: 'n1'}). The id key is optional (the default is 'id').