apoc.import.xml
Procedure APOC Core
apoc.import.xml(file,config) - imports graph from provided file
Reading from a file
By default importing from the file system is disabled.
We can enable it by setting the following property in apoc.conf
:
apoc.import.file.enabled=true
If we try to use any of the import procedures without having first set this property, we’ll get the following error message:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Import from files not enabled, please set apoc.import.file.enabled=true in your apoc.conf |
Import files are read from the import
directory, which is defined by the dbms.directories.import
property.
This means that any file path that we provide is relative to this directory.
If we try to read from an absolute path, such as /tmp/filename
, we’ll get an error message similar to the following one:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Can’t read url or key file:/path/to/neo4j/import/tmp/filename as json: /path/to/neo4j//import/tmp/filename (No such file or directory) |
We can enable reading files from anywhere on the file system by setting the following property in apoc.conf
:
apoc.import.file.use_neo4j_config=false
Neo4j will now be able to read from anywhere on the file system, so be sure that this is your intention before setting this property. |
Usage Examples
The examples in this section are based on the Microsoft book.xml file.
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
...
This file can be downloaded from GitHub.
We can write the following query to create a graph structure of the Microsoft books XML file.
CALL apoc.import.xml(
"https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.1/src/test/resources/xml/books.xml",
{relType:'NEXT_WORD', label:'XmlWord'}
)
YIELD node
RETURN node;
node |
---|
(:XmlDocument {_xmlVersion: "1.0", _xmlEncoding: "UTF-8", url: "https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.0/src/test/resources/xml/books.xml"}) |
The Neo4j Browser visualization below shows the imported graph:
Binary file
You can also import a file from a binary byte[]
(not compressed) or a compressed file (allowed compression algos are: GZIP
, BZIP2
, DEFLATE
, BLOCK_LZ4
, FRAMED_SNAPPY
).
CALL apoc.import.xml(`binaryGzipByteArray`, {compression: 'GZIP'})
or:
CALL apoc.import.xml(`binaryFileNotCompressed`, {compression: 'NONE'})
For example, this one works well with apoc.util.compress function:
WITH apoc.util.compress('<?xml version="1.0" encoding="UTF-8"?>
<parent name="databases">
<child name="Neo4j">
Neo4j is a graph database
</child>
<child name="relational">
<grandchild name="MySQL"><![CDATA[
MySQL is a database & relational
]]>
</grandchild>
<grandchild name="Postgres">
Postgres is a relational database
</grandchild>
</child>
</parent>', {compression: 'DEFLATE'}) as xmlCompressed
CALL apoc.import.xml(xmlCompressed, {compression: 'DEFLATE'})
YIELD node
RETURN node
node |
---|
[source,json] ---- { "identity": 11, "labels": [ "XmlDocument" ], "properties": { "_xmlEncoding": "UTF-8", "_xmlVersion": "1.0" } } ---- |