Load XML
Many existing enterprise applications, endpoints, and files use XML as data exchange format. The Load XML procedures allow us to process these files.
Procedure and Function Overview
The table below describes the available procedures and functions:
Qualified Name | Type | Release |
---|---|---|
- load from XML URL (e.g. web-api) to import XML as single nested map with attributes and _type, _text and _childrenx fields. |
|
|
|
|
|
- imports graph from provided file |
|
|
apoc.load.xml
This procedure takes a file or HTTP URL and parses the XML into a map data structure.
signature |
---|
|
The map is created using the following rules:
-
in simple mode, each type of children has its own entry in the parent map.
-
the element-type as key is prefixed with
_
to prevent collisions with attributes. -
if there is a single element, the entry will just have that element as value, not a collection.
-
if there is more than one element, there will be a list of values.
-
each child will still have its
_type
field to discern them.
This procedure supports the following config parameters:
name | type | default | description |
---|---|---|---|
failOnError |
boolean |
true |
fail if error encountered while parsing XML |
headers |
Map |
{} |
HTTP headers to be used when querying XML document |
compression |
|
|
Allow taking binary data, either not compressed (value: |
apoc.xml.parse
If our dataset contains nodes with XML as property values, they can be parsed into maps with the apoc.xml.parse
function.
signature |
---|
|
This function supports the following config parameter:
name | type | default | description |
---|---|---|---|
failOnError |
boolean |
true |
fail if error encountered while parsing XML |
WITH '<?xml version="1.0"?><table><tr><td><img src="pix/logo-tl.gif"></img></td></tr></table>' AS xmlString
RETURN apoc.xml.parse(xmlString) AS value
value |
---|
{_type: "table", _children: [{_type: "tr", _children: [{_type: "td", _children: [{_type: "img", src: "pix/logo-tl.gif"}]}]}]} |
apoc.import.xml
If we don’t want to do any transformation of the XML before creating a graph structure, we can create a 1:1 mapping of XML into the graph using the apoc.import.xml
procedure.
signature |
---|
|
This procedure will return a node representing the XML document containing nodes and relationships underneath mapping to the XML structure.
The following mapping rules are applied:
xml | label | properties |
---|---|---|
document |
XmlDocument |
_xmlVersion, _xmlEncoding |
processing instruction |
XmlProcessingInstruction |
_piData, _piTarget |
Element/Tag |
XmlTag |
_name |
Attribute |
n/a |
property in the XmlTag node |
Text |
XmlWord |
for each word a separate node is created |
The nodes for the XML document are connected:
relationship type | description |
---|---|
:IS_CHILD_OF |
pointing to a nested xml element |
:FIRST_CHILD_OF |
pointing to the first child |
:NEXT_SIBLING |
pointing to the next xml element on the same nesting level |
:NEXT |
produces a linear chain through the full document |
:NEXT_WORD |
only produced if config map has |
This procedure supports the following config parameters:
config option | default value | description |
---|---|---|
connectCharacters |
false |
if |
filterLeadingWhitespace |
false |
if |
delimiter |
|
if given, split text elements with the delimiter into separate nodes |
label |
XmlCharacter |
label to use for text element representation |
relType |
|
relationship type to be used for connecting the text elements into one linked list |
charactersForTag |
{} |
map of tagname → string. For the given tag names an additional text element is added containing the value as |
Importing from a file
By default importing from the file system is disabled.
We can enable it by setting the following property in apoc.conf
:
apoc.import.file.enabled=true
If we try to use any of the import procedures without having first set this property, we’ll get the following error message:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Import from files not enabled, please set apoc.import.file.enabled=true in your apoc.conf |
Import files are read from the import
directory, which is defined by the dbms.directories.import
property.
This means that any file path that we provide is relative to this directory.
If we try to read from an absolute path, such as /tmp/filename
, we’ll get an error message similar to the following one:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Can’t read url or key file:/path/to/neo4j/import/tmp/filename as json: /path/to/neo4j//import/tmp/filename (No such file or directory) |
We can enable reading files from anywhere on the file system by setting the following property in apoc.conf
:
apoc.import.file.use_neo4j_config=false
Neo4j will now be able to read from anywhere on the file system, so be sure that this is your intention before setting this property. |
Examples
The examples in this section are based on the Microsoft book.xml file.
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
...
This file can be downloaded from GitHub.
Import from local file
The books.xml
file described below contains the first two books from the Microsoft Books XML file.
We’ll use the smaller file in this section to simplify our examples.
<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<author>Arciniegas, Fabio</author>
<title>XML Developer's Guide</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications
with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect battles corporate zombies,
an evil sorceress, and her own childhood to become queen
of the world.</description>
</book>
</catalog>
We’ll place this file into the import
directory of our Neo4j instance.
Let’s now write a query using the apoc.load.xml
procedure to explore this file.
books.xml
and returns the content as Cypher data structuresCALL apoc.load.xml("file:///books.xml")
YIELD value
RETURN value
value |
---|
{_type: "catalog", _children: [{_type: "book", _children: [{_type: "author", _text: "Gambardella, Matthew"}, {_type: "author", _text: "Arciniegas, Fabio"}, {_type: "title", _text: "XML Developer’s Guide"}, {_type: "genre", _text: "Computer"}, {_type: "price", _text: "44.95"}, {_type: "publish_date", _text: "2000-10-01"}, {_type: "description", _text: "An in-depth look at creating applications with XML."}], id: "bk101"}, {_type: "book", _children: [{_type: "author", _text: "Ralls, Kim"}, {_type: "title", _text: "Midnight Rain"}, {_type: "genre", _text: "Fantasy"}, {_type: "price", _text: "5.95"}, {_type: "publish_date", _text: "2000-12-16"}, {_type: "description", _text: "A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world."}], id: "bk102"}]} |
We get back a map representing the XML structure.
Every time an XML element is nested inside another one, it is accessible via the .children
property.
We can write the following query to get a better understanding of what our file contains.
book.xml
and parses the results to pull out the title, description, genre, and authorsCALL apoc.load.xml("file:///books.xml")
YIELD value
UNWIND value._children AS book
RETURN book.id AS bookId,
[item in book._children WHERE item._type = "title"][0] AS title,
[item in book._children WHERE item._type = "description"][0] AS description,
[item in book._children WHERE item._type = "author"] AS authors,
[item in book._children WHERE item._type = "genre"][0] AS genre;
bookId | title | description | authors | genre |
---|---|---|---|---|
"bk101" |
{_type: "title", _text: "XML Developer’s Guide"} |
{_type: "description", _text: "An in-depth look at creating applications with XML."} |
[{_type: "author", _text: "Gambardella, Matthew"}, {_type: "author", _text: "Arciniegas, Fabio"}] |
{_type: "genre", _text: "Computer"} |
"bk102" |
{_type: "title", _text: "Midnight Rain"} |
{_type: "description", _text: "A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world."} |
[{_type: "author", _text: "Ralls, Kim"}] |
{_type: "genre", _text: "Fantasy"} |
Let’s now create a graph of books and their metadata, authors, and genres.
book.xml
and parses the results to pull out the title, description, genre, and authorsCALL apoc.load.xml("file:///books.xml")
YIELD value
UNWIND value._children AS book
WITH book.id AS bookId,
[item in book._children WHERE item._type = "title"][0] AS title,
[item in book._children WHERE item._type = "description"][0] AS description,
[item in book._children WHERE item._type = "author"] AS authors,
[item in book._children WHERE item._type = "genre"][0] AS genre
MERGE (b:Book {id: bookId})
SET b.title = title._text, b.description = description._text
MERGE (g:Genre {name: genre._text})
MERGE (b)-[:HAS_GENRE]->(g)
WITH b, authors
UNWIND authors AS author
MERGE (a:Author {name:author._text})
MERGE (a)-[:WROTE]->(b);
The Neo4j Browser visualization below shows the imported graph:
You can use failOnError
configuration to handle the result in case of incorrect url or xml.
For example, with the help of the apoc.when
procedure, you can return nothingToDo
as result with incorrect url:
CALL apoc.load.xml("MY_XML_URL", '', {failOnError:false})
YIELD value
WITH value as valueXml
call apoc.do.when(valueXml["_type"] is null, "return 'nothingToDo' as result", "return valueXml as result", {valueXml: valueXml})
YIELD value
UNWIND value["result"] as result
RETURN result
Import from GitHub
We can also process XML files from HTTP or HTTPS URIs.
Let’s start by processing the books.xml
file hosted on GitHub.
This time we’ll pass in true
as the 4th argument of the procedure.
This means that the XML will be parsed in simple mode.
WITH "https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.3/core/src/test/resources/xml/books.xml" AS uri
CALL apoc.load.xml(uri, '', {}, true)
YIELD value
RETURN value;
value |
---|
{_type: "catalog", _catalog: [{_type: "book", _book: [{_type: "author", _text: "Gambardella, Matthew"}, {_type: "author", _text: "Arciniegas, Fabio"}, {_type: "title", _text: "XML Developer’s Guide"}, {_type: "genre", _text: "Computer"}, {_type: "price", _text: "44.95"}, {_type: "publish_date", _text: "2000-10-01"}, {_type: "description", _text: "An in-depth look at creating applications with XML."}], id: "bk101"}, {_type: "book", _book: [{_type: "author", _text: "Ralls, Kim"}, {_type: "title", _text: "Midnight Rain"}, {_type: "genre", _text: "Fantasy"}, {_type: "price", _text: "5.95"}, {_type: "publish_date", _text: "2000-12-16"}, {_type: "description", _text: "A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world."}], id: "bk102"}, {_type: "book", _book: [{_type: "author", _text: "Corets, Eva"}, {_type: "title", _text: "Maeve Ascendant"}, {_type: "genre", _text: "Fantasy"}, {_type: "price", _text: "5.95"}, {_type: "publish_date", _text: "2000-11-17"}, {_type: "description", _text: "After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society."}], id: "bk103"}, {_type: "book", _book: [{_type: "author", _text: "Corets, Eva"}, {_type: "title", _text: "Oberon’s Legacy"}, {_type: "genre", _text: "Fantasy"}, {_type: "price", _text: "5.95"}, {_type: "publish_date", _text: "2001-03-10"}, {_type: "description", _text: "In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant."}], id: "bk104"}, {_type: "book", _book: [{_type: "author", _text: "Corets, Eva"}, {_type: "title", _text: "The Sundered Grail"}, {_type: "genre", _text: "Fantasy"}, {_type: "price", _text: "5.95"}, {_type: "publish_date", _text: "2001-09-10"}, {_type: "description", _text: "The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon’s Legacy."}], id: "bk105"}, {_type: "book", _book: [{_type: "author", _text: "Randall, Cynthia"}, {_type: "title", _text: "Lover Birds"}, {_type: "genre", _text: "Romance"}, {_type: "price", _text: "4.95"}, {_type: "publish_date", _text: "2000-09-02"}, {_type: "description", _text: "When Carla meets Paul at an ornithology conference, tempers fly as feathers get ruffled."}], id: "bk106"}, {_type: "book", _book: [{_type: "author", _text: "Thurman, Paula"}, {_type: "title", _text: "Splish Splash"}, {_type: "genre", _text: "Romance"}, {_type: "price", _text: "4.95"}, {_type: "publish_date", _text: "2000-11-02"}, {_type: "description", _text: "A deep sea diver finds true love twenty thousand leagues beneath the sea."}], id: "bk107"}, {_type: "book", _book: [{_type: "author", _text: "Knorr, Stefan"}, {_type: "title", _text: "Creepy Crawlies"}, {_type: "genre", _text: "Horror"}, {_type: "price", _text: "4.95"}, {_type: "publish_date", _text: "2000-12-06"}, {_type: "description", _text: "An anthology of horror stories about roaches, centipedes, scorpions and other insects."}], id: "bk108"}, {_type: "book", _book: [{_type: "author", _text: "Kress, Peter"}, {_type: "title", _text: "Paradox Lost"}, {_type: "genre", _text: "Science Fiction"}, {_type: "price", _text: "6.95"}, {_type: "publish_date", _text: "2000-11-02"}, {_type: "description", _text: "After an inadvertant trip through a Heisenberg Uncertainty Device, James Salway discovers the problems of being quantum."}], id: "bk109"}, {_type: "book", _book: [{_type: "author", _text: "O’Brien, Tim"}, {_type: "title", _text: "Microsoft .NET: The Programming Bible"}, {_type: "genre", _text: "Computer"}, {_type: "price", _text: "36.95"}, {_type: "publish_date", _text: "2000-12-09"}, {_type: "description", _text: "Microsoft’s .NET initiative is explored in detail in this deep programmer’s reference."}], id: "bk110"}, {_type: "book", _book: [{_type: "author", _text: "O’Brien, Tim"}, {_type: "title", _text: "MSXML3: A Comprehensive Guide"}, {_type: "genre", _text: "Computer"}, {_type: "price", _text: "36.95"}, {_type: "publish_date", _text: "2000-12-01"}, {_type: "description", _text: "The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more."}], id: "bk111"}, {_type: "book", _book: [{_type: "author", _text: "Galos, Mike"}, {_type: "title", _text: "Visual Studio 7: A Comprehensive Guide"}, {_type: "genre", _text: "Computer"}, {_type: "price", _text: "49.95"}, {_type: "publish_date", _text: "2001-04-16"}, {_type: "description", _text: "Microsoft Visual Studio 7 is explored in depth, looking at how Visual Basic, Visual C+, C#, and ASP are integrated into a comprehensive development environment."}], id: "bk112"}]} |
We again get back back a map representing the XML structure, but the structure is different than when we don’t use simple mode.
This time nested XML elements are accessible via a property of the element name prefixed with an _
.
We can write the following query to get a better understanding of what our file contains.
book.xml
and parses the results to pull out the title, description, genre, and authorsWITH "https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/{branch}/core/src/test/resources/xml/books.xml" AS uri
CALL apoc.load.xml(uri, '', {}, true)
YIELD value
UNWIND value._catalog AS catalog
RETURN catalog.id AS bookId,
[item in catalog._book WHERE item._type = "title"][0] AS title,
[item in catalog._book WHERE item._type = "description"][0] AS description,
[item in catalog._book WHERE item._type = "author"] AS authors,
[item in catalog._book WHERE item._type = "genre"][0] AS genre;
bookId | title | description | authors | genre |
---|---|---|---|---|
"bk101" |
{_type: "title", _text: "XML Developer’s Guide"} |
{_type: "description", _text: "An in-depth look at creating applications with XML."} |
[{_type: "author", _text: "Gambardella, Matthew"}, {_type: "author", _text: "Arciniegas, Fabio"}] |
{_type: "genre", _text: "Computer"} |
"bk102" |
{_type: "title", _text: "Midnight Rain"} |
{_type: "description", _text: "A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world."} |
[{_type: "author", _text: "Ralls, Kim"}] |
{_type: "genre", _text: "Fantasy"} |
"bk103" |
{_type: "title", _text: "Maeve Ascendant"} |
{_type: "description", _text: "After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society."} |
[{_type: "author", _text: "Corets, Eva"}] |
{_type: "genre", _text: "Fantasy"} |
"bk104" |
{_type: "title", _text: "Oberon’s Legacy"} |
{_type: "description", _text: "In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant."} |
[{_type: "author", _text: "Corets, Eva"}] |
{_type: "genre", _text: "Fantasy"} |
"bk105" |
{_type: "title", _text: "The Sundered Grail"} |
{_type: "description", _text: "The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon’s Legacy."} |
[{_type: "author", _text: "Corets, Eva"}] |
{_type: "genre", _text: "Fantasy"} |
"bk106" |
{_type: "title", _text: "Lover Birds"} |
{_type: "description", _text: "When Carla meets Paul at an ornithology conference, tempers fly as feathers get ruffled."} |
[{_type: "author", _text: "Randall, Cynthia"}] |
{_type: "genre", _text: "Romance"} |
"bk107" |
{_type: "title", _text: "Splish Splash"} |
{_type: "description", _text: "A deep sea diver finds true love twenty thousand leagues beneath the sea."} |
[{_type: "author", _text: "Thurman, Paula"}] |
{_type: "genre", _text: "Romance"} |
"bk108" |
{_type: "title", _text: "Creepy Crawlies"} |
{_type: "description", _text: "An anthology of horror stories about roaches, centipedes, scorpions and other insects."} |
[{_type: "author", _text: "Knorr, Stefan"}] |
{_type: "genre", _text: "Horror"} |
"bk109" |
{_type: "title", _text: "Paradox Lost"} |
{_type: "description", _text: "After an inadvertant trip through a Heisenberg Uncertainty Device, James Salway discovers the problems of being quantum."} |
[{_type: "author", _text: "Kress, Peter"}] |
{_type: "genre", _text: "Science Fiction"} |
"bk110" |
{_type: "title", _text: "Microsoft .NET: The Programming Bible"} |
{_type: "description", _text: "Microsoft’s .NET initiative is explored in detail in this deep programmer’s reference."} |
[{_type: "author", _text: "O’Brien, Tim"}] |
{_type: "genre", _text: "Computer"} |
"bk111" |
{_type: "title", _text: "MSXML3: A Comprehensive Guide"} |
{_type: "description", _text: "The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more."} |
[{_type: "author", _text: "O’Brien, Tim"}] |
{_type: "genre", _text: "Computer"} |
"bk112" |
{_type: "title", _text: "Visual Studio 7: A Comprehensive Guide"} |
{_type: "description", _text: "Microsoft Visual Studio 7 is explored in depth, looking at how Visual Basic, Visual C+, C#, and ASP are integrated into a comprehensive development environment."} |
[{_type: "author", _text: "Galos, Mike"}] |
{_type: "genre", _text: "Computer"} |
Rather than just returning that data, we can create a graph of books and their metadata, authors, and genres.
book.xml
and parses the results to pull out the title, description, genre, and authorsWITH "https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.3/core/src/test/resources/xml/books.xml" AS uri
CALL apoc.load.xml(uri, '', {}, true)
YIELD value
UNWIND value._catalog AS catalog
WITH catalog.id AS bookId,
[item in catalog._book WHERE item._type = "title"][0] AS title,
[item in catalog._book WHERE item._type = "description"][0] AS description,
[item in catalog._book WHERE item._type = "author"] AS authors,
[item in catalog._book WHERE item._type = "genre"][0] AS genre
MERGE (b:Book {id: bookId})
SET b.title = title._text, b.description = description._text
MERGE (g:Genre {name: genre._text})
MERGE (b)-[:HAS_GENRE]->(g)
WITH b, authors
UNWIND authors AS author
MERGE (a:Author {name:author._text})
MERGE (a)-[:WROTE]->(b);
The Neo4j Browser visualization below shows the imported graph:
xPath expressions
We can also provide an xPath expression to select nodes from an XML document.
If we only want to return books that have the Computer
genre, we could write the following query:
CALL apoc.load.xml(
"https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.3/core/src/test/resources/xml/books.xml",
'/catalog/book[genre=\"Computer\"]'
)
YIELD value as book
WITH book.id as id, [attr IN book._children WHERE attr._type IN ['title','price'] | attr._text] as pairs
RETURN id, pairs[0] as title, pairs[1] as price;
id | title | price |
---|---|---|
"bk101" |
"XML Developer’s Guide" |
"44.95" |
"bk110" |
"Microsoft .NET: The Programming Bible" |
"36.95" |
"bk111" |
"MSXML3: A Comprehensive Guide" |
"36.95" |
"bk112" |
"Visual Studio 7: A Comprehensive Guide" |
"49.95" |
In this case we return only id
, title
and prize
but we can return any other elements
We can also return just a single specific element.
For example, the following query returns the author
of the book with id = bg102
CALL apoc.load.xml(
'https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.3/core/src/test/resources/xml/books.xml',
'/catalog/book[@id="bk102"]/author'
)
YIELD value as result
WITH result._text as author
RETURN author;
author |
---|
"Ralls, Kim" |
Extracting data structures
We can turn values into a map using the apoc.map.fromPairs
function.
call apoc.load.xml("https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.3/core/src/test/resources/xml/books.xml")
yield value as catalog
UNWIND catalog._children as book
WITH book.id as id, [attr IN book._children WHERE attr._type IN ['author','title'] | [attr._type, attr._text]] as pairs
WITH id, apoc.map.fromPairs(pairs) AS value
RETURN id, value
id | value |
---|---|
"bk101" |
{title: "XML Developer’s Guide", author: "Arciniegas, Fabio"} |
"bk102" |
{title: "Midnight Rain", author: "Ralls, Kim"} |
"bk103" |
{title: "Maeve Ascendant", author: "Corets, Eva"} |
"bk104" |
{title: "Oberon’s Legacy", author: "Corets, Eva"} |
"bk105" |
{title: "The Sundered Grail", author: "Corets, Eva"} |
"bk106" |
{title: "Lover Birds", author: "Randall, Cynthia"} |
"bk107" |
{title: "Splish Splash", author: "Thurman, Paula"} |
"bk108" |
{title: "Creepy Crawlies", author: "Knorr, Stefan"} |
"bk109" |
{title: "Paradox Lost", author: "Kress, Peter"} |
"bk110" |
{title: "Microsoft .NET: The Programming Bible", author: "O’Brien, Tim"} |
"bk111" |
{title: "MSXML3: A Comprehensive Guide", author: "O’Brien, Tim"} |
"bk112" |
{title: "Visual Studio 7: A Comprehensive Guide", author: "Galos, Mike"} |
And now we can cleanly access the attributes from the map.
call apoc.load.xml("https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.3/core/src/test/resources/xml/books.xml")
yield value as catalog
UNWIND catalog._children as book
WITH book.id as id, [attr IN book._children WHERE attr._type IN ['author','title'] | [attr._type, attr._text]] as pairs
WITH id, apoc.map.fromPairs(pairs) AS value
RETURN id, value.title, value.author
id | value.title | value.author |
---|---|---|
"bk101" |
"XML Developer’s Guide" |
"Arciniegas, Fabio" |
"bk102" |
"Midnight Rain" |
"Ralls, Kim" |
"bk103" |
"Maeve Ascendant" |
"Corets, Eva" |
"bk104" |
"Oberon’s Legacy" |
"Corets, Eva" |
"bk105" |
"The Sundered Grail" |
"Corets, Eva" |
"bk106" |
"Lover Birds" |
"Randall, Cynthia" |
"bk107" |
"Splish Splash" |
"Thurman, Paula" |
"bk108" |
"Creepy Crawlies" |
"Knorr, Stefan" |
"bk109" |
"Paradox Lost" |
"Kress, Peter" |
"bk110" |
"Microsoft .NET: The Programming Bible" |
"O’Brien, Tim" |
"bk111" |
"MSXML3: A Comprehensive Guide" |
"O’Brien, Tim" |
"bk112" |
"Visual Studio 7: A Comprehensive Guide" |
"Galos, Mike" |
Import XML directly
We can write the following query to create a graph structure of the Microsoft books XML file.
books.xml
CALL apoc.import.xml(
"https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.3/core/src/test/resources/xml/books.xml",
{relType:'NEXT_WORD', label:'XmlWord'}
)
YIELD node
RETURN node;
node |
---|
(:XmlDocument {_xmlVersion: "1.0", _xmlEncoding: "UTF-8", url: "https://raw.githubusercontent.com/neo4j-contrib/neo4j-apoc-procedures/4.3/core/src/test/resources/xml/books.xml"}) |
The Neo4j Browser visualization below shows the imported graph: