Load and Import Arrow
The following procedures allow you to read an Apache Arrow file exported via apoc.export.arrow.* procedures. It could also potentially read other Apache Arrow files that have not been created via the export procedures.
Procedure and Function Overview
The table below describes the available procedures and functions:
Qualified Name | Type |
---|---|
apoc.load.arrow |
|
apoc.load.arrow.stream.adoc |
|
apoc.import.arrow |
|
apoc.load.arrow
This procedure takes a file or HTTP URL and parses the Apache Arrow into a map data structure.
signature |
---|
|
Currently, this procedure does not support any config parameters.
By default importing from the file system is disabled.
We can enable it by setting the following property in apoc.conf
:
apoc.import.file.enabled=true
If we try to use any of the import procedures without having first set this property, we’ll get the following error message:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Import from files not enabled, please set apoc.import.file.enabled=true in your apoc.conf |
Import files are read from the import
directory, which is defined by the server.directories.import
property.
This means that any file path that we provide is relative to this directory.
If we try to read from an absolute path, such as /tmp/filename
, we’ll get an error message similar to the following one:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Can’t read url or key file:/path/to/neo4j/import/tmp/filename as json: /path/to/neo4j//import/tmp/filename (No such file or directory) |
We can enable reading files from anywhere on the file system by setting the following property in apoc.conf
:
apoc.import.file.use_neo4j_config=false
Neo4j will now be able to read from anywhere on the file system, so be sure that this is your intention before setting this property. |
Examples
The following section contains examples showing how to import data from various Apache Arrow sources.
Import from local file
Taking the output of this case:
test.arrow
file and returns the content as Cypher data structuresCALL apoc.load.arrow('test.arrow')
YIELD value
RETURN value
value |
---|
{arrayArray → ["[0]"], dateData → 2015-05-18T19:32:24Z, boolArray → [true,false,true], intArray → [1,2,3], mapData → "{"foo":"bar"}", boolData → true, intData → 1, mixedArray → ["1","2","true",<null>], doubleArray → [1.1,2.2,3.3], doubleData → 1.1, stringData → "a"} |
Import from binary source
Taking the output of this case:
test.arrow
file and returns the content as Cypher data structuresCALL apoc.load.arrow.stream('<binary arrow file>')
YIELD value
RETURN value
value |
---|
{arrayArray → ["[0]"], dateData → 2015-05-18T19:32:24Z, boolArray → [true,false,true], intArray → [1,2,3], mapData → "{"foo":"bar"}", boolData → true, intData → 1, mixedArray → ["1","2","true",<null>], doubleArray → [1.1,2.2,3.3], doubleData → 1.1, stringData → "a"} |
Import Arrow file created by Export Arrow procedures
The apoc.import.arrow
procedure can be used to import Apache Arrow files created by the apoc.export.arrow.* procedures.
This procedure should not be confused with the apoc.load.arrow*
procedures,
which just loads the values of the Arrow file, and does not create entities in the database.
See this page for more info.