Load and Import Arrow

The following procedures allow you to read an Apache Arrow file exported via apoc.export.arrow.* procedures. It could also potentially read other Apache Arrow files that have not been created via the export procedures.

Procedure and Function Overview

The table below describes the available procedures and functions:

Qualified Name Type

Qualified Name	Type
apoc.load.arrow `apoc.load.arrow(file STRING, config MAP<STRING, ANY>)` - loads values from the provided Arrow file.	`Procedure`
apoc.load.arrow.stream.adoc `apoc.load.arrow.stream(source LIST<INTEGER>, config MAP<STRING, ANY>)` - loads values from the provided Arrow byte array.	`Procedure`
apoc.import.arrow `apoc.import.arrow(urlOrBinaryFile ANY, config MAP<STRING, ANY>)` - imports entities from the provided Arrow file or byte array	`Procedure`

apoc.load.arrow
apoc.load.arrow(file STRING, config MAP<STRING, ANY>) - loads values from the provided Arrow file.

Procedure

apoc.load.arrow.stream.adoc
apoc.load.arrow.stream(source LIST<INTEGER>, config MAP<STRING, ANY>) - loads values from the provided Arrow byte array.

Procedure

apoc.import.arrow
apoc.import.arrow(urlOrBinaryFile ANY, config MAP<STRING, ANY>) - imports entities from the provided Arrow file or byte array

Procedure

`apoc.load.arrow`

This procedure takes a file or HTTP URL and parses the Apache Arrow into a map data structure.

signature

signature
`apoc.load.arrow(file :: STRING, config = {} :: MAP) :: (value :: MAP)`

apoc.load.arrow(file :: STRING, config = {} :: MAP) :: (value :: MAP)

Currently, this procedure does not support any config parameters.

By default importing from the file system is disabled. We can enable it by setting the following property in apoc.conf:

apoc.conf

apoc.import.file.enabled=true

If we try to use any of the import procedures without having first set this property, we’ll get the following error message:

Failed to invoke procedure: Caused by: java.lang.RuntimeException: Import from files not enabled, please set apoc.import.file.enabled=true in your apoc.conf

Import files are read from the import directory, which is defined by the server.directories.import property. This means that any file path that we provide is relative to this directory. If we try to read from an absolute path, such as /tmp/filename, we’ll get an error message similar to the following one:

Failed to invoke procedure: Caused by: java.lang.RuntimeException: Can’t read url or key file:/path/to/neo4j/import/tmp/filename as json: /path/to/neo4j//import/tmp/filename (No such file or directory)

We can enable reading files from anywhere on the file system by setting the following property in apoc.conf:

apoc.conf

apoc.import.file.use_neo4j_config=false

Neo4j will now be able to read from anywhere on the file system, so be sure that this is your intention before setting this property.

`apoc.load.arrow.stream`

This procedure takes a byte[] source and parses the Apache Arrow into a map data structure.

signature

signature
`apoc.load.arrow.stream(source :: LIST<INTEGER>, config = {} :: MAP) :: (value :: MAP)`

apoc.load.arrow.stream(source :: LIST<INTEGER>, config = {} :: MAP) :: (value :: MAP)

Currently, this procedure does not support any config parameters.

Examples

The following section contains examples showing how to import data from various Apache Arrow sources.

Import from local file

Taking the output of this case:

The following query processes a test.arrow file and returns the content as Cypher data structures

CALL apoc.load.arrow('test.arrow')
YIELD value
RETURN value

Table 1. Results
value
{arrayArray → ["[0]"], dateData → 2015-05-18T19:32:24Z, boolArray → [true,false,true], intArray → [1,2,3], mapData → "{"foo":"bar"}", boolData → true, intData → 1, mixedArray → ["1","2","true",<null>], doubleArray → [1.1,2.2,3.3], doubleData → 1.1, stringData → "a"}

Import from binary source

Taking the output of this case:

The following query processes a test.arrow file and returns the content as Cypher data structures

CALL apoc.load.arrow.stream('<binary arrow file>')
YIELD value
RETURN value

Table 2. Results
value
{arrayArray → ["[0]"], dateData → 2015-05-18T19:32:24Z, boolArray → [true,false,true], intArray → [1,2,3], mapData → "{"foo":"bar"}", boolData → true, intData → 1, mixedArray → ["1","2","true",<null>], doubleArray → [1.1,2.2,3.3], doubleData → 1.1, stringData → "a"}

Import Arrow file created by Export Arrow procedures

The apoc.import.arrow procedure can be used to import Apache Arrow files created by the apoc.export.arrow.* procedures.

This procedure should not be confused with the apoc.load.arrow* procedures, which just loads the values of the Arrow file, and does not create entities in the database.

See this page for more info.