Export to Apache Arrow

These procedures export data into a format that’s used by many Apache and non-Apache tools.

Available Procedures

The table below describes the available procedures:

Qualified Name Type

apoc.export.arrow.all
apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>) - exports the full database as an arrow file.

Procedure

apoc.export.arrow.graph.adoc
apoc.export.arrow.graph(file STRING, graph ANY, config MAP<STRING, ANY>) - exports the given graph as an arrow file.

Procedure

apoc.export.arrow.query.adoc
apoc.export.arrow.stream.all(config MAP<STRING, ANY>) - exports the full database as an arrow byte array.

Procedure

apoc.export.arrow.stream.all
apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>) - exports the full database as an arrow file.

Procedure

apoc.export.arrow.stream.graph.adoc
apoc.export.arrow.stream.graph(graph ANY, config MAP<STRING, ANY>) - exports the given graph as an arrow byte array.

Procedure

apoc.export.arrow.stream.query.adoc
apoc.export.arrow.stream.query(query ANY, config MAP<STRING, ANY>) - exports the given Cypher query as an arrow byte array.

Procedure

Exporting to a file

By default exporting to the file system is disabled. We can enable it by setting the following property in apoc.conf:

apoc.conf
apoc.export.file.enabled=true

If we try to use any of the export procedures without having first set this property, we’ll get the following error message:

Failed to invoke procedure: Caused by: java.lang.RuntimeException: Export to files not enabled, please set apoc.export.file.enabled=true in your apoc.conf. Otherwise, if you are running in a cloud environment without filesystem access, use the {stream:true} config and null as a 'file' parameter to stream the export back to your client. Note that the stream mode cannot be used with the apoc.export.xls.* procedures.

Export files are written to the import directory, which is defined by the server.directories.import property. This means that any file path that we provide is relative to this directory. If we try to write to an absolute path, such as /tmp/filename, we’ll get an error message similar to the following one:

Failed to invoke procedure: Caused by: java.io.FileNotFoundException: /path/to/neo4j/import/tmp/fileName (No such file or directory)

We can enable writing to anywhere on the file system by setting the following property in apoc.conf:

apoc.conf
apoc.import.file.use_neo4j_config=false

Neo4j will now be able to write anywhere on the file system, so be sure that this is your intention before setting this property.

Examples

Export results of Cypher query to Apache Arrow file

CALL apoc.export.arrow.query('query_test.arrow',
    "RETURN 1 AS intData, 'a' AS stringData,
        true AS boolData,
        [1, 2, 3] AS intArray,
        [1.1, 2.2, 3.3] AS doubleArray,
        [true, false, true] AS boolArray,
        [1, '2', true, null] AS mixedArray,
        {foo: 'bar'} AS mapData,
        localdatetime('2015-05-18T19:32:24') as dateData,
        [[0]] AS arrayArray,
        1.1 AS doubleData"
) YIELD file
Table 1. Results
file source format nodes relationships properties time rows batchSize batches done data

"query_test.arrow"

"statement: cols(11)"

"arrow"

0

0

11

468

11

2000

1

true

<null>

Export results of Cypher query to Apache Arrow binary output

CALL apoc.export.arrow.stream.query('query_test.arrow',
    "RETURN 1 AS intData, 'a' AS stringData,
        true AS boolData,
        [1, 2, 3] AS intArray,
        [1.1, 2.2, 3.3] AS doubleArray,
        [true, false, true] AS boolArray,
        [1, '2', true, null] AS mixedArray,
        {foo: 'bar'} AS mapData,
        localdatetime('2015-05-18T19:32:24') as dateData,
        [[0]] AS arrayArray,
        1.1 AS doubleData"
) YIELD value
Table 2. Results
value

<binary Apache Arrow output>