Export to Apache Arrow

These procedures export data into a format that’s used by many Apache and non-Apache tools.

Available Procedures

The table below describes the available procedures:

Qualified Name Type

Qualified Name	Type
apoc.export.arrow.all `apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>)` - exports the full database as an arrow file.	`Procedure`
apoc.export.arrow.graph.adoc `apoc.export.arrow.graph(file STRING, graph ANY, config MAP<STRING, ANY>)` - exports the given graph as an arrow file.	`Procedure`
apoc.export.arrow.query.adoc `apoc.export.arrow.stream.all(config MAP<STRING, ANY>)` - exports the full database as an arrow byte array.	`Procedure`
apoc.export.arrow.stream.all `apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>)` - exports the full database as an arrow file.	`Procedure`
apoc.export.arrow.stream.graph.adoc `apoc.export.arrow.stream.graph(graph ANY, config MAP<STRING, ANY>)` - exports the given graph as an arrow byte array.	`Procedure`
apoc.export.arrow.stream.query.adoc `apoc.export.arrow.stream.query(query ANY, config MAP<STRING, ANY>)` - exports the given Cypher query as an arrow byte array.	`Procedure`

apoc.export.arrow.all
apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>) - exports the full database as an arrow file.

Procedure

apoc.export.arrow.graph.adoc
apoc.export.arrow.graph(file STRING, graph ANY, config MAP<STRING, ANY>) - exports the given graph as an arrow file.

Procedure

apoc.export.arrow.query.adoc
apoc.export.arrow.stream.all(config MAP<STRING, ANY>) - exports the full database as an arrow byte array.

Procedure

apoc.export.arrow.stream.all
apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>) - exports the full database as an arrow file.

Procedure

apoc.export.arrow.stream.graph.adoc
apoc.export.arrow.stream.graph(graph ANY, config MAP<STRING, ANY>) - exports the given graph as an arrow byte array.

Procedure

apoc.export.arrow.stream.query.adoc
apoc.export.arrow.stream.query(query ANY, config MAP<STRING, ANY>) - exports the given Cypher query as an arrow byte array.

Procedure

Exporting to a file

By default exporting to the file system is disabled. We can enable it by setting the following property in apoc.conf:

apoc.conf

apoc.export.file.enabled=true

If we try to use any of the export procedures without having first set this property, we’ll get the following error message:

Failed to invoke procedure: Caused by: java.lang.RuntimeException: Export to files not enabled, please set apoc.export.file.enabled=true in your apoc.conf. Otherwise, if you are running in a cloud environment without filesystem access, use the {stream:true} config and null as a 'file' parameter to stream the export back to your client. Note that the stream mode cannot be used with the apoc.export.xls.* procedures.

Export files are written to the import directory, which is defined by the server.directories.import property. This means that any file path that we provide is relative to this directory. If we try to write to an absolute path, such as /tmp/filename, we’ll get an error message similar to the following one:

Failed to invoke procedure: Caused by: java.io.FileNotFoundException: /path/to/neo4j/import/tmp/fileName (No such file or directory)

We can enable writing to anywhere on the file system by setting the following property in apoc.conf:

apoc.conf

apoc.import.file.use_neo4j_config=false

Neo4j will now be able to write anywhere on the file system, so be sure that this is your intention before setting this property.

Examples

Export results of Cypher query to Apache Arrow file

CALL apoc.export.arrow.query('query_test.arrow',
    "RETURN 1 AS intData, 'a' AS stringData,
        true AS boolData,
        [1, 2, 3] AS intArray,
        [1.1, 2.2, 3.3] AS doubleArray,
        [true, false, true] AS boolArray,
        [1, '2', true, null] AS mixedArray,
        {foo: 'bar'} AS mapData,
        localdatetime('2015-05-18T19:32:24') as dateData,
        [[0]] AS arrayArray,
        1.1 AS doubleData"
) YIELD file

Table 1. Results
file	source	format	nodes	relationships	properties	time	rows	batchSize	batches	done	data
"query_test.arrow"	"statement: cols(11)"	"arrow"	0	0	11	468	11	2000	1	true	<null>

Export results of Cypher query to Apache Arrow binary output

CALL apoc.export.arrow.stream.query('query_test.arrow',
    "RETURN 1 AS intData, 'a' AS stringData,
        true AS boolData,
        [1, 2, 3] AS intArray,
        [1.1, 2.2, 3.3] AS doubleArray,
        [true, false, true] AS boolArray,
        [1, '2', true, null] AS mixedArray,
        {foo: 'bar'} AS mapData,
        localdatetime('2015-05-18T19:32:24') as dateData,
        [[0]] AS arrayArray,
        1.1 AS doubleData"
) YIELD value

Table 2. Results
value
<binary Apache Arrow output>