Export to Apache Arrow
These procedures export data into a format that’s used by many Apache and non-Apache tools.
Available Procedures
The table below describes the available procedures:
Qualified Name | Type |
---|---|
apoc.export.arrow.all |
|
apoc.export.arrow.graph.adoc |
|
apoc.export.arrow.query.adoc |
|
apoc.export.arrow.stream.all |
|
apoc.export.arrow.stream.graph.adoc |
|
apoc.export.arrow.stream.query.adoc |
|
Exporting to a file
By default exporting to the file system is disabled.
We can enable it by setting the following property in apoc.conf
:
apoc.export.file.enabled=true
If we try to use any of the export procedures without having first set this property, we’ll get the following error message:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Export to files not enabled, please set apoc.export.file.enabled=true in your apoc.conf.
Otherwise, if you are running in a cloud environment without filesystem access, use the |
Export files are written to the import
directory, which is defined by the server.directories.import
property.
This means that any file path that we provide is relative to this directory.
If we try to write to an absolute path, such as /tmp/filename
, we’ll get an error message similar to the following one:
Failed to invoke procedure: Caused by: java.io.FileNotFoundException: /path/to/neo4j/import/tmp/fileName (No such file or directory) |
We can enable writing to anywhere on the file system by setting the following property in apoc.conf
:
apoc.import.file.use_neo4j_config=false
Neo4j will now be able to write anywhere on the file system, so be sure that this is your intention before setting this property. |
Examples
Export results of Cypher query to Apache Arrow file
CALL apoc.export.arrow.query('query_test.arrow',
"RETURN 1 AS intData, 'a' AS stringData,
true AS boolData,
[1, 2, 3] AS intArray,
[1.1, 2.2, 3.3] AS doubleArray,
[true, false, true] AS boolArray,
[1, '2', true, null] AS mixedArray,
{foo: 'bar'} AS mapData,
localdatetime('2015-05-18T19:32:24') as dateData,
[[0]] AS arrayArray,
1.1 AS doubleData"
) YIELD file
file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data |
---|---|---|---|---|---|---|---|---|---|---|---|
"query_test.arrow" |
"statement: cols(11)" |
"arrow" |
0 |
0 |
11 |
468 |
11 |
2000 |
1 |
true |
<null> |
Export results of Cypher query to Apache Arrow binary output
CALL apoc.export.arrow.stream.query('query_test.arrow',
"RETURN 1 AS intData, 'a' AS stringData,
true AS boolData,
[1, 2, 3] AS intArray,
[1.1, 2.2, 3.3] AS doubleArray,
[true, false, true] AS boolArray,
[1, '2', true, null] AS mixedArray,
{foo: 'bar'} AS mapData,
localdatetime('2015-05-18T19:32:24') as dateData,
[[0]] AS arrayArray,
1.1 AS doubleData"
) YIELD value
value |
---|
<binary Apache Arrow output> |