apoc.load.xls
Procedure APOC Full
apoc.load.xls('url','selector',{config}) YIELD lineNo, list, map - load XLS fom URL as stream of row values, config contains any of: {skip:1,limit:5,header:false,ignore:['tmp'],arraySep:';',mapping:{years:{type:'int',arraySep:'-',array:false,name:'age',ignore:false, dateFormat:'iso_date', dateParse:['dd-MM-yyyy']}}
Signature
apoc.load.xls(url :: STRING?, selector :: STRING?, config = {} :: MAP?) :: (lineNo :: INTEGER?, list :: LIST? OF ANY?, map :: MAP?)
Config parameters
The procedure support the following config parameters:
name | type | default | description |
---|---|---|---|
skip |
boolean |
none |
skip result rows |
limit |
Long |
none |
limit result rows |
header |
booelan |
true |
indicates if file has a header |
sep |
String |
',' |
separator character or 'TAB' |
quoteChar |
String |
'"' |
the char to use for quoted elements |
arraySep |
String |
';' |
array separator |
ignore |
List<String> |
[] |
which columns to ignore |
nullValues |
List<String> |
[] |
which values to treat as null, e.g. |
mapping |
Map |
{} |
per field mapping, entry key is field name, .e.g |
mapping
supports the following values:
-
<sheet>
- name of the sheet -
<type>
- type of the conversion requested (STRING
,INTEGER
,FLOAT
,BOOLEAN
,NULL
,LIST
,DATE
,DATE_TIME
,LOCAL_DATE
,LOCAL_DATE_TIME
,LOCAL_TIME
,TIME
) -
dateFormat: <format>
- convert the Date into String (only String is allowed) -
dateParse: [<formats>]
- convert the String into Date (Array of strings are allowed)
Install Dependencies
For loading XLS we’re using the Apache POI library, which works well with old and new Excel formats, but is quite large. That’s why we decided not to include it into the apoc jar, but make it an optional dependency.
These dependencies are included in apoc-xls-dependencies-4.4.0.34.jar, which can be downloaded from the releases page.
Once that file is downloaded, it should be placed in the plugins
directory and the Neo4j Server restarted.
Alternatively, you can download these jars from Maven Repository (putting them into plugins
directory as well):
Usage Examples
CALL apoc.load.xls("https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/load_test.xls",
'Full',{ mapping: {
Integer:{type:'int'},
Array:{type:'int',array:true,arraySep:';'}
}});
lineNo | list | map |
---|---|---|
0 |
["Test", TRUE, 2, 1.5, [1, 2, 3]] |
{Integer: 2, Array: [1, 2, 3], Float: 1.5, String: "Test", Boolean: TRUE} |
CALL apoc.load.xls("https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/load_test.xls",
'Kids'
);
lineNo | list | map |
---|---|---|
0 |
["Selma", 8] |
{name: "Selma", age: 8} |
1 |
["Rana", 11] |
{name: "Rana", age: 11} |
2 |
["Selina", 18] |
{name: "Selina", age: 18} |
Some examples with type/dateFormat and dateParse:
CALL apoc.load.xls('https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/test_date.xlsx',
'sheet',{ mapping:{
Date:{type:'String'}
}});
lineNo | list | map |
---|---|---|
0 |
["2018/05/10", "2018/10/05", "Alan"] |
{Data: "2018/10/05", Date: "2018/05/10", Name: "Alan"} |
1 |
["2018-09-10T00:00:00", 2018-10-10T00:00, "Jack"] |
{Data: 2018-10-10T00:00, Date: "2018-09-10T00:00:00", Name: "Jack"} |
2 |
["2018/05/10 12:10:10", 2018-10-10T00:00, 2018-10-10T00:00] |
{Data: 2018-10-10T00:00, Date: "2018/05/10 12:10:10", Name: 2018-10-10T00:00} |
3 |
[NULL, 2018-10-10T00:00, 1899-12-31T12:01:10] |
{Data: 2018-10-10T00:00, Date: NULL, Name: 1899-12-31T12:01:10} |
4 |
["2011-01-01T12:00:00.05381+01:00", NULL, NULL] |
{Data: NULL, Date: "2011-01-01T12:00:00.05381+01:00", Name: NULL} |
CALL apoc.load.xls('https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/test_date.xlsx',
'sheet', { mapping: {
Date:{type:'String',dateFormat:'iso_date'}
}});
lineNo | list | map |
---|---|---|
0 |
["2018/05/10", "2018/10/05", "Alan"] |
{Data: "2018/10/05", Date: "2018/05/10", Name: "Alan"} |
1 |
["2018-09-10", 2018-10-10T00:00, "Jack"] |
{Data: 2018-10-10T00:00, Date: "2018-09-10", Name: "Jack"} |
2 |
["2018/05/10 12:10:10", 2018-10-10T00:00, 2018-10-10T00:00] |
{Data: 2018-10-10T00:00, Date: "2018/05/10 12:10:10", Name: 2018-10-10T00:00} |
3 |
[NULL, 2018-10-10T00:00, 1899-12-31T12:01:10] |
{Data: 2018-10-10T00:00, Date: NULL, Name: 1899-12-31T12:01:10} |
4 |
["2011-01-01T12:00:00.05381+01:00", NULL, NULL] |
{Data: NULL, Date: "2011-01-01T12:00:00.05381+01:00", Name: NULL} |
CALL apoc.load.xls('https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/test_date.xlsx',
'sheet',{ mapping:{
Date:{type:'String',dateParse:["wrongPath", "dd-MM-yyyy", "dd/MM/yyyy", "yyyy/MM/dd", "yyyy/dd/MM", "yyyy-dd-MM'T'hh:mm:ss"]}
}});
lineNo | list | map |
---|---|---|
0 |
["2018/05/10", "2018/10/05", "Alan"] |
{Data: "2018/10/05", Date: "2018/05/10", Name: "Alan"} |
1 |
["2018-09-10T00:00:00", 2018-10-10T00:00, "Jack"] |
{Data: 2018-10-10T00:00, Date: "2018-09-10T00:00:00", Name: "Jack"} |
2 |
["2018/05/10 12:10:10", 2018-10-10T00:00, 2018-10-10T00:00] |
{Data: 2018-10-10T00:00, Date: "2018/05/10 12:10:10", Name: 2018-10-10T00:00} |
3 |
[NULL, 2018-10-10T00:00, 1899-12-31T12:01:10] |
{Data: 2018-10-10T00:00, Date: NULL, Name: 1899-12-31T12:01:10} |
4 |
["2011-01-01T12:00:00.05381+01:00", NULL, NULL] |
{Data: NULL, Date: "2011-01-01T12:00:00.05381+01:00", Name: NULL} |