apoc.load.xls

Procedure APOC Full

apoc.load.xls('url','selector',{config}) YIELD lineNo, list, map - load XLS fom URL as stream of row values, config contains any of: {skip:1,limit:5,header:false,ignore:['tmp'],arraySep:';',mapping:{years:{type:'int',arraySep:'-',array:false,name:'age',ignore:false, dateFormat:'iso_date', dateParse:['dd-MM-yyyy']}}

Signature

apoc.load.xls(url :: STRING?, selector :: STRING?, config = {} :: MAP?) :: (lineNo :: INTEGER?, list :: LIST? OF ANY?, map :: MAP?)

Input parameters

Name Type Default

url

STRING?

null

selector

STRING?

null

config

MAP?

{}

Config parameters

The procedure support the following config parameters:

Table 1. Config parameters
name type default description

skip

boolean

none

skip result rows

limit

Long

none

limit result rows

header

booelan

true

indicates if file has a header

sep

String

','

separator character or 'TAB'

quoteChar

String

'"'

the char to use for quoted elements

arraySep

String

';'

array separator

ignore

List<String>

[]

which columns to ignore

nullValues

List<String>

[]

which values to treat as null, e.g. ['na',false]

mapping

Map

{}

per field mapping, entry key is field name, .e.g {mapping:{'<sheet>':{type:'<type>', dateFormat: '<format>', dateParse: [<formats>]}}}

mapping supports the following values:

  • <sheet> - name of the sheet

  • <type> - type of the conversion requested (STRING, INTEGER, FLOAT, BOOLEAN, NULL, LIST, DATE, DATE_TIME, LOCAL_DATE, LOCAL_DATE_TIME, LOCAL_TIME, TIME)

  • dateFormat: <format> - convert the Date into String (only String is allowed)

  • dateParse: [<formats>] - convert the String into Date (Array of strings are allowed)

Output parameters

Name Type

lineNo

INTEGER?

list

LIST? OF ANY?

map

MAP?

Install Dependencies

For loading XLS we’re using the Apache POI library, which works well with old and new Excel formats, but is quite large. That’s why we decided not to include it into the apoc jar, but make it an optional dependency.

These dependencies are included in apoc-xls-dependencies-4.4.0.40.jar, which can be downloaded from the releases page. Once that file is downloaded, it should be placed in the plugins directory and the Neo4j Server restarted.

Alternatively, you can download these jars from Maven Repository (putting them into plugins directory as well):

For XLS files:

Usage Examples

CALL apoc.load.xls("https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/load_test.xls",
  'Full',{ mapping: {
  Integer:{type:'int'},
  Array:{type:'int',array:true,arraySep:';'}
}});
Table 2. Results
lineNo list map

0

["Test", TRUE, 2, 1.5, [1, 2, 3]]

{Integer: 2, Array: [1, 2, 3], Float: 1.5, String: "Test", Boolean: TRUE}

CALL apoc.load.xls("https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/load_test.xls",
  'Kids'
);
Table 3. Results
lineNo list map

0

["Selma", 8]

{name: "Selma", age: 8}

1

["Rana", 11]

{name: "Rana", age: 11}

2

["Selina", 18]

{name: "Selina", age: 18}

Some examples with type/dateFormat and dateParse:

CALL apoc.load.xls('https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/test_date.xlsx',
  'sheet',{ mapping:{
  Date:{type:'String'}
}});
Table 4. Results
lineNo list map

0

["2018/05/10", "2018/10/05", "Alan"]

{Data: "2018/10/05", Date: "2018/05/10", Name: "Alan"}

1

["2018-09-10T00:00:00", 2018-10-10T00:00, "Jack"]

{Data: 2018-10-10T00:00, Date: "2018-09-10T00:00:00", Name: "Jack"}

2

["2018/05/10 12:10:10", 2018-10-10T00:00, 2018-10-10T00:00]

{Data: 2018-10-10T00:00, Date: "2018/05/10 12:10:10", Name: 2018-10-10T00:00}

3

[NULL, 2018-10-10T00:00, 1899-12-31T12:01:10]

{Data: 2018-10-10T00:00, Date: NULL, Name: 1899-12-31T12:01:10}

4

["2011-01-01T12:00:00.05381+01:00", NULL, NULL]

{Data: NULL, Date: "2011-01-01T12:00:00.05381+01:00", Name: NULL}

CALL apoc.load.xls('https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/test_date.xlsx',
  'sheet', { mapping: {
    Date:{type:'String',dateFormat:'iso_date'}
}});
Table 5. Results
lineNo list map

0

["2018/05/10", "2018/10/05", "Alan"]

{Data: "2018/10/05", Date: "2018/05/10", Name: "Alan"}

1

["2018-09-10", 2018-10-10T00:00, "Jack"]

{Data: 2018-10-10T00:00, Date: "2018-09-10", Name: "Jack"}

2

["2018/05/10 12:10:10", 2018-10-10T00:00, 2018-10-10T00:00]

{Data: 2018-10-10T00:00, Date: "2018/05/10 12:10:10", Name: 2018-10-10T00:00}

3

[NULL, 2018-10-10T00:00, 1899-12-31T12:01:10]

{Data: 2018-10-10T00:00, Date: NULL, Name: 1899-12-31T12:01:10}

4

["2011-01-01T12:00:00.05381+01:00", NULL, NULL]

{Data: NULL, Date: "2011-01-01T12:00:00.05381+01:00", Name: NULL}

CALL apoc.load.xls('https://github.com/neo4j-contrib/neo4j-apoc-procedures/raw/4.4/full/src/test/resources/test_date.xlsx',
  'sheet',{ mapping:{
  Date:{type:'String',dateParse:["wrongPath", "dd-MM-yyyy", "dd/MM/yyyy", "yyyy/MM/dd", "yyyy/dd/MM", "yyyy-dd-MM'T'hh:mm:ss"]}
}});
Table 6. Results
lineNo list map

0

["2018/05/10", "2018/10/05", "Alan"]

{Data: "2018/10/05", Date: "2018/05/10", Name: "Alan"}

1

["2018-09-10T00:00:00", 2018-10-10T00:00, "Jack"]

{Data: 2018-10-10T00:00, Date: "2018-09-10T00:00:00", Name: "Jack"}

2

["2018/05/10 12:10:10", 2018-10-10T00:00, 2018-10-10T00:00]

{Data: 2018-10-10T00:00, Date: "2018/05/10 12:10:10", Name: 2018-10-10T00:00}

3

[NULL, 2018-10-10T00:00, 1899-12-31T12:01:10]

{Data: 2018-10-10T00:00, Date: NULL, Name: 1899-12-31T12:01:10}

4

["2011-01-01T12:00:00.05381+01:00", NULL, NULL]

{Data: NULL, Date: "2011-01-01T12:00:00.05381+01:00", Name: NULL}