In this article we discuss index providers and what limitations and workarounds there are.
There are two index types in Neo4j, btree and fulltext. This article target btree indexes, up until 4.0 called schema indexes. This is the normal index you get when you create an index or index backed constraint through Cypher.
CREATE INDEX "My index" FOR (p:Person) ON (p.name)
All indexes are backed by an index provider.
Fulltext indexes are backed by fulltext1.0
and btree indexes are backed by nativebtree1.0
(default) or lucene+native3.0
.
The table below lists the available btree index providers and their support for native indexing:
Index provider  Value types supported for native indexing 


Native for all types 

Lucene for singleproperty strings, native for the rest 
Key size limit
The nativebtree1.0
index provider has a key size limit of 8167 bytes.
This limit manifests itself in different ways depending on whether the key holds a single string, a single array, or multiple values (i.e. is the key in a composite index).
If a transaction reaches the key size limit for one or more of its changes, that transaction will fail before committing any changes. If the limit is reached during index population, the resulting index will be in a failed state, thus not be usable for any queries.
See below for details on how to calculate key sizes for native indexes.
If keys does not fit in this limit, most likely a fulltext index is a better fit for what the use case, if that’s not the case the lucene+native3.0
has a key size limit of 32766 bytes.
Contains and ends with queries
The nativebtree1.0
index provider have limited support for ENDS WITH
and CONTAINS
queries.
These queries will not be able to do an optimized search the way they do for queries that use STARTS WITH
, =
and <>
.
Instead, the index result will be a stream of an index scan with filtering.
For singleproperty strings lucene+native3.0
can be used instead which have full support for both ENDS WITH
and CONSTAINS
.
To create an index with a different provider than default, the easiest way is to use db.createIndex
, db.createUniquePropertyConstraint
or db.createNodeKey
procedures to which you can provide index provider name.
Another option is to configure the default index provider using dbms.index.default_schema_provider
.
Note that a restart is necessary for this config option to take effect.
Key size calculation
This part describes how to calculate key sizes for native indexes.
As described in the section about key size there are limitations to how large the key size can be when using nativebtree1.0
index provider.
This appendix describes in detail how the sizes can be calculated.
Element size calculations
It is useful to know how to calculate the size of a single value when calculating the total size of the resulting key. In some cases those entry sizes is different based on whether the entry is in an array or not.
Type  elementSize_{ifSingle} * 
elementSize_{ifInArray} ** 


3 
1 

4 
2 

6 
4 

10 
8 

6 
4 

10 
8 

2 
1 

9 
8 

13 
12 

9 
8 

17 
16 

13 
12 

29 
28 

29 
28 

28 
24 

36 
32 

28 
24 

36 
32 




† 
Nested arrays are not supported 
* elementSize_{ifSingle}
denotes the size of an element if is a single entry.
** elementSize_{ifInArray}
denotes the size of an element if it is part of an array.
*** utf8StringSize
is the size of the String
in bytes when encoded with UTF8.
† elementSize_{Array}
is the size of an array element, and is calculated using the following formulas:

If the data type of the array is a numeric data type:
elementSize_{Array} = 4 + ( arrayLength * elementSize_{ifInArray} )

If the data type of the array is a geometry data type:
elementSize_{Array} = 6 + ( arrayLength * elementSize_{ifInArray} )

If the data type of the array is nonnumeric:
elementSize_{Array} = 3 + ( arrayLength * elementSize_{ifInArray} )
Note

String encoding with UTF8
It is worth noting that common characters, such as letters, digits and some symbols, translate into one byte per character. NonLatin characters may occupy more than one byte per character. Therefore, for example, a string that contains 100 characters or less may be longer than 100 bytes if it contains multibyte characters. More specifically, the relevant length in bytes of a string is when encoded with UTF8. 
Consider the string His name was Måns Lööv
.
This string has 19 characters that each occupies 1 byte.
Additionally, there are 3 characters that each occupy 2 bytes per character, which add 6 to the total.
Therefore, the size of the String
in bytes when encoded with UTF8, utf8StringSize, is 25 bytes.
If this string is part of a native index, we get:
elementSize = 2 + utf8StringSize = 2 + 25 = 27 bytes
Consider the array [19, 84, 20, 11, 54, 9, 59, 76, 82, 27, 9, 35, 56, 80, 65, 95, 16, 91, 61, 11].
This array has 20 elements of the type Int
.
Since they are in an array, we need to use elementSize_{ifInArray}
, which is 4
for Int
.
Applying the formula for arrays of numeric data types, we get:
elementSize_{Array} = 4 + ( arrayLength * elementSize_{ifInArray} ) = 4 + ( 20 * 4 ) = 84 bytes
Noncomposite indexes
The only way that a noncomposite index can violate the size limit is if the value is a long string or a large array.
Strings
Strings in noncomposite indexes have a key size limit of 8164 bytes.
Arrays
The following formula is used for arrays in noncomposite indexes:
1 + elementSize_{Array} =< 8167
Here elementSize_{Array}
is the number calculated from Element sizes.
If we count backwards, we can get the exact array length limit for each data type:

maxArrayLength = FLOOR( ( 8167  4 ) / elementSize_{ifInArray} )
for numeric types. 
maxArrayLength = FLOOR( ( 8167  4 ) / elementSize_{ifInArray} )
for nonnumeric types.
These calculations result in the table below:
Data type  maxArrayLength 


8163 

4081 

2040 

1020 

2040 

1020 

8164 



1020 

680 

1020 

510 

680 

291 

291 

340 

255 

340 

255 
Note that in most cases, Cypher will use Long
or Double
when working with numbers.
Properties with the type of String
are a special case because they are dynamically sized.
The table below shows the maximum number of array elements in an array, based on certain string sizes:
String size  maxArrayLength 

1 
2721 
10 
680 
100 
80 
1000 
8 
The table can be used as a reference point. For example: if we know that all the strings in an array occupy 100 bytes or less, then arrays of length 80 or lower will definitely fit.
Composite indexes
This limitation only applies if one or more of the following criteria is met:
 Composite index contains strings
 Composite index contains arrays
 Composite index targets many different properties (>50)
We denote a targeted property of a composite index a slot
.
For example, an index on :Person(firstName, surName, age)
has three properties and thus three slots.
In the index, each slot is filled by an element.
In order to calculate the size of the index, we must have the size of each element in the index, i.e. the elementSize
, as calculated in previous sections.
The following equation can be used to verify that a specific composite index entry is within bounds:
sum( elementSize ) =< 8167
Here, sum( elementSize )
is the sum of the sizes of all the elements of the composite key as defined in elementSize_{ifSingle}
in [indexconfigurationlimitationselementsizecalculations].
Consider a composite index of five strings that each can occupy the maximum of 500 bytes.
Using the equation above we get:
sum( elementSize ) = 5 * ( 3 + 500 ) = 2515 < 8167
We are well within bounds for our composite index.
Consider a composite index of 10 arrays of type Float
that each have a length of 250.
First we calculate the size of each array element:
elementSize_{Array} = 4 + ( arrayLength * elementSize_{ifInArray} ) = 4 + ( 250 * 4 ) = 1004
Then we calculate the size of the composite index:
sum( elementSize_{Array} ) = 10 * 1004 = 10040 > 8167
This index key will exceed the key size limit for native indexes.
 Last Modified: 20200915 13:07:09 UTC by Anton Persson.
 Relevant for Neo4j Versions: 4.0.
 Relevant keywords indexing.