This appendix describes how to calculate key sizes for native indexes.
This section describes the following:
As described in the section about Schema indexes, there are limitations to how large the key size can be when using native indexes. This appendix describes in detail how the sizes can be calculated.
It is useful to know how to calculate the size of a single value when calculating the total size of the resulting key. In some cases those entry sizes is different based on whether the entry is in an array or not.
Type  elementSize_{ifSingle} *

elementSize_{ifInArray} **



2 
1 

3 
2 

5 
4 

9 
8 

5 
4 

9 
8 

1 
1 

8 
8 

12 
12 

8 
8 

16 
16 

12 
12 

28 
28 

28 
28 

28 
24 

36 
32 

28 
24 

36 
32 




† 
Nested arrays are not supported 
* elementSize_{ifSingle}
denotes the size of an element if is a single entry.
** elementSize_{ifInArray}
denotes the size of an element if it is part of an array.
*** utf8StringSize
is the size of the String
in bytes when encoded with UTF8.
† elementSize_{Array}
is the size of an array element, and is calculated using the following formulas:
If the data type of the array is a numeric data type:
elementSize_{Array} = 3 + ( arrayLength * elementSize_{ifInArray} )
If the data type of the array is a geometry data type:
elementSize_{Array} = 5 + ( arrayLength * elementSize_{ifInArray} )
If the data type of the array is nonnumeric:
elementSize_{Array} = 2 + ( arrayLength * elementSize_{ifInArray} )
String encoding with UTF8  

It is worth noting that common characters, such as letters, digits and some symbols, translate into one byte per character. NonLatin characters may occupy more than one byte per character. Therefore, for example, a string that contains 100 characters or less may be longer than 100 bytes if it contains multibyte characters. More specifically, the relevant length in bytes of a string is when encoded with UTF8. 
Consider the string His name was Måns Lööv
.
This string has 19 characters that each occupies one byte.
Additionally, there are three characters that each occupy two bytes per character, which add six to the total.
Therefore, the size of the String
in bytes when encoded with UTF8, utf8StringSize, is 25 bytes.
If this string is part of a native index, we get:
elementSize = 2 + utf8StringSize = 2 + 25 = 27 bytes
Consider the array [19, 84, 20, 11, 54, 9, 59, 76, 82, 27, 9, 35, 56, 80, 65, 95, 16, 91, 61, 11].
This array has 20 elements of the type Int
.
Since they are in an array, we need to use elementSize_{ifInArray}
, which is 4
for Int
.
Applying the formula for arrays of numeric data types, we get:
elementSize_{Array} = 3 + ( arrayLength * elementSize_{ifInArray} ) = 3 + ( 20 * 4 ) = 83 bytes
The only way that a noncomposite index can violate the size limit is if the value is a long string or a large array.
Strings in noncomposite indexes have a key size limit of 4036 bytes.
The following formula is used for arrays in noncomposite indexes:
1 + elementSize_{Array} =< 4039
Here elementSize_{Array}
is the number calculated from Table D.1, “Element sizes”.
If we count backwards, we can get the exact array length limit for each data type:
maxArrayLength = FLOOR( ( 4039  3  1 ) / elementSize_{ifInArray} )
for numeric types.
maxArrayLength = FLOOR( ( 4039  2  1 ) / elementSize_{ifInArray} )
for nonnumeric types.
These calculations result in the table below:
Data type  maxArrayLength 


4035 

2017 

1008 

504 

1008 

504 

4036 



504 

336 

504 

252 

336 

144 

144 

168 

126 

168 

126 
Note that in most cases, Cypher will use Long
or Double
when working with numbers.
Properties with the type of String
are a special case because they are dynamically sized.
The table below shows the maximum number of array elements in an array, based on certain string sizes:
String size  maxArrayLength 

1 
1345 
10 
336 
100 
39 
1000 
4 
The table can be used as a reference point. For example: if we know that all the strings in an array occupy 100 bytes or less, then arrays of length 39 or lower will definitely fit.
This limitation only applies if one or more of the following criteria is met:
We denote a targeted property of a composite index a slot
, and the number of slots numberOfSlots
.
For example, an index on :Person(firstName, surName, age)
has three properties and thus numberOfSlots = 3
.
In the index, each slot is filled by an element.
In order to calculate the size of the index, we must have the size of each element in the index, i.e. the elementSize
, as calculated in previous sections.
The following equation can be used to verify that a specific composite index entry is within bounds:
numberOfSlots + sum( elementSize ) =< 4039
Here, sum( elementSize )
is the sum of the sizes of all the elements of the composite key as defined in the section called “Element size calculations”, and numberOfSlots
is the number of targeted properties for the index.
Consider a composite index of five strings that each can occupy the maximum of 500 bytes.
Using the equation above we get:
numberOfSlots + sum( elementSize ) = 5 + ( 5 * ( 2 + 500 ) ) = 2515 < 4039
We are well within bounds for our composite index.
Consider a composite index of five arrays of type Float
that each have a length of 250.
First we calculate the size of each array element:
elementSize_{Array} = 3 + ( arrayLength * elementSize_{ifInArray} ) = 3 + ( 250 * 4 ) = 1003
Then we calculate the size of the composite index:
numberOfSlots + sum( elementSize_{Array} ) = 5 + ( 5 * 1003 ) = 5020 > 4039
This index key will exceed the key size limit for native indexes.
To work around this, it is possible to create the index using the lucene+native2.0
index provider, as described in the section called “Workarounds to address limitations”, but please note that this index provider has been deprecated.