Appendix D. Index key calculations

This appendix describes how to calculate key sizes for native indexes.

This section describes the following:

As described in the section about Schema indexes, there are limitations to how large the key size can be when using native indexes. This appendix describes in detail how the sizes can be calculated.

Element size calculations

It is useful to know how to calculate the size of a single value when calculating the total size of the resulting key. In some cases those entry sizes is different based on whether the entry is in an array or not.

Table D.1. Element sizes
Type elementSizeifSingle * elementSizeifInArray **

Byte

2

1

Short

3

2

Int

5

4

Long

9

8

Float

5

4

Double

9

8

Boolean

1

1

Date

8

8

Time

12

12

LocalTime

8

8

DateTime

16

16

LocalDateTime

12

12

Duration

28

28

Period

28

28

Point (Cartesian)

28

24

Point (Cartesian 3D)

36

32

Point (WGS-84)

28

24

Point (WGS-84 3D)

36

32

String

2 + utf8StringSize ***

2 + utf8StringSize ***

Array

Nested arrays are not supported

* elementSizeifSingle denotes the size of an element if is a single entry.

** elementSizeifInArray denotes the size of an element if it is part of an array.

*** utf8StringSize is the size of the String in bytes when encoded with UTF8.

elementSizeArray is the size of an array element, and is calculated using the following formulas:

String encoding with UTF8

It is worth noting that common characters, such as letters, digits and some symbols, translate into one byte per character. Non-Latin characters may occupy more than one byte per character. Therefore, for example, a string that contains 100 characters or less may be longer than 100 bytes if it contains multi-byte characters.

More specifically, the relevant length in bytes of a string is when encoded with UTF8.

Example D.1. Calculate the size of a string when used in an index

Consider the string His name was Måns Lööv.

This string has 19 characters that each occupies one byte. Additionally, there are three characters that each occupy two bytes per character, which add six to the total. Therefore, the size of the String in bytes when encoded with UTF8, utf8StringSize, is 25 bytes.

If this string is part of a native index, we get:

elementSize = 2 + utf8StringSize = 2 + 25 = 27 bytes

Example D.2. Calculate the size of an array when used in an index

Consider the array [19, 84, 20, 11, 54, 9, 59, 76, 82, 27, 9, 35, 56, 80, 65, 95, 16, 91, 61, 11].

This array has 20 elements of the type Int. Since they are in an array, we need to use elementSizeifInArray, which is 4 for Int.

Applying the formula for arrays of numeric data types, we get:

elementSizeArray = 3 + ( arrayLength * elementSizeifInArray ) = 3 + ( 20 * 4 ) = 83 bytes

Non-composite indexes

The only way that a non-composite index can violate the size limit is if the value is a long string or a large array.

Strings

Strings in non-composite indexes have a key size limit of 4036 bytes.

Arrays

The following formula is used for arrays in non-composite indexes:

1 + elementSizeArray =< 4039

Here elementSizeArray is the number calculated from Table D.1, “Element sizes”.

If we count backwards, we can get the exact array length limit for each data type:

These calculations result in the table below:

Table D.2. Maximum array length, per data type
Data type maxArrayLength

Byte

4035

Short

2017

Int

1008

Long

504

Float

1008

Double

504

Boolean

4036

String

See Table D.3, “Maximum array length, examples for strings”

Date

504

Time

336

LocalTime

504

DateTime

252

LocalDateTime

336

Duration

144

Period

144

Point (Cartesian)

168

Point (Cartesian 3D)

126

Point (WGS-84)

168

Point (WGS-84 3D)

126

Note that in most cases, Cypher will use Long or Double when working with numbers.

Properties with the type of String are a special case because they are dynamically sized. The table below shows the maximum number of array elements in an array, based on certain string sizes:

Table D.3. Maximum array length, examples for strings
String size maxArrayLength

1

1345

10

336

100

39

1000

4

The table can be used as a reference point. For example: if we know that all the strings in an array occupy 100 bytes or less, then arrays of length 39 or lower will definitely fit.

Composite indexes

This limitation only applies if one or more of the following criteria is met:

We denote a targeted property of a composite index a slot, and the number of slots numberOfSlots. For example, an index on :Person(firstName, surName, age) has three properties and thus numberOfSlots = 3.

In the index, each slot is filled by an element. In order to calculate the size of the index, we must have the size of each element in the index, i.e. the elementSize, as calculated in previous sections.

The following equation can be used to verify that a specific composite index entry is within bounds:

numberOfSlots + sum( elementSize ) =< 4039

Here, sum( elementSize ) is the sum of the sizes of all the elements of the composite key as defined in the section called “Element size calculations”, and numberOfSlots is the number of targeted properties for the index.

Example D.3. The size of a composite index containing strings

Consider a composite index of five strings that each can occupy the maximum of 500 bytes.

Using the equation above we get:

numberOfSlots + sum( elementSize ) = 5 + ( 5 * ( 2 + 500 ) ) = 2515 < 4039

We are well within bounds for our composite index.

Example D.4. The size of an index containing arrays

Consider a composite index of five arrays of type Float that each have a length of 250.

First we calculate the size of each array element:

elementSizeArray = 3 + ( arrayLength * elementSizeifInArray ) = 3 + ( 250 * 4 ) = 1003

Then we calculate the size of the composite index:

numberOfSlots + sum( elementSizeArray ) = 5 + ( 5 * 1003 ) = 5020 > 4039

This index key will exceed the key size limit for native indexes.

To work around this, it is possible to create the index using the lucene+native-2.0 index provider, as described in the section called “Workarounds to address limitations”, but please note that this index provider has been deprecated.