Structure Semantics

While PackStream defines what a Structure looks like, it does not define what it means. The semantics of Structures are bound to the Bolt Protocol version.

The table below lists the PackStream specified structures and their code and tag byte across all currently existing Bolt Protocol versions.

Structures

Structure name Code tag byte

Node

N

4E

Relationship

R

52

UnboundRelationship

r

72

Path

P

50

Date

D

44

Time

T

54

LocalTime

t

74

DateTime

I

49

DateTimeZoneId

i

69

LocalDateTime

d

64

Duration

E

45

Point2D

X

58

Point3D

Y

59

Legacy Structures

Legacy DateTime

F

46

Legacy DateTimeZoneId

f

66

Node

A snapshot of a node within a graph database.

The element_id field was added with version 5.0 and does not exist in earlier versions.

tag byte: 4E

Number of fields: 4 (3 before version 5.0)

Node::Structure(
    id::Integer,
    labels::List<String>,
    properties::Dictionary,
    element_id::String,
)
Example of a node structure
Node(
  id = 3,
  labels = ["Example", "Node"],
  properties = {"name": "example"},
  element_id = "abc123",
)
B4 4E
...

Relationship

A snapshot of a relationship within a graph database.

The fields element_id, start_node_element_id, and end_node_element_id were added with version 5.0 and do not exist in earlier versions.

tag byte: 52

Number of fields: 8 (5 before version 5.0)

Relationship::Structure(
    id::Integer,
    startNodeId::Integer,
    endNodeId::Integer,
    type::String,
    properties::Dictionary,
    element_id::String,
    start_node_element_id::String,
    end_node_element_id::String,
)
Example of a relationship structure
Relationship(
    id = 11,
    startNodeId = 2,
    endNodeId = 3,
    type = "KNOWS",
    properties = {"name": "example"},
    element_id = "abc123",
    start_node_element_id = "def456",
    end_node_element_id = "ghi789",
)
B8 52
...

UnboundRelationship

A relationship without start or end node ID. It is used internally for Path serialization.

The element_id field was added with version 5.0 and does not exist in earlier versions.

tag byte: 72

Number of fields: 4 (3 before version 5.0)

UnboundRelationship::Structure(
    id::Integer,
    type::String,
    properties::Dictionary,
    element_id::String,
)
Example of unbound relationship structure
UnboundRelationship(
    id = 17,
    type = "KNOWS",
    properties = {"name": "example"},
    element_id = "foo"
)
B4 72
...

Path

An alternating sequence of nodes and relationships.

tag byte: 50

Number of fields: 3

Path::Structure(
    nodes::List<Node>,
    rels::List<UnboundRelationship>,
    indices::List<Integer>,
)

Where the nodes field contains a list of nodes and the rels field is a list of unbound relationships. The indices are a list of integers describing how to construct the path from nodes and rels. The first node in nodes is always the first node in the path and is not referenced in indices. indices always has an even number of entries. The 1st, 3rd, …​ entry in indices refers to an entry in rels (1-indexed), for example, a 3 would refer to the 3rd element of rels. The number can also be negative which should be treated like the positive equivalent, except for denoting the relationship in the inverse direction. The number is never 0. The 2nd, 4th, …​ entry in indices refers to an entry in nodes (0-indexed), for example, a 3 would refer the 2nd element of nodes. The number is always ≥ 0.

Example (simplified notation for <Node> and <UnboundRelationship>)
Path::Structure(
    nodes: [Node::Structure(42, ...), Node::Structure(69, ...), Node::Structure(1, ...)],
    rels: [UnboundRelationship::Structure(1000, ...), UnboundRelationship::Structure(1001, ...)],
    indices: [1, 1, 1, 0, -2, 2],

This represents the path (42)-[1000]→(69)-[1000]→(42)←[1001]-(1), where (n) denotes a node with id n and [n] a relationship with id n ( or denotes the direction of each relationship).

Date

A date without a time-zone in the ISO-8601 calendar system, e.g. 2007-12-03.

tag byte: 44

Number of fields: 1

Date::Structure(
    days::Integer,
)

Where the days are days since Unix epoch. 0 for example represents 1970-01-01 while 1 represents 1970-01-02.

Time

An instant capturing the time of day, and the timezone, but not the date.

tag byte: 54

Number of fields: 2

Time::Structure(
    nanoseconds::Integer,
    tz_offset_seconds::Integer,
)

Where the nanoseconds are nanoseconds since midnight (this time is not UTC) and the tz_offset_seconds are an offset in seconds from UTC.

LocalTime

An instant capturing the time of day, but neither the date nor the time zone.

tag byte: 74

Number of fields: 1

LocalTime::Structure(
    nanoseconds::Integer,
)

Where the nanoseconds are nanoseconds since midnight.

DateTime

An instant capturing the date, the time, and the time zone. The time zone information is specified with a zone offset.

This structure is new in version 5.0. It replaces Legacy DateTime and fixes a bug in certain edge-cases. Version 4.4 also allows for usage of the fixed structure, if server and client negotiate its usage (see HELLO message).

tag byte: 49

Number of fields: 3

DateTime::Structure(
    seconds::Integer,
    nanoseconds::Integer,
    tz_offset_seconds::Integer,
)
  • The seconds and nanoseconds are the time since Unix epoch, often referred as a Unix timestamp.

  • The amount of nanoseconds ranges from 0 to 999_999_999 (_ separator added here and later for clarity).

  • The tz_offset_seconds specifies the offset in seconds from UTC.

For instance, the serialization of the point in time denoted as 1970-01-01T02:15:00.000000042+01:00 can be implemented as follows:

  • compute the UTC time, i.e. 1970-01-01T01:15:00.000000042Z (Z denotes UTC).

  • compute the difference between that UTC time and the Unix epoch, which is 1h15, i.e. 4_500 seconds.

The resulting DateTime instance is therefore as follows:

{
  seconds: 4500
  nanoseconds: 42,
  tz_offset_seconds: 3600
}

The deserialization of such a DateTime structure expectedly happens in reverse:

  • instantiate the idiomatic equivalent of DateTime based on that Unix timestamp (4500 seconds and 42 nanoseconds), giving 1970-01-01T01:15:00.000000042Z

  • localize the resulting UTC DateTime to the timezone of the specified offset, giving 1970-01-01T02:15:00.000000042+0100

DateTimeZoneId

An instant capturing the date, the time, and the time zone. The time zone information is specified with a zone identifier.

This structure is new in version 5.0. It replaces Legacy DateTimeZoneId and fixes a bug in certain edge-cases. Version 4.4 also allows for usage of the fixed structure, if server and client negotiate its usage (see HELLO message).

tag byte: 69

Number of fields: 3

DateTimeZoneId::Structure(
    seconds::Integer,
    nanoseconds::Integer,
    tz_id::String,
)
  • The seconds and nanoseconds are the time since Unix epoch, often referred as a Unix timestamp.

  • The amount of nanoseconds ranges from 0 to 999_999_999 (_ separator added here and later for clarity).

  • The tz_id specifies the timezone name as understood by the timezone database.

For instance, the serialization of the point in time denoted as 1970-01-01T02:15:00.000000042+0100[Europe/Paris] can be implemented as follows:

  • retrieve the offset of the named timezone for that point in time, here +1 hour, i.e. 3_600 seconds.

  • compute the UTC time, i.e. 1970-01-01T01:15:00.000000042Z (Z denotes UTC).

  • compute the difference between that UTC time and the Unix epoch, which is 1h15, i.e. 4_500 seconds.

The resulting DateTime instance is therefore as follows:

{
  seconds: 4500
  nanoseconds: 42,
  tz_id: "Europe/Paris"
}

The deserialization of such a DateTime structure happens in reverse:

  • instantiate the idiomatic equivalent of DateTime based on that Unix timestamp (4500 seconds and 42 nanoseconds), giving 1970-01-01T01:15:00.000000042Z

  • localize the resulting UTC DateTime to the timezone specified by tz_id, giving 1970-01-01T02:15:00.000000042+0100[Europe/Paris]

Known Limitations

Accuracy

The resolution of offsets for a given time zone name and point in time is bound to the accuracy of the underlying timezone database. In particular, time zones before 1970 are not as well specified. Moreover, the offset resolution likely occurs both on the Bolt client side and Bolt server side. They each rely on a different timezone database. If these copies are not in sync, it could lead to unwanted discrepancies. In such a situation, either server or client could:

  • reject a timezone name deemed valid by the other party.

  • resolve different offsets for the same time zone and DateTimeZoneId.

LocalDateTime

An instant capturing the date and the time but not the time zone.

tag byte: 64

Number of fields: 2

LocalDateTime::Structure(
    seconds::Integer,
    nanoseconds::Integer,
)

Where the seconds are seconds since the Unix epoch.

Duration

A temporal amount. This captures the difference in time between two instants. It only captures the amount of time between two instants, it does not capture a start time and end time. A unit capturing the start time and end time would be a Time Interval and is out of scope for this proposal.

A duration can be negative.

tag byte: 45

Number of fields: 4

Duration::Structure(
    months::Integer,
    days::Integer,
    seconds::Integer,
    nanoseconds::Integer,
)

Point2D

A representation of a single location in 2-dimensional space.

tag byte: 58

Number of fields: 3

Point2D::Structure(
    srid::Integer,
    x::Float,
    y::Float,
)

Where the srid is a Spatial Reference System Identifier.

Point3D

A representation of a single location in 3-dimensional space.

tag byte: 59

Number of fields: 4

Point3D::Structure(
    srid::Integer,
    x::Float,
    y::Float,
    z::Float,
)

Where the srid is a Spatial Reference System Identifier.

Legacy Structures

Legacy DateTime

An instant capturing the date, the time, and the time zone. The time zone information is specified with a zone offset.

This structure got removed in version 5.0 in favor of DateTime.

tag byte: 46

Number of fields: 3

DateTime::Structure(
    seconds::Integer,
    nanoseconds::Integer,
    tz_offset_seconds::Integer,
)
  • The tz_offset_seconds specifies the offset in seconds from UTC.

  • The seconds elapsed since the Unix epoch, often referred as a Unix timestamp, added to the above offset.

  • The nanoseconds are what remains after the last second of the DateTime. The amount of nanoseconds ranges from 0 to 999_999_999 (_ separator added here for clarity).

For instance, the serialization of the point in time denoted as 1970-01-01T02:15:00+01:00 (and 42 nanoseconds) can be implemented as follows:

  • compute the UTC time, i.e. 1970-01-01T01:15:00Z (Z denotes UTC).

  • compute the difference between that UTC time and the Unix epoch, which is 1h15, i.e. 4500 seconds.

  • add the offset of +1 hour, i.e. 3600 seconds, to the above difference, which yields 8100 (4500 + 3600).

The resulting DateTime instance is therefore as follows:

{
  seconds: 8100
  nanoseconds: 42,
  tz_offset_seconds: 3600
}

The deserialization of such a DateTime structure expectedly happens in reverse:

  • remove the offset from the seconds field, which gives here 8100

  • instantiate the idiomatic equivalent of DateTime based on that Unix timestamp, giving 1970-01-01T01:15:00Z

  • localize the resulting UTC DateTime to the timezone of the specified offset, giving 1970-01-01T02:15:00+0100

Legacy DateTimeZoneId

An instant capturing the date, the time, and the time zone. The time zone information is specified with a zone identifier.

This structure got removed in version 5.0 in favor of DateTimeZoneId.

tag byte: 66

Number of fields: 3

DateTimeZoneId::Structure(
    seconds::Integer,
    nanoseconds::Integer,
    tz_id::String,
)
  • The tz_id specifies the timezone name as understood by the timezone database.

  • The seconds elapsed since the Unix epoch, often referred as a Unix timestamp, added to the offset derived from the named timezone and specified the point in time.

  • The nanoseconds are what remains after the last second of the DateTime. The amount of nanoseconds ranges from 0 to 999_999_999 (_ separator added here and later for clarity).

For instance, the serialization of the point in time denoted as 1970-01-01T02:15:00+0100[Europe/Paris] (and 42 nanoseconds) can be implemented as follows:

  • retrieve the offset of the named timezone for that point in time, here +1 hour, i.e. 3600 seconds.

  • compute the UTC time, i.e. 1970-01-01T01:15:00Z (Z denotes UTC).

  • compute the difference between that UTC time and the Unix epoch, which is 1h15, i.e. 4500 seconds.

  • add the resolved offset of +1 hour, i.e. 3600 seconds, to the above difference, which yields 8100 (4500 + 3600).

The resulting DateTime instance is therefore as follows:

{
  seconds: 8100
  nanoseconds: 42,
  tz_id: "Europe/Paris"
}

The deserialization of such a DateTime structure happens as follows:

  • instantiate the idiomatic equivalent of DateTime assuming the seconds denote a Unix timestamp, giving 1970-01-01T02:15:00Z.

  • set the timezone of the resulting instance, without changing the date/time components, giving 1970-01-01T02:15:00+0100[Europe/Paris] (this may lead to ambiguities, refer to the Known Limitations section below for more information).

Known Limitations

Accuracy

The resolution of offsets for a given time zone name and point in time is bound to the accuracy of the underlying timezone database. In particular, time zones before 1970 are not as well specified. Moreover, the offset resolution likely occurs both on the Bolt client side and Bolt server side. They each rely on a different timezone database. If these copies are not in sync, it could lead to unwanted discrepancies. In such a situation, either server or client could:

  • reject a timezone name deemed valid by the other party.

  • resolve different offsets for the same time zone and DateTimeZoneId.

Time Shifts

Note: these issues have been resolved with the introduction of DateTimeZoneId in version 5.0.

Not all instances of DateTimeZoneId map to a single valid point in time.

  1. During time shifts like going from 2AM to 3AM in a given day and timezone, 2:30AM e.g. does not happen.

  2. Similarly, when going from 3AM to 2AM in a given day and timezone, 2:30AM happens twice.

In the first case, a DateTimeZoneId specifying a time between 2AM and 3AM does not correspond to any actual points in time for that timezone and is invalid.

In the second case, all points in the time between 2AM and 3AM exist twice, but with a different offset. Therefore, the timezone name is not sufficient to resolve the ambiguity, the timezone offset is also needed. Since DateTimeZoneId does not include the timezone offset, the resolution of these particular datetimes is undefined behavior.

Summary of changes per version

The sections below list the changes of structure semantics in versions where they changed. Please also check for changes in Bolt Messages.

Version 5.0