Parsing
This page provides a general overview of how Cypher® parses an input STRING
.
The Cypher parser takes an arbitrary input STRING
.
This page details the general rules on which characters are considered valid input.
Using unicodes in Cypher
Unicodes can generally be escaped as \uxxx
.
For example, the below query uses the Unicode u00B0
to search for any recipe descriptions containing the degree symbol, º
:
STRING
matchingMATCH (r:Recipe)
WHERE r.description CONTAINS "\u00B0"
RETURN r
Additional documentation on escaping rules for STRING
literals, names and regular expressions can be found here:
The Unicode version used by Cypher depends on the running JVM version.
Neo4j version | JVM compliancy | Unicode version |
---|---|---|
3.x |
Java SE 8 Platform Specification |
Unicode 6.2 |
4.x |
Java SE 11 Platform Specification |
Unicode 10.0 |
5.x |
Java SE 17 Platform Specification |
Unicode 13.0 |
5.14 |
Java SE 17 and Java SE 21 Platform Specification |
Unicode 13.0 and Unicode 15.0 |
Supported whitespace
Whitespace can be used as a separator between keywords and has no semantic meaning. The following unicode characters are considered as whitespace:
Description | List of included Unicode characters |
---|---|
Unicode general category Zp |
|
Unicode general category Zs |
|
Unicode general category class Zl |
|
Horizontal tabulation |
|
Line feed |
|
Vertical tabulation |
|
Form feed |
|
Carriage return |
|
File separator |
|
Group separator |
|
Record separator |
|
Unit separator |
|
It is possible to have multiple whitespace characters in a row, and will have the same effect as using a single whitespace.
The following example query uses vertical tabulation (\u000B
) as whitespace between the RETURN
keyword and the variable m
:
MATCH (m) RETURN\u000Bm;
Supported newline characters
A newline character identifies a new line in the query and is also considered whitespace. The supported newline characters in Cypher are:
Description | List of included Unicode characters |
---|---|
Line feed |
|
Carriage return |
|
Carriage return + line feed |
|