Core concepts
Fundamentally, a Neo4j graph database consists of three core entities: nodes, relationships, and paths. Cypher® queries are constructed to either match or create these entities in a graph. Having a basic understanding of what nodes, relationships, and paths are in a graph database is therefore crucial in order to construct Cypher queries.
Nodes
The data entities in a Neo4j graph database are called nodes.
Nodes are referred to in Cypher using parentheses ().
MATCH (n:Person {name:'Anna'})
RETURN n.born AS birthYear
In the above example, the node includes the following:
-
A
Personlabel. Labels are like tags, and are used to query the database for specific nodes. A node may have multiple labels, for examplePersonandActor. -
A
nameproperty set toAnna. Properties are defined within curly braces,{}, and are used to provide nodes with specific information, which can also be queried for and further improve the ability to pinpoint data. -
A variable,
n. Variables allow referencing specified nodes in subsequent clauses.
In this example, the first MATCH clause finds all Person nodes in the graph with the name property set to Anna, and binds them to the variable n.
The variable n is then passed along to the subsequent RETURN clause, which returns the value of a different property (born) belonging to the same node.
Relationships
Nodes in a graph can be connected with relationships.
A relationship must have a start node, an end node, and exactly one type.
Relationships are represented in Cypher with arrows (e.g. -->) indicating the direction of a relationship.
MATCH (:Person {name: 'Anna'})-[r:KNOWS WHERE r.since < 2020]->(friend:Person)
RETURN count(r) As numberOfFriends
Unlike nodes, information within a relationship pattern must be enclosed by square brackets.
The query example above matches for relationships of type KNOWS and with the property since set to less than 2020.
The query also requires the relationships to go from a Person node named Anna to any other Person nodes, referred to as friend.
The count() function is used in the RETURN clause to count all the relationships bound by the r variable in the preceding MATCH clause (i.e. how many friends Anna has known since before 2020).
Note that while nodes can have several labels, relationships can only have one type.
Paths
Paths in a graph consist of connected nodes and relationships. Exploring these paths sits at the very core of Cypher.
MATCH (n:Person {name: 'Anna'})-[:KNOWS]-{1,5}(friend:Person WHERE n.born < friend.born)
RETURN DISTINCT friend.name AS youngerConnections
This example uses a quantified relationship to find all paths up to 5 hops away, traversing only relationships of type KNOWS from the start node Anna to other younger Person nodes (as defined by the WHERE clause).
The DISTINCT operator is used to ensure that the RETURN clause only returns unique nodes.
Paths can also be assigned variables.
For example, the below query binds a whole path pattern, which matches the SHORTEST path from Anna to another Person node in the graph with a nationality property set to Canadian.
In this case, the RETURN clause returns the full path between the two nodes.
MATCH p = SHORTEST 1 (:Person {name: 'Anna'})-[:KNOWS]-+(:Person {nationality: 'Canadian'})
RETURN p
For more information about graph pattern matching, see Patterns.