Core concepts

Fundamentally, a Neo4j graph database consists of three core entities: nodes, relationships, and paths. Cypher® queries are constructed to either match or create these entities in a graph. Having a basic understanding of what nodes, relationships, and paths are in a graph database is therefore crucial in order to construct Cypher queries.

The below examples use the MATCH and RETURN clauses to find and return the desired graph patterns. To learn more about these and many other Cypher clauses, see the section on Clauses.

Nodes

The data entities in a Neo4j graph database are called nodes. Nodes are referred to in Cypher using parenthesis ().

MATCH (n:Person {name:'Anna'})
RETURN n.born AS birthYear

In the above example, the node includes the following:

  • A Person label. Labels are like tags, and are used to query the database for specific nodes. A node may have multiple labels, for example Person and Actor.

  • A name property set to Anna. Properties are defined within curly braces, {}, and are used to provide nodes with specific information, which can also be queried for and further improve the ability to pinpoint data.

  • A variable, n. Variables allow referencing specified nodes in subsequent clauses.

In this example, the first MATCH clause finds all Person nodes in the graph with the name property set to Anna, and binds them to the variable n. The variable n is then passed along to the subsequent RETURN clause, which returns the value of a different property (born) belonging to the same node.

Relationships

Nodes in a graph can be connected with relationships. A relationship must have a start node, an end node, and exactly one type. Relationships are represented in Cypher with arrows (e.g. -->) indicating the direction of a relationship.

MATCH (:Person {name: 'Anna'})-[r:KNOWS WHERE r.since < 2020]->(friend:Person)
RETURN count(r) As numberOfFriends

Unlike nodes, information within a relationship pattern must be enclosed by square brackets. The query example above matches for relationships of type KNOWS and with the property since set to less than 2020. The query also requires the relationships to go from a Person node named Anna to any other Person nodes, referred to as friend. The count() function is used in the RETURN clause to count all the relationships bound by the r variable in the preceding MATCH clause (i.e. how many friends Anna has known since before 2020).

Note that while nodes can have several labels, relationships can only have one type.

Paths

Paths in a graph consist of connected nodes and relationships. Exploring these paths sits at the very core of Cypher.

MATCH (n:Person {name: 'Anna'})-[:KNOWS]-{1,5}(friend:Person WHERE n.born < friend.born)
RETURN DISTINCT friend.name AS olderConnections

This example uses a quantified relationship to find all paths up to 5 hops away, traversing only relationships of type KNOWS from the start node Anna to other older Person nodes (as defined by the WHERE clause). The DISTINCT operator is used to ensure that the RETURN clause only returns unique nodes.

Paths can also be assigned variables. For example, the below query binds a whole path pattern, which matches the shortest path from Anna to another Person node in the graph up to 10 hops away with the nationality property set to Canadian. In this case, the RETURN clause returns the full path between the two nodes.

MATCH p=shortestPath((:Person {name: 'Anna'})-[:KNOWS*1..10]-(:Person {nationality: 'Canadian'}))
RETURN p

For more information about graph patterns, see the section on Patterns.