This chapter presents an introduction to graph database concepts.
This chapter includes the following sections:
We will use the example graph below to introduce the basic concepts of the labeled property graph:
Nodes are often used to represent entities. The simplest possible graph is a single node.
Consider the graph below, consisting of a single node.
Labels are used to shape the domain by grouping nodes into sets where all nodes that have a certain label belongs to the same set.
For example, all nodes representing users could be labeled with the label
With that in place, you can ask Neo4j to perform operations only on your user nodes, such as finding all users with a given
Since labels can be added and removed during runtime, they can also be used to mark temporary states for nodes.
:Suspended label could be used to denote bank accounts that are suspended, and a
:Seasonal label can denote vegetables that are currently in season.
A node can have zero to many labels.
In the example above, the nodes have the labels
Movie, which is one possible way of describing the data.
But assume that we want to express different dimensions of the data.
One way of doing that is to add more labels.
Below is an example showing the use of multiple labels:
A relationship connects two nodes. Relationships organize nodes into structures, allowing a graph to resemble a list, a tree, a map, or a compound entity — any of which may be combined into yet more complex, richly inter-connected structures.
Our example graph will make a lot more sense once we add relationships to it:
A relationship must have exactly one relationship type.
Our example uses
DIRECTED as relationship types.
roles property on the
ACTED_IN relationship has an array value with a single item in it.
Below is an
ACTED_IN relationship, with the
Tom Hanks node as the source node and
Forrest Gump as the target node.
We observe that the
Tom Hanks node has an outgoing relationship, while the
Forrest Gump node has an incoming relationship.
Relationships always have a direction. However, you only have to pay attention to the direction where it is useful. This means that there is no need to add duplicate relationships in the opposite direction unless it is needed in order to properly describe your use case.
Note that a node can have relationships to itself.
If we want to express that
KNOWS himself, that would be expressed as:
Properties are name-value pairs that are used to add qualities to nodes and relationships.
In our example graphs, we have used the properties
Movie nodes, and the property
roles on the
The value part of the property can hold different data types such as
For a thorough description of the available data types, refer to the Cypher manual.
A traversal is how you query a graph in order to find answers to questions, for example: "What music do my friends like that I don’t yet own?", or "What web services are affected if this power supply goes down?".
Traversing a graph means visiting nodes by following relationships according to some rules. In most cases only a subset of the graph is visited.
If we want to find out which movies Tom Hanks acted in according to our tiny example database, the traversal would start from
Tom Hanks node, follow any
:ACTED_IN relationships connected to the node, and end up with
Forrest Gump as the result (see the dashed lines):
The traversal result could be returned as a path with the length one:
The path above has length one.
The shortest possible path has length zero. It contains a single node and no relationships. For example:
This path has length one:
A schema in Neo4j refers to indexes and constraints.
Neo4j is often described as schema optional, meaning that it is not necessary to create indexes and constraints. You can create data — nodes, relationships and properties — without defining a schema up front. Indexes and constraints can be introduced when desired, in order to gain performance or modeling benefits.
Indexes are used to increase performance.
For working with indexes in Cypher, see Cypher manual → Indexes.
Constraints are used to make sure that the data adheres to the rules of the domain.
For example: "The value of the property
name must be unique among nodes that have the label `Person`".
For working with constraints in Cypher, see the Cypher manual → Constraints.
Node labels, relationship types and properties are case sensitive, meaning for example that the property
name means something different than the property
It is recommended to follow the naming conventions described in the following table:
|Graph entity||Recommended style||Example|
Camel case, beginning with an upper-case character
Upper case, using underscore to separate words
Lower camel case, beginning with a lower-case character
For the precise naming rules, refer to the Cypher manual → Naming rules and recommendations.