Graph database concepts
This chapter presents an introduction to graph database concepts.
This chapter includes the following sections:
1. Example graph
We will use the example graph below to introduce the basic concepts of the property graph:
2. Nodes
Nodes are often used to represent entities. The simplest possible graph is a single node.
Consider the graph below, consisting of a single node.
3. Labels
Labels are used to shape the domain by grouping nodes into sets where all nodes that have a certain label belongs to the same set.
For example, all nodes representing users could be labeled with the label :User
.
With that in place, you can ask Neo4j to perform operations only on your user nodes, such as finding all users with a given name.
Since labels can be added and removed during runtime, they can also be used to mark temporary states for nodes.
A :Suspended
label could be used to denote bank accounts that are suspended, and a :Seasonal
label can denote vegetables that are currently in season.
A node can have zero to many labels.
In the example above, the nodes have the labels Person
and Movie
, which is one possible way of describing the data.
But assume that we want to express different dimensions of the data.
One way of doing that is to add more labels.
Below is an example showing the use of multiple labels:
4. Relationships
A relationship connects two nodes. Relationships organize nodes into structures, allowing a graph to resemble a list, a tree, a map, or a compound entity — any of which may be combined into yet more complex, richly inter-connected structures.
Our example graph will make a lot more sense once we add relationships to it:
5. Relationship types
A relationship must have exactly one relationship type.
Our example uses ACTED_IN
and DIRECTED
as relationship types.
The roles
property on the ACTED_IN
relationship has an array value with a single item in it.
Below is an ACTED_IN
relationship, with the Tom Hanks
node as the source node and Forrest Gump
as the target node.
We observe that the Tom Hanks
node has an outgoing relationship, while the Forrest Gump
node has an incoming relationship.
Relationships always have a direction. However, you only have to pay attention to the direction where it is useful. This means that there is no need to add duplicate relationships in the opposite direction unless it is needed in order to properly describe your use case.
Note that a node can have relationships to itself.
If we want to express that Tom Hanks
KNOWS
himself, that would be expressed as:
6. Properties
Properties are name-value pairs that are used to add qualities to nodes and relationships.
In our example graphs, we have used the properties name
and born
on Person
nodes, title
and released
on Movie
nodes, and the property roles
on the :ACTED_IN
relationship.
The value part of the property can hold different data types such as number
, string
and boolean
.
For a thorough description of the available data types, refer to the Cypher manual.
7. Traversals and paths
A traversal is how you query a graph in order to find answers to questions, for example: "What music do my friends like that I don’t yet own?", or "What web services are affected if this power supply goes down?".
Traversing a graph means visiting nodes by following relationships according to some rules. In most cases only a subset of the graph is visited.
If we want to find out which movies Tom Hanks acted in according to our tiny example database, the traversal would start from the Tom Hanks
node, follow any :ACTED_IN
relationships connected to the node, and end up with Forrest Gump
as the result (see the dashed lines):
The traversal result could be returned as a path with the length one:
The path above has length one.
The shortest possible path has length zero. It contains a single node and no relationships. For example:
This path has length one:
8. Schema
A schema in Neo4j refers to indexes and constraints.
Neo4j is often described as schema optional, meaning that it is not necessary to create indexes and constraints. You can create data — nodes, relationships and properties — without defining a schema up front. Indexes and constraints can be introduced when desired, in order to gain performance or modeling benefits.
8.1. Indexes
Indexes are used to increase performance. To see examples of how to work with indexes, see Using indexes. For detailed descriptions of how to work with indexes in Cypher, see Cypher Manual → Indexes.
8.2. Constraints
Constraints are used to make sure that the data adheres to the rules of the domain. To see examples of how to work with indexes, see Using constraints. For detailed descriptions of how to work with constraints in Cypher, see the Cypher manual → Constraints.
9. Naming rules and recommendations
Node labels, relationship types and properties are case sensitive, meaning for example that the property name
means something different than the property Name
.
It is recommended to follow the naming conventions described in the following table:
Graph entity | Recommended style | Example |
---|---|---|
Node label |
Camel case, beginning with an upper-case character |
|
Relationship type |
Upper case, using underscore to separate words |
|
Property |
Lower camel case, beginning with a lower-case character |
|
For the precise naming rules, refer to the Cypher manual → Naming rules and recommendations.
Was this page helpful?