Goals This guide explains the basic concepts of Cypher, Neo4j’s graph query language. You should be able to read and understand Cypher queries after finishing this guide. Prerequisites You should be familiar with graph database concepts and the property graph… Read more →

Goals
This guide explains the basic concepts of Cypher, Neo4j’s graph query language. You should be able to read and understand Cypher queries after finishing this guide.
Prerequisites
You should be familiar with graph database concepts and the property graph model.
Beginner


Why Cypher?

We already know that Neo4j’s property graph model is composed of nodes and relationships, which may also have properties associated with them. However, nodes and relationships are the simple components that build the most valuable and powerful piece of the property graph model – the pattern. Patterns are comprised of node and relationship elements and can express simple or complex traversals and paths.

Pattern recognition is fundamental to the way that the brain works. Because of this, humans are very good at working with patterns (think of visual diagrams or even memory-matching games). Cypher is also heavily based on patterns and is designed to recognize various versions of these patterns in data, making it a simple and logical language for users to learn.

Cypher Syntax

The video below walks through some background on Cypher, basic syntax, and some intermediate examples. The concepts in the video are discussed in the paragraphs below, as well as in upcoming guides.

Since Cypher is designed to be human-readable, it’s construct is based on English prose and iconography to make syntax visual and easily understood. For example, take a look at the simple graph data in the image below. How would you represent this data in English? NOTE: the answer is in the paragraph below

Jennifer likes Graphs. Jennifer is friends with Michael. Jennifer works for Neo4j.

Cypher syntax will build upon this English-language structure we just created. In the next section, we will see exactly how to write this example in Cypher.

Cypher Comments

As you work through this guide, you will see comments in the Cypher code to help explain the syntax or what a query is doing. Comments in Cypher are the same as in many programming languages. You can add comments by starting a line with // and putting text after the slashes. Just like in other languages, starting the line with two forward slashes will mean that anything on that line will become a comment.

This is especially helpful to use in Neo4j Browser when saving queries. If you add a comment before the query, the comment automatically becomes the name of the saved query!

Representing Nodes in Cypher

Since Cypher uses ASCII-Art for patterns, we need a visual way to represent each component of our pattern above. We know that the main components of the property graph model are nodes and relationships. Remember that nodes are the data entities in your graph and that you can often identify nodes by finding the nouns or objects in your data model. In our example, Jennifer, Michael, Graphs, and Neo4j are our nodes.

To depict nodes in Cypher, we surround the node with parentheses, e.g. (node). Notice how the parentheses look similar to the circles that the visual representation uses for nodes in our data model.

Node Variables

If we later want to refer to the node, we can give it a variable like (p) for person or (t) for thing. In real-world queries, we might use longer, more expressive variable names like (person) or (thing). Just like in programming language variables, you can name your variables what you want and reference them by that same name later in a query.

If the node is not relevant to your return results, you can specify an anonymous node using empty parentheses (). This means that you will not be able to return this node later in the query.

Node Labels

If you remember from the property graph data model, you can also group similar nodes together by assigning a node label. Labels are kind of like tags and allow you to specify certain types of entities to look for or create. In our example, Person, Technology, and Company are the labels.

You can kind of think of this like telling SQL which table to look for the particular row. Just like to tell SQL to query a person’s information from a Person or Employee or Customer table, you can also tell Cypher to only check those labels for that information. This helps Cypher distinguish between entities and optimize execution for your queries. It is always better to use node labels in your queries, where possible.

If you do not specify a label for Cypher to filter out non-matching node categories, the query will check all of the nodes in the database! As you can imagine, this would be cumbersome if you had a very large graph.

Example: Nodes in Cypher

Using our graph example above, let’s see how we could specify our nodes.

()                  //anonymous node (no label or variable) can refer to any node in the database
(p:Person)          //using variable p and label Person
(:Technology)       //no variable, label Technology
(work:Company)      //using variable work and label Company

Representing Relationships in Cypher

To fully utilize the power of a graph database, we also need to express the relationships between our nodes. Relationships are represented in Cypher using an arrow --> or <-- between two nodes. Notice how the syntax looks like the arrows and lines connecting our nodes in the visual representation. Additional information, such as how nodes are connected (relationship type) and any properties pertaining to the relationship, can be placed in square brackets inside of the arrow.

In our example, the lines with LIKES, IS_FRIENDS_WITH, and WORKS_FOR between nodes are our relationships.

Undirected relationships are represented with no arrow and just two dashes --. This means that the relationship can be traversed in either direction. While a direction must be inserted to the database, it can be matched with an undirected relationship where Cypher ignores any particular direction and retrieves the relationship and connected nodes, no matter what the physical direction is. This allows the queries to be flexible and not force the user to know the physical direction of the relationship stored in the database.

If data is stored with one relationship direction, and a query specifies the wrong direction, Cypher will not return any results. In these cases where you may not be sure of direction, it is better to use an undirected relationship and retrieve some results.

//data stored with this direction
CREATE (p:Person)-[:LIKES]->(t:Technology)

//query relationship backwards will not return results
MATCH (p:Person)<-[:LIKES]-(t:Technology)

//better to query with undirected relationship unless sure of direction
MATCH (p:Person)-[:LIKES]-(t:Technology)

Relationship Types

Relationship types categorize and add meaning to a relationship, similar to how labels group nodes. In our property graph data model, relationships show how nodes are connected and related to each other. You can usually identify relationships in your data model by looking for actions or verbs.

You can specify any type of relationship you want between nodes, but we recommend good naming conventions using verbs and actions. Poor relationship type names make it more difficult to both read and write Cypher (remember, it should sound like English!).

For example, let us look at the relationship types from our example graph.

  • [:LIKES] – makes sense when we put nodes on either side of the relationship (Jennifer LIKES Graphs)
  • [:IS_FRIENDS_WITH] – makes sense when we put nodes with it (Jennifer IS_FRIENDS_WITH Michael)
  • [:WORKS_FOR] – makes sense with nodes (Jennifer WORKS_FOR Neo4j)

Relationship Variables

Just as we did with nodes, if we want to refer to a relationship later in a query, we can give it a variable like [r] or [rel]. We can also use longer, more expressive variable names like [likes] or [knows]. If you do not need to reference the relationship later, you can specify an anonymous relationship using two dashes --, -->, <--.

As an example, you could use either -[rel]-> or -[rel:LIKES]-> and call the rel variable later in your query to reference the relationship and its details.

If you forget the colon in front of a relationship type like this -[LIKES]->, it represents a variable (not a relationship type). Since no relationship type declared, Cypher will search all types of relationships.

Node or Relationship Properties

We have talked about how to write Cypher for nodes, relationships, and labels. The last piece of our property graph data model is for properties. Remember that properties are name-value pairs that provide additional details to our nodes and relationships.

To represent these in Cypher, we can use curly braces within the parentheses of a node or the brackets of a relationship. The name and value of the property then go inside the curly braces. Our example graph has both a node property (name) and a relationship property (since).

  • Node property: (p:Person {name: 'Jennifer'})
  • Relationship property: -[rel:IS_FRIENDS_WITH {since: 2018}]->

Properties can have values with a variety of data types. To see the full list that Cypher offers, see the Neo4j Developer Manual section on values and types.

Patterns in Cypher

Nodes and relationships make up the building blocks for graph patterns. These building blocks can come together to express simple or complex patterns. Patterns are the most powerful capability of graphs. In Cypher, they can be written as a continuous path or separated into smaller patterns and tied together with commas.

To show a pattern in Cypher, we need to combine the node and relationship syntaxes we have learned so far. Let us use our example of Jennifer likes Graphs.

In Cypher, this pattern would look like the code below.

(p:Person {name: "Jennifer"})-[rel:LIKES]->(g:Technology {type: "Graphs"})

This bit of Cypher tells the pattern we want, but it does not tell whether we want to find that existing pattern or insert it as a new pattern. To tell Cypher what we want it to do with the pattern, we need to add some keywords.

Cypher Keywords

Just like with most programming languages, there are a few words in Cypher reserved for specific actions in parts of a query. We need to be able to create, read, update, or delete data in Neo4j, and keywords help us accomplish that functionality. Let us look more in detail at two common keywords (more will be covered in upcoming guides).

MATCH

The MATCH keyword in Cypher is what searches for an existing node, relationship, label, property, or pattern in the database. If you are familiar with SQL, MATCH works pretty much like SELECT in SQL.

You can find all node labels in the database, search for a particular node, find all the nodes with a particular relationship, look for patterns of nodes and relationships, and much more using MATCH.

RETURN

The RETURN keyword in Cypher specifies what values or results you might want to return from a Cypher query. You can tell Cypher to return nodes, relationships, node and relationship properties, or patterns in your query results. RETURN is not required when doing write procedures, but is needed for reads.

The node and relationship variables we discussed earlier become important when using RETURN. In order to bring back nodes, relationships, properties, or patterns, you need to have variables specified in your MATCH clause for the data you want to return.

Cypher Examples

Let us look at some examples of the syntax we have learned so far using MATCH and RETURN keywords. Each example will start with an explanation of what we are trying to achieve and have an image below of the results of the query run in Neo4j Browser.

  • Example 1: Find the labeled Person nodes in the graph. Note that we must use a variable like p for the Person node if we want retrieve the node in the RETURN clause.
MATCH (p:Person)
RETURN p
  • Example 2: Find Person nodes in the graph that have a name of ‘Jennifer’. Remember that we can name our variable anything we want, as long as we reference that same name later.
MATCH (jenn:Person {name: 'Jennifer'})
RETURN jenn
  • Example 3: Find which Company Jennifer works for.

Explanation: we know we need to find Jennifer’s Person node, and we need to find the Company node she is connected to. To do that, we need to follow the WORKS_FOR relationship from Jennifer’s Person node to the Company node. We have also specified a label of Company so that the query will only look at nodes with that label. Since we only care about returning the company in this query, we need to give that node a variable (company) but do not need to give variables for the Person node or WORKS_FOR relationship.

MATCH (:Person {name: 'Jennifer'})-[:WORKS_FOR]->(company:Company)
RETURN company
  • Example 4: Find which Company Jennifer works for, but this time, return only the name of the company.

Explanation: this query is very similar to Example 3. Example 3 returned the entire Company node with all its properties. For this example, we still need to find Jennifer’s company, but now we only care about its name. We will need to access the node’s name property using the syntax variable.property to return the name value.

MATCH (:Person {name: 'Jennifer'})-[:WORKS_FOR]->(company:Company)
RETURN company.name

Aliasing Return Values

Not all properties are simple like our company.name example above. Some properties have poor names due to property length, multi-word descriptions, developer jargon, and other shortcuts. These naming conventions can be difficult to read, especially if they end up on reports and other user-facing interfaces.

Just like with SQL, you can rename return results by using the AS keyword and aliasing the property with a cleaner name. We can look at a mocked-up example to list a customer’s orders and the number of items in the order.

//poorly-named property
MATCH (kristen:Customer {name:'Kristen'})-[rel:PURCHASED]-(order:Order)
RETURN order.orderId, order.orderDate, kristen.customerIdNo, order.orderTotalNoOfItems

//cleaner printed results with aliasing
MATCH (kristen:Customer {name:'Kristen'})-[rel:PURCHASED]-(order:Order)
RETURN order.orderId AS OrderID, order.orderDate AS `Purchase Date`,
       kristen.customerIdNo AS CustomerID, order.orderNumOfLineItems AS `Number Of Items`
Results Without Aliases:

cypher without aliases

Results With Aliases:

cypher with aliases

You can specify return aliases that have spaces by using the backtick character before and after the alias (order.orderDate AS Purchase Date). If you do not have an alias that contains spaces, then you do not need to use backticks.

Next Steps

Now that you know how to write nodes, relationships, properties, and patterns in Cypher for reading existing data, you can begin exploring data that exists in a Neo4j database. We will look at more MATCH capabilities in an upcoming guide, as well as how to write Cypher for create, update, and delete operations with your data.