Cypher: The Neo4j Query Language Decoded for Beginners

Ashok Vishwakarma

Founder and CTO, Impulsive Web

Everyone hates SQL JOINs and everyone has their own reasons why. After spending almost two decades in tech, I wonder why the way we look at and use our data has completely changed, but our storing techniques remain the same. This makes me and other engineers do a lot of work because our database is not doing what it requires.

When I got introduced to graph databases, I felt it aligned with the demand of reading and using data, especially where relationships within data matters the most.

If you’re planning to get started and a little afraid that you have to start over, as you did with SQL, here’s a guide and toolkit to help ease the process. This guide is your friendly introduction to Cypher, Neo4j’s powerful and expressive query language. Forget cryptic commands — learning Cypher is like sketching relationships and patterns, making graph database queries feel natural and even fun.

We’ll navigate this Cypher tutorial with the indispensable Neo4j Cypher Cheat Sheet (v5) as our guide, helping you understand Cypher basics and beyond.

Getting Started With Cypher

Before we dive into writing complex Cypher queries, let’s get comfortable with the fundamental elements — the very essence of how Neo4j represents data. Think of these as the essential vocabulary for your graph database conversations or, as we call them, queries.

Node ()

The stars of your Neo4j graph, they represent the entities or objects in your Neo4j graph database. They’re the nouns: people, products, companies, anything you want to model. To identify nodes in a Cypher query, look for parentheses:

  • (): Anonymous node — Representing something when the specifics aren’t immediately needed in your Cypher query.
  • (n): A node assigned to a variable n — This allows you to reference this specific node later in your Cypher query, making your graph queries more powerful.
  • (:Person): A node with the Person label — Labels are crucial in Neo4j for categorizing nodes, similar to tables in relational databases, but more flexible. A node can have multiple labels — for example, (:User:Admin), (movie:Movie) represents a movie in your graph database.
  • (p:Product {name: “Graph Visualizer”, version: “2.0”}) — A node labeled Product with properties. Properties in Cypher are key-value pairs storing data on the node, giving your entities rich attributes — for example, (artist:Artist {name: “The Graphers”, genre: “Data Rock”}).

Relationships []

Relationships are first-class citizens in Neo4j. They define how nodes are connected and are the verbs in your graph story. They always have a type and direction. They help connect the dots in the Neo4j data. To identify a relationship in a Cypher query, look for square brackets.

  • [r]-> — A relationship, assigned to the variable r, pointing from the left node to the right node. Naming relationships is useful for querying their properties.
  • -[:ACTED_IN]-> — A relationship specifically of type “ACTED_IN”. The relationship type is vital for understanding the context of the connection in your graph database — for example, (user)-[:PURCHASED]->(product:Item) shows that a user purchased an item.
  • -[rel:FRIENDS_WITH {since: 2018, source: “Conference”}]-> — This is a “FRIENDS_WITH” relationship with its own properties. In Neo4j, relationships can store data, too, adding depth to your connections — for example, (city1)-[rt:ROUTE {distance: “250km”}]->(city2).

Direction -, ->, <-

Directionality is key to understanding the flow or nature of a relationship in Cypher:

  • -> Indicates a left-to-right relationship — e.g., (A)-[:LOVES]->(B).
  • <- Indicates a right-to-left relationship — e.g., (A)<-[:LOVES]-(B).
  • Used when the direction is not important for a specific part of your Cypher query or if the relationship is inherently bidirectional — for example, (director:Person)-[:DIRECTED]->(movie:Movie) — A Person node (the director) DIRECTED a Movie node.

Properties {}

Key-value pairs that provide attributes for your Neo4j nodes and relationships. To identify them in a Cypher query, look for curly braces:

// A node with properties
(c:Customer {id: "CUST123", loyaltyLevel: "Gold"})

// A relationship with properties
(employee)-[m:MANAGES {department: "Sales", year_appointed: 2021}]->(team)

Putting it together, if we want to represent a “Java” developer working on a specific “GraphDB Interface” project, the Cypher pattern looks like this:

(dev:Developer {language: "Java"})-[:WORKS_ON]->(project:Project {name: "GraphDB Interface"})

This visual and descriptive nature is a core strength of Cypher for Neo4j.

Cypher Keywords

Now that you understand the basic syntax for Neo4j nodes and relationships, let’s explore the main Cypher keywords that allow you to interact with your graph database.

MATCH

MATCH is your primary tool for specifying patterns of nodes and relationships you want to find in your Neo4j graph database.

// Find all nodes labeled "Movie" in your graph database
MATCH (m:Movie)
RETURN m.title;

// Find all people who reviewed a specific movie
MATCH (p:Person)-[:REVIEWED]->(m:Movie {title: "The Graphfather"})
RETURN p.name, m.title;

Using MATCH effectively is key to writing powerful Cypher queries.

WHERE

The WHERE clause is used to add constraints to your MATCH patterns, filtering the results from your Neo4j graph database.

// Find actors who starred in movies released after 2000
MATCH (actor:Actor)-[:STARRED_IN]->(movie:Movie)
WHERE movie.releaseYear > 2000
RETURN actor.name, movie.title;

// Find users who are active AND located in 'USA'
MATCH (u:User)
WHERE u.status = 'active' AND u.country = 'USA'
RETURN u.email;

WHERE helps refine your graph queries to get precisely the data you need.

RETURN

The RETURN clause defines what data your Cypher query should output from the Neo4j graph database.

// Return the names and birth years of people
MATCH (p:Person)
RETURN p.name AS personName, p.born AS birthYear;

// Return the count of direct friends for a user
MATCH (u:User {name: "Alice"})-[:IS_FRIENDS_WITH]->(friend:User)
RETURN u.name, count(friend) AS numberOfFriends;

RETURN is crucial for presenting the results of your Neo4j exploration.

CREATE

CREATE is used to add new nodes and relationships to your Neo4j graph database.

// Create a new 'Software' node
CREATE (s:Software {name: "GraphQueryTool", version: "1.0", license: "MIT"});

// Create two nodes and a 'WROTE_ARTICLE_FOR' relationship between them
CREATE (author:TechWriter {name: "Dr. Graph"})
CREATE (topic:Subject {name: "Cypher Query Language"})
CREATE (author)-[:WROTE_ARTICLE_FOR {publishedDate: date()}]->(topic);

MERGE

MERGE uniquely finds or creates a pattern in your Neo4j graph. If the pattern exists, MERGE uses it. If not, MERGE creates it. This is excellent for data import and avoiding duplicates.

// Ensure a 'City' node for 'London' exists, 
// set creation timestamp if new

MERGE (c:City {name: "London"})
ON CREATE SET c.created_at = timestamp()
RETURN c;

// If user 'jane' doesn't have a 'LOGGED_IN'
// relationship to today's date node, create it

MATCH (u:User {email: "jane@example.com"}), (d:Date {date: date()})
MERGE (u)-[r:LOGGED_IN]->(d)
ON CREATE SET r.login_time = time()
RETURN u.email, r, d.date;

SET and REMOVE

SET is used to add or update properties and labels on Neo4j nodes and relationships. REMOVE deletes properties and labels.

// Update a user's status and add a new label
MATCH (u:User {id: "user007"})
SET u.status = "premium_member", u.last_updated = datetime()
SET u:Premium // Add 'Premium' label
RETURN u;

// Remove a property and a label from a product
MATCH (p:Product {sku: "XYZ123"})
REMOVE p.old_price, p:Discontinued
RETURN p;

DELETE and DETACH DELETE

DELETE removes nodes and relationships. You must delete a node’s relationships before deleting the node itself. DETACH DELETE removes a node and all its connected relationships in one go.

// Delete an 'EXPIRED_OFFER' relationship
MATCH (product)-[r:EXPIRED_OFFER]->(user)
DELETE r;

// Delete an inactive account and all its relationships
MATCH (acc:Account {status: "inactive", closed_date: date("2020-01-01")})
DETACH DELETE acc;

ORDER BY, SKIP, LIMIT

These clauses control the presentation of your Neo4j query results: ORDER BY sorts, SKIP paginates, and LIMIT restricts the number of results.

// Get the 10 most recently active users, 
// ordered by last login

MATCH (u:User)
WHERE u.last_login IS NOT NULL
RETURN u.username, u.last_login
ORDER BY u.last_login DESC
LIMIT 10;

WITH

WITH allows you to pass results from one part of your Neo4j query to another, enabling aggregations and manipulations mid-query.

// Find authors, count their books, 
// then filter for those with more than 5 books

MATCH (author:Author)-[:WROTE]->(book:Book)
WITH author, count(book) AS bookCount
WHERE bookCount > 5
RETURN author.name, bookCount
ORDER BY bookCount DESC;

WITH is essential for structuring advanced Cypher queries in your Neo4j graph database.

That’s all you need to write a Cypher query. If you noticed, it’s similar to the SQL you write, but more accurate in language and how relational data is represented and used.

How to Read a Cypher Query

When you encounter a Cypher query, especially a longer one, it might seem daunting. But remember, Cypher is designed to be readable, mirroring the visual patterns in your Neo4j graph.

If you look at some specific keywords and patterns and break them down, it will start making sense to you. Here’s how I approach a Cypher query when I meet one.

Start With MATCH (or MERGE/CREATE)

This clause describes the graph pattern the query is targeting:

  • Look for () for nodes. Pay attention to variables (n), labels (:Person), and properties ({name: ‘Alice’}).
  • Look for [] for relationships. Note variables (r), types (:ACTED_IN), and properties.
  • Arrows -, ->, <- show relationship direction or indicate the direction is being ignored for matching.

Follow the Clauses Sequentially

Cypher queries generally execute top to bottom:

  • MATCH finds graph patterns.
  • WHERE filters these patterns.
  • WITH processes and passes intermediate results.
  • RETURN specifies the final output from your Neo4j graph database.
  • ORDER BY, SKIP, and LIMIT format this output.

Let’s understand further with an example:

MATCH (director:Person {name: "Director X"})-[:DIRECTED]->(movie:Movie)
WHERE movie.released > 2010
WITH director, movie
MATCH (actor:Person)-[:ACTED_IN]->(movie)
RETURN movie.title AS filmTitle, collect(actor.name) AS cast
ORDER BY filmTitle;

The above query is to find movies released after 2010 that were directed by “Director X” and return the movie title and the names of actors who starred in it.

Let’s break down the query for each component and see how it all comes together:

// Find a Person node named "Director X" who DIRECTED a Movie node.
MATCH (director:Person {name: "Director X"})-[:DIRECTED]->(movie:Movie)

// Filter these to movies released after 2010.
WHERE movie.released > 2010

// Pass the found director and movie variables to the next query part.
WITH director, movie

// For each movie passed from the WITH clause,
// find Person nodes (actors) who ACTED_IN that specific movie.
MATCH (actor:Person)-[:ACTED_IN]->(movie)

// Output the movie's title and
// a collected list of actor names for its cast.
RETURN movie.title AS filmTitle, collect(actor.name) AS cast

// Sort the results by movie title
ORDER BY filmTitle

This structured, step-by-step approach makes even complex Cypher queries for your Neo4j graph database easy to read and understand. It also helps identify how the graph is structured and what the output might be.

Summary and Next Steps

And there you have it, a quick tour through the expressive and intuitive world of Cypher, the premier query language for Neo4j graph databases.

We’ve seen how Cypher’s visual syntax, built around nodes () and relationships [], allows you to elegantly describe complex patterns and unlock the rich insights hidden within your connected data.

From fundamental Cypher keywords like MATCH, WHERE, and RETURN to powerful data manipulation clauses like CREATE, MERGE, and SET, you’re now equipped with the core knowledge to start crafting your own Neo4j queries.

Learning Cypher isn’t just about mastering a new syntax. It’s about adopting a new way of thinking about data — one that mirrors how relationships exist in the real world. The journey from simple lookups to sophisticated graph traversals and analytics is an exciting one, and Cypher is your trusty companion every step of the way.

Don’t let this be the end of your exploration. The best way to truly solidify your understanding of Cypher is to practice. Fire up your Neo4j instance, experiment with the Cypher examples we’ve discussed, and start applying these concepts to your own datasets.

Keep the Neo4j Cypher Cheat Sheet bookmarked and don’t hesitate to explore the vibrant Neo4j community resources for further learning and support. You can also explore Neo4j’s GraphAcademy to learn more about graph databases and Cypher.

The power to navigate, understand, and leverage your graph data is at your fingertips. So go forth, learn Cypher, and build amazing things with Neo4j. Happy querying! 🙂


Cypher, the Neo4j Query Language Decoded for Beginners was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.