Comparing Cypher with SQL

While there are key differences between Cypher® and SQL, it is still possible to compare both languages and write equivalent SQL statements using Cypher. The Northwind dataset is used here to better illustrate the comparison.

For a more in-depth explanation on the differences and similarities between graph and relational databases, see Transition from relational to graph database.

Indexing

Indexes are available both in SQL and Cypher. They make searching for a specific node label and attribute combination more efficient.

Indexes in Cypher are only used for finding the starting points of a query; all subsequent pattern matching is done through the graph structure. Cypher supports range, text, point, lookup, full-text, and vector indexes.

In the Northwind dataset, adding indexes on productName and unitPrice makes searching for a product and its price quicker:

SQL Cypher
CREATE INDEX Product_productName ON products (product_name);
CREATE INDEX Product_unitPrice ON products (unit_price);
CREATE INDEX Product_productName IF NOT EXISTS FOR (p:Product) ON p.productName;
CREATE INDEX Product_unitPrice IF NOT EXISTS FOR (p:Product) ON p.unitPrice;

Query examples

Select and return records

SQL Cypher

To select and return records in SQL, select everything from the products table:

SELECT p.*
FROM products as p;

In Cypher, you MATCH a simple pattern: all nodes with the label :Product, and RETURN them:

MATCH (p:Product)
RETURN p;

Field access, ordering, and paging

Rather than returning all attributes, you can filter out the ones you are interested in — ProductName and UnitPrice, for example.

SQL Cypher

In SQL, this is how you order items by price and return the 10 most expensive items:

SELECT p.ProductName, p.UnitPrice
FROM products as p
ORDER BY p.UnitPrice DESC
LIMIT 10;

The statement is similar in Cypher, except for the pattern matching part:

MATCH (p:Product)
RETURN p.productName, p.unitPrice
ORDER BY p.unitPrice DESC
LIMIT 10;
Result
p.productName p.unitPrice

"Côte de Blaye"

263.5

"Thüringer Rostbratwurst"

123.79

"Mishi Kobe Niku"

97.0

"Sir Rodney’s Marmalade"

81.0

"Carnarvon Tigers"

62.5

"Raclette Courdavault"

55.0

"Manjimup Dried Apples"

53.0

"Tarte au sucre"

49.3

"Ipoh Coffee"

46.0

"Rössle Sauerkraut"

45.6

Remember that labels, relationship types, and property names are case sensitive in Neo4j. For more details on naming rules, see the Cypher Manual → Naming rules and recommendations.

Find a single product by name

There are different ways to query the database and retrieve a single item, for example, a product named Chocolade. You can do that, for example, by filtering by equality:

SQL Cypher

In SQL, you can filter data using the WHERE clause:

SELECT p.ProductName, p.UnitPrice
FROM products AS p
WHERE p.ProductName = 'Chocolade';

In Cypher, the WHERE clause belongs to the MATCH statement:

MATCH (p:Product)
WHERE p.productName = 'Chocolade'
RETURN p.productName, p.unitPrice;

A shorter option is to use the label productName to specify the product in the MATCH statement:

MATCH (p:Product {productName:'Chocolade'})
RETURN p.productName, p.unitPrice;
Result
p.productName p.unitPrice

"Chocolade"

12.75

Filter products

Filter by list/range

SQL Cypher

In SQL, you can use the operator IN:

SELECT p.ProductName, p.UnitPrice
FROM products as p
WHERE p.ProductName IN ('Chocolade','Chai');

Cypher has full collection support, including IN and other collection functions, predicates, and transformations:

MATCH (p:Product)
WHERE p.productName IN ['Chocolade','Chai']
RETURN p.productName, p.unitPrice;
Result
p.productName p.unitPrice

"Chocolade"

12.75

"Chai"

18.0

Filter by multiple numeric and textual predicates

SQL Cypher

This query retrieves products with a name starting with "C" and a price larger than 100:

SELECT p.ProductName, p.UnitPrice
FROM products AS p
WHERE p.ProductName LIKE 'C%' AND p.UnitPrice > 100;

In Cypher, the LIKE operator is replaced by the STARTS WITH, CONTAINS, and ENDS WITH operators:

MATCH (p:Product)
WHERE p.productName STARTS WITH 'C' AND p.unitPrice > 100
RETURN p.productName, p.unitPrice;

You can also use a regular expression to get all products with name starting with "C" and their prices:

MATCH (p:Product)
WHERE p.productName =~ '^C.*'
RETURN p.productName, p.unitPrice
Result
p.productName p.unitPrice

"Côte de Blaye"

263.5

Joining products with customers

SQL Cypher

In SQL, if you want to see who bought Chocolade, you can join the four tables together:

SELECT DISTINCT c.CompanyName
FROM customers AS c
JOIN orders AS o ON (c.CustomerID = o.CustomerID)
JOIN order_details AS od ON (o.OrderID = od.OrderID)
JOIN products AS p ON (od.ProductID = p.ProductID)
WHERE p.ProductName = 'Chocolade';

In Cypher, there is no need to JOIN tables. You can express connections as graph patterns instead:

MATCH (p:Product {productName:'Chocolade'})<-[:ORDERS]-(:Order)<-[:PURCHASED]-(c:Customer)
RETURN DISTINCT c.companyName;
Result
c.companyName

"Victuailles en stock"

"Ernst Handel"

"Antonio Moreno Taquería"

"Furia Bacalhau e Frutos do Mar"

"Around the Horn"

"Queen Cozinha"

Total spent in each product

By summing up product prices and ordered quantities, an aggregated view per product for the customer is provided. You can use aggregation functions like sum, count, avg, and max in both SQL and Cypher.

SQL Cypher

If you want to see what a company (e.g. Drachenblut Delikatessen) paid in total per product, including where they had no orders for products, you have to use OUTER JOINS to make sure that results are returned even if there are no matching rows in other tables:

SELECT p.Product_Name, sum(od.Unit_Price * od.Quantity) AS TotalPrice
FROM customers AS c
LEFT OUTER JOIN orders AS o ON (c.Customer_ID = o.Customer_ID)
LEFT OUTER JOIN order_details AS od ON (o.Order_ID = od.Order_ID)
LEFT OUTER JOIN products AS p ON (od.Product_ID = p.Product_ID)
WHERE c.Company_Name = 'Drachenblut Delikatessen'
GROUP BY p.Product_Name;

In Cypher, you need to turn the unitPrice property of the ORDERS relationship into an interger in order to do the calculation between quantity ordered and amount spent:

MATCH (p:Product)<-[o:ORDERS]-(order:Order)
SET o.unitPrice = toInteger(o.unitPrice)
RETURN o

Then you MATCH the company you want to gather information from, and use OPTIONAL MATCH to find their purchases, and products acquired, and RETURN the sum:

MATCH (c:Customer {companyName:'Drachenblut Delikatessen'})
OPTIONAL MATCH (c)-[:PURCHASED]->(:Order)-[o:ORDERS]->(p:Product)
RETURN p.productName, toInteger(sum(o.unitPrice * o.quantity)) AS totalPrice
Result
p.productName totalPrice

"Gumbär Gummibärchen"

372

"Perth Pasties"

640

"Konbu"

114

"Jack’s New England Clam Chowder"

81

"Queso Cabrales"

420

"Raclette Courdavault"

1650

"Lakkalikööri"

168

"Rhönbräu Klosterbier"

72

"Gorgonzola Telino"

200

Amount of products supplied

The previous example mentioned aggregation and used the SUM function to find out how much a company has spent when purchasing specific products. You can use the COUNT function in Cypher to also count how many products are offered by a supplier, for example.

SQL Cypher

In SQL, aggregation is explicit, so you have to provide all grouping keys again in the GROUP BY clause.

SELECT s.CompanyName AS Supplier, COUNT(p.ProductID) AS NumberOfProducts
FROM Suppliers s
JOIN Products p ON s.SupplierID = p.SupplierID
GROUP BY s.CompanyName
ORDER BY NumberOfProducts DESC
LIMIT 5;

In Cypher, grouping for aggregation is implicit. As soon as you use the first aggregation function, all non-aggregated columns automatically become grouping keys:

MATCH (s:Supplier)<-[:SUPPLIED_BY]-(p:Product)
RETURN s.companyName AS Supplier, COUNT(p) AS NumberOfProducts
ORDER BY NumberOfProducts DESC
LIMIT 5

Additional aggregation functions like collect, percentileCont, stdDev are also available.

Result
Supplier NumberOfProducts

"Pavlova"

5

"Plutzer Lebensmittelgroßmärkte AG"

5

"Specialty Biscuits"

4

"New Orleans Cajun Delights"

4

"Grandma Kelly’s Homestead"

3

List of products supplied

In Cypher you can use the COLLECT function to gather all nodes connected to others, but SQL doesn’t have a direct equivalent to it.

SQL Cypher

In SQL, if you want a lit of what products the suppliers offer, you use STRING_AGG:

SELECT s.CompanyName AS Supplier, STRING_AGG(p.ProductName, ', ' ORDER BY p.ProductName) AS ProductsSupplied
FROM Suppliers s
JOIN Products p ON s.SupplierID = p.SupplierID
GROUP BY s.CompanyName
ORDER BY s.CompanyName
LIMIT 5;

In Cypher, you can either return the structure like in SQL or use the collect() aggregation function, which aggregates values into a collection (list, array). This way, only one row per parent, containing an inlined collection of child values, is returned:

MATCH (s:Supplier)-[:SUPPLIES]->(p:Product)
RETURN s.companyName AS Supplier, COLLECT(p.productName) AS ProductsSupplied
ORDER BY Supplier
LIMIT 5

This also works for nested values.

Result
Supplier ProductsSupplied

"Aux joyeux ecclésiastiques"

["Côte de Blaye", "Chartreuse verte"]

"Bigfoot Breweries"

["Sasquatch Ale", "Laughing Lumberjack Lager", "Steeleye Stout"]

"Cooperativa de Quesos 'Las Cabras'"

["Queso Manchego La Pastora", "Queso Cabrales"]

"Escargots Nouveaux"

["Escargots de Bourgogne"]

"Exotic Liquids"

["Aniseed Syrup", "Chang", "Chai"]