How to Create Conditional and Dynamic Queries in Neo4j Bloom


Enabling BI tool-like functionality in Neo4j Bloom — “I want to easily search based on this and/or that logic”

Neo4j Bloom is a wonderful tool for navigating and visualizing a Neo4j graph without having to know the Cypher query language. This functionality is unique and a necessity for graph database end-users. Why?

The variety and depth of connections between nodes and relationships that form paths in a graph often cannot be determined ahead of time. The Cypher query language can express those paths, but learning how to write a Cypher query, similar to learning SQL, can be limiting for some. Besides, it’s fun to be able to visually traverse the graph, similar to taking an unknown path through a forest. You never know what you might find. Take a quick look at the “Bloom Features: Near Natural Language Search” video if you’re not familiar with the Bloom search and graph navigation capabilities.

Compare using Bloom to freely explore a graph with classic business intelligence tools. BI tools work with known data structures and predetermined links that are instantiated at runtime (think joins). Bloom users traverse graph paths without having to know the structure of the graph ahead of time (think traversing nodes and relationships regardless of types). The user stories for BI tools, such as Tableau, and Neo4j Bloom address two different scenarios.

BI tools provide functionality that Bloom was never designed to provide (think subtotals, bar charts, etc.). A Neo4j graph database can appear as a RDBMS structure and queried using SQL using the Neo4j BI Connector, but that’s a different topic.

Using the BI Connector to Query Neo4j with SQL

Querying and traversing a graph in Bloom is done with idiomatic, near natural language search patterns. A wonderfully graph-y thing, but there are times where pattern traversing of a graph is not what is needed. The developers of Neo4j Bloom addressed this by enabling custom Cypher queries executed in Bloom via “Search phrases.” Search phrases are parameterized Cypher queries that use a combination of text, parameters, and Cypher to guide users through building a customized query in the Bloom search box, all without knowing Cypher. See the “Search Phrases in Bloom” video below if you’re not familiar with them.

Search phrases are great for many use cases, except when there’s a need to add declarative constructs to query patterns similar to BI tool functionality. Not worries! Below we’ll discuss a generic approach for providing a level of declarative functionality.

Example Time! Dynamic Search Based on If X and/or Y Kinda Stuff

Things most don’t realize you can do with Bloom

Many users want to be able to dynamically search the graph based on and/or logic. For example, “Find all :Person labeled nodes where the property born is between 1965 and 1992.” This declarative logic is something BI users are used to, but does not naturally integrate into the graph path traversal functionality of Bloom. On the other hand, visual unbounded traversing of a graph is not something BI users typically have.

Bloom search phrases can be used to address the need for conditional and declarative query constructs. The building block for this is one of my favorite Neo4j apoc procedures, apoc.cypher.run(). It’s a handy little function that will execute a Cypher query passed in as a string. Guess what? Bloom search phrases can dynamically build Cypher strings. Below are some foundational examples to illustrate.

👉 Run a Cypher Statement in Bloom 👈

The most basic example of using apoc.cypher.run() in Bloom would be to create a search phrase that inputs a Cypher query string and executes it:

Bloom Search phrase definition pane

A user would do the following in the search box to execute this search phrase:

  1. Start typing “Q1 -” in the search box and tab to complete the search phrase.
  2. Once you have Q1 — Cypher Input: showing the search box, type a the free-form Cypher query, e.g. MATCH path=(:Person)-->().
  3. Then press enter to execute the query which will show something similar to:

The Cypher query text entered in the search box as part of the search phrase is passed to the main query in the parameter $cypherText, which is then executed by apoc.cypher.run(). I’ve used this search phrase when I need a visualization that is too much for the Neo4j Browser to render. Read the “Where’s My Neo4j Cypher Query Results” post for more details on when this might occur.

Where’s My Neo4j Cypher Query Results? 😠 ⚡️ ⁉️

Note the hardcoded LIMIT clause in the search phrase query. It might be prudent to set a limit to the number of nodes returned by the submitted dynamic Cypher that is equal or lesser than the “Node query limit” setting in Bloom or some other sensible metric. Bloom will stop retrieving data when the node query limit setting is reached, so why have the Neo4j database work on retrieving more data than Bloom will display? The techniques outlined below can be used to have users specify the node limit in a search phrase to match the the changeable Bloom node query limit setting.

Neo4j Bloom version 1.6+ now uses the reactive javascript driver. The “Fetching Large Amounts of Data Using the Neo4j Reactive Driver: The Bloom Case” Medium post is an interesting read on how this is implemented with some good background on the interrelationship between displaying nodes and uniqueness.

Fetching Large Amount of Data Using the Neo4j Reactive Driver: The Bloom Case

👉 Search Phrase That Allows for Conditionals on Properties 👈

Forming a query pattern in Bloom allows for flexible searching with exact matching values, but can be problematic other types operators. A path-based Bloom query currently cannot search for all :Person nodes with the born property > 1965, or properties on relationships. No worries! A generic search phrase can be used to guide a user through this functionality. The animation below shows this search phrase functionality using the ubiquitous Neo4j “hello world” movies database.

Finding all :Person nodes with property born > 1965

The above example was done with a search phrase that chains search phrase parameters, which is where parameter values are used in other parameter definitions. Below is the Bloom search phrase definition for the above shown in the Bloom search phrase editing panel:

The logic behind this search phrase is:

1️⃣ Create Search Phrase. Enter search phrase text with the conditional parameters $label., $property, $condition, and $value shown as a,b,c,d in the above image.

2️⃣ Add a user friendly description.

3️⃣ Main Cypher Query. The main Cypher query will be built concatenating the parameter values created in subsequent steps into a string to be run by apoc.cypher.run() The parameters are created in the search phrase defined in step 1️⃣:

// Main Bloom Cypher query
WITH "MATCH (n:" + $label + ") " +
"WHERE n." + $property +
" " + $condition +
" " + $value +
" RETURN n" AS qryStr
CALL apoc.cypher.run(qryStr, {}) YIELD value AS nodes
RETURN nodes

4️⃣ Pick a Node Label. Get the $label parameter value (‘a’ in the image). Present the users with all the labels in the database using the database metadata call db.labels() that returns what labels exist in the graph, except for the _Bloom_Perspective_ label because it does not hold any user data:

// Parameter: $label; Data type: String; values: Cypher Query
CALL db.labels() YIELD label
WHERE label <> '_Bloom_Perspective_'
RETURN label

5️⃣ Present Only Numeric Properties As a Query Option. Get the $property parameter value (‘b’ in the image). Use the metadata database call db.schema.nodeTypeProperties() to return any property for the nodes that have the label defined in the $label parameter and are numeric values.

// Parameter: $property data type: String; values: Cypher Query
CALL db.schema.nodeTypeProperties() YIELD nodeType, propertyName, propertyTypes
WHERE nodeType = ":`" + $label + "`"
AND propertyTypes IN [['Long'], ['Double']]
RETURN DISTINCT propertyName

This example restricts the search to numeric values to reduce the code shown in this blog. The “dynamic search…” example below and the corresponding blog github repo code has logic to handle conditionals for different data types. For example, this search phrase parameter Cypher does not deal with string query conditionals, such as STARTS WITH conditions.

6️⃣ Present Conditional Operator. Pick a $condition parameter (‘c’ in the image). Only the valid conditionals for numerics are returned here.

// Parameter: $condition data type: String; values: Cypher Query
UNWIND ["=", "<>", "<", ">", "<=", ">=", "IS NULL", "IS NOT NULL"] AS opX
RETURN opX

IS NOT NULL and IS NULL can be used to test a property’s existence; it could be switched to use [NOT] exists(<node.property>) with a different search phrase query pattern.

7️⃣ Enter A Value To Query On. Free-format $value to search for (‘d’ in the image). No Cypher needed and the return value is String. The node properties to include in the main query is limited numeric values as picked up by the call to db.schema.nodeTypeProperties() in step 5️⃣.

➡️ Last step is to save the query and run it using the Bloom search box.

Note: There is a new scene (displayed data) filter capability in Bloom version 1.6 that will allow you to choose individual property values to filter the nodes in the scene. The difference with this search phrase is that the data is filtered in the query that retrieves data versus the scene filter that filters data has already been queried and is displayed by Bloom.

As you think about creating your own dynamic search phrases, keep in mind that properties in a Label Property Graph can be unique to each node or relationship. This means that two different nodes can have a property with the same name but different data types. It can be valid to have a node with the property born = 1997 and another with born = ‘St. Mare’, with or without the same label(s). Meaning that maybe your use case would call for specifying the data type being queried.

👉 Dynamic Labels and Search on Multiple Properties with AND/OR Condition 👈

The next logical step is to expand on the above conditional logic example to include choosing a property, setting a conditional, then choosing AND/OR condition for a second property selection. This search phrase also presents the user with the proper conditional tests based on the data type of the property being tested.

Person born property IS NULL and name CONTAINS ‘J’

A few things to note on how this search phrase was constructed when you look at the code in the file srchPhrs_3_labelMultiProp.cypher in the github repo:

  • Spaces are a delimiter for the Bloom search box, which makes completing parameter terms with spaces, such as IS NOT NULL, tricky for the user to navigate. The approach used here for parameters with spaces is to have them replaced with an underbar (e.g. IS_NOT_NULL). Personally I think it’s easier to read the terms in the search box when clauses with spaces have underbars, such as IS_NOT_NULL and ENDS_WITH. The underbars for these phrases are replaced with spaces when the final string passed to apoc.cypher.run().
  • Search phrase completion requires that each parameter have a value, and the conditional terms IS_NULL and IS_NOT_NULL do not have a target value. This is addressed by breaking the IS_ conditions into two parts. The first parameter can be IS_, which allows testing in the second parameter to present the user with NULL and NOT_NULL.
  • The main query is a regular Cypher query, so string data types using as test values must be enclosed in quotes (the 'J' in the animation above). This is different from the typeahead functionality in the Bloom search box.

👉 Finding Multi-Label Nodes 👈

A scenario that is occasionally asked for is to be able to search for nodes with multi-label combinations. To show this, the movies database has been enhanced create labels on :Person and :Movie nodes as described in the github repo readme with the code in the 0_2_enhanceMovie.cypher file:

// New labels to make things interesting
// A :Person labeled node can be:
// Rich
// Famous
// Rich and famous
// Neither rich or famous
//
// A :Movie labeled node can be:
// Famous
// Or not

The query that drives this search phrase does not return a single label option because Bloom already does this. It also does not allow for invalid label combinations; a node with a :Person and :Movie label combination is not valid in the movie database and is excluded from the options presented to the user via the Cypher WHERE NOT clause.

// Parameter $label; type String; output Cypher Query
// Note: logic in the cypher query removes invalid label combinations for the enhanced movies graph
CALL db.labels() YIELD label as label
WHERE label <> '_Bloom_Perspective_'
WITH collect(label) as labels
WITH labels, size(labels) as nbrLabels
UNWIND apoc.coll.combinations(labels, 2, nbrLabels) AS labelCombos
WITH labelCombos // exclude invalid label combinations
WHERE NOT ('Movie' IN labelCombos AND 'Person' IN labelCombos)
AND NOT ('Movie' IN labelCombos AND 'Rich' IN labelCombos)
WITH labelCombos
WITH reduce(x = '', lab in labelCombos | x + ':' + lab + ' AND ' ) AS whereClause
RETURN left(whereClause, size(whereClause) - 4) as whereClause

👉 One Last Thing. Show the Graph Database Schema 👈

It’s always useful to see what types of labels, relationships, and properties are in a graph, which can easily be done with db.visualize.schema() query in a search phrase:

CALL db.schema.visualization() YIELD nodes, relationships
UNWIND nodes as node
with node, relationships
WHERE apoc.node.labels(node) <> ["_Bloom_Perspective_"]
RETURN collect(node) AS nodes, relationships
Result of :Schema search phrase

What Else Might Be Done?

  • Allow search phrases that that work with alternative data structures, such as lists.
  • Use aggregate functions to enable searching for minimum, maximum, averages, etc. of properties.
  • Use predicate functions such as any() or all() for extended AND/OR query patterns.

Hopefully, more examples could be added or suggested to the github repo for this post if anyone has other ideas.

Github / Setup If You Want to Play Along in Your Own Bloom Environment

The Cypher statement files referenced in this blog can be found in the github repo:

dfgitn4j/bloom-dynamic-cond-search

A few things need to be available and done to run the examples below in your own Bloom environment:

  1. Have access to Neo4j Bloom, either through Neo4j Desktop or a server install.
  2. Create the movies database running the :play movies gist in the Neo4j Browser, or running the Cypher in the file 0_1_creatMovieDB.cypher.
  3. Add new node label combinations by executing the Cypher in the file 0_2_enhanceMovie.cypher if you want to run the multi-label combination search phrase example.
  4. Import the perspective Bloom conditional and dynamic queries.json into Bloom by clicking on “Import Perspective” from Bloom’s Perspective Gallery.

Tips and Techniques

A few tips about working with Bloom and the constructs presented in this blog:

  1. The Bloom search box is very aggressive looking for relevant search phrases and graph patterns as users type. Having unique wording in a search phrase definition and description reduces the number of search phrases being presented to the user.
  2. Consider adding a LIMIT clause to dynamic queries generated using the technique described in this post that matches Bloom’s “Node query limit” parameter.
  3. Metadata commands such as the db.labels() used in search phrases will return Bloom Categories (labels in Neo4j graph database speak) that have not been included in the current Bloom Categories definition. Unwanted categories can be explicitly excluded in the search phrase Cypher.
  4. Case insensitivity does not work the same as with the Bloom case insensitive search option. Case insensitivity must be coded in the search phrase Cypher statements.
  5. Removing a query parameter while editing the search phrase definition and the corresponding parameter logic goes away. Save any code before you wish to keep.
  6. There is no error checking. A runtime error will be shown if an incorrect Cypher statement is built.

Thank you for your time, and please email the author or add code and suggestions to the github repo for this blog if you have other useful search phrase patterns. I will be adding some in the future that are bouncing around in my “when I have spare time and have functional consciousness” spot in my b̶r̶i̶a̶n brain.


How to Create Conditional and Dynamic Queries in Neo4j Bloom was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.