An approach to parsing the query.log

When one has enabled query.log through Neo4j Enterprise parameter dbms.logs.query.enabled the included bash shell script can be used to quickly parse the log and identify the top 10 most expensive queries based upon total execution time and if one has… Read more →

Tuning Cypher queries by understanding cardinality

Cardinality issues are the most frequent culprit in slow or incorrect Cypher queries. Because of this, understanding cardinality, and using this understanding to manage cardinality issues, is a critical component in Cypher query tuning, and query correctness in general. A… Read more →

Creating and working with linked lists in Cypher

At some point when working with a graph, you may want to create a linked list out of some nodes. If each of the nodes to be linked has its own variable, this is easy, you just do a CREATE… Read more →

Conditional Cypher Execution

At some point you’re going to write a Cypher query requiring some conditional logic, where you want different Cypher statements executed depending on the case. At this point in time Cypher does not include native conditional functionality to address this… Read more →

Alternatives to UNION queries

While UNIONs can be useful for certain cases, they can often be avoided completely with small changes to the query. In this article we’ll present various example cases where a UNION isn’t necessary, and a simple Cypher query will do.… Read more →

Using AWS CLI to upload/download files to Amazon S3 bucket

If one has installed the AWS CLI To download a file from a S3 bucket anonymously run aws s3 cp s3://<AWS instance name>/<bucket_Name>/<file> <file> –no-sign-request and/or to upload to a Neo4j S3 buck anonymously run aws s3 cp <file> s3://<AWS… Read more →

Linux Out of Memory killer

The Out Of Memory Killer or OOM Killer is a process that the linux kernel employs when the system is critically low on memory. This situation occurs because the linux kernel has over allocated memory to its processes. When a… Read more →

Understanding transaction and lock timeouts

One way to handle runaway queries is to impose a time limit that will terminate a query when exceeded. There are some subtleties here that need to be understood to ensure proper behavior and avoid confusion. Defining a transaction timeout… Read more →

Understanding memory configurations for neo4j-admin backup

When using bin\neo4j-admin backup to backup a Neo4j database, Neo4j Support recommends explicitly defining the JVM heap size and pagecache memory to be used by the backup JVM process. If these are not defined then when neo4j-admin backup is executed,… Read more →

How to import a file with LOAD CSV that has a space in file name?

When you try to import data from a file using LOAD CSV where the filename containing spaces for example you get the following error: Statement: load csv from “file:///test copy.csv” as row return row Error: Illegal character in path… Read more →

Using apoc.load.jsonParams to load data from Zendesk into Neo4j to learn about article subscribers

The following document describes how to utilize the Zendesk API to load data from Zendesk into Neo4j, specifically data about users who have choosen to subscribe/follow Knowledge Base section(s). This document attempts to solve the issue described by the following… Read more →

Enabling GC Logging

What is Garbage collection and why enabling it? A garbage collection event is a complete pause of the java application (ie: neo4j-server). It can be identified in the debug.log as a stop-the-world event. For example: If you notice issues with… Read more →

How Does Neo4j Browser interact with Neo4j Server?

Starting with Neo4j 3.2, the Neo4j Browser only supports Bolt connectivity to the Neo4j Server. This requires that the network allows for socket communication between the browser and Bolt Port specified on the Neo4j Server. To see if your network… Read more →

Viewing schema data with APOC Procedures

APOC Procedures offers meta procedures to view information about your database schema and the data it stores. The procedure apoc.meta.schema() uses a sampling of the graph data to produce a map of metadata on the graph labels, relationships, properties, and… Read more →

How to avoid using excessive memory on deletes involving dense nodes

In situations where you know you need to delete a bunch of nodes (and by rule their relationships as well), it can be tempting to simply use DETACH DELETE and be done with it. However, this can become problematic if… Read more →

Checkpointing and Log Pruning interactions

Overview Checkpointing is the process of flushing all pending page updates from the page cache to the store files. This is necessary for ensuring that the number of transactions that are in need of being replayed during recovery is kept… Read more →

Explanation of error “Record id 65536 is out of range [0, 65535]”

When running a Cypher statement that creates a new relationship type, for example MERGE (n1:Person {id:1})-[r:knows]->(n2:Person {id:2}) one may encounter an error which is logged in the $NEO4J_HOME/logs/debug.log as 2017-10-30 17:08:29.741+0000 ERROR [o.n.b.v.r.ErrorReporter] Client triggered an unexpected error [UnknownError]: Could… Read more →

Fast counts using the count store

Neo4j maintains a transactional count store for holding count metadata for a number of things. The count store is used to inform the query planner so it can make educated choices on how to plan the query. Obtaining counts from… Read more →

Neo4j’s commit process explained

This article will try to guide you through Neo4j’s commit and replication processes both for single instances and causal clusters.   Single Instance When you call tx.commit(), the transaction will go through the Storage Engine which will transform that transaction… Read more →

Understanding causal cluster size scaling

The ability to safely scale down the size of a causal cluster affords us more robustness for instance failures, provided we maintain quorum as the failures take place. Prior to 3.4, we used a single config property to define both… Read more →

How to Setup Neo4j to Startup on Linux Server Reboot

If you want to emulate the Neo4j RPM service with a tar installation on Linux systems, do the following steps: As root: Copy the $NEO4J_HOME/bin/neo4j script file to /etc/init.d Edit the /etc/init.d/neo4j script file to uncomment the NEO4J_HOME variable and… Read more →

Using apt to download a specific Neo4j debian package

By default, using apt-get to install Neo4j allows you to grab the current and previous stable releases. However, if you would like to install an older version, you can specify that. For reference, the Debian repo is located here:… Read more →

Understanding memory consumption

So you have configured Neo4j to use 4GB of heap and 6GB of page cache and sat back relaxed, thinking the Java process would not go above 10GB in your 12GB machine only to realise that Neo4j had an OOM… Read more →

Achieving longestPath Using Cypher

While Cypher is optimized for finding the shortest path between two nodes, with such functionality as shortestPath(), it does not have the same sort of function for longest path. In some cases, you may want this, and not the shortest… Read more →

Neo4j specific http request user agent strings

For those APOC commands that retrieve data using HTTP/HTTPS, and or running Cypher LOAD CSV the request will be sent with Neo4j specific user-agent/browser identifiers. Below is an example log from an Apache webservers access log at /var/log/apache2/access.log and includes… Read more →