Chapter 1. Introduction

Table of Contents

This chapter introduces Neo4j capabilities, editions, and architecture.

1.1. Neo4j editions

There are two editions of Neo4j to choose from: Community Edition and Enterprise Edition. The nature of the required solution will help decide which edition to select.

Community Edition is a fully functional edition of Neo4j, suitable for single instance deployments. It has full support for key Neo4j features, such as ACID compliance, Cypher, and programming APIs. It is ideal for smaller workgroup or do-it-yourself projects similar to:

  • learning Neo4j and just getting started
  • building a solution for an internal team that can tolerate downtime for support
  • building a solution available to external users, but without guarantees on uptime or availability
  • building a solution which does not have high demands for scalability or concurrent access

Enterprise Edition extends the functionality of Community Edition to include key features for performance and scalability, such as a clustering architecture for high availability and online backup functionality. It is the choice for production systems with availability requirements or needs for scaling up, for example:

  • the ability to scale up your solution with the clustering architecture
  • 24x7 availability capabilities
  • ability to support disaster recovery
  • provisioning for early stage load testing
  • access to professional support from Neo Technology

Which is the right Neo4j edition for a particular deployment?

As a rule of thumb:

  1. Both editions offer the same, great core graph database capabilities
  2. Enterprise Edition is the choice for a commercial solution, a critical or highly depended-on internal solution, and when anticipate needing scalability, redundancy, or high availability.
Table 1.1. Features
Edition Enterprise Community

Property Graph Model

X

X

Native Graph Processing & Storage

X

X

ACID

X

X

Cypher - Graph Query Language

X

X

Language Drivers

X

X

Extensible REST API

X

X

High-Performance Native API

X

X

HTTPS

X

X

Table 1.2. Performance & Scalability
Edition Enterprise Community

Enterprise Lock Manager

X

-

High-Performance Cache

X

-

Clustering

X

-

Hot Backups

X

-

Advanced Monitoring

X

-

1.2. Neo4j for the enterprise

This section covers the major features of Neo4j Enterprise Edition.

1.2.1. Architecture

Figure 1.1. Neo4j cluster
ha architecture neo styled

Neo4j Clustering is comprised of a single master instance and zero or more slave instances. All instances in the cluster have full copies of your data in their local database files. Each database instance contains the logic needed in order to coordinate with the other members of the cluster for data replication and election management.

When performing a write transaction on a slave each write operation will be synchronized with the master. Locks will be acquired on both master and slave. When the transaction commits it will first be committed on the master and then, if successful, on the slave. To ensure consistency, a slave has to be up to date with the master before performing a write operation. This is built into the communication protocol between the slave and master, so that updates will be applied to a slave communicating with its master automatically.

Write transactions performed directly through the master will execute in the same way as running in normal non-cluster mode. On success the transaction will be pushed out to a configurable number of slaves. This is done optimistically, meaning that if the push fails, the transaction will still be successful.

Whenever a Neo4j database becomes unavailable, by means of for example hardware failure or network outages, the other database instances in the cluster will detect that and mark it as temporarily failed. A database instance that becomes available after being unavailable will automatically catch up with the cluster. If the master goes down another member will be elected and have its role switched from slave to master after a quorum has been reached within the cluster. When the new master has performed its role switch it will broadcast its availability to all the other members of the cluster. Normally a new master is elected and started within just a few seconds and during this time no writes can take place

A special case of a slave instance is the arbiter instance. The arbiter instance does not operate any database, but can be regarded as cluster participants in that its role is to take part in master elections with the single purpose of breaking ties in the election process. That makes possible a scenario where you have a cluster of two Neo4j database instances plus an arbiter instance, and still enjoy tolerance of a single failure of either of the three instances.

All this can be summarized as:

  • Write transactions can be performed on any database instance in a cluster.
  • Neo4j cluster is fault tolerant and can continue to operate from any number of machines down to a single machine.
  • Slaves will be automatically synchronized with the master on write operations.
  • If the master fails, a new master will be elected automatically.
  • The cluster automatically handles instances becoming unavailable (for example due to network issues), and also makes sure to accept them as members in the cluster when they are available again.
  • Transactions are atomic, consistent and durable but eventually propagated out to other slaves.
  • Updates to slaves are eventually consistent by nature but can be configured to be pushed optimistically from master during commit.
  • If the master goes down, any running write transaction will be rolled back and new transactions will block or fail until a new master has become available.
  • Reads are highly available and the ability to handle read load scales with more database instances in the cluster.

1.2.2. Design considerations

When designing your solution, some of your first considerations will concern your functional requirements and the type of technology choices you make to meet them. Some of those functional requirements likely will include a need to scale to many concurrent users, maintain consistent uptime, or the ability to recover from a system failure and maintain availability. These are important production related questions that help drive your technical decisions and can ultimately guide you to choose to cluster Neo4j.

This covers four major advantages of using Neo4j clustering:

  1. Read Scalability
  2. High Availability
  3. Disaster Recovery
  4. Analytics

1.2.2.1. Read scalability

Clustering Neo4j allows you to distribute read workload across a number of Neo4j instances. You can take two approaches to scaling your reads with Neo4j:

1.2.2.1.1. Distribute load balance reads to any slave instance in the cluster

Neo4j’s clustering architecture replicates the entire database to each instance in your cluster. Therefore you are able to direct any read from your application to any slave instance without much concern for data locality.

Figure 1.2. Distribute load balance reads to any slave instance in the cluster
cluster w lb neo styled
1.2.2.1.1.1. When would you chose this method?
  1. You need to scale up the number of concurrent read requests
  2. Your data has no natural or obvious way of partitioning reads
  3. A significant portion of the data that needs to be read can reasonably be expected to already be in memory on any instance in the cluster.
1.2.2.1.2. Distribute direct reads to specific instances in the cluster

This is sometimes referred to as "cache-based partitioning". The strategy simply allows you to take advantage of natural partitions in your data to direct reads to particular instances where the system will already have those datasets in memory. This approach is significantly beneficial when your total active dataset is much larger than can fit in memory in any particular instance.

Figure 1.3. Cache-based partitioning
cache charding neo
1.2.2.1.2.1. When would you choose this method?
  1. Your total active data set is larger than can reasonably be expected to fit in memory in any single instance in your cluster.
  2. A natural or obvious partition can be identified in your dataset
  3. You have the application and operations ability to direct which instances are read from.

1.2.2.2. High availability

Figure 1.4. High availability cluster
cluster failover neo styled

A significant and fundamental functional requirement for any service or application is the requirements for overall availability. Very often this question is answered more by the demands of the users, the times they would be interacting with the solution, the impact downtime would have on the business or users of the system to complete their roles, or the financial impact of a system failure. These are not always customer-facing solutions and can be critical internal systems.

Availability can often be addressed with various strategies for recovery or mirroring. However, Neo4j’s clustering architecture is an automated solution for ensuring Neo4j is consistently available to your application and end-users.

How do you know if you need Neo4j’s clustering for high availability reasons?
  1. Neo4j is serving data for a critical business or consumer-facing solution that would impact the ability for the company to conduct business if the component were down.
  2. Global end-users with random access behavior are depending on the data stored in Neo4j.
  3. Business continuity must be ensured by availability of disaster recovery features.

1.2.2.3. Disaster recovery

Disaster recovery, in general terms, defines your ability to recover from major outages of your services. The most common example is whole-datacenter outages where many services are disrupted. In these cases a disaster recovery strategy can define a failover datacenter along with a strategy for bringing services back online.

Neo4j clustering can accommodate disaster recovery strategies that require very short-windows of downtime or low tolerances for data loss in disaster scenarios. By deploying a cluster instance to an alternate location, you have an active copy of your database up and available in your designated disaster recovery location that is consistently keeping up with the transactions against your database.

Why would you choose Clustering in support of Disaster Recovery?
  1. Minimize downtime: Your application availability demands are very high and you cannot sustain significant periods of downtime.
  2. Require real-time: You already employ a disaster recovery strategy for other application or service components that are near real-time.
  3. Minimize data loss: You have a significantly large database that changes frequently and have low tolerance for data loss in a disaster scenario.

1.2.2.4. Analytics

Your application needs to access data for its' purposes. It reads data, writes data, and is generally keeping your application service or end-users happy. Then comes the analytics team that wants to collect and aggregate data for their reports. Next thing you know, you have a set of long-running compute queries running against your production databases and disrupting your service or end-users' happiness.

You can’t avoid servicing the needs of the analytics requests, but you can box in the impact their queries have on your service. Neo4j clustering can be used to include separate instances entirely in support of query analytics, either from end users or from BI tools. Using clustering means the data is always up to date for analytics queries as well.

When would you decide to use clustering to support analytics needs?
  1. You have regular BI users that consistently need to run analytics against the most recent versions of the data
  2. Your analytics includes queries that aggregate over large or entire sets of data
  3. Your analytics processes include complex compute algorithms for predictive or modeling purposes