7.1. Introduction

This section gives an introduction of Neo4j Fabric.

This section describes the following:

7.1.1. Overview

Fabric, introduced in Neo4j 4.0, is a way to store and retrieve data in multiple databases, whether they are on the same Neo4j DBMS or in multiple DBMSs, using a single Cypher query. Fabric achieves a number of desirable objectives:

  • a unified view of local and distributed data, accessible via a single client connection and user session
  • increased scalability for read/write operations, data volume and concurrency
  • predictable response time for queries executed during normal operations, a failover or other infrastructure changes
  • High Availability and No Single Point of Failure for large data volume.

In practical terms, Fabric provides the infrastructure and tooling for:

  • Data Federation: the ability to access data available in distributed sources in the form of disjointed graphs.
  • Data Sharding: the ability to access data available in distributed sources in the form of a common graph partitioned on multiple databases.

With Fabric, a Cypher query can store and retrieve data in multiple federated and sharded graphs.

7.1.2. Fabric concepts

7.1.2.1. The fabric database

A Fabric setup includes a Fabric database, that acts as the entry point to a federated or sharded graph infrastructure. This database is also referred in Fabric as the virtual database. Drivers and client applications access and use the Fabric database like any other Neo4j database. The exception is that it cannot store any data, and queries against it don’t return any data. The Fabric database can be configured only on a standalone Neo4j DBMS, i.e. on a Neo4j DBMS where the configuration setting dbms.mode must be set to SINGLE.

7.1.2.2. Fabric graphs

In a Fabric database, data is organized in the form of graphs. Graphs are seen by client applications as local logical structures, where physically data is stored in one or more databases. Databases configured as Fabric graphs may be local, i.e. in the same Neo4j DBMS, or they may be located in external Neo4j DBMSes. The databases are accessible by client applications also from regular local connections in their respective Neo4j DBMSs.

7.1.3. Deployment examples

Fabric constitutes an extremely versatile environment that provides scalability and availability with no single point of failure in various topologies. Users and developers may use applications that can work on a standalone DBMS as well on a very complex and largely distributed infrastructure without the need to apply any change to the queries accessing the Fabric graphs.

7.1.3.1. Development deployment

In its simplest deployment, Fabric can be used on a single instance, where Fabric graphs are associated to local databases. This approach is commonly used by software developers to create applications that will be deployed on multiple Neo4j DBMSs, or by power users who intend to execute Cypher queries against local disjoint graphs.

Figure 7.1. Fabric deployment in a single instance
fabric single instance

7.1.3.2. Cluster deployment with no single point of failure

In this deployment Fabric guarantees access to disjoint graphs in high availability with no single point of failure. Availability if reached by creating redundant entry points for the Fabric Database (i.e. two standalone Neo4j DBMSs with the same Fabric configuration) and a minimum Causal Cluster of three members for data storage and retrieval. This approach is suitable for production environments and it can be used by power users who intend to execute Cypher queries against disjoint graphs.

Figure 7.2. Fabric deployment with no single point of failure
fabric deployment

7.1.3.3. Multi-cluster deployment

In this deployment Fabric provides high scalability and availability with no single point of failure. Disjoint clusters can be sized according to the expected workload and Databases may be colocated in the same cluster or they can be hosted in their own cluster to provide higher throughput. This approach is suitable for production environments where database can be sharded, federated or a combination of the two.

Figure 7.3. Fabric deployment for scalability with no single point of failure
fabric deployment scalable