The Internet-Scale, Native Graph Database

Neo4j equally exploits both data relationships and data elements, empowering the next generation of breakthrough applications.

Connections Are the Future

Yesterday's breakthrough applications were driven by big data – tomorrow's breakthrough applications will be driven by connected data. No longer powered merely by data transactions, these applications draw together every system across the entire enterprise.

These networks of related data are known as graphs.

Neo4j Network Graphs Diagram

As a native graph database, Neo4j is specifically optimized to store and traverse these graphs of connected data. By intuitively mapping data points and the connections between them, Neo4j powers intelligent, real-time applications that tackle today's toughest enterprise challenges, including:

The Impact of Native Graph Technology

In order to truly harness the power of connected data, a graph database must be engineered from top to bottom to handle data relationships. Only native graph technology handles the scalability, reliability and performance required by an always-on, mission-critical application.

From its inception, Neo4j defined the benchmark for native graph databases, and has continued to do so as the technology evolves.

Play Video

The Native Graph Advantage

Understand the ‘The Native Graph Advantage’ with Jim Webber, Neo4j Chief Scientist, who discusses native graph technology and why it matters.

What Makes a Graph Database Native? 7 Essentials

Property graph model

Property graph data model

The labeled property graph model – pioneered by the Neo4j team – intuitively maps the data model between whiteboard and keyboard.

Graph-specific query language

Graph-specific query language

Cypher is a human-readable query language designed specifically for handling connected graph data. It is becoming a vendor-neutral graph query language via the openCypher project.

Graph-specific visualization and tooling

Graph-specific visualization and tooling

The Neo4j Browser allows you to visualize your connected data, simplifies your Cypher commands and offers query development tools beyond the command line.

Graph-specific storage engine

Graph-specific storage engine

Neo4j is designed from top to bottom to store connected data without an over-reliance on indexes (index-free adjacency) and without compromising data integrity via ACID-compliant transactions.

Graph-specific processing engine

Graph-specific processing engine

For enterprise-grade performance, a native graph database must offer compiled graph queries, graph query planning, graph-specific APIs, native application drivers, graph-specific cost-based optimizers and high performance caching.

Graph-specific scalability

Graph-specific scalability features

Neo4j includes off-heap memory management, Causal Clustering that optimizes for both read-only access and read/write access, high availability (HA), disaster recovery and multi-data center support.

Graph-specific enterprise user security

Graph-specific enterprise user security

Native graph technology must offer a fundamentally robust database architecture that includes user- and role-based security, LDAP and Active Directory integration, Kerberos authentication, cloud and on-premises deployment options, and both open source as well as commercial licensing.

Neo4j Basics

The Labeled Property Graph Model

The Labeled Property Graph Model

  • Nodes

    • Nodes are the main data elements
    • Nodes are connected to other nodes via relationships
    • Nodes can have one or more properties (i.e., attributes stored as key/value pairs)
    • Nodes have one or more labels that describes its role in the graph
    • Example: Person nodes vs Car nodes
  • Relationships

    • Relationships connect two nodes
    • Relationships are directional
    • Nodes can have multiple, even recursive relationships
    • Relationships can have one or more properties (i.e., attributes stored as key/value pairs)
  • Properties

    • Properties are named values where the name (or key) is a string
    • Properties can be indexed and constrained
    • Composite indexes can be created from multiple properties
  • Labels

    • Labels are used to group nodes into sets
    • A node may have multiple labels
    • Labels are indexed to accelerate finding nodes in the graph
    • Native label indexes are optimized for speed

The Cypher Graph Query Language

  • Cypher is a declarative graph query language that is intuitive and human-readable
  • It is inspired by SQL with pattern matching from SPARQL
  • Cypher describes nodes, relationships and properties as ASCII art directly in the language, making queries easy to both read and recognize as part of the graph
  • Since it is highly legible, Cypher is easy to maintain, simplifying application maintenance as a result
  • Through the openCypher project, Cypher is rapidly becoming the standard and vendor-neutral language for graph technology

The Cypher Graph Query Language

How Neo4j Compares

Neo4j vs. RDBMS and other NoSQL

There's no denying: Other data stores have their appropriate use cases. But whenever your enterprise wants to leverage the connections between data points, you need to tap into the power of Neo4j.

Here's how the world's leading graph database stacks up against traditional relational databases (RDBMS) and other competing NoSQL data stores:

Neo4j vs. Relational Databases (RDBMS)
Neo4j vs. Other NoSQL Databases
Category Relational Database Neo4j, Native Graph Database
Data Storage Storage in fixed, pre-defined tables with rows and columns with connected data often disjointed between tables, crippling query efficiency. Graph storage structure with index-free adjacency results in faster transactions and processing for data relationships.
Data Modeling Database model must be developed with modelers and translated from a logical model to a physical one. Since data types and sources must be known ahead of time, any changes require weeks of downtime for implementation. Flexible, "whiteboard-friendly" data model with no mismatch between logical and physical model. Data types and sources can be added or changed at any time, leading to dramatically shorter development times and true agile iteration.
Query Performance Data processing performance suffers with the number and depth of JOINs (or relationships queried). Graph processing ensures zero latency and real-time performance, regardless of the number or depth of relationships.
Query Language SQL: A query language that increases in complexity with the number of JOINs needed for connected data queries. Cypher: A native graph query language that provides the most efficient and expressive way to describe relationship queries.
Transaction Support ACID transaction support required by enterprise applications for consistent and reliable data. Retains ACID transactions for fully consistent and reliable data around the clock – perfect for always-on global enterprise applications.
Processing at Scale Scales out through replication and scale up architecture is possible but costly. Complex data relationships are not harvested at scale. Graph model inherently scales for pattern-based queries. Scale out architecture maintains data integrity via replication. Massive scale up possibilities with IBM POWER8 and CAPI Flash systems.
Data Center Efficiency Server consolidation is possible but costly for scale up architecture. Scale out architecture is expensive in terms of purchase, energy use and management time. Data and relationships are stored natively together with performance improving as complexity and scale grow. This leads to server consolidation and incredibly efficient use of hardware.
Category Other NoSQL Databases Neo4j, Native Graph Database
Data Storage No support for connected data at the database level. Performance and data trustability degrade with scale and complexity of connections. Native graph storage structure with index-free adjacency results in faster transactions and processing for data relationships.
Data Modeling Data model not suitable for enterprise architectures as wide columns and document stores do not offer control at the design level. Puts undue pressure on the application level to catch and solve problems. Flexible, "whiteboard-friendly" data model allows for fine-grained control of data architecture. Intuitive data model eases communication between developers, architects and DBAs.
Query Performance No graph processing capability for data relationships, thus all relationships have to be created at the application level. Native graph processing ensures zero latency and real-time performance, regardless of the number or depth of relationships.
Query Language Query language varies, but no query constructs exist to express data relationships. Cypher: A native graph query language that provides the most efficient and expressive way to describe relationship queries.
Transaction Support BASE transactions lead to data corruption because basic availability and eventual consistency are unreliable for data relationships. ACID transactions ensure data is fully consistent and reliable around the clock – perfect for always-on global enterprise applications.
Processing at Scale Optimized for ingesting data but not reading data at scale. Scalability depends on scale out architecture that does not protect the integrity of graph-like data, so data is not trustworthy. Native graph model inherently scales for pattern-based queries. Scale out architecture maintains data integrity via replication. Massive scale up possibilities with IBM POWER8 and CAPI Flash systems.
Data Center Efficiency Scale out architecture assumes ongoing access to more and more commodity hardware without accounting for energy costs, network vulnerabilities and other risks. Data and relationships are stored natively together with performance improving as complexity and scale grow. This leads to server consolidation and incredibly efficient use of hardware.

Neo4j Graph Database Features

* Enterprise Edition

  • ACID for Data Integrity

    Neo4j is a fully ACID transactional database which ensures the integrity of your data at all times. Unlike in other NoSQL databases, data reliability is a primary design consideration for Neo4j.
  • Flexible Schema

    The labeled property graph model captures data as it naturally occurs, eliminating the need to translate a whiteboard model into tables, columns, documents or triples – and eradicating future schema migrations. Instead, developers enjoy the flexibility to add or remove properties as business requirements change, with optional schema constraints for enterprise governance or rules enforcement.
  • High-Performance Query Execution

    Querying connected data presents new opportunities to query relationship information in real-time applications. As a native graph database, Neo4j offers index-free adjacency, the fastest way to search through millions of data connections per second (per core). As a result, performance remains constant no matter the volume or complexity of your dataset.
  • Cypher Query Language

    Cypher is a declarative graph query language that naturally describes graph patterns. It is intuitive to both read and learn, and requires 10-100x less code than SQL. Its natural pattern-matching ability means you no longer need to debug nested JOINs. Through the openCypher project, Cypher will become the de facto language for graph technology across the industry.
  • Scale and Performance

    Neo4j lets you scale across every key dimension: volume, reads, writes and locations – all while providing blazing-fast queries, consistent response times and rock-solid data integrity. Neo4j also offers support for replication with master re-election and failover to keep your data safe and reliable.
  • Advanced Causal Clustering*

    Neo4j supports scalability across global data centers through its proprietary Causal Clustering architecture. This Raft-based architecture supports the ability to scale both read/write Core Servers independently from Read Replica servers, allowing your internet-scale application to perform perfectly for a global audience.
  • Built-in Tooling & Visualization

    The Neo4j Browser allows developers to query and visualize your connected data. Visualization is key to discovering patterns in your graph data that can then be easily translated into perpetual Cypher queries – all within the Browser experience. In addition, query profile and planning tools allow you to fine-tune queries before deploying to production.
  • Drivers for Popular Languages & Frameworks

    Neo4j offers official support for Java, C#, Python and JavaScript drivers, as well as community drivers for Ruby, PHP, R, Go and others. The Neo4j community also has support for popular frameworks like Spring Data, Django ORM, Laravel, JDBC and more. There are also integrations for other databases and analytics tools, like MongoDB, Cassandra, ElasticSearch and Spark/GraphX.
  • Seamless Data Import

    Neo4j has always been available for on-premises deployment, but many now use Neo4j in cloud environments like Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform. No matter your preferred platform, fully hosted offerings are available through the Neo4j partner ecosystem. In addition, our official Docker image simplifies automation and deployment, making it easy to get up and running with a single instance or a full HA cluster.
  • Cloud-Ready Deployment

    Neo4j has always been available for on-premises deployment, but many now use Neo4j in cloud environments like Amazon Web Services (AWS), Microsoft Azure and Google Cloud Platform. No matter your preferred platform, fully hosted offerings are available through the Neo4j partner ecosystem. In addition, our official Docker image simplifies automation and deployment, making it easy to get up and running with a single instance or a full HA cluster.
  • Elastic Scalability*

    Neo4j clustering provides scale-out capabilities for reads, letting you spread out your graph in memory, while ensuring each instance is able to get to any node or relationship using its own local copy. This allows for blazing speed even as your graph dataset grows, all while providing high availability via a replication protocol. Massive scale up architecture is also possible with Neo4j on IBM POWER8 with CAPI Flash.
  • In-Memory Page Cache*

    Neo4j Enterprise Edition includes an in-memory page cache that is separate from traditional JVM-based caching strategies. Caching can also be location or data center specific.
  • Hot Backups*

    Neo4j Enterprise Edition allows you to take hot, point-in-time backups while your graph database is still running. Your application can keep running 24/7 without compromising the availability or quality of your backups.

Ready to get Started?

Your enterprise is driven by connections – now it's time for your database to do the same. Click below to download and dive into Neo4j for yourself – or download the white paper to learn how today's leading enterprises are using Neo4j to achieve sustainable competitive advantage.

Download Neo4j Download the White Paper