The Internet-Scale, Native Graph Database
Neo4j equally exploits both data relationships and data elements, empowering the next generation of breakthrough applications.
Yesterday's breakthrough applications were driven by big data – tomorrow's breakthrough applications will be driven by connected data. No longer powered merely by data transactions, these applications draw together every system across the entire enterprise.
These networks of related data are known as graphs.
As a native graph database, Neo4j is specifically optimized to store and traverse these graphs of connected data. By intuitively mapping data points and the connections between them, Neo4j powers intelligent, real-time applications that tackle today's toughest enterprise challenges, including:
The Impact of Native Graph Technology
In order to truly harness the power of connected data, a graph database must be engineered from top to bottom to handle data relationships. Only native graph technology handles the scalability, reliability and performance required by an always-on, mission-critical application.
From its inception, Neo4j defined the benchmark for native graph databases, and has continued to do so as the technology evolves.
What Makes a Graph Database Native? 7 Essentials
Property graph data model
The labeled property graph model – pioneered by the Neo4j team – intuitively maps the data model between whiteboard and keyboard.
Graph-specific visualization and tooling
The Neo4j Browser allows you to visualize your connected data, simplifies your Cypher commands and offers query development tools beyond the command line.
Graph-specific processing engine
For enterprise-grade performance, a native graph database must offer compiled graph queries, graph query planning, graph-specific APIs, native application drivers, graph-specific cost-based optimizers and high performance caching.
Graph-specific scalability features
Neo4j includes off-heap memory management, Causal Clustering that optimizes for both read-only access and read/write access, high availability (HA), disaster recovery and multi-data center support.
- Nodes are the main data elements
- Nodes are connected to other nodes via relationships
- Nodes can have one or more properties (i.e., attributes stored as key/value pairs)
- Nodes have one or more labels that describes its role in the graph
- Example: Person nodes vs Car nodes
- Relationships connect two nodes
- Relationships are directional
- Nodes can have multiple, even recursive relationships
- Relationships can have one or more properties (i.e., attributes stored as key/value pairs)
- Properties are named values where the name (or key) is a string
- Properties can be indexed and constrained
- Composite indexes can be created from multiple properties
- Labels are used to group nodes into sets
- A node may have multiple labels
- Labels are indexed to accelerate finding nodes in the graph
- Native label indexes are optimized for speed
The Cypher Graph Query Language
- Cypher is a declarative graph query language that is intuitive and human-readable
- It is inspired by SQL with pattern matching from SPARQL
- Cypher describes nodes, relationships and properties as ASCII art directly in the language, making queries easy to both read and recognize as part of the graph
- Since it is highly legible, Cypher is easy to maintain, simplifying application maintenance as a result
- Through the openCypher project, Cypher is rapidly becoming the standard and vendor-neutral language for graph technology
There's no denying: Other data stores have their appropriate use cases. But whenever your enterprise wants to leverage the connections between data points, you need to tap into the power of Neo4j.
Here's how the world's leading graph database stacks up against traditional relational databases (RDBMS) and other competing NoSQL data stores:
|Category||Relational Database||Neo4j, Native Graph Database|
|Data Storage||Storage in fixed, pre-defined tables with rows and columns with connected data often disjointed between tables, crippling query efficiency.||Graph storage structure with index-free adjacency results in faster transactions and processing for data relationships.|
|Data Modeling||Database model must be developed with modelers and translated from a logical model to a physical one. Since data types and sources must be known ahead of time, any changes require weeks of downtime for implementation.||Flexible, "whiteboard-friendly" data model with no mismatch between logical and physical model. Data types and sources can be added or changed at any time, leading to dramatically shorter development times and true agile iteration.|
|Query Performance||Data processing performance suffers with the number and depth of JOINs (or relationships queried).||Graph processing ensures zero latency and real-time performance, regardless of the number or depth of relationships.|
|Query Language||SQL: A query language that increases in complexity with the number of JOINs needed for connected data queries.||Cypher: A native graph query language that provides the most efficient and expressive way to describe relationship queries.|
|Transaction Support||ACID transaction support required by enterprise applications for consistent and reliable data.||Retains ACID transactions for fully consistent and reliable data around the clock – perfect for always-on global enterprise applications.|
|Processing at Scale||Scales out through replication and scale up architecture is possible but costly. Complex data relationships are not harvested at scale.||Graph model inherently scales for pattern-based queries. Scale out architecture maintains data integrity via replication. Massive scale up possibilities with IBM POWER8 and CAPI Flash systems.|
|Data Center Efficiency||Server consolidation is possible but costly for scale up architecture. Scale out architecture is expensive in terms of purchase, energy use and management time.||Data and relationships are stored natively together with performance improving as complexity and scale grow. This leads to server consolidation and incredibly efficient use of hardware.|
|Category||Other NoSQL Databases||Neo4j, Native Graph Database|
|Data Storage||No support for connected data at the database level. Performance and data trustability degrade with scale and complexity of connections.||Native graph storage structure with index-free adjacency results in faster transactions and processing for data relationships.|
|Data Modeling||Data model not suitable for enterprise architectures as wide columns and document stores do not offer control at the design level. Puts undue pressure on the application level to catch and solve problems.||Flexible, "whiteboard-friendly" data model allows for fine-grained control of data architecture. Intuitive data model eases communication between developers, architects and DBAs.|
|Query Performance||No graph processing capability for data relationships, thus all relationships have to be created at the application level.||Native graph processing ensures zero latency and real-time performance, regardless of the number or depth of relationships.|
|Query Language||Query language varies, but no query constructs exist to express data relationships.||Cypher: A native graph query language that provides the most efficient and expressive way to describe relationship queries.|
|Transaction Support||BASE transactions lead to data corruption because basic availability and eventual consistency are unreliable for data relationships.||ACID transactions ensure data is fully consistent and reliable around the clock – perfect for always-on global enterprise applications.|
|Processing at Scale||Optimized for ingesting data but not reading data at scale. Scalability depends on scale out architecture that does not protect the integrity of graph-like data, so data is not trustworthy.||Native graph model inherently scales for pattern-based queries. Scale out architecture maintains data integrity via replication. Massive scale up possibilities with IBM POWER8 and CAPI Flash systems.|
|Data Center Efficiency||Scale out architecture assumes ongoing access to more and more commodity hardware without accounting for energy costs, network vulnerabilities and other risks.||Data and relationships are stored natively together with performance improving as complexity and scale grow. This leads to server consolidation and incredibly efficient use of hardware.|