So you’ve heard about graph database technology and you want to know what all the buzz is about.
It’s easy to take the perspective of a cynic: They’re just another passing trend – here today, gone tomorrow – right? Isn’t that the way of all tech buzzwords?
Feel free to be suspicious – skeptical even – but leave your cynicism at home. Instead, I’m inviting you on an adventure of a new way of seeing the world.
The graph paradigm goes well beyond databases and application development; it’s a reimagining of what’s possible around the idea of connections. And just like any new problem-solving framework, approaching a challenge from a different dimension often produces an orders-of-magnitude change in possible solutions.
All that to say: Graph technology is a rising tide your development team – and your business – can’t afford to pass up. Graph databases are the future, and even if you’re just a beginner, it’s never too late to get started. Let’s dive in.
In this Graph Databases for Beginners blog series, I’ll take you through the basics of graph technology assuming you have little (or no) background in the space. This week, we’ll walk you through an introduction to graph database with basic definitions and why those distinctions matter.
Why You Should Care about Graph Database Technology
When you’re on your own, new tech might be fun to play around with or to use on a personal side project, but when you’re at work, it’s a whole different story.
Professionally, you have to operate in a world of budgets, timelines, corporate standards and competitors. And in that world, the only test for new tech is that it better work damn well (and way better than anything else you already have on hand).
Graph databases fit that bill, and here’s why:
Your data volume will definitely increase in the future, but what’s going to increase at an even faster clip is the connections (or relationships) between your data. Big data will definitely get bigger, and the relationships will grow exponentially.
With traditional databases, relationship queries come to a grinding halt as the number and depth of relationships increase. In contrast, graph databases scales with your data and your business needs in real-world situations, minimizing cost and hardware while maximizing performance across connected datasets.
With graph databases, your IT and data architecture teams move at the speed of business because the structure and schema of a graph data model flex as your solutions and industry change. Your team doesn’t have to exhaustively model your domain ahead of time (and then exhaustively remodel and migrate the DB after some exec asks for a change); instead, you can add to the existing structure without endangering current functionality.
With the graph database model, you are the one dictating changes and taking charge; whereas the RDBMS data model dictates it’s requirements to you, forcing you to adapt to its tabular way of seeing the world.
Deploying graph databases, on-premise or in the cloud, aligns perfectly with today’s agile, test-driven development practices, allowing your graph-database-backed application to evolve with your changing business requirements.
Your agile team now has a database that keeps up with your daily demands.
What Is a Graph Database? (a Non-Technical Definition)
You don’t need to understand the arcane mathematical wizardry of graph theory in order to understand graph database technology. On the contrary, they’re more intuitive to understand than relational databases (RDBMS).
A graph is composed of two elements: a node and a relationship.
Each node represents an entity (a person, place, thing, category or other piece of data), and each relationship represents how two nodes are associated. For example, the two nodes
dessert would have the relationship
is a type of pointing from
Consider another example: Twitter is a perfect example of a graph database connecting hundreds of millions of monthly active users.
In the illustration below, we have a small slice of Twitter users represented in a graph database. Each node (labeled User) belongs to a single person and is connected with relationships describing how each user is connected. As we see below, Peter and Emil follow each other, as do Emil and Johan, but although Johan follows Peter, Peter hasn’t (yet) reciprocated.
Twitter users represented in a graph database model.
If this example makes sense to you, then you’ve already grasped the basics of what makes up a graph database.
How Graph Databases Work (Explained in a Way You Actually Understand)
Unlike other database management systems (DBMS), relationships take first priority in graph databases. In the graph world, the connections between data are as important, if not more, than individual data points.
This connections-first approach to data means relationships and connections are persisted (and not just temporarily calculated) through every part of the data lifecycle: from idea, to design in a logical model, to implementation in a physical model, to operation using a query language and to persistence within a scalable, reliable database system.
Unlike other database systems, this approach means your application doesn’t have to infer data connections using things like foreign keys or out-of-band processing, like MapReduce.
The result: Your data models are simpler and yet more expressive than the ones you’d produce with relational databases or NoSQL (Not only SQL) stores.
What Makes Graph Databases Unique
A lot of databases have similar characteristics, but graph databases have a few things that make them unique. Here are the two most important properties of graph database technologies that you need to understand:
- Graph storage
Some graph databases use native graph storage that is specifically designed to store and manage graphs – from bare metal on up. Other graph technologies use relational, columnar or object-oriented databases as their storage layer. Non-native storage is often slower than a native approach because all of the graph connections have to be translated into a different data model.
- Graph processing
Native graph processing (a.k.a. index-free adjacency) is the most efficient means of processing data in a graph because connected nodes physically point to each other in the database. Non-native graph processing engines use other means to process Create, Read, Update or Delete (CRUD) operations that aren’t optimized for handling connected data.
Conclusion: Graphs Are in More Places than You Think (They’re Everywhere)
The real world is richly interconnected, and graph databases aim to mimic those sometimes-consistent, sometimes-erratic relationships in an intuitive way. That’s what makes the graph paradigm different than other database models: It maps more realistically to how the human brain maps and processes the world around it.
And once you start seeing graphs of interconnected data in one place (your recommendation engine, for example), you start seeing them in other places too (like your fraud detection efforts or your master data management). Pretty soon, you’ll have the epiphany: graphs are everywhere.
It comes as no surprise then that graph technology is on the rise (but you don’t have to take my word for it).
There’s a good chance your competitors are at least evaluating or exploring the deployment of a graph database, so this is your opportunity to step up your game and join leading companies like:
Tap into graph database technology today and your business retains the competitive advantage well past tomorrow.
Ready to dive deeper into the world of graph databases? Learn how to apply graph technologies to real-world problems with O’Reilly’s Graph Databases book. Click below to get your free copy of the definitive book on graph databases and your introduction to Neo4j.
Catch up with the rest of the Graph Databases for Beginners series:
- Wait, What Do You Mean By “Graph”?
- Why Connected Data Matters
- The Basics of Data Modeling
- Data Modeling Pitfalls to Avoid
- Why a Database Query Language Matters
- Imperative vs. Declarative Query Languages: What’s the Difference?
- Graph Theory & Predictive Modeling
- Graph Search Algorithm Basics
- Why We Need NoSQL Databases
- ACID vs. BASE Explained
- A Tour of Aggregate Stores
- Other Graph Data Technologies
- Native vs. Non-Native Graph Technology