By Bryce Merkl Sasaki, Editor-in-Chief, Neo4j | July 24, 2015
So you’ve heard about graph databases and you want to know what all the buzz is about. Are they just a passing trend – here today and gone tomorrow – or are they a rising tide your business and your development team can’t afford to pass up?
In short: Graph databases are the future, and even if you’re just a beginner, it’s never too late to get started.
In this “Graph Databases for Beginners” blog series, I’ll take you through the basics of graph technology assuming you have little (or no) background in the space. This week, we’ll tackle the basic introductions below.
Why You Should Care about Graph DatabasesNew tech is great, but you operate in a world of budgets, timelines, corporate standards and competitors. You don’t merely replace your existing database infrastructure just because something new comes along – you only take action with an orders-of-magnitude improvement. Graph databases fit that bill, and here’s why:
- • Performance:
- Your data volume will definitely increase in the future, but what’s going to increase at an even faster clip is the connections (or relationships) between your data. With traditional databases, relationship queries will come to a grinding halt as the number and depth of relationships increase. In contrast, graph database performance stays constant even as your data grows year over year.
- • Flexibility:
- With graph databases, your IT and data architect teams move at the speed of business because the structure and schema of a graph model flex as your solutions and industry change. Your team doesn’t have to exhaustively model your domain ahead of time; instead, they can add to the existing structure without endangering current functionality.
- • Agility:
- Developing with graph databases aligns perfectly with today’s agile, test-driven development practices, allowing your graph-database-backed application to evolve with your changing business requirements.
What Is a Graph Database? (a Non-Technical Definition)You don’t need to understand the arcane mathematical wizardry of graph theory in order to understand graph databases. On the contrary, they’re more intuitive to understand than relational databases (RDBMS). A graph is composed of two elements: a node and a relationship. Each node represents an entity (a person, place, thing, category or other piece of data), and each relationship represents how two nodes are associated. For example, the two nodes “cake” and “dessert” would have the relationship “is a type of” pointing from “cake” to “dessert.” Consider another example: Twitter is a perfect example of a graph database connecting 302 million monthly active users. In the illustration below, we have a small slice of Twitter users represented in a graph database. Each node (labeled “User”) belongs to a single person and is connected with relationships describing how each user is connected. As we see below, Billy and Harry follow each other, as do Harry and Ruth, but although Ruth follows Billy, Billy hasn’t (yet) reciprocated.
Twitter users represented in a graph database.If this example makes sense to you, then you’ve already grasped the basics of what makes up a graph database.
How Graph Databases Work (Explained in a Way You Actually Understand)Unlike other database management systems, relationships take first priority in graph databases. This means your application doesn’t have to infer data connections using things like foreign keys or out-of-band processing, like MapReduce. The result of using graph databases instead? Your data models are simpler and more expressive than the ones you’d produce with relational databases or NoSQL (Not only SQL) stores. There are two important properties of graph database technologies you need to understand:
- • Graph storage
- Some graph databases use “native” graph storage that is specifically designed to store and manage graphs, while others use relational or object-oriented databases instead. Non-native storage is often slower than a native approach.
- • Graph processing engine
- Native graph processing (a.k.a. “index-free adjacency”) is the most efficient means of processing data in a graph because connected nodes physically “point” to each other in the database. However, non-native graph processing engines use other means to process Create, Read, Update or Delete (CRUD) operations.
Conclusion: Graphs Are in More Places than You ThinkThe real world is richly interconnected, and graph databases aim to mimic those sometimes-consistent, sometimes-erratic relationships in an intuitive way. Graph databases are extremely useful in understanding big data sets in scenarios as diverse as logistics route optimization, retail suggestion engines, fraud detection and social network monitoring. Graph databases are on the rise, and big data is getting bigger. Your competitors most likely aren’t harnessing the power of graph technology to power their applications or analyze their big data, so this is your opportunity to step up your game and join leading companies like Walmart, eBay and Pitney Bowes. That said, it’s a narrow window. Learn to leverage graph databases today and your business retains the competitive advantage well past tomorrow. Ready to dive deeper into the world of graph databases? Learn how to apply graph technologies to real-world problems with O’Reilly’s Graph Databases. Click below to get your free copy of the definitive book on graph databases and your introduction to Neo4j.
About the Author
Bryce Merkl Sasaki, Editor-in-Chief, Neo4j
Bryce Merkl Sasaki is the Editor-in-Chief at Neo4j. He studied professional and creative writing for undergrad and has been freelancing for 7 years. Recently, he worked at an inbound marketing agency in Philadelphia as a copywriter before moving to California. When not working, he likes to spend his time working on his novel, looking for pickup soccer games and reading voraciously.
From the CEO
Have a Graph Question?
Reach out and connect with the Neo4j staff.Stackoverflow
Share your Graph Story?
Email us: email@example.com