The Problem with Relational DatabasesRelational databases were designed for tabular data, with a consistent structure and a fixed schema. They work best for problems that are well defined at the outset. However, attempting to answer questions about data relationships (e.g., a product recommendations engine, a social graph or a fraud detection solution) with a relational database involves numerous and expensive JOINs between database tables. Despite their name, relational databases do not store relationships between data elements, making them unfit for today’s highly connected data. Relational databases have a fixed scheme, so they don’t adapt well to changes. So even as Database Administrators (DBAs) and developers face a steady stream of requests to meet changing business requirements, such schema changes are problematic and take a great deal of time. Many relational database applications are working fine within their limits. Some, however, may be showing significant signs of strain induced by the database, especially when an RDBMS is being used to handle highly connected data. In a world where the only constant is flux and business data is connected more than ever, here are the five surest signs it’s time to abandon your SQL database:
1. A Large Number of JOINsWhen you utilize queries that join many different tables, there’s an explosion of complexity and computing resource consumption. This results in a corresponding increase in query response times.
2. Numerous Self-JOINs (or Recursive JOINs)Self-JOIN statements are common for hierarchy and tree representations of data, but traversing relationships by repeatedly joining tables to themselves is inefficient. In fact, some of the longest SQL queries in the world involve recursive JOINs.
3. Frequent Schema ChangesAt a time when business agility is at a premium, requests for changes are more often than not put off by DBAs because the schema of relational databases isn’t designed for frequent changes and pivots. Common schema changes indicate that the data or requirements are rapidly evolving, calling for a more flexible model.
4. Slow-Running Queries (Despite Extensive Tuning)Your DBA might use every trick in the book to speed up query times, but many SQL queries still aren’t fast enough to support your application’s needs. In addition, denormalizing data models for performance can negatively impact data quality and update behavior.
5. Pre-Computing Your ResultsBecause queries run so slowly, many applications pre-compute their results using past data. However, this is effectively using yesterday’s data for queries that should be handled in real time today. Furthermore, your system usually must pre-compute 100% of your data, even if only 1-2% of it will be accessed at any given time. If you or your development team frequently suffer from any of these symptoms of SQL strain, then you’re probably trying to use a relational database to solve a graph problem. A graph database is purpose-built to store highly connected data, to flex as schemas change and to capture real-time insights from data relationships. Look for next week’s post on SQL strain to learn more about graph databases and connected data Are your data-driven insights being hindered by the limited capabilities of a relational database? Click below to download a free copy of this white paper, Overcoming SQL Strain and SQL Pain and discover how to harness connected data like never before.
From the CEO
Have a Graph Question?
Reach out and connect with the Neo4j staff.Stackoverflow
Share your Graph Story?
Email us: firstname.lastname@example.org