Data² Builds Leading GenAI Analytics Platform with Neo4j

50%

estimated Analyst workload freed up by reView

$2 billion

in budgets managed by the Data² team historically

$13 billion

market for AI in the energy sector alone in 2024

Data² is on a mission to change how defense, intelligence, and energy organizations extract insights from structured and unstructured data.

Founded by a team of military veterans and energy industry experts, Data² brings together domain expertise and generative AI to enable analysts to make breakthroughs, from uncovering hidden terror cells to untangling and optimizing complex well networks.

Behind every deployment of the company’s flagship analytics and artificial reasoning platform, reView, is a knowledge graph built on Neo4j AuraDB.

“We knew we needed to take a different approach when we set out to build reView,” says Jeff Dalgliesh, Chief Technology Officer at Data². “I saw firsthand in the oil field that the most valuable information lives in unstructured sources, like drilling reports, facilities records, and maintenance logs.”

Dalgliesh found that relational databases struggled to analyze the complex relationships within these large datasets. Data² needed a more scalable technology to power its emerging analytics platform — and one that could quickly provide the groundwork for generative AI development. In 2023, Indigo Advisory estimated that the market for AI was worth up to $13 billion in the energy sector alone, with ​​GenAI accounting for 28.1% of overall AI spending.

Why Relational Databases and Triple Stores Fell Short

Data² explored a number of technologies in its research and development phase — but integrating components like Spark, triple stores, Hadoop clusters, and indexes was a daunting task for the growing team. These technologies required expertise and time-consuming configuration that slowed development down.

“We discovered that with Neo4j, we didn’t need to worry about all that mess behind the scenes,” Dalgliesh explains. “Graph technology helped us focus on building our AI capabilities without getting bogged down in database administration and scaling.”

Dalgliesh was also interested in integrating Neo4j’s GraphRAG capabilities. GraphRAG combines knowledge graphs, data science, large language models (LLMs), and retrieval-augmented generation (RAG) to provide more accurate responses to user queries.

“Oil and gas is a connected network of processes, people, and infrastructure,” Dalgliesh says. “We are working with a client now modeling saltwater disposal wells. If you turn a valve that decreases the flow in a pipeline, it has a downstream effect on the disposal well. Knowledge graphs built on Neo4j are the perfect abstraction layer to model relationships across this kind of complex network and were a much better fit for reView than triple stores.”

Dalgliesh ruled out triple stores early in reView’s development process. Triple stores are a type of database designed to store triples, which are a combination of a subject-predicate-object, like “Bob is 35” or “Bob knows Fred.” Every property of a “subject” must be modeled as its own triple. This approach can lead to performance issues as the number of triples grows, making it difficult to efficiently query and analyze large datasets.

“Triple stores tend to be more focused on ontological purity than solving real-world problems,” Dalgliesh says. “They’re academic, with a rigid insistence on precision. But when you’re trying to rapidly make sense of a messy domain, you need a flexible model that helps you solve business problems quickly.”

GraphRAG Gives Analysts a Paper Trail to Follow

Today, Data² reView presents customer data with a knowledge graph stored in a Neo4j AuraDB graph database. Dalgliesh’s team has developed several components that improve reView’s accuracy and enhance the user experience for analysts:

  • Arctic Loader: Data² built this component to load tabular data and documents into their knowledge graph and map tables to graph structures.
  • Question Answering System: Users can ask questions and receive accurate responses (stored back in the graph), powered by Neo4j GraphRAG and Anthropic or other LLMs.
  • Evidence and Question Graphs: Data² maintains separate graphs for evidence and questions, preventing its language models from ingesting incorrect data.

Analysts are now able to interrogate datasets with tens of thousands of nodes, and receive answers from the GenAI Question Answering System in plain English. Users can drill into the evidence that supports each answer, and accept the result as training data or flag it for human correction to improve the model’s accuracy.

“Analysts need to be able to dissect exactly how the AI reached a particular conclusion or recommendation,” says Chief Business Officer Eric Costantini. “Neo4j enables us to enforce robust information security by applying access controls at the subgraph level.”

As user questions in reView become more specific and targeted, the answers remain accurate and transparent.

“The better the questions you ask, the smarter you’re going to become,” Dalgliesh explains. “We think of the knowledge graph as a dynamic, evolving brain that captures the full scope of an organization’s operating knowledge. Our generative AI agent learns and reasons over top of this brain to deliver context-relevant insights and recommendations grounded in data.”

Discovering Better Strategic Portfolio Fits for O&G Lease Acquisitions and Hidden Security Threats with an ‘Evolving Brain’

Incorporating knowledge graphs and GraphRAG into reView has allowed Data² to uncover hidden patterns in its customers’ data. Intelligence agencies can quickly zero in on high-value investigation targets, analyze relevant pattern-of-life data, and identify hidden threat networks. This innovative integration offers a comprehensive understanding of dynamic environments, empowering decision-makers with unparalleled insights to enhance strategic operations and safeguard national security.

Oil and gas executives can use reView to make better decisions when assessing new portfolio acquisitions, from oil fields to drilling locations. reView’s AI offering simplifies the task of evaluating leases, consolidating key geological, production, and economic data. This allows leaders to make smarter choices with greater confidence.

Operational data is difficult to obtain. The diagram above visualizes problems with equipment extracted from construction and maintenance records for oil wells. This evidence is often buried in reports that no one has time to find or connect. Neo4j — combined with reView’s LLM-powered fact extractor — makes building highly specific domain knowledge graphs possible for teams with limited time and resources.

Data²’s Vision for Redefining Defense with Generative AI

In August 2023, the Department of Defense announced the establishment of a generative artificial intelligence task force to assess and employ GenAI capabilities across the DoD. For the Data² team and its customers, knowledge graphs are just the beginning of this AI-driven future.

“Neo4j enables our users to easily interact with and explore knowledge graphs and to achieve mission outcomes that just weren’t possible with prior approaches,” Dalgliesh explains. “The distance between two nodes may contain an answer that helps make the world a safer place.”

Get in Touch

Curious about what insights you could unlock for your business with graph-powered solutions? Let’s talk — reach out, and we’ll get in touch.

Use Cases

  • GenAI

Industry

  • Energy
  • Americas

Explore More