Neo4j Brings GraphRAG Capabilities for GenAI to Google Cloud

April 9, 2024

6 min read

We’re thrilled to announce new native integrations with Google Cloud and Vertex AI that solve a critical challenge in GenAI development: accessing contextually rich external data to deliver accurate, explainable results. The integrations streamline the implementation of GraphRAG, a responsive and real-time method to address these issues.

GraphRAG combines two powerful technologies: retrieval-augmented generation (RAG) and knowledge graphs. RAG allows GenAI applications to access and query external datasets, while knowledge graphs make the data smarter by enriching the contextual information with entities and capturing the complex relationships between them. This enriched context enables LLMs to reason, infer, and accurately answer questions and execute tasks, anchoring their responses and actions in factual information.

The importance of knowledge graphs in GenAI development cannot be overstated. Gartner considers knowledge graphs essential to the development of GenAI and has urged data leaders to “leverage the power of LLMs with the robustness of knowledge graphs to build fault-tolerant AI applications.”

Available now, the GraphRAG integrations allow organizations to rapidly build GenAI applications that can integrate contextually rich external data in real time while being secure and compliant. This capability dramatically reduces hallucinations while enabling LLMs to uncover and use complex relationships and patterns within large datasets—so GenAI apps can deliver the accuracy, relevance, and explainability required by enterprise use cases.

The integrations enable developers to implement GraphRAG seamlessly:

1. 1. Quickly create knowledge graphs for accurate, explainable results. Developers can easily create knowledge graphs using Gemini models, the Google Cloud VertexAI platform, LangChain, and Neo4j from unstructured data like PDFs, webpages, and documents—either directly or loaded from Google Cloud Storage buckets. The simplified process uses llm-graph-transformer, which Neo4j contributed to LangChain. It provides LLMs with the intended graph schema and uses Gemini’s function-calling capabilities to extract entities and their relationships in a structured way. These entities and relationships are then added to the Neo4j knowledge graphs for use in GenAI or other applications. With GraphRAG, LLMs can access data modeled around entities, their attributes, and the relationships between entities, improving GenAI accuracy and explainability.

1. 1. Ingest, process, and analyze real-time data in seconds. Developers can use Flex templates in Dataflow to create repeatable, secure data pipelines that ingest, process, and analyze data across Google BigQuery, Google Cloud Storage, and Neo4j—supplying knowledge graphs with real-time information and enabling GenAI applications to provide relevant, timely insights.

1. Build and deploy graph-powered GenAI apps with Gemini for Workspace and Reasoning Engine. Deploying GenAI applications to production has been challenging, but with Reasoning Engine from the Vertex AI platform, developers now have the tools to easily deploy, monitor, and scale GenAI apps and APIs onto Google Cloud Run. Neo4j’s GenAI capabilities, such as Vector Search, GraphRAG, and conversational memory, integrate seamlessly through LangChain and Neo4j AuraDB on Google Cloud. The Gemini models have been trained on Neo4j-specific assistant content, streamlining development by automatically generating code snippets for Neo4j tools and APIs. It also translates natural language into Neo4j’s Cypher query language, making it easier to build applications with Neo4j. With just a few lines of Python code and the Google Vertex AI Python SDK, you can deploy GenAI and graph-powered APIs to Reasoning Engine, making powerful RAG capabilities available for both developers and operations teams. This empowers organizations to bring graph-powered GenAI applications to production with ease and confidence.

Let’s dig a little more deeply into the new integrations and how they help organizations realize the immense potential of GenAI.

Constructing Knowledge Graphs From Unstructured Data With Gemini Models and LangChain

Gemini’s advanced language capabilities and new function-calling features enable it to identify entities, types, and relationships from unstructured text and extract them in a structured way. Using the llm-graph-transformer that Neo4j contributed to LangChain, developers can turn any set of LangChain documents— PDFs, web pages, Google Docs, and more—into a knowledge graph.

By providing a specific prompt with the instruction and an optional graph schema with the unstructured text, developers can guide the LLM to extract information and populate structured output via pre-defined objects for nodes and relationships. With Gemini’s initial support for function calls, developers can use the model in the extraction pipeline.

Powering Real-Time GraphRAG Applications With Dataflow Flex Templates

Our new integrations allow developers to build GraphRAG applications using real-time data. They can use Dataflow Flex templates to set up secure, efficient pipelines for real-time data movement from BigQuery and Google Cloud Storage into their Neo4j Graph Database on Google Cloud. The knowledge graphs powering GraphRAG applications are continuously updated, enabling them to generate more accurate, timely, and contextually relevant responses.

Architecture diagram showing Dataflow from Google Cloud to Neo4j

Below, we use Dataflow Flex templates to run a job from the Google Cloud Storage bucket and transform the data into a Neo4j knowledge graph. As the data is updated in Google Cloud, we can create a job that checks for changes and updates the knowledge graph in real time. Now the graph within Neo4j can be used with RAG architecture for greater LLM accuracy and explainability.

A sample job run in Dataflow to get data from Google Cloud to Neo4j to create a knowledge graph

Organizations can combine Dataflow Flex templates with Neo4j’s graph database to build sophisticated GraphRAG applications that adapt to rapidly changing data landscapes. With real-time data, these applications deliver more accurate, timely, and contextually rich insights, enhancing decision-making and user experience across domains and use cases.

Accelerating GenAI With Gemini for Google Workspace and Neo4j

We’ve worked closely with Google to improve the graph application development capabilities of Gemini for Google Workspace. Neo4j has provided extensive training data to Google, enabling Gemini to understand the Cypher query language and provide more comprehensive guidance to graph application developers. (The training data included Neo4j’s documentation, online courses, and knowledge base, as well as data from our text2cypher development efforts and crowdsourced, LLM-generated question-answer pairs.)

The Gemini for Google Workspace developer assistant can now help developers create knowledge graphs in Neo4j by translating natural language into Cypher. Applications can be integrated for vector search capabilities with knowledge graphs and augmented with GraphRAG to ground LLMs for more explainable and accurate results. Once knowledge graphs are created within Neo4j, they can be used to explore graph data to uncover hidden patterns and insights.

A code snippet within a Google Workspace, showcasing the generation of Cypher query to load data into Neo4j

Gemini for Google Workspace is also trained in integrating Neo4j with popular orchestration frameworks like LangChain, LlamaIndex, and Haystack, so it can offer framework-specific guidance to developers, further streamlining the development process.

Gemini for Google Workspace is available for end users in Google Workspace and developers in the Google Cloud Platform (GCP) console, as well as in popular development environments like Visual Studio Code and JetBrains. This allows developers to tap into AI-assisted coding capabilities as they build the next generation of GenAI-enabled applications.

Deploying Graph-Powered GenAI Applications With Google’s Reasoning Engine Runtime

Many developers are new to deploying GenAI applications in production environments. Google’s Reasoning Engine Runtime addresses this challenge by simplifying the process of securely deploying, scaling, monitoring, and operating GenAI applications with Vertex AI and Gemini models. Google’s Reasoning Engine Runtime is a new product that goes beyond the capabilities of Vertex ML by providing a framework for integrating knowledge graphs with GenAI applications. It helps developers decide when to use a knowledge graph based on the specific requirements of their application, such as the need to model complex relationships between entities or perform advanced reasoning tasks.

Our new integrations with Google Cloud, combined with our extensive LangChain integrations, allow users to seamlessly incorporate Neo4j knowledge graphs into their GenAI stack. Developers can use LangChain to run direct or advanced RAG architectures, including GraphRAG, within Reasoning Engine Runtime.

Combining Neo4j’s knowledge graph capabilities with Google’s Reasoning Engine Runtime is a powerful approach to building contextually advanced GenAI applications. It reduces the complexities of productionization while delivering more accurate and explainable GenAI results.

GraphRAG: Unlocking the Potential of GenAI

As GenAI has evolved, it’s become increasingly clear that GraphRAG is a powerful tool for overcoming the limitations of LLMs. Combining knowledge graphs with retrieval-augmented generation resolves the critical issues of accuracy, explainability, and transparency—and unlocks the full potential of GenAI.

The new integrations between Neo4j and Google Cloud make GraphRAG more accessible and simpler to use than ever before. Now, instead of struggling to address hallucinations or lack of transparency, developers can focus on creating a new generation of reliable, context-aware GenAI applications.

To get started with Neo4j GraphRAG on Google Cloud, explore our GenAI resources and run on Neo4j AuraDB, available on Google Cloud Marketplace today.

Get Started on Neo4j AuraDB