Neo4j Brings GraphRAG Capabilities for GenAI to Google Cloud
6 min read

-
-
- Quickly create knowledge graphs for accurate, explainable results. Developers can easily create knowledge graphs using Gemini models, the Google Cloud VertexAI platform, LangChain, and Neo4j from unstructured data like PDFs, webpages, and documents—either directly or loaded from Google Cloud Storage buckets. The simplified process uses llm-graph-transformer, which Neo4j contributed to LangChain. It provides LLMs with the intended graph schema and uses Gemini’s function-calling capabilities to extract entities and their relationships in a structured way. These entities and relationships are then added to the Neo4j knowledge graphs for use in GenAI or other applications. With GraphRAG, LLMs can access data modeled around entities, their attributes, and the relationships between entities, improving GenAI accuracy and explainability.
-
-
-
- Ingest, process, and analyze real-time data in seconds. Developers can use Flex templates in Dataflow to create repeatable, secure data pipelines that ingest, process, and analyze data across Google BigQuery, Google Cloud Storage, and Neo4j—supplying knowledge graphs with real-time information and enabling GenAI applications to provide relevant, timely insights.
-
-
- Build and deploy graph-powered GenAI apps with Gemini for Workspace and Reasoning Engine. Deploying GenAI applications to production has been challenging, but with Reasoning Engine from the Vertex AI platform, developers now have the tools to easily deploy, monitor, and scale GenAI apps and APIs onto Google Cloud Run. Neo4j’s GenAI capabilities, such as Vector Search, GraphRAG, and conversational memory, integrate seamlessly through LangChain and Neo4j AuraDB on Google Cloud. The Gemini models have been trained on Neo4j-specific assistant content, streamlining development by automatically generating code snippets for Neo4j tools and APIs. It also translates natural language into Neo4j’s Cypher query language, making it easier to build applications with Neo4j. With just a few lines of Python code and the Google Vertex AI Python SDK, you can deploy GenAI and graph-powered APIs to Reasoning Engine, making powerful RAG capabilities available for both developers and operations teams. This empowers organizations to bring graph-powered GenAI applications to production with ease and confidence.
Constructing Knowledge Graphs From Unstructured Data With Gemini Models and LangChain
Gemini’s advanced language capabilities and new function-calling features enable it to identify entities, types, and relationships from unstructured text and extract them in a structured way. Using the llm-graph-transformer that Neo4j contributed to LangChain, developers can turn any set of LangChain documents— PDFs, web pages, Google Docs, and more—into a knowledge graph. By providing a specific prompt with the instruction and an optional graph schema with the unstructured text, developers can guide the LLM to extract information and populate structured output via pre-defined objects for nodes and relationships. With Gemini’s initial support for function calls, developers can use the model in the extraction pipeline.Powering Real-Time GraphRAG Applications With Dataflow Flex Templates
Our new integrations allow developers to build GraphRAG applications using real-time data. They can use Dataflow Flex templates to set up secure, efficient pipelines for real-time data movement from BigQuery and Google Cloud Storage into their Neo4j Graph Database on Google Cloud. The knowledge graphs powering GraphRAG applications are continuously updated, enabling them to generate more accurate, timely, and contextually relevant responses.
Architecture diagram showing Dataflow from Google Cloud to Neo4j
Below, we use Dataflow Flex templates to run a job from the Google Cloud Storage bucket and transform the data into a Neo4j knowledge graph. As the data is updated in Google Cloud, we can create a job that checks for changes and updates the knowledge graph in real time. Now the graph within Neo4j can be used with RAG architecture for greater LLM accuracy and explainability.
A sample job run in Dataflow to get data from Google Cloud to Neo4j to create a knowledge graph
Organizations can combine Dataflow Flex templates with Neo4j’s graph database to build sophisticated GraphRAG applications that adapt to rapidly changing data landscapes. With real-time data, these applications deliver more accurate, timely, and contextually rich insights, enhancing decision-making and user experience across domains and use cases.Accelerating GenAI With Gemini for Google Workspace and Neo4j
We’ve worked closely with Google to improve the graph application development capabilities of Gemini for Google Workspace. Neo4j has provided extensive training data to Google, enabling Gemini to understand the Cypher query language and provide more comprehensive guidance to graph application developers. (The training data included Neo4j’s documentation, online courses, and knowledge base, as well as data from our text2cypher development efforts and crowdsourced, LLM-generated question-answer pairs.) The Gemini for Google Workspace developer assistant can now help developers create knowledge graphs in Neo4j by translating natural language into Cypher. Applications can be integrated for vector search capabilities with knowledge graphs and augmented with GraphRAG to ground LLMs for more explainable and accurate results. Once knowledge graphs are created within Neo4j, they can be used to explore graph data to uncover hidden patterns and insights.
A code snippet within a Google Workspace, showcasing the generation of Cypher query to load data into Neo4j
Gemini for Google Workspace is also trained in integrating Neo4j with popular orchestration frameworks like LangChain, LlamaIndex, and Haystack, so it can offer framework-specific guidance to developers, further streamlining the development process. Gemini for Google Workspace is available for end users in Google Workspace and developers in the Google Cloud Platform (GCP) console, as well as in popular development environments like Visual Studio Code and JetBrains. This allows developers to tap into AI-assisted coding capabilities as they build the next generation of GenAI-enabled applications.