Python, OpenAI & GraphRAG
This guide shows you how to set up and run GraphRAG (Graph Retrieval-Augmented Generation) using Neo4j’s official GraphRAG package with OpenAI integration. The example uses a public demo database with the Goodreads dataset containing Book nodes and their related entities.
What is GraphRAG?
GraphRAG combines knowledge graphs with AI to provide intelligent question-answering by:
- Embedding your questions using OpenAI’s models
- Finding relevant information in the Neo4j graph database using vector similarity
- Generating comprehensive answers using OpenAI’s language models
Create a Project
Create a file named app.py.
Next create a virtual environment, start it, and pull down the Neo4j packages:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate pip install neo4j "neo4j-graphrag[openai]"
Configure Credentials and Embedding Details
import os from neo4j import GraphDatabase from neo4j_graphrag.embeddings import OpenAIEmbeddings from neo4j_graphrag.generation import GraphRAG from neo4j_graphrag.llm import OpenAILLM from neo4j_graphrag.retrievers import VectorRetriever NEO4J_URI = "neo4j+s://demo.neo4jlabs.com" NEO4J_USERNAME = "goodreads" NEO4J_PASSWORD = "goodreads" NEO4J_DATABASE = "goodreads" VECTOR_INDEX_NAME = "book-descriptions" LLM_MODEL = "gpt-4o" EMBEDDING_MODEL = "text-embedding-3-small" OPENAI_API_KEY = "your-openai-api-key-here" # Replace with your key
Create a Pipeline
The GraphRAG system combines Neo4j with OpenAI to provide intelligent question-answering. The pipeline setup creates all necessary AI components. In this sample, we define a function that initializes the complete GraphRAG pipeline with embeddings, language model, and vector retriever.
def create_graphrag_pipeline():
"""Initialize the GraphRAG components"""
# Set OpenAI API key
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
# Connect to Neo4j
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))
# Create embeddings and LLM
embedder = OpenAIEmbeddings(model=EMBEDDING_MODEL)
llm = OpenAILLM(model_name=LLM_MODEL, model_params={"temperature": 0.0})
# Create retriever
retriever = VectorRetriever(
driver=driver,
index_name=VECTOR_INDEX_NAME,
embedder=embedder,
neo4j_database=NEO4J_DATABASE
)
# Create GraphRAG pipeline
rag = GraphRAG(retriever=retriever, llm=llm)
return rag, driver
This function creates four key components:
- embedder: converts questions to vectors for similarity search
- llm: generates intelligent answers using the LLM model
- retriever: finds relevant information in the Neo4j database
- rag: combines all components into a complete question-answering system
Run a Query
The GraphRAG search API automatically handles vector similarity search, context retrieval, and answer generation. In this sample, we define a main function that accepts a natural language question and returns an AI-generated answer.
def ask_question(question, top_k=5):
"""Ask a question and get an AI answer from the book database"""
rag, driver = create_graphrag_pipeline()
try:
print(f"🔍 Question: {question}")
# Search and get answer
response = rag.search(
query_text=question,
retriever_config={"top_k": top_k}
)
print(f"🤖 Answer: {response.answer}")
return response.answer
except Exception as e:
print(f"❌ Error: {e}")
return None
finally:
driver.close()
The method takes two parameters:
- question: your natural language question about books
- top_k: number of relevant books to consider for the answer (default: 5)
The function returns an AI-generated answer based on the most relevant books found in the database.
Example Queries
The following examples show different types of questions you can ask the Goodreads book database.
if __name__ == "__main__":
# Make sure to set your OpenAI API key above!
# Ask questions about books
questions = [
"Which books are about motherhood and friendship?",
"What are the best science fiction novels?",
"Tell me about books that deal with mental health"
]
for question in questions:
print("=" * 60)
ask_question(question, top_k=3)
print()
Running Script
You can run the script from the command line with:
python app.py
Make sure you have set your OpenAI API key in the code before running.
Sample Response
The first question should return a response similar to:
🔍 Question: Which books are about motherhood and friendship?
🤖 Answer: The books about motherhood and friendship are "Friendship Crisis" and "Good Harbor." "Friendship Crisis" explores the keys to forming emotionally supportive connections, which often include themes of motherhood and friendship. "Good Harbor" examines the balance of marriage, career, motherhood, and friendship, highlighting the redemptive power of friendship.Code language: PHP (php)
Resources
- Complete sample code: https://gist.github.com/jalakoo/968e7bd46b30f3bf64f771bdebdb7153
- Neo4j GraphRAG Documentation: https://neo4j.com/docs/neo4j-graphrag-python/current/index.html
- Neo4j Python Driver: https://github.com/neo4j/neo4j-python-driver
- Neo4j GraphAcademy GraphRAG courses: https://graphacademy.neo4j.com/knowledge-graph-rag/
- OpenAI API Documentation: https://platform.openai.com/docs


