Developer Center » Languages » Python » Code Guides » Python, OpenAI & GraphRAG

Python, OpenAI & GraphRAG

This guide shows you how to set up and run GraphRAG (Graph Retrieval-Augmented Generation) using Neo4j’s official GraphRAG package with OpenAI integration. The example uses a public demo database with the Goodreads dataset containing Book nodes and their related entities.

What is GraphRAG?

GraphRAG combines knowledge graphs with AI to provide intelligent question-answering by:

  1. Embedding your questions using OpenAI’s models
  2. Finding relevant information in the Neo4j graph database using vector similarity
  3. Generating comprehensive answers using OpenAI’s language models

Create a Project

Create a file named app.py.
Next create a virtual environment, start it, and pull down the Neo4j packages:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install neo4j "neo4j-graphrag[openai]"

Configure Credentials and Embedding Details

import os
from neo4j import GraphDatabase
from neo4j_graphrag.embeddings import OpenAIEmbeddings
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.retrievers import VectorRetriever

NEO4J_URI = "neo4j+s://demo.neo4jlabs.com"
NEO4J_USERNAME = "goodreads"
NEO4J_PASSWORD = "goodreads"
NEO4J_DATABASE = "goodreads"
VECTOR_INDEX_NAME = "book-descriptions"
LLM_MODEL = "gpt-4o"
EMBEDDING_MODEL = "text-embedding-3-small"
OPENAI_API_KEY = "your-openai-api-key-here"  # Replace with your key

Create a Pipeline

The GraphRAG system combines Neo4j with OpenAI to provide intelligent question-answering. The pipeline setup creates all necessary AI components. In this sample, we define a function that initializes the complete GraphRAG pipeline with embeddings, language model, and vector retriever.

def create_graphrag_pipeline():
    """Initialize the GraphRAG components"""
    # Set OpenAI API key
    os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

    # Connect to Neo4j
    driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))

    # Create embeddings and LLM
    embedder = OpenAIEmbeddings(model=EMBEDDING_MODEL)
    llm = OpenAILLM(model_name=LLM_MODEL, model_params={"temperature": 0.0})

    # Create retriever
    retriever = VectorRetriever(
        driver=driver,
        index_name=VECTOR_INDEX_NAME,
        embedder=embedder,
        neo4j_database=NEO4J_DATABASE
    )

    # Create GraphRAG pipeline
    rag = GraphRAG(retriever=retriever, llm=llm)

    return rag, driver

This function creates four key components:

  • embedder: converts questions to vectors for similarity search
  • llm: generates intelligent answers using the LLM model
  • retriever: finds relevant information in the Neo4j database
  • rag: combines all components into a complete question-answering system

Run a Query

The GraphRAG search API automatically handles vector similarity search, context retrieval, and answer generation. In this sample, we define a main function that accepts a natural language question and returns an AI-generated answer.

def ask_question(question, top_k=5):
    """Ask a question and get an AI answer from the book database"""
    rag, driver = create_graphrag_pipeline()

    try:
        print(f"🔍 Question: {question}")

        # Search and get answer
        response = rag.search(
            query_text=question,
            retriever_config={"top_k": top_k}
        )

        print(f"🤖 Answer: {response.answer}")
        return response.answer

    except Exception as e:
        print(f"❌ Error: {e}")
        return None
    finally:
        driver.close()

The method takes two parameters:

  • question: your natural language question about books
  • top_k: number of relevant books to consider for the answer (default: 5)

The function returns an AI-generated answer based on the most relevant books found in the database.

Example Queries

The following examples show different types of questions you can ask the Goodreads book database.

if __name__ == "__main__":
    # Make sure to set your OpenAI API key above!

    # Ask questions about books
    questions = [
        "Which books are about motherhood and friendship?",
        "What are the best science fiction novels?",
        "Tell me about books that deal with mental health"
    ]

    for question in questions:
        print("=" * 60)
        ask_question(question, top_k=3)
        print()

Running Script

You can run the script from the command line with:

python app.py

Make sure you have set your OpenAI API key in the code before running.

Sample Response

The first question should return a response similar to:

🔍 Question: Which books are about motherhood and friendship?
🤖 Answer: The books about motherhood and friendship are "Friendship Crisis" and "Good Harbor." "Friendship Crisis" explores the keys to forming emotionally supportive connections, which often include themes of motherhood and friendship. "Good Harbor" examines the balance of marriage, career, motherhood, and friendship, highlighting the redemptive power of friendship.Code language: PHP (php)

Resources

Share Article