Neo4j Live: Personal Knowledge Vault with Neo4j GraphRAG
Managing personal and organisational knowledge is becoming more complex every day. Traditional systems force users into rigid structures and disconnected apps, leaving knowledge scattered and hard to retrieve.
In a recent Neo4j Live session, Mike Morley, Director of Machine Learning and AI at Arcurve, demonstrated a new approach: building a Personal Knowledge Vault using Neo4j’s GraphRAG pattern combined with FastAPI, Celery and RabbitMQ.
You can watch the full session above.
In this post, we’ll explore Mike’s approach to creating a scalable, flexible knowledge management system and how you can build your own.
The Challenge: Traditional Knowledge Management is Broken
Mike started with a clear problem statement: today’s knowledge systems rely on outdated metaphors like files, folders and disconnected applications.
- Users have to remember where and how they stored information.
- Information is trapped inside silos (SharePoint, OneDrive, Dropbox, etc.).
- Retrieval is error-prone, slow, and frustrating.
Even worse, cognitive load increases when users have to map their knowledge into strict taxonomies or application-specific formats. Mike’s goal? Build a system that works the way humans think, not the way machines demand.
Enter the Personal Knowledge Vault
Mike’s Personal Knowledge Vault is a modern, AI-powered solution to these problems. It’s built around three core ideas:
1. Capture Data Flexibly
Capture knowledge from any source – handwritten notes, books, web articles voice recordings – without forcing users into rigid templates.
2. Structure with Graphs
Use Neo4j to store captured knowledge as richly linked graphs, naturally representing relationships between ideas, events, and concepts.
3. Enable Smart Retrieval
Integrate LLMs with the graph to allow natural language queries, dynamic search, and powerful context-aware retrieval.
Together, this architecture enables a fluid, human-centric way to capture and retrieve knowledge.
System Architecture Overview
Mike’s Personal Knowledge Vault integrates several technologies:
- Neo4j Aura — Graph database backend, storing structured document graphs.
- FastAPI — Python-based REST API layer, handling data ingestion and queries.
- Celery + RabbitMQ — Asynchronous processing engine for document decomposition and indexing.
- OpenAI API — For handwriting recognition, semantic chunking, and question-answering capabilities.
- LangChain + LangGraph — For autonomous agents managing research tasks and document curation.
The system is modular and scalable, making it ideal for both personal use cases and enterprise-grade deployments.
Real-World Use Cases
Mike’s Personal Knowledge Vault is more than a tech demo. It’s solving real problems:
- Helping users with memory challenges: Providing easy, conversational access to personal memories and notes.
- Enterprise knowledge management: Offering a scalable blueprint for companies to integrate and unify fragmented data sources.
- Research assistance: Allowing users to dynamically retrieve and curate information across personal and public knowledge bases.
Whether it’s personal productivity or business operations, the system shows how graph databases and AI can unlock new possibilities.
Key Features That Make It Stand Out
Seamless Integration of LLMs and Graphs
Instead of relying solely on a black-box LLM, Mike anchors all knowledge in a traceable graph structure. Every answer has a known source, essential for building trust and traceability in AI applications.
Asynchronous, Scalable Processing
Thanks to Celery and RabbitMQ, ingestion is non-blocking. The system easily handles long-running operations, large documents, and high-throughput scenarios.
Flexible Knowledge Capture
Unlike traditional forms and taxonomies, users can input messy, unstructured data, even handwritten notes, and the system does the heavy lifting to organise it.
Multi-Agent Systems
Using LangGraph, Mike deployed autonomous AI agents for tasks like search, curation, summarisation, and writing – paving the way for dynamic workflows.
Key Takeaways for Developers
- Store knowledge in a graph to make it structured, flexible and traceable.
- Process documents asynchronously with tools like Celery for better scalability.
- Chunk documents semantically to improve retrieval relevance and context.
- Use hybrid retrieval combining vector search and graph traversal.
- Prioritise source traceability to build trust in AI-assisted answers.