LLM Knowledge Graph Builder — First Release of 2025
Head of Product Innovation & Developer Strategy, Neo4j
9 min read
New features include community summaries, parallel retrievers, and expanded model support for better knowledge graph construction from text
Background
Many developers try to build retrieval-augmented generation (RAG) experiences to interact with information from unstructured data using only vector search and struggle to get to the results that they want. Looking only at text fragments without context only gets you so far. As usual in data engineering, there are more opens in new tabadvanced patterns for preprocessing the data and extracting knowledge, one of which is GraphRAG. So when you get around using the data, you’ve surfaced the underlying concepts and can make use of them to connect the pieces and provide relevant context to a user’s questions.
We built, opens in new tabopen-sourced, and hosted the opens in new tabLLM Knowledge Graph Builder to let you try out better ways of treating your unstructured data. We preprocess documents, transcripts, web articles, and more sources into chunks, compute text embeddings, and connect them (lexical graph).
But we don’t stop there. We also extract entities and their relationships, which is especially relevant if you ingest multiple documents because you can relate the pieces spread out over multiple sources (entity graph).
This combined knowledge graph then enables a set of different retrievers to fetch data (see below).
Since we launched the opens in new tabLLM Knowledge Graph Builder in June 2024, we’ve had an impressive amount of usage and great feedback from users. It’s now the fourth most popular source of user interaction on AuraDB Free, which makes us really happy.
We did a release in fall 2024, but there were too many AI events, which took most of my time to write a blog post. Over the past few months, the team worked on really nice features — some of which we want to introduce today in the first release of 2025.
What Does the LLM Knowledge Graph Builder Do?
For those of you who don’t know what the tool does, here’s a quick introduction.
If you have a number of text documents, web articles, Wikipedia pages, or similar unstructured information, wouldn’t it be great to surface all the knowledge hidden inside those in a structured way and then use those entities and their relationships to better chat with your data?
The LLM Knowledge Graph Builder:
- Imports your documents
- Splits them into chunks and links them up
- Generates text embeddings for vector search and connects the most similar ones
- Uses a variety of large language models (LLMs) to extract entities and their relationships
- Optionally using a graph schema you can provide
- Stores the nodes and relationships in Neo4j
- And when running against a graph data science-enabled Neo4j instance, it also performs topic clustering and summarization
Get a quick overview of the process and try it out at opens in new tabhttps://llm-graph-builder.neo4jlabs.com.
The only prerequisite is a publicly accessible Neo4j instance to store your data, which you can opens in new tabcreate on AuraDB Free (or Aura Pro Trial with Graph Data Science).