Build a Knowledge Graph from Documents (TypeScript)
A step-by-step tutorial to extract entities and relationships from a corpus of text and persist them as a queryable knowledge graph using the TypeScript SDK and the hosted NAMS service.
In this tutorial we will turn a small set of company / product / person descriptions into a knowledge graph: an LLM extracts structured entities and relationships from each document, we persist them via @neo4j-labs/agent-memory, and then we query the graph by semantic search and by relationship traversal. By the end you’ll be able to point any TS agent at the same backend and have it use that graph.
What You’ll Learn
-
How to use an LLM with structured output to extract entities from free-text.
-
How to persist entities and typed relationships via
memory.longTerm.addEntityandaddRelationship. -
How to search the graph by semantic similarity (
searchEntities). -
How to walk relationships outward from a known entity (
getRelatedEntities). -
How another agent in the same backend reads what this one wrote.
Prerequisites
-
Node.js 20 or later.
-
OPENAI_API_KEYandMEMORY_API_KEYset. -
Basic familiarity with TypeScript and async/await.
What We’re Building
A small "product / company / founder" knowledge graph built from five short documents. The schema we’ll end up with:
(:Entity:Person {name})-[:FOUNDED]->(:Entity:Organization {name})
(:Entity:Organization {name})-[:MAKES]->(:Entity:Product {name})
(:Entity:Organization {name})-[:HEADQUARTERED_IN]->(:Entity:Location {name})
NAMS uses lowercase types ("person", "organization", "location", "concept"); we’ll use "concept" for products since NAMS doesn’t have a built-in product type.
Step 1: Project setup
mkdir kg-from-docs && cd kg-from-docs
npm init -y
npm install @neo4j-labs/agent-memory ai @ai-sdk/openai zod
npm install --save-dev typescript tsx @types/node
tsconfig.json and package.json "type": "module" as in the previous tutorial.
Step 2: Prepare sample documents
Create src/documents.ts:
export const documents = [
{
id: "doc-1",
text: "Brian Chesky co-founded Airbnb in 2008. The company is headquartered in San Francisco and operates an online marketplace for short-term lodging.",
},
{
id: "doc-2",
text: "Phil Knight founded Nike in 1964. Nike, based in Beaverton, Oregon, makes the Air Max running shoes and the Vaporfly racing shoes.",
},
{
id: "doc-3",
text: "Steve Jobs co-founded Apple in 1976. Apple, headquartered in Cupertino, California, makes the iPhone and the MacBook.",
},
{
id: "doc-4",
text: "Reed Hastings co-founded Netflix in 1997. Based in Los Gatos, California, Netflix operates a streaming service.",
},
{
id: "doc-5",
text: "Travis Kalanick co-founded Uber in 2009. Uber is headquartered in San Francisco and operates ride-sharing and food-delivery services.",
},
];
Step 3: Define an extraction schema with Zod
Create src/schema.ts:
import { z } from "zod";
export const ExtractionSchema = z.object({
people: z.array(
z.object({
name: z.string(),
description: z.string().optional(),
}),
),
organizations: z.array(
z.object({
name: z.string(),
description: z.string().optional(),
headquartersLocation: z.string().optional(),
}),
),
locations: z.array(
z.object({
name: z.string(),
}),
),
products: z.array(
z.object({
name: z.string(),
makerOrganization: z.string(),
}),
),
relationships: z.array(
z.object({
sourceName: z.string(),
targetName: z.string(),
type: z.enum(["FOUNDED", "MAKES", "HEADQUARTERED_IN"]),
}),
),
});
export type Extraction = z.infer<typeof ExtractionSchema>;
The schema is what we’ll hand to the LLM as a structured-output target. Zod gives us runtime validation and a TypeScript type.
Step 4: Extract structured entities with the LLM
Create src/extract.ts:
import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { ExtractionSchema, type Extraction } from "./schema.js";
export async function extractFromDocument(text: string): Promise<Extraction> {
const { object } = await generateObject({
model: openai("gpt-4o-mini"),
schema: ExtractionSchema,
prompt: `Extract people, organizations, locations, products, and the
relationships between them from the following text.
For relationships use these types:
- FOUNDED: person founded organization
- MAKES: organization makes product
- HEADQUARTERED_IN: organization is headquartered in location
Text:
${text}`,
});
return object;
}
generateObject is the Vercel AI SDK’s structured-output helper. It enforces our Zod schema on the model’s response.
Step 5: Persist the extraction into the knowledge graph
Create src/build.ts:
import { MemoryClient } from "@neo4j-labs/agent-memory";
import { documents } from "./documents.js";
import { extractFromDocument } from "./extract.js";
import type { Extraction } from "./schema.js";
const memory = new MemoryClient();
async function persist(extraction: Extraction): Promise<Map<string, string>> {
// Map entity name -> persisted entity id, for relationship wiring.
const nameToId = new Map<string, string>();
// 1. People
for (const person of extraction.people) {
const entity = await memory.longTerm.addEntity(person.name, "person", {
description: person.description,
});
nameToId.set(person.name, entity.id);
}
// 2. Organizations
for (const org of extraction.organizations) {
const entity = await memory.longTerm.addEntity(org.name, "organization", {
description: org.description,
});
nameToId.set(org.name, entity.id);
}
// 3. Locations
for (const loc of extraction.locations) {
const entity = await memory.longTerm.addEntity(loc.name, "location");
nameToId.set(loc.name, entity.id);
}
// 4. Products (NAMS doesn't have a built-in product type; use "concept")
for (const product of extraction.products) {
const entity = await memory.longTerm.addEntity(product.name, "concept", {
description: `Product made by ${product.makerOrganization}`,
});
nameToId.set(product.name, entity.id);
}
// 5. Relationships
for (const rel of extraction.relationships) {
const sourceId = nameToId.get(rel.sourceName);
const targetId = nameToId.get(rel.targetName);
if (!sourceId || !targetId) {
console.warn(`Skipping ${rel.type}: missing endpoint`);
continue;
}
await memory.longTerm.addRelationship(sourceId, targetId, rel.type);
}
return nameToId;
}
async function main() {
for (const doc of documents) {
process.stdout.write(`\nDocument ${doc.id} — extracting... `);
const extraction = await extractFromDocument(doc.text);
process.stdout.write(
`${extraction.people.length}p, ${extraction.organizations.length}o, ` +
`${extraction.locations.length}l, ${extraction.products.length}prod, ` +
`${extraction.relationships.length}rel\n`,
);
await persist(extraction);
}
console.log("\nKnowledge graph built. Move on to Step 6 to query it.");
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
Run it:
npx tsx src/build.ts
Expected output:
Document doc-1 — extracting... 1p, 1o, 1l, 0prod, 2rel Document doc-2 — extracting... 1p, 1o, 1l, 2prod, 5rel Document doc-3 — extracting... 1p, 1o, 1l, 2prod, 5rel Document doc-4 — extracting... 1p, 1o, 1l, 0prod, 2rel Document doc-5 — extracting... 1p, 1o, 1l, 0prod, 2rel Knowledge graph built. Move on to Step 6 to query it.
NAMS deduplicates entities server-side, so re-running this script is idempotent.
Step 6: Query the knowledge graph
Create src/query.ts:
import { MemoryClient } from "@neo4j-labs/agent-memory";
const memory = new MemoryClient();
async function main() {
// 1. Semantic search across people
console.log("\n=== Founders ===");
const founders = await memory.longTerm.searchEntities(
"company founder",
{ type: "person", limit: 5 },
);
for (const f of founders) {
console.log(` ${f.name} — ${f.description ?? "(no description)"}`);
}
// 2. Walk relationships outward from a known entity
console.log("\n=== What does Nike make? ===");
const nike = await memory.longTerm.getEntityByName("Nike");
if (nike) {
const products = await memory.longTerm.getRelatedEntities(nike.id, {
relationshipType: "MAKES",
});
for (const p of products) {
console.log(` Nike makes: ${p.name}`);
}
}
// 3. Multi-hop traversal
console.log("\n=== Where is each founder's company headquartered? ===");
for (const founder of founders) {
const orgs = await memory.longTerm.getRelatedEntities(founder.id, {
relationshipType: "FOUNDED",
});
for (const org of orgs) {
const locations = await memory.longTerm.getRelatedEntities(org.id, {
relationshipType: "HEADQUARTERED_IN",
});
for (const loc of locations) {
console.log(` ${founder.name} → ${org.name} → ${loc.name}`);
}
}
}
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
Run it:
npx tsx src/query.ts
Expected output:
=== Founders === Brian Chesky — co-founded Airbnb in 2008 Phil Knight — founded Nike in 1964 Steve Jobs — co-founded Apple in 1976 Reed Hastings — co-founded Netflix in 1997 Travis Kalanick — co-founded Uber in 2009 === What does Nike make? === Nike makes: Air Max Nike makes: Vaporfly === Where is each founder's company headquartered? === Brian Chesky → Airbnb → San Francisco Phil Knight → Nike → Beaverton, Oregon Steve Jobs → Apple → Cupertino, California Reed Hastings → Netflix → Los Gatos, California Travis Kalanick → Uber → San Francisco
We never wrote any Cypher. Two API calls — searchEntities and getRelatedEntities — covered semantic discovery and graph traversal.
Step 7: Use the graph from a different agent
The whole point of a shared knowledge graph is that other agents can use what this one built. From any other process pointed at the same NAMS endpoint:
const memory = new MemoryClient();
// A chat agent now has access to the entire graph.
const matches = await memory.longTerm.searchEntities(
"Who founded the company that makes the Vaporfly?",
{ limit: 5 },
);
// matches[0] = Vaporfly (concept)
// matches[1] = Nike (organization)
// matches[2] = Phil Knight (person)
Try this from a Python agent on the same backend — entity types map (PERSON↔person) and the data is the same. See Cross-Agent Memory Sharing.
What You’ve Built
-
A workflow for going from documents → typed entities + relationships using an LLM with structured output.
-
A queryable knowledge graph backed by NAMS — no Neo4j to operate, no extraction pipeline to maintain.
-
A graph that any other agent on the same backend can read.
Extending the Knowledge Graph
-
Stream documents. Pipe a larger corpus through
extractFromDocumentand persist in batches. -
Custom entity types. NAMS’s
"custom"type accepts a free-text subtype —addEntity(name, "custom", { description })with a subtype convention in the description. -
Provenance. After each extraction, record a reasoning trace tying entities back to the source document —
memory.reasoning.startTrace(documentId, "extract entities")thenrecordToolCallper entity created. Query later withgetEntityProvenance(entityId). -
Wikipedia enrichment. The bolt-side Python SDK can auto-enrich entities from Wikipedia. On NAMS, server-side enrichment is opt-in via account settings — see Use NAMS.
Next Steps
-
Connect Claude Desktop to Your Memory (TypeScript) — expose this graph through MCP so any MCP-aware client queries it.
-
Work with Entities — full how-to with all the entity operations.
-
Record Reasoning Traces — tie reasoning steps to entities they touched.
See Also
-
The POLE+O Data Model — the type taxonomy NAMS implements as
person/organization/location/concept/tool/custom.