Build a Knowledge Graph from Documents (TypeScript)

A step-by-step tutorial to extract entities and relationships from a corpus of text and persist them as a queryable knowledge graph using the TypeScript SDK and the hosted NAMS service.

In this tutorial we will turn a small set of company / product / person descriptions into a knowledge graph: an LLM extracts structured entities and relationships from each document, we persist them via @neo4j-labs/agent-memory, and then we query the graph by semantic search and by relationship traversal. By the end you’ll be able to point any TS agent at the same backend and have it use that graph.

What You’ll Learn

  • How to use an LLM with structured output to extract entities from free-text.

  • How to persist entities and typed relationships via memory.longTerm.addEntity and addRelationship.

  • How to search the graph by semantic similarity (searchEntities).

  • How to walk relationships outward from a known entity (getRelatedEntities).

  • How another agent in the same backend reads what this one wrote.

Prerequisites

  • Node.js 20 or later.

  • OPENAI_API_KEY and MEMORY_API_KEY set.

  • Basic familiarity with TypeScript and async/await.

Time Required

About 30 minutes.

What We’re Building

A small "product / company / founder" knowledge graph built from five short documents. The schema we’ll end up with:

(:Entity:Person {name})-[:FOUNDED]->(:Entity:Organization {name})
(:Entity:Organization {name})-[:MAKES]->(:Entity:Product {name})
(:Entity:Organization {name})-[:HEADQUARTERED_IN]->(:Entity:Location {name})

NAMS uses lowercase types ("person", "organization", "location", "concept"); we’ll use "concept" for products since NAMS doesn’t have a built-in product type.

Step 1: Project setup

mkdir kg-from-docs && cd kg-from-docs
npm init -y
npm install @neo4j-labs/agent-memory ai @ai-sdk/openai zod
npm install --save-dev typescript tsx @types/node

tsconfig.json and package.json "type": "module" as in the previous tutorial.

Step 2: Prepare sample documents

Create src/documents.ts:

export const documents = [
  {
    id: "doc-1",
    text: "Brian Chesky co-founded Airbnb in 2008. The company is headquartered in San Francisco and operates an online marketplace for short-term lodging.",
  },
  {
    id: "doc-2",
    text: "Phil Knight founded Nike in 1964. Nike, based in Beaverton, Oregon, makes the Air Max running shoes and the Vaporfly racing shoes.",
  },
  {
    id: "doc-3",
    text: "Steve Jobs co-founded Apple in 1976. Apple, headquartered in Cupertino, California, makes the iPhone and the MacBook.",
  },
  {
    id: "doc-4",
    text: "Reed Hastings co-founded Netflix in 1997. Based in Los Gatos, California, Netflix operates a streaming service.",
  },
  {
    id: "doc-5",
    text: "Travis Kalanick co-founded Uber in 2009. Uber is headquartered in San Francisco and operates ride-sharing and food-delivery services.",
  },
];

Step 3: Define an extraction schema with Zod

Create src/schema.ts:

import { z } from "zod";

export const ExtractionSchema = z.object({
  people: z.array(
    z.object({
      name: z.string(),
      description: z.string().optional(),
    }),
  ),
  organizations: z.array(
    z.object({
      name: z.string(),
      description: z.string().optional(),
      headquartersLocation: z.string().optional(),
    }),
  ),
  locations: z.array(
    z.object({
      name: z.string(),
    }),
  ),
  products: z.array(
    z.object({
      name: z.string(),
      makerOrganization: z.string(),
    }),
  ),
  relationships: z.array(
    z.object({
      sourceName: z.string(),
      targetName: z.string(),
      type: z.enum(["FOUNDED", "MAKES", "HEADQUARTERED_IN"]),
    }),
  ),
});

export type Extraction = z.infer<typeof ExtractionSchema>;

The schema is what we’ll hand to the LLM as a structured-output target. Zod gives us runtime validation and a TypeScript type.

Step 4: Extract structured entities with the LLM

Create src/extract.ts:

import { generateObject } from "ai";
import { openai } from "@ai-sdk/openai";
import { ExtractionSchema, type Extraction } from "./schema.js";

export async function extractFromDocument(text: string): Promise<Extraction> {
  const { object } = await generateObject({
    model: openai("gpt-4o-mini"),
    schema: ExtractionSchema,
    prompt: `Extract people, organizations, locations, products, and the
relationships between them from the following text.

For relationships use these types:
- FOUNDED: person founded organization
- MAKES: organization makes product
- HEADQUARTERED_IN: organization is headquartered in location

Text:
${text}`,
  });
  return object;
}

generateObject is the Vercel AI SDK’s structured-output helper. It enforces our Zod schema on the model’s response.

Step 5: Persist the extraction into the knowledge graph

Create src/build.ts:

import { MemoryClient } from "@neo4j-labs/agent-memory";
import { documents } from "./documents.js";
import { extractFromDocument } from "./extract.js";
import type { Extraction } from "./schema.js";

const memory = new MemoryClient();

async function persist(extraction: Extraction): Promise<Map<string, string>> {
  // Map entity name -> persisted entity id, for relationship wiring.
  const nameToId = new Map<string, string>();

  // 1. People
  for (const person of extraction.people) {
    const entity = await memory.longTerm.addEntity(person.name, "person", {
      description: person.description,
    });
    nameToId.set(person.name, entity.id);
  }

  // 2. Organizations
  for (const org of extraction.organizations) {
    const entity = await memory.longTerm.addEntity(org.name, "organization", {
      description: org.description,
    });
    nameToId.set(org.name, entity.id);
  }

  // 3. Locations
  for (const loc of extraction.locations) {
    const entity = await memory.longTerm.addEntity(loc.name, "location");
    nameToId.set(loc.name, entity.id);
  }

  // 4. Products (NAMS doesn't have a built-in product type; use "concept")
  for (const product of extraction.products) {
    const entity = await memory.longTerm.addEntity(product.name, "concept", {
      description: `Product made by ${product.makerOrganization}`,
    });
    nameToId.set(product.name, entity.id);
  }

  // 5. Relationships
  for (const rel of extraction.relationships) {
    const sourceId = nameToId.get(rel.sourceName);
    const targetId = nameToId.get(rel.targetName);
    if (!sourceId || !targetId) {
      console.warn(`Skipping ${rel.type}: missing endpoint`);
      continue;
    }
    await memory.longTerm.addRelationship(sourceId, targetId, rel.type);
  }

  return nameToId;
}

async function main() {
  for (const doc of documents) {
    process.stdout.write(`\nDocument ${doc.id} — extracting... `);
    const extraction = await extractFromDocument(doc.text);
    process.stdout.write(
      `${extraction.people.length}p, ${extraction.organizations.length}o, ` +
        `${extraction.locations.length}l, ${extraction.products.length}prod, ` +
        `${extraction.relationships.length}rel\n`,
    );
    await persist(extraction);
  }

  console.log("\nKnowledge graph built. Move on to Step 6 to query it.");
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

Run it:

npx tsx src/build.ts

Expected output:

Document doc-1 — extracting... 1p, 1o, 1l, 0prod, 2rel
Document doc-2 — extracting... 1p, 1o, 1l, 2prod, 5rel
Document doc-3 — extracting... 1p, 1o, 1l, 2prod, 5rel
Document doc-4 — extracting... 1p, 1o, 1l, 0prod, 2rel
Document doc-5 — extracting... 1p, 1o, 1l, 0prod, 2rel

Knowledge graph built. Move on to Step 6 to query it.

NAMS deduplicates entities server-side, so re-running this script is idempotent.

Step 6: Query the knowledge graph

Create src/query.ts:

import { MemoryClient } from "@neo4j-labs/agent-memory";

const memory = new MemoryClient();

async function main() {
  // 1. Semantic search across people
  console.log("\n=== Founders ===");
  const founders = await memory.longTerm.searchEntities(
    "company founder",
    { type: "person", limit: 5 },
  );
  for (const f of founders) {
    console.log(`  ${f.name} — ${f.description ?? "(no description)"}`);
  }

  // 2. Walk relationships outward from a known entity
  console.log("\n=== What does Nike make? ===");
  const nike = await memory.longTerm.getEntityByName("Nike");
  if (nike) {
    const products = await memory.longTerm.getRelatedEntities(nike.id, {
      relationshipType: "MAKES",
    });
    for (const p of products) {
      console.log(`  Nike makes: ${p.name}`);
    }
  }

  // 3. Multi-hop traversal
  console.log("\n=== Where is each founder's company headquartered? ===");
  for (const founder of founders) {
    const orgs = await memory.longTerm.getRelatedEntities(founder.id, {
      relationshipType: "FOUNDED",
    });
    for (const org of orgs) {
      const locations = await memory.longTerm.getRelatedEntities(org.id, {
        relationshipType: "HEADQUARTERED_IN",
      });
      for (const loc of locations) {
        console.log(`  ${founder.name} → ${org.name} → ${loc.name}`);
      }
    }
  }
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

Run it:

npx tsx src/query.ts

Expected output:

=== Founders ===
  Brian Chesky — co-founded Airbnb in 2008
  Phil Knight — founded Nike in 1964
  Steve Jobs — co-founded Apple in 1976
  Reed Hastings — co-founded Netflix in 1997
  Travis Kalanick — co-founded Uber in 2009

=== What does Nike make? ===
  Nike makes: Air Max
  Nike makes: Vaporfly

=== Where is each founder's company headquartered? ===
  Brian Chesky → Airbnb → San Francisco
  Phil Knight → Nike → Beaverton, Oregon
  Steve Jobs → Apple → Cupertino, California
  Reed Hastings → Netflix → Los Gatos, California
  Travis Kalanick → Uber → San Francisco

We never wrote any Cypher. Two API calls — searchEntities and getRelatedEntities — covered semantic discovery and graph traversal.

Step 7: Use the graph from a different agent

The whole point of a shared knowledge graph is that other agents can use what this one built. From any other process pointed at the same NAMS endpoint:

const memory = new MemoryClient();

// A chat agent now has access to the entire graph.
const matches = await memory.longTerm.searchEntities(
  "Who founded the company that makes the Vaporfly?",
  { limit: 5 },
);
// matches[0] = Vaporfly (concept)
// matches[1] = Nike (organization)
// matches[2] = Phil Knight (person)

Try this from a Python agent on the same backend — entity types map (PERSONperson) and the data is the same. See Cross-Agent Memory Sharing.

What You’ve Built

  • A workflow for going from documents → typed entities + relationships using an LLM with structured output.

  • A queryable knowledge graph backed by NAMS — no Neo4j to operate, no extraction pipeline to maintain.

  • A graph that any other agent on the same backend can read.

Extending the Knowledge Graph

  • Stream documents. Pipe a larger corpus through extractFromDocument and persist in batches.

  • Custom entity types. NAMS’s "custom" type accepts a free-text subtype — addEntity(name, "custom", { description }) with a subtype convention in the description.

  • Provenance. After each extraction, record a reasoning trace tying entities back to the source document — memory.reasoning.startTrace(documentId, "extract entities") then recordToolCall per entity created. Query later with getEntityProvenance(entityId).

  • Wikipedia enrichment. The bolt-side Python SDK can auto-enrich entities from Wikipedia. On NAMS, server-side enrichment is opt-in via account settings — see Use NAMS.

Next Steps

See Also