POLE+O Data Model
The POLE+O data model is the default entity classification system in neo4j-agent-memory. It provides a structured, extensible framework for categorizing entities extracted from text.
What is POLE+O?
POLE+O stands for Person, Object, Location, Event + Organization. Originally developed for law enforcement and intelligence analysis, it has been adapted for general-purpose entity extraction in AI applications.
┌─────────────────────────────────────────────────────────────┐
│ POLE+O Model │
├─────────────┬─────────────┬─────────────┬─────────────┬─────┤
│ PERSON │ OBJECT │ LOCATION │ EVENT │ ORG │
├─────────────┼─────────────┼─────────────┼─────────────┼─────┤
│ Individuals │ Physical/ │ Places & │ Things that │ Com-│
│ & people │ digital │ geographic │ happen or │ pan-│
│ mentioned │ items │ areas │ occurred │ ies │
└─────────────┴─────────────┴─────────────┴─────────────┴─────┘
Entity Types & Subtypes
PERSON
Individuals mentioned by name, role, or description.
| Subtype | Description |
|---|---|
|
A specific named person |
|
An alternative name or identity |
|
A role or character |
|
Person of interest (law enforcement context) |
|
Someone who observed an event |
|
Someone affected by an event |
Examples:
"John Smith" → PERSON
"CEO" → PERSON (role)
"@johndoe" → PERSON:ALIAS
"Dr. Jane Wilson" → PERSON:INDIVIDUAL
OBJECT
Physical or digital items, artifacts, or things.
| Subtype | Description |
|---|---|
|
Cars, trucks, boats, aircraft |
|
Phone numbers and devices |
|
Email addresses |
|
Papers, files, records |
|
Electronic devices |
|
Weapons and armaments |
|
Currency and financial instruments |
|
Controlled substances |
|
Applications and programs |
|
Commercial products |
Examples:
"Tesla Model 3" → OBJECT:VEHICLE
"555-123-4567" → OBJECT:PHONE
"john@example.com" → OBJECT:EMAIL
"passport" → OBJECT:DOCUMENT
"iPhone 15" → OBJECT:DEVICE
LOCATION
Places, addresses, and geographic areas.
| Subtype | Description |
|---|---|
|
Street addresses |
|
Cities and towns |
|
States, provinces, regions |
|
Countries and nations |
|
Notable places and monuments |
|
Buildings and structures |
Examples:
"123 Main Street" → LOCATION:ADDRESS
"San Francisco" → LOCATION:CITY
"California" → LOCATION:REGION
"United States" → LOCATION:COUNTRY
"Eiffel Tower" → LOCATION:LANDMARK
"JFK Airport" → LOCATION:FACILITY
EVENT
Things that happened, meetings, transactions, or temporal occurrences.
| Subtype | Description |
|---|---|
|
Accidents, crimes, occurrences |
|
Scheduled gatherings |
|
Financial or business exchanges |
|
Calls, messages, correspondence |
|
Calendar dates |
|
Times of day |
Examples:
"car accident" → EVENT:INCIDENT
"board meeting" → EVENT:MEETING
"wire transfer" → EVENT:TRANSACTION
"January 15, 2024" → EVENT:DATE
"3:30 PM" → EVENT:TIME
ORGANIZATION
Companies, institutions, groups, and collective entities.
| Subtype | Description |
|---|---|
|
Businesses and corporations |
|
Charitable organizations |
|
Government agencies |
|
Schools and universities |
|
Informal groups and associations |
Examples:
"Apple Inc." → ORGANIZATION:COMPANY
"Red Cross" → ORGANIZATION:NONPROFIT
"FBI" → ORGANIZATION:GOVERNMENT
"MIT" → ORGANIZATION:EDUCATIONAL
"Book Club" → ORGANIZATION:GROUP
Neo4j Schema
Entities are stored as nodes in Neo4j with multiple labels for efficient querying.
Each entity has the base :Entity label plus the type and subtype as additional labels.
(:Entity:Person:Individual {
id: "uuid",
name: "John Smith",
type: "PERSON", // POLE+O type (stored as uppercase)
subtype: "INDIVIDUAL", // Optional subtype (stored as uppercase)
canonical_name: "John Smith", // Resolved name
description: "CEO of Acme Corp",
confidence: 0.92,
embedding: [0.1, 0.2, ...], // Vector for search
created_at: datetime(),
metadata: "{...}" // JSON metadata
})
Entity types and subtypes are stored as uppercase properties (e.g., type: "PERSON") but labels use PascalCase (e.g., :Person) following Neo4j naming conventions.
|
Label Structure
Each entity has:
-
:Entity- Base label (always present) -
:<Type>- Entity type as PascalCase label (e.g.,:Person,:Object,:Location,:Event,:Organization) -
:<Subtype>- Subtype as PascalCase label when present (e.g.,:Vehicle,:Address,:Company)
Both POLE+O types and custom types are added as PascalCase labels, as long as they are valid Neo4j label identifiers (start with a letter, contain only letters, numbers, and underscores).
This enables efficient queries like:
// Find all people
MATCH (p:Person) RETURN p
// Find all vehicles (regardless of whether they're Object type)
MATCH (v:Vehicle) RETURN v
// Find all entities (any type)
MATCH (e:Entity) RETURN e
// Find people who are individuals
MATCH (p:Person:Individual) RETURN p
// Combine with relationship traversal
MATCH (p:Person)-[:WORKS_AT]->(o:Organization:Company)
RETURN p.name, o.name
Custom Entity Types
Custom entity types outside the POLE+O model also become PascalCase labels:
# Custom types become PascalCase labels
await client.long_term.add_entity(
name="Widget Pro",
entity_type="PRODUCT",
subtype="ELECTRONICS",
)
# Creates: (:Entity:Product:Electronics {name: "Widget Pro", ...})
Query by custom type:
MATCH (p:Product) RETURN p
MATCH (e:Electronics) RETURN e
| For POLE+O types, subtypes are validated against the known subtypes for that type. For custom types, any valid Neo4j label identifier can be used as a subtype. |
Relationships
Entities can have relationships to other entities:
// Person works at organization
(:Person)-[:WORKS_AT]->(:Organization)
// Person lives at location
(:Person)-[:LIVES_IN]->(:Location)
// Organization located at location
(:Organization)-[:LOCATED_AT]->(:Location)
// Person owns object
(:Person)-[:OWNS]->(:Object)
// Person participated in event
(:Person)-[:PARTICIPATED_IN]->(:Event)
Configuring POLE+O
Using Default POLE+O
from neo4j_agent_memory import MemoryClient, MemorySettings
# Default configuration uses POLE+O
settings = MemorySettings()
async with MemoryClient(settings) as memory:
# Entities extracted using POLE+O types
await memory.short_term.add_message(
session_id="session-1",
role="user",
content="I work at Acme Corp in San Francisco"
)
# Extracts: "Acme Corp" (ORGANIZATION), "San Francisco" (LOCATION)
Selecting Specific POLE+O Types
from neo4j_agent_memory import MemorySettings, ExtractionConfig
# Only extract people and organizations
settings = MemorySettings(
extraction=ExtractionConfig(
entity_types=["PERSON", "ORGANIZATION"]
)
)
Using Subtypes
Subtypes are automatically extracted by the LLM extractor and can be specified when manually adding entities:
async with MemoryClient(settings) as memory:
# Add entity with subtype
await memory.long_term.add_entity(
name="Tesla Model 3",
entity_type="OBJECT",
subtype="VEHICLE",
description="Electric sedan"
)
# Query by type and subtype
vehicles = await memory.long_term.search_entities(
query="cars",
entity_type="OBJECT"
)
Custom Schema Models
While POLE+O is the default, you can use alternative schema models:
SchemaModel Options
| Model | Description |
|---|---|
|
Default POLE+O model with all subtypes |
|
Backward-compatible with older EntityType enum |
|
User-defined entity types |
Using Custom Entity Types
from neo4j_agent_memory import MemorySettings, SchemaConfig, SchemaModel, ExtractionConfig
# E-commerce domain example
settings = MemorySettings(
schema=SchemaConfig(
model=SchemaModel.CUSTOM,
entity_types=["CUSTOMER", "PRODUCT", "ORDER", "STORE", "REVIEW"],
strict_types=True # Reject unknown types
),
extraction=ExtractionConfig(
entity_types=["CUSTOMER", "PRODUCT", "ORDER", "STORE", "REVIEW"]
)
)
Loading Schema from File
Create a JSON schema file:
{
"name": "ecommerce_schema",
"version": "1.0",
"entity_types": [
{
"name": "CUSTOMER",
"description": "A person who purchases products",
"subtypes": ["PREMIUM", "REGULAR", "NEW"],
"attributes": ["name", "email", "tier"]
},
{
"name": "PRODUCT",
"description": "An item available for purchase",
"subtypes": ["ELECTRONICS", "CLOTHING", "FOOD"],
"attributes": ["sku", "price", "category"]
}
],
"relationship_types": [
{
"name": "PURCHASED",
"source": "CUSTOMER",
"target": "PRODUCT"
}
]
}
Load it in configuration:
from neo4j_agent_memory import MemorySettings, SchemaConfig, SchemaModel
settings = MemorySettings(
schema=SchemaConfig(
model=SchemaModel.CUSTOM,
custom_schema_path="./ecommerce_schema.json"
)
)
Working with Entities
Searching Entities
async with MemoryClient(settings) as memory:
# Semantic search across all entities
entities = await memory.long_term.search_entities(
query="companies in technology",
limit=10
)
# Filter by type
people = await memory.long_term.search_entities(
query="engineers",
entity_type="PERSON",
limit=10
)
# Get entity by name
entity = await memory.long_term.get_entity(
name="Acme Corp",
entity_type="ORGANIZATION"
)
Entity Resolution
Entity resolution merges duplicate entities and links aliases:
from neo4j_agent_memory import MemorySettings, ResolutionConfig, ResolverStrategy
settings = MemorySettings(
resolution=ResolutionConfig(
strategy=ResolverStrategy.COMPOSITE,
exact_threshold=1.0, # Exact match
fuzzy_threshold=0.85, # Fuzzy match threshold
semantic_threshold=0.9, # Embedding similarity
)
)
async with MemoryClient(settings) as memory:
# These might resolve to the same entity
await memory.short_term.add_message(
session_id="s1", role="user",
content="I talked to John Smith"
)
await memory.short_term.add_message(
session_id="s1", role="user",
content="Johnny Smith called me back"
)
# "John Smith" and "Johnny Smith" may be resolved as the same person
POLE+O in the Extraction Pipeline
Each extractor maps to POLE+O types differently:
spaCy Mapping
SPACY_TO_POLEO = {
"PERSON": "PERSON",
"ORG": "ORGANIZATION",
"GPE": "LOCATION", # Geopolitical entities
"LOC": "LOCATION", # Other locations
"FAC": "LOCATION", # Facilities
"EVENT": "EVENT",
"PRODUCT": "OBJECT",
"WORK_OF_ART": "OBJECT",
"DATE": "EVENT",
"TIME": "EVENT",
"MONEY": "OBJECT",
}
GLiNER Labels
GLiNER uses lowercase labels that map to POLE+O:
GLINER_LABELS = [
"person", # → PERSON
"organization", # → ORGANIZATION
"company", # → ORGANIZATION:COMPANY
"location", # → LOCATION
"city", # → LOCATION:CITY
"country", # → LOCATION:COUNTRY
"event", # → EVENT
"meeting", # → EVENT:MEETING
"object", # → OBJECT
"vehicle", # → OBJECT:VEHICLE
"document", # → OBJECT:DOCUMENT
]
LLM Prompt
The LLM extractor uses explicit POLE+O instructions:
Extract entities using the POLE+O model:
- PERSON: Individuals, people mentioned by name or role
- OBJECT: Physical or digital items (vehicles, phones, documents)
- LOCATION: Places, addresses, geographic areas
- EVENT: Incidents, meetings, transactions, things that happened
- ORGANIZATION: Companies, groups, institutions
For each entity, also identify:
- Subtype (e.g., VEHICLE for OBJECT, COMPANY for ORGANIZATION)
- Confidence score (0.0 to 1.0)
- Description (brief context)
Best Practices
Choosing Entity Types
-
Start with POLE+O - The default types cover most use cases
-
Add subtypes - Use subtypes for finer classification without changing core types
-
Custom types - Only create custom types for domain-specific needs