Configuration Reference

Table of Contents

Configuration Methods
MemorySettings
Neo4j Configuration
Embedding Configuration
Extraction Configuration
Schema Configuration
Resolution Configuration
LLM Configuration
- Python Configuration
- Environment Variables
Memory Configuration
- Python Configuration
- Environment Variables
Search Configuration
- Python Configuration
- Environment Variables
Geocoding Configuration
Enrichment Configuration
Deduplication Configuration
Observability Configuration
CLI Configuration
Complete Example
- Python Configuration
- Environment Variables (.env file)
Configuration Precedence
Validation
Next Steps

Complete reference for all configuration options in neo4j-agent-memory.

Configuration Methods

Neo4j Agent Memory supports multiple configuration methods:

Python Configuration - Direct instantiation of settings objects
Environment Variables - Using the NAM_ prefix
Configuration Files - YAML or JSON files
Mixed - Combine methods with environment variables taking precedence

MemorySettings

The main configuration class that contains all settings.

from neo4j_agent_memory import MemorySettings

settings = MemorySettings(
    neo4j=Neo4jConfig(...),
    embedding=EmbeddingConfig(...),
    extraction=ExtractionConfig(...),
    resolution=ResolutionConfig(...),
    schema=SchemaConfig(...),
    llm=LLMConfig(...),
    memory=MemoryConfig(...),
    search=SearchConfig(...),
)

Neo4j Configuration

Connection settings for Neo4j database.

Python Configuration

from neo4j_agent_memory import Neo4jConfig
from pydantic import SecretStr

neo4j_config = Neo4jConfig(
    uri="bolt://localhost:7687",
    username="neo4j",
    password=SecretStr("password"),
    database="neo4j",                    # Database name (default: neo4j)
    max_connection_pool_size=50,         # Connection pool size
    connection_timeout=30.0,             # Connection timeout in seconds
    max_transaction_retry_time=30.0,     # Max retry time for transactions
)

Environment Variables

NAM_NEO4J__URI=bolt://localhost:7687
NAM_NEO4J__USERNAME=neo4j
NAM_NEO4J__PASSWORD=your-password
NAM_NEO4J__DATABASE=neo4j
NAM_NEO4J__MAX_CONNECTION_POOL_SIZE=50
NAM_NEO4J__CONNECTION_TIMEOUT=30.0

Parameters

Parameter Type Default Description

Parameter	Type	Default	Description
`uri`	str	`bolt://localhost:7687`	Neo4j connection URI
`username`	str	`neo4j`	Authentication username
`password`	SecretStr	Required	Authentication password
`database`	str	`neo4j`	Database name
`max_connection_pool_size`	int	50	Maximum connection pool size
`connection_timeout`	float	30.0	Connection timeout (seconds)
`max_transaction_retry_time`	float	30.0	Maximum transaction retry time

uri

str

bolt://localhost:7687

Neo4j connection URI

username

str

neo4j

Authentication username

password

SecretStr

Required

Authentication password

database

str

neo4j

Database name

max_connection_pool_size

int

Maximum connection pool size

connection_timeout

float

30.0

Connection timeout (seconds)

max_transaction_retry_time

float

30.0

Maximum transaction retry time

Embedding Configuration

Settings for vector embeddings.

Python Configuration

from neo4j_agent_memory import EmbeddingConfig, EmbeddingProvider
from pydantic import SecretStr

embedding_config = EmbeddingConfig(
    provider=EmbeddingProvider.OPENAI,
    model="text-embedding-3-small",
    api_key=SecretStr("sk-..."),         # Or use OPENAI_API_KEY env var
    dimensions=1536,                      # Embedding dimensions
    batch_size=100,                       # Batch size for bulk embedding
    device="cpu",                         # Device for local models
)

Environment Variables

NAM_EMBEDDING__PROVIDER=openai
NAM_EMBEDDING__MODEL=text-embedding-3-small
NAM_EMBEDDING__API_KEY=sk-...
NAM_EMBEDDING__DIMENSIONS=1536
NAM_EMBEDDING__BATCH_SIZE=100
NAM_EMBEDDING__DEVICE=cpu

# Alternative: use standard OpenAI env var
OPENAI_API_KEY=sk-...

Parameters

Parameter Type Default Description

Parameter	Type	Default	Description
`provider`	EmbeddingProvider	`OPENAI`	Embedding provider (OPENAI, SENTENCE_TRANSFORMERS, VERTEX_AI, BEDROCK)
`model`	str	`text-embedding-3-small`	Model name
`api_key`	SecretStr	None	API key (for OpenAI)
`dimensions`	int	1536	Embedding dimensions
`batch_size`	int	100	Batch size for bulk operations
`device`	str	`cpu`	Device for local models (cpu/cuda)
`project_id`	str	None	GCP project ID (for Vertex AI)
`location`	str	`us-central1`	GCP region (for Vertex AI)
`task_type`	str	`RETRIEVAL_DOCUMENT`	Vertex AI task type
`aws_region`	str	None	AWS region (for Bedrock)
`aws_profile`	str	None	AWS credentials profile name (for Bedrock)

provider

EmbeddingProvider

OPENAI

Embedding provider (OPENAI, SENTENCE_TRANSFORMERS, VERTEX_AI, BEDROCK)

model

str

text-embedding-3-small

Model name

api_key

SecretStr

None

API key (for OpenAI)

dimensions

int

1536

Embedding dimensions

batch_size

int

100

Batch size for bulk operations

device

str

cpu

Device for local models (cpu/cuda)

project_id

str

None

GCP project ID (for Vertex AI)

location

str

us-central1

GCP region (for Vertex AI)

task_type

str

RETRIEVAL_DOCUMENT

Vertex AI task type

aws_region

str

None

AWS region (for Bedrock)

aws_profile

str

None

AWS credentials profile name (for Bedrock)

Embedding Providers

Provider Models Notes

Provider	Models	Notes
`OPENAI`	`text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`	Requires API key
`SENTENCE_TRANSFORMERS`	`all-MiniLM-L6-v2`, `all-mpnet-base-v2`, etc.	Runs locally
`VERTEX_AI`	`text-embedding-004`, `textembedding-gecko@003`	Requires GCP project. Install with `pip install neo4j-agent-memory[vertex-ai]`
`BEDROCK`	`amazon.titan-embed-text-v2:0`, `amazon.titan-embed-text-v1`, `cohere.embed-english-v3`, `cohere.embed-multilingual-v3`	Requires AWS credentials. Install with `pip install neo4j-agent-memory[bedrock]`

OPENAI

text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002

Requires API key

SENTENCE_TRANSFORMERS

all-MiniLM-L6-v2, all-mpnet-base-v2, etc.

Runs locally

VERTEX_AI

text-embedding-004, textembedding-gecko@003

Requires GCP project. Install with pip install neo4j-agent-memory[vertex-ai]

BEDROCK

amazon.titan-embed-text-v2:0, amazon.titan-embed-text-v1, cohere.embed-english-v3, cohere.embed-multilingual-v3

Requires AWS credentials. Install with pip install neo4j-agent-memory[bedrock]

Extraction Configuration

Settings for entity extraction pipeline.

Python Configuration

from neo4j_agent_memory import ExtractionConfig, ExtractorType, MergeStrategy

extraction_config = ExtractionConfig(
    # Extractor type
    extractor_type=ExtractorType.PIPELINE,

    # Pipeline stages
    enable_spacy=True,
    enable_gliner=True,
    enable_llm_fallback=True,

    # Merge strategy
    merge_strategy=MergeStrategy.CONFIDENCE,
    fallback_on_empty=True,

    # spaCy settings
    spacy_model="en_core_web_sm",
    spacy_confidence=0.85,

    # GLiNER settings
    gliner_model="urchade/gliner_medium-v2.1",
    gliner_threshold=0.5,
    gliner_device="cpu",

    # GLiREL relation extraction (optional)
    enable_gliner_relations=False,
    gliner_relations_model="jackboyla/glirel-large-v0",
    gliner_relations_threshold=0.3,

    # LLM settings
    llm_model="gpt-4o-mini",

    # Entity types
    entity_types=["PERSON", "ORGANIZATION", "LOCATION", "EVENT", "OBJECT"],

    # Extraction options
    extract_relations=True,
    extract_preferences=True,

    # Batch extraction settings
    batch_size=10,
    batch_max_concurrent=5,

    # Streaming extraction settings
    streaming_chunk_size=4000,
    streaming_chunk_overlap=200,
)

Environment Variables

# Extractor type
NAM_EXTRACTION__EXTRACTOR_TYPE=pipeline    # none, llm, spacy, gliner, pipeline

# Pipeline stages
NAM_EXTRACTION__ENABLE_SPACY=true
NAM_EXTRACTION__ENABLE_GLINER=true
NAM_EXTRACTION__ENABLE_LLM_FALLBACK=true

# Merge strategy
NAM_EXTRACTION__MERGE_STRATEGY=confidence  # union, intersection, confidence, cascade
NAM_EXTRACTION__FALLBACK_ON_EMPTY=true

# spaCy settings
NAM_EXTRACTION__SPACY_MODEL=en_core_web_sm
NAM_EXTRACTION__SPACY_CONFIDENCE=0.85

# GLiNER settings
NAM_EXTRACTION__GLINER_MODEL=urchade/gliner_medium-v2.1
NAM_EXTRACTION__GLINER_THRESHOLD=0.5
NAM_EXTRACTION__GLINER_DEVICE=cpu

# GLiREL relation extraction (optional)
NAM_EXTRACTION__ENABLE_GLINER_RELATIONS=false
NAM_EXTRACTION__GLINER_RELATIONS_MODEL=jackboyla/glirel-large-v0
NAM_EXTRACTION__GLINER_RELATIONS_THRESHOLD=0.3

# LLM settings
NAM_EXTRACTION__LLM_MODEL=gpt-4o-mini

# Entity types (JSON array)
NAM_EXTRACTION__ENTITY_TYPES='["PERSON","ORGANIZATION","LOCATION","EVENT","OBJECT"]'

# Extraction options
NAM_EXTRACTION__EXTRACT_RELATIONS=true
NAM_EXTRACTION__EXTRACT_PREFERENCES=true

# Batch extraction settings
NAM_EXTRACTION__BATCH_SIZE=10
NAM_EXTRACTION__BATCH_MAX_CONCURRENT=5

# Streaming extraction settings
NAM_EXTRACTION__STREAMING_CHUNK_SIZE=4000
NAM_EXTRACTION__STREAMING_CHUNK_OVERLAP=200

Extractor Types

Type Description

Type	Description
`NONE`	Disable extraction
`SPACY`	spaCy statistical NER only
`GLINER`	GLiNER zero-shot NER only
`LLM`	LLM-based extraction only
`PIPELINE`	Multi-stage pipeline (default)

NONE

Disable extraction

SPACY

spaCy statistical NER only

GLINER

GLiNER zero-shot NER only

LLM

LLM-based extraction only

PIPELINE

Multi-stage pipeline (default)

Merge Strategies

Strategy Description

Strategy	Description
`UNION`	Keep all unique entities from all stages
`INTERSECTION`	Only keep entities found by multiple extractors
`CONFIDENCE`	Keep highest-confidence version (default)
`CASCADE`	Use first stage, fill gaps with subsequent stages
`FIRST_SUCCESS`	Stop after first successful stage

UNION

Keep all unique entities from all stages

INTERSECTION

Only keep entities found by multiple extractors

CONFIDENCE

Keep highest-confidence version (default)

CASCADE

Use first stage, fill gaps with subsequent stages

FIRST_SUCCESS

Stop after first successful stage

Schema Configuration

Settings for the knowledge graph schema.

Python Configuration

from neo4j_agent_memory import SchemaConfig, SchemaModel

schema_config = SchemaConfig(
    model=SchemaModel.POLEO,              # Schema model
    entity_types=None,                     # Custom types (for CUSTOM model)
    enable_subtypes=True,                  # Track entity subtypes
    strict_types=False,                    # Reject unknown types
    custom_schema_path=None,               # Path to schema file
)

Environment Variables

NAM_SCHEMA__MODEL=poleo                   # poleo, legacy, custom
NAM_SCHEMA__ENTITY_TYPES='["PERSON","PRODUCT"]'  # For custom model
NAM_SCHEMA__ENABLE_SUBTYPES=true
NAM_SCHEMA__STRICT_TYPES=false
NAM_SCHEMA__CUSTOM_SCHEMA_PATH=/path/to/schema.json

Schema Models

Model Description

Model	Description
`POLEO`	Default POLE+O model (Person, Object, Location, Event, Organization)
`LEGACY`	Backward-compatible with older versions
`CUSTOM`	User-defined entity types

POLEO

Default POLE+O model (Person, Object, Location, Event, Organization)

LEGACY

Backward-compatible with older versions

CUSTOM

User-defined entity types

Resolution Configuration

Settings for entity resolution (deduplication).

Python Configuration

from neo4j_agent_memory import ResolutionConfig, ResolverStrategy

resolution_config = ResolutionConfig(
    strategy=ResolverStrategy.COMPOSITE,
    exact_threshold=1.0,                  # Exact match threshold
    fuzzy_threshold=0.85,                 # Fuzzy match threshold
    semantic_threshold=0.9,               # Semantic similarity threshold
)

Environment Variables

NAM_RESOLUTION__STRATEGY=composite        # none, exact, fuzzy, semantic, composite
NAM_RESOLUTION__EXACT_THRESHOLD=1.0
NAM_RESOLUTION__FUZZY_THRESHOLD=0.85
NAM_RESOLUTION__SEMANTIC_THRESHOLD=0.9

Resolver Strategies

Strategy Description

Strategy	Description
`NONE`	No resolution
`EXACT`	Exact string matching only
`FUZZY`	Fuzzy string matching (using rapidfuzz)
`SEMANTIC`	Embedding similarity matching
`COMPOSITE`	Combine all strategies

NONE

No resolution

EXACT

Exact string matching only

FUZZY

Fuzzy string matching (using rapidfuzz)

SEMANTIC

Embedding similarity matching

COMPOSITE

Combine all strategies

LLM Configuration

Settings for LLM-based operations.

Python Configuration

from neo4j_agent_memory import LLMConfig, LLMProvider
from pydantic import SecretStr

llm_config = LLMConfig(
    provider=LLMProvider.OPENAI,
    model="gpt-4o-mini",
    api_key=SecretStr("sk-..."),
    temperature=0.0,
    max_tokens=4096,
)

Environment Variables

NAM_LLM__PROVIDER=openai
NAM_LLM__MODEL=gpt-4o-mini
NAM_LLM__API_KEY=sk-...
NAM_LLM__TEMPERATURE=0.0
NAM_LLM__MAX_TOKENS=4096

# Alternative
OPENAI_API_KEY=sk-...

Memory Configuration

Settings for memory behavior.

Python Configuration

from neo4j_agent_memory import MemoryConfig

memory_config = MemoryConfig(
    default_session_ttl=86400,            # Session TTL in seconds (24 hours)
    max_messages_per_session=1000,        # Max messages per session
    auto_summarize=False,                 # Auto-summarize long conversations
    summarize_threshold=50,               # Messages before summarization
)

Environment Variables

NAM_MEMORY__DEFAULT_SESSION_TTL=86400
NAM_MEMORY__MAX_MESSAGES_PER_SESSION=1000
NAM_MEMORY__AUTO_SUMMARIZE=false
NAM_MEMORY__SUMMARIZE_THRESHOLD=50

Search Configuration

Settings for search operations.

Python Configuration

from neo4j_agent_memory import SearchConfig

search_config = SearchConfig(
    default_limit=10,                     # Default result limit
    max_limit=100,                        # Maximum result limit
    similarity_threshold=0.7,             # Minimum similarity score
    include_metadata=True,                # Include metadata in results
)

Environment Variables

NAM_SEARCH__DEFAULT_LIMIT=10
NAM_SEARCH__MAX_LIMIT=100
NAM_SEARCH__SIMILARITY_THRESHOLD=0.7
NAM_SEARCH__INCLUDE_METADATA=true

Geocoding Configuration

Settings for automatic geocoding of LOCATION entities. When enabled, location names are converted to latitude/longitude coordinates stored as Neo4j Point properties, enabling geospatial queries.

Python Configuration (Nominatim - Free)

from neo4j_agent_memory import GeocodingConfig, GeocodingProvider

# Nominatim is free but rate-limited to 1 request/second
geocoding_config = GeocodingConfig(
    enabled=True,
    provider=GeocodingProvider.NOMINATIM,
    cache_results=True,                   # Cache to avoid repeated API calls
    rate_limit_per_second=1.0,            # Nominatim requires <= 1 req/sec
    user_agent="my-app/1.0",              # Required by Nominatim ToS
)

Python Configuration (Google Maps - Higher Accuracy)

from neo4j_agent_memory import GeocodingConfig, GeocodingProvider
from pydantic import SecretStr

# Google Maps API requires an API key and has usage costs
geocoding_config = GeocodingConfig(
    enabled=True,
    provider=GeocodingProvider.GOOGLE,
    api_key=SecretStr("your-google-api-key"),
    cache_results=True,
)

Environment Variables

# Enable geocoding
NAM_GEOCODING__ENABLED=true

# Provider selection
NAM_GEOCODING__PROVIDER=nominatim        # nominatim or google

# API key (required for Google)
NAM_GEOCODING__API_KEY=your-google-api-key

# Caching
NAM_GEOCODING__CACHE_RESULTS=true

# Nominatim-specific settings
NAM_GEOCODING__RATE_LIMIT_PER_SECOND=1.0
NAM_GEOCODING__USER_AGENT=my-app/1.0

Parameters

Parameter Type Default Description

Parameter	Type	Default	Description
`enabled`	bool	`False`	Enable automatic geocoding of LOCATION entities
`provider`	GeocodingProvider	`NOMINATIM`	Geocoding provider (NOMINATIM or GOOGLE)
`api_key`	SecretStr	None	API key (required for Google)
`cache_results`	bool	`True`	Cache geocoding results in-memory
`rate_limit_per_second`	float	1.0	Rate limit for requests (Nominatim requires ≤1)
`user_agent`	str	`neo4j-agent-memory`	User-Agent header for Nominatim (required by ToS)

enabled

bool

False

Enable automatic geocoding of LOCATION entities

provider

GeocodingProvider

NOMINATIM

Geocoding provider (NOMINATIM or GOOGLE)

api_key

SecretStr

None

API key (required for Google)

cache_results

bool

True

Cache geocoding results in-memory

rate_limit_per_second

float

1.0

Rate limit for requests (Nominatim requires ≤1)

user_agent

str

neo4j-agent-memory

User-Agent header for Nominatim (required by ToS)

Geocoding Providers

Provider Cost Rate Limit Notes

Provider	Cost	Rate Limit	Notes
`NOMINATIM`	Free	1 request/second	Uses OpenStreetMap data. Good for most use cases.
`GOOGLE`	Pay per use	50 requests/second	Higher accuracy, better address parsing. Requires API key.

NOMINATIM

Free

1 request/second

Uses OpenStreetMap data. Good for most use cases.

GOOGLE

Pay per use

50 requests/second

Higher accuracy, better address parsing. Requires API key.

Usage Example

from neo4j_agent_memory import (
    MemoryClient,
    MemorySettings,
    GeocodingConfig,
    GeocodingProvider,
)

# Configure with geocoding enabled
settings = MemorySettings(
    geocoding=GeocodingConfig(
        enabled=True,
        provider=GeocodingProvider.NOMINATIM,
    )
)

async with MemoryClient(settings) as client:
    # LOCATION entities are automatically geocoded
    entity, dedup_result = await client.long_term.add_entity(
        "Empire State Building, New York",
        "LOCATION",
    )

    # Get coordinates
    coords = await client.long_term.get_entity_coordinates(entity.id)
    if coords:
        lat, lon = coords
        print(f"Coordinates: {lat}, {lon}")

    # Search for nearby locations (within 5km)
    nearby = await client.long_term.search_locations_near(
        latitude=40.7484,
        longitude=-73.9857,
        radius_km=5.0,
    )

    # Batch geocode existing locations without coordinates
    stats = await client.long_term.geocode_locations()
    print(f"Geocoded {stats['geocoded']} of {stats['processed']} locations")

Geospatial Queries

Once locations are geocoded, you can run geospatial queries:

# Find locations within a bounding box
locations = await client.long_term.search_locations_in_bounds(
    min_lat=40.70,
    max_lat=40.80,
    min_lon=-74.02,
    max_lon=-73.95,
)

# Find locations near a point
nearby = await client.long_term.search_locations_near(
    latitude=40.7484,
    longitude=-73.9857,
    radius_km=10.0,
    limit=20,
)

Enrichment Configuration

Settings for background entity enrichment from external knowledge sources (Wikipedia, Diffbot).

Python Configuration

from neo4j_agent_memory.config.settings import EnrichmentConfig, EnrichmentProvider

enrichment_config = EnrichmentConfig(
    enabled=True,                                    # Enable enrichment
    providers=[EnrichmentProvider.WIKIMEDIA],        # Providers to use

    # API keys (Diffbot only)
    diffbot_api_key="your-api-key",                  # Or set DIFFBOT_API_KEY env var

    # Rate limiting
    wikimedia_rate_limit=0.5,                        # Seconds between requests
    diffbot_rate_limit=0.2,                          # Seconds between requests

    # Caching
    cache_results=True,                              # Cache results in memory
    cache_ttl_hours=168,                             # Cache TTL (1 week)

    # Background processing
    background_enabled=True,                         # Enable async processing
    queue_max_size=1000,                             # Max queue size
    max_retries=3,                                   # Retry count
    retry_delay_seconds=60.0,                        # Delay between retries

    # Filtering
    entity_types=["PERSON", "ORGANIZATION", "LOCATION", "EVENT"],
    min_confidence=0.7,                              # Minimum confidence threshold

    # API settings
    language="en",                                   # Wikipedia language
    user_agent="neo4j-agent-memory/1.0",             # User-Agent header
)

Environment Variables

# Enable enrichment
NAM_ENRICHMENT__ENABLED=true

# Providers (JSON array)
NAM_ENRICHMENT__PROVIDERS=["wikimedia", "diffbot"]

# Diffbot API key
NAM_ENRICHMENT__DIFFBOT_API_KEY=your-api-key
# Or use the standard env var
DIFFBOT_API_KEY=your-api-key

# Rate limiting
NAM_ENRICHMENT__WIKIMEDIA_RATE_LIMIT=0.5
NAM_ENRICHMENT__DIFFBOT_RATE_LIMIT=0.2

# Caching
NAM_ENRICHMENT__CACHE_RESULTS=true
NAM_ENRICHMENT__CACHE_TTL_HOURS=168

# Background processing
NAM_ENRICHMENT__BACKGROUND_ENABLED=true
NAM_ENRICHMENT__QUEUE_MAX_SIZE=1000
NAM_ENRICHMENT__MAX_RETRIES=3
NAM_ENRICHMENT__RETRY_DELAY_SECONDS=60.0

# Filtering
NAM_ENRICHMENT__ENTITY_TYPES=["PERSON", "ORGANIZATION", "LOCATION", "EVENT"]
NAM_ENRICHMENT__MIN_CONFIDENCE=0.7

# API settings
NAM_ENRICHMENT__LANGUAGE=en
NAM_ENRICHMENT__USER_AGENT=neo4j-agent-memory/1.0

Parameters

Parameter Type Default Description

Parameter	Type	Default	Description
`enabled`	bool	`False`	Enable the enrichment system
`providers`	list[EnrichmentProvider]	`[WIKIMEDIA]`	Providers to use (tried in order)
`diffbot_api_key`	SecretStr	None	API key for Diffbot
`wikimedia_rate_limit`	float	0.5	Seconds between Wikimedia requests
`diffbot_rate_limit`	float	0.2	Seconds between Diffbot requests
`cache_results`	bool	`True`	Cache enrichment results
`cache_ttl_hours`	int	168	Cache TTL in hours (1 week)
`background_enabled`	bool	`True`	Enable background processing
`queue_max_size`	int	1000	Maximum enrichment queue size
`max_retries`	int	3	Retry count for failures
`retry_delay_seconds`	float	60.0	Delay between retries
`entity_types`	list[str]	See above	Entity types to enrich
`min_confidence`	float	0.7	Minimum confidence threshold
`language`	str	`"en"`	Wikipedia language code
`user_agent`	str	`"neo4j-agent-memory/1.0"`	User-Agent header

enabled

bool

False

Enable the enrichment system

providers

list[EnrichmentProvider]

[WIKIMEDIA]

Providers to use (tried in order)

diffbot_api_key

SecretStr

None

API key for Diffbot

wikimedia_rate_limit

float

0.5

Seconds between Wikimedia requests

diffbot_rate_limit

float

0.2

Seconds between Diffbot requests

cache_results

bool

True

Cache enrichment results

cache_ttl_hours

int

168

Cache TTL in hours (1 week)

background_enabled

bool

True

Enable background processing

queue_max_size

int

1000

Maximum enrichment queue size

max_retries

int

Retry count for failures

retry_delay_seconds

float

60.0

Delay between retries

entity_types

list[str]

See above

Entity types to enrich

min_confidence

float

0.7

Minimum confidence threshold

language

str

"en"

Wikipedia language code

user_agent

str

"neo4j-agent-memory/1.0"

User-Agent header

Enrichment Providers

Provider Cost Rate Limit Notes

Provider	Cost	Rate Limit	Notes
`WIKIMEDIA`	Free	2 requests/second	Uses Wikipedia REST API. Good for general entities.
`DIFFBOT`	Pay per use	5 requests/second	Richer structured data, requires API key.

WIKIMEDIA

Free

2 requests/second

Uses Wikipedia REST API. Good for general entities.

DIFFBOT

Pay per use

5 requests/second

Richer structured data, requires API key.

See the Working with Entities guide for detailed usage documentation.

Deduplication Configuration

Settings for entity deduplication during ingest.

Python Configuration

from neo4j_agent_memory import DeduplicationConfig, DeduplicationStrategy

dedup_config = DeduplicationConfig(
    enabled=True,                         # Enable deduplication
    strategy=DeduplicationStrategy.COMPOSITE,
    embedding_threshold=0.92,             # Similarity for auto-merge
    fuzzy_threshold=0.85,                 # Fuzzy match threshold
    create_same_as=True,                  # Create SAME_AS for ambiguous matches
    same_as_threshold=0.85,               # Threshold for SAME_AS relationships
    batch_size=100,                       # Entities to process per batch
)

Environment Variables

NAM_DEDUPLICATION__ENABLED=true
NAM_DEDUPLICATION__STRATEGY=composite     # none, exact, fuzzy, embedding, composite
NAM_DEDUPLICATION__EMBEDDING_THRESHOLD=0.92
NAM_DEDUPLICATION__FUZZY_THRESHOLD=0.85
NAM_DEDUPLICATION__CREATE_SAME_AS=true
NAM_DEDUPLICATION__SAME_AS_THRESHOLD=0.85
NAM_DEDUPLICATION__BATCH_SIZE=100

Deduplication Strategies

Strategy Description

Strategy	Description
`NONE`	Disable deduplication
`EXACT`	Exact name matching only
`FUZZY`	Fuzzy string matching (rapidfuzz)
`EMBEDDING`	Vector similarity matching
`COMPOSITE`	Combine fuzzy and embedding (default)

NONE

Disable deduplication

EXACT

Exact name matching only

FUZZY

Fuzzy string matching (rapidfuzz)

EMBEDDING

Vector similarity matching

COMPOSITE

Combine fuzzy and embedding (default)

Observability Configuration

Settings for tracing and monitoring with OpenTelemetry or Opik.

Python Configuration (OpenTelemetry)

from neo4j_agent_memory import ObservabilityConfig, TracingProvider

observability_config = ObservabilityConfig(
    enabled=True,
    provider=TracingProvider.OPENTELEMETRY,
    service_name="my-agent-memory",
    endpoint="http://localhost:4317",     # OTLP endpoint
    sample_rate=1.0,                      # Trace all requests
    log_level="INFO",
)

Python Configuration (Opik)

from neo4j_agent_memory import ObservabilityConfig, TracingProvider

observability_config = ObservabilityConfig(
    enabled=True,
    provider=TracingProvider.OPIK,
    project_name="my-agent-memory",
    workspace="my-workspace",             # Optional Opik workspace
    track_llm_calls=True,                 # Track LLM interactions
    track_extraction=True,                # Track extraction pipeline
    track_memory_ops=True,                # Track memory operations
)

Environment Variables

# Common settings
NAM_OBSERVABILITY__ENABLED=true
NAM_OBSERVABILITY__PROVIDER=opentelemetry  # opentelemetry, opik, auto

# OpenTelemetry settings
NAM_OBSERVABILITY__SERVICE_NAME=my-agent-memory
NAM_OBSERVABILITY__ENDPOINT=http://localhost:4317
NAM_OBSERVABILITY__SAMPLE_RATE=1.0
NAM_OBSERVABILITY__LOG_LEVEL=INFO

# Opik settings
NAM_OBSERVABILITY__PROJECT_NAME=my-agent-memory
NAM_OBSERVABILITY__WORKSPACE=my-workspace
NAM_OBSERVABILITY__TRACK_LLM_CALLS=true
NAM_OBSERVABILITY__TRACK_EXTRACTION=true
NAM_OBSERVABILITY__TRACK_MEMORY_OPS=true

# Opik API (if using cloud)
OPIK_API_KEY=your-api-key
OPIK_WORKSPACE=your-workspace

Parameters

Parameter Type Default Description

Parameter	Type	Default	Description
`enabled`	bool	`False`	Enable observability
`provider`	TracingProvider	`AUTO`	Tracing provider to use
`service_name`	str	`neo4j-agent-memory`	Service name for traces
`sample_rate`	float	1.0	Percentage of requests to trace (0.0-1.0)
`track_llm_calls`	bool	`True`	Track LLM API calls
`track_extraction`	bool	`True`	Track extraction pipeline stages
`track_memory_ops`	bool	`True`	Track memory read/write operations

enabled

bool

False

Enable observability

provider

TracingProvider

AUTO

Tracing provider to use

service_name

str

neo4j-agent-memory

Service name for traces

sample_rate

float

1.0

Percentage of requests to trace (0.0-1.0)

track_llm_calls

bool

True

Track LLM API calls

track_extraction

bool

True

Track extraction pipeline stages

track_memory_ops

bool

True

Track memory read/write operations

Tracing Providers

Provider Description Required Package

Provider	Description	Required Package
`OPENTELEMETRY`	Standard OpenTelemetry tracing	`pip install neo4j-agent-memory[opentelemetry]`
`OPIK`	Comet Opik for LLM observability	`pip install neo4j-agent-memory[opik]`
`AUTO`	Auto-detect available provider	Whichever is installed

OPENTELEMETRY

Standard OpenTelemetry tracing

pip install neo4j-agent-memory[opentelemetry]

OPIK

Comet Opik for LLM observability

pip install neo4j-agent-memory[opik]

AUTO

Auto-detect available provider

Whichever is installed

CLI Configuration

The CLI tool uses environment variables and optional configuration files.

Environment Variables

# Neo4j connection (required for memory commands)
NAM_NEO4J__URI=bolt://localhost:7687
NAM_NEO4J__USERNAME=neo4j
NAM_NEO4J__PASSWORD=your-password

# Extraction settings
NAM_EXTRACTION__EXTRACTOR_TYPE=pipeline
NAM_EXTRACTION__ENABLE_GLINER=true

# Output format
NAM_CLI__OUTPUT_FORMAT=json              # json, table, yaml
NAM_CLI__VERBOSE=false
NAM_CLI__COLOR=true

Configuration File

Create a .neo4j-memory.yaml file in your project or home directory:

# .neo4j-memory.yaml
neo4j:
  uri: bolt://localhost:7687
  username: neo4j
  password: your-password

extraction:
  extractor_type: pipeline
  enable_gliner: true
  gliner_threshold: 0.5

cli:
  output_format: json
  verbose: false
  color: true

CLI Commands

# Extract entities from text
neo4j-agent-memory extract "John works at Acme Corp in New York"

# Extract with specific schema
neo4j-agent-memory extract --schema poleo "..."

# Extract from file
neo4j-agent-memory extract --input document.txt

# List available schemas
neo4j-agent-memory schemas list

# Show schema details
neo4j-agent-memory schemas show poleo

# Get extraction statistics
neo4j-agent-memory stats

# Output as table
neo4j-agent-memory extract --format table "..."

Complete Example

Python Configuration

from neo4j_agent_memory import (
    MemorySettings,
    Neo4jConfig,
    EmbeddingConfig,
    EmbeddingProvider,
    ExtractionConfig,
    ExtractorType,
    MergeStrategy,
    SchemaConfig,
    SchemaModel,
    ResolutionConfig,
    ResolverStrategy,
    LLMConfig,
    LLMProvider,
    DeduplicationConfig,
    DeduplicationStrategy,
    GeocodingConfig,
    GeocodingProvider,
    ObservabilityConfig,
    TracingProvider,
)
from pydantic import SecretStr

settings = MemorySettings(
    neo4j=Neo4jConfig(
        uri="bolt://localhost:7687",
        username="neo4j",
        password=SecretStr("password"),
    ),
    embedding=EmbeddingConfig(
        provider=EmbeddingProvider.OPENAI,
        model="text-embedding-3-small",
    ),
    extraction=ExtractionConfig(
        extractor_type=ExtractorType.PIPELINE,
        enable_spacy=True,
        enable_gliner=True,
        enable_llm_fallback=True,
        merge_strategy=MergeStrategy.CONFIDENCE,
        # GLiREL for relation extraction
        enable_gliner_relations=True,
    ),
    schema=SchemaConfig(
        model=SchemaModel.POLEO,
        enable_subtypes=True,
    ),
    resolution=ResolutionConfig(
        strategy=ResolverStrategy.COMPOSITE,
    ),
    deduplication=DeduplicationConfig(
        enabled=True,
        strategy=DeduplicationStrategy.COMPOSITE,
        embedding_threshold=0.92,
    ),
    geocoding=GeocodingConfig(
        enabled=True,
        provider=GeocodingProvider.NOMINATIM,
        cache_results=True,
    ),
    llm=LLMConfig(
        provider=LLMProvider.OPENAI,
        model="gpt-4o-mini",
    ),
    observability=ObservabilityConfig(
        enabled=True,
        provider=TracingProvider.OPIK,
        project_name="my-agent-memory",
    ),
)

Environment Variables (`.env` file)

# Neo4j
NAM_NEO4J__URI=bolt://localhost:7687
NAM_NEO4J__USERNAME=neo4j
NAM_NEO4J__PASSWORD=your-password

# Embedding
NAM_EMBEDDING__PROVIDER=openai
NAM_EMBEDDING__MODEL=text-embedding-3-small

# Extraction
NAM_EXTRACTION__EXTRACTOR_TYPE=pipeline
NAM_EXTRACTION__ENABLE_SPACY=true
NAM_EXTRACTION__ENABLE_GLINER=true
NAM_EXTRACTION__ENABLE_LLM_FALLBACK=true
NAM_EXTRACTION__MERGE_STRATEGY=confidence
NAM_EXTRACTION__ENABLE_GLINER_RELATIONS=true

# Schema
NAM_SCHEMA__MODEL=poleo
NAM_SCHEMA__ENABLE_SUBTYPES=true

# Resolution
NAM_RESOLUTION__STRATEGY=composite

# Deduplication
NAM_DEDUPLICATION__ENABLED=true
NAM_DEDUPLICATION__STRATEGY=composite
NAM_DEDUPLICATION__EMBEDDING_THRESHOLD=0.92

# Geocoding (for LOCATION entities)
NAM_GEOCODING__ENABLED=true
NAM_GEOCODING__PROVIDER=nominatim
NAM_GEOCODING__CACHE_RESULTS=true
# For Google Maps (instead of Nominatim):
# NAM_GEOCODING__PROVIDER=google
# NAM_GEOCODING__API_KEY=your-google-api-key

# Observability
NAM_OBSERVABILITY__ENABLED=true
NAM_OBSERVABILITY__PROVIDER=opik
NAM_OBSERVABILITY__PROJECT_NAME=my-agent-memory

# LLM
NAM_LLM__PROVIDER=openai
NAM_LLM__MODEL=gpt-4o-mini

# OpenAI API Key
OPENAI_API_KEY=sk-...

Configuration Precedence

When using multiple configuration methods, the precedence is:

Explicit Python arguments (highest priority)
Environment variables with NAM_ prefix
Default values (lowest priority)

import os

# Set environment variable
os.environ["NAM_NEO4J__URI"] = "bolt://env-server:7687"

# This will use the environment variable
settings = MemorySettings()
print(settings.neo4j.uri)  # bolt://env-server:7687

# This will override the environment variable
settings = MemorySettings(
    neo4j={"uri": "bolt://explicit-server:7687"}
)
print(settings.neo4j.uri)  # bolt://explicit-server:7687

Validation

Settings are validated using Pydantic. Invalid configurations raise ValidationError:

from neo4j_agent_memory import MemorySettings, ExtractionConfig
from pydantic import ValidationError

try:
    settings = MemorySettings(
        extraction=ExtractionConfig(
            gliner_threshold=1.5  # Invalid: must be 0.0-1.0
        )
    )
except ValidationError as e:
    print(f"Configuration error: {e}")

Configuration Reference

Configuration Methods

MemorySettings

Neo4j Configuration

Python Configuration

Environment Variables

Parameters

Embedding Configuration

Python Configuration

Environment Variables

Parameters

Embedding Providers

Extraction Configuration

Python Configuration

Environment Variables

Extractor Types

Merge Strategies

Schema Configuration

Python Configuration

Environment Variables

Schema Models

Resolution Configuration

Python Configuration

Environment Variables

Resolver Strategies

LLM Configuration

Python Configuration

Environment Variables

Memory Configuration

Python Configuration

Environment Variables

Search Configuration

Python Configuration

Environment Variables

Geocoding Configuration

Python Configuration (Nominatim - Free)

Python Configuration (Google Maps - Higher Accuracy)

Environment Variables

Parameters

Geocoding Providers

Usage Example

Geospatial Queries

Enrichment Configuration

Python Configuration

Environment Variables

Parameters

Enrichment Providers

Deduplication Configuration

Python Configuration

Environment Variables

Deduplication Strategies

Observability Configuration

Python Configuration (OpenTelemetry)

Python Configuration (Opik)

Environment Variables

Parameters

Tracing Providers

CLI Configuration

Environment Variables

Configuration File

CLI Commands

Complete Example

Python Configuration

Environment Variables (.env file)

Configuration Precedence

Validation

Next Steps

Environment Variables (`.env` file)