LLM Provider API
Public exports
from neo4j_agent_memory.llm import (
# Protocols
LLMProvider,
StructuredExtractor,
EmbeddingProvider,
# Types
ChatMessage,
Completion,
Usage,
# Errors
ProviderError,
ProviderAuthError,
ProviderRateLimitError,
ProviderTimeoutError,
ProviderInvalidRequestError,
ProviderServiceError,
StructuredExtractionError,
EmbeddingDimensionMismatchError,
# Factory
from_provider,
# Helpers
schema_aligned_extract,
)
Protocols
LLMProvider
Chat completions. Bronze TCK tier.
class LLMProvider(Protocol):
model: str
async def complete(
self,
messages: Sequence[ChatMessage],
*,
temperature: float = 0.0,
max_tokens: int | None = None,
stop: Sequence[str] | None = None,
timeout: float | None = None,
) -> Completion: ...
Implementations MUST:
-
Be safe to call concurrently — no shared mutable state.
-
Translate SDK-specific errors to
ProviderErrorsubclasses. -
Honor
temperature=0.0as deterministic where the provider supports it.
The model attribute is the canonical "provider/model" identifier. Adapters that accept bare names normalise to include the prefix.
StructuredExtractor
Validated Pydantic outputs. Silver TCK tier.
class StructuredExtractor(Protocol):
async def complete_structured(
self,
messages: Sequence[ChatMessage],
response_model: type[T],
*,
temperature: float = 0.0,
max_retries: int = 2,
timeout: float | None = None,
) -> T: ...
Implementations MUST:
-
Use the most reliable structured-output mode the underlying provider supports — strict JSON schema (OpenAI), forced tool use (Anthropic),
response_format(LiteLLM), etc. -
Retry on
ValidationErrorup tomax_retriestimes, with feedback. -
Raise
StructuredExtractionErrorafter exhausting retries.
Adapters without a native structured mode delegate to schema_aligned_extract.
EmbeddingProvider
Text embeddings. Bronze TCK tier.
class EmbeddingProvider(Protocol):
model: str
dimensions: int
async def embed(self, texts: Sequence[str]) -> list[list[float]]: ...
async def embed_one(self, text: str) -> list[float]: ...
Contract:
-
dimensionsmust be available at construction time (not lazily). -
embed([])returns[]. -
Every returned vector has length
dimensions.
MemoryClient.connect() reads dimensions to size vector indexes and validates against existing indexes — see the migration runbook.
Types
ChatMessage
Pydantic, frozen.
class ChatMessage(BaseModel):
role: Literal["system", "user", "assistant", "tool"]
content: str
name: str | None = None
tool_call_id: str | None = None
content is str only — multimodal is a v0.4+ concern.
Exception hierarchy
All inherit from ProviderError, which inherits from Exception. None inherit from MemoryError — provider errors are intentionally separate from storage errors.
| Class | Raised when |
|---|---|
|
Base class. |
|
API key invalid / missing / expired. |
|
429 / quota exceeded. Carries |
|
Request exceeded configured timeout. |
|
Malformed request (unknown model, bad params). |
|
Retriable 5xx server error. |
|
SAP exhausted retries. Carries |
|
Existing vector index disagrees with embedder. Carries |
Factory
from_provider(model: str, *, kind="llm", prefer_litellm=False, **kwargs) — string-shorthand factory. Returns an LLMProvider or EmbeddingProvider depending on kind. See Factory Reference.
Helper functions
schema_aligned_extract
async def schema_aligned_extract(
provider: LLMProvider,
messages: Sequence[ChatMessage],
response_model: type[T],
*,
temperature: float = 0.0,
max_retries: int = 2,
timeout: float | None = None,
) -> T: ...
Generic structured-output path. Builds a system message containing the schema, parses tolerantly, validates against response_model, retries with feedback on failure. Use directly when your provider does not implement StructuredExtractor.
Tolerant JSON parser
schema_aligned_extract strips markdown fences, smart quotes, trailing commas, and finds the largest balanced {…} block. Truncated JSON (mid-string cutoff) surfaces as json.JSONDecodeError — the correct outcome, since it must trigger a retry.
Default model dimensions
neo4j_agent_memory.llm.defaults exposes:
-
EMBEDDING_DIMENSIONS: dict[str, int]— known model → dimensions. -
lookup_embedding_dimensions(model: str) → int | None— tolerant lookup with and without provider prefix.
For models not in the table, embedding adapters require an explicit dimensions=N constructor argument.
Related
-
Adapters Reference — constructor signatures for every adapter.
-
Factory Reference —
from_providerdetails. -
Configuration — how to wire providers into
MemorySettings. -
Why the Provider Protocol? — design rationale.