Semiont: an Annotated Semantic Layer

Semiont is a new semantic layer for Agentic AI from the AI Alliance. Its main author and creator is Adam Pingel, now an engineer at IBM Research and the head of the Knowledge working group at the AI Alliance. Adam is a veteran speaker By the Bay, having lead data engineering teams in legal and document space for more than a decade. Prior to IBM, Adam was a CTO at Lexis/Nexis.

Semiont addresses the key problem in knowledge graphs — the cold start, the graph creation.

Semiont preserves all of the original documents and builds a wiki around them. The way it does it is through annotation and resource generation, aided by named entity extraction. An annotation selects a lexical span describing a concept and adds metadata to it. Annotations are actually standardized by W3C, and can be represented as JSON-LD and stored in a variety of data stores.

A typical annotation would be for a named entity in the document, such as a place or a name. Semiont uses LLMs to create annotations for named entities. Once an entity is annotated, you can also generate a resource for it and link to it from the document, adding to the graph.

Resource generation helps create new nodes. The underlying graph structure connects resources, and their textual renderings are derived from their definitions. This way, a lot of work lead by W3C, and focused on separating definition from presentation, is leveraged as the appropriate abstraction.

Annotations provide a new way to create, persist, and transfer context between agents. In the current GraphRAG approaches, documents are chunked, entities are extracted, and the resulting graph is queried and traversed by bespoke agents with domain-specific logic. Their AI Memory is reconstructed from a disparate set of data sources. Passing context between agents or aggregating it is bespoke and differs between AI Memory providers. Since annotations are a standard, leveraging them for AI Memory can be a powerful step towards interoperability.

Semiont is built on four principles:
* Annotation Detection
* Entity Resolution
* Context Retrieval
* Resource Generation

The project is built with the AI coding agents to help in Typescript, and you can get started immediately using a GitHub Code Space.

For now, Semiont welcomes new use cases, where interaction will inform new features. There’s a robust backlog for potential contributors to operationalize the system.

The APIs and rigorous engineering enable MCP-based interaction and agentic integration. There’s an example agent in the repository.

In this interview, Alexy Khrabrov, the founder of AI By the Bay meetups and conferences, goes over the motivation, design choices, and workings of Semiont. We invite you to bring your use cases and try it out!