ChromaDB

Here is a list of all available ChromaDB procedures, note that the list and the signature procedures are consistent with the others, like the Qdrant ones:

name description

apoc.vectordb.chroma.createCollection(hostOrKey, collection, similarity, size, $config)

Creates a collection, with the name specified in the 2nd parameter, and with the specified similarity and size. The default endpoint is <hostOrKey param>/api/v1/collections.

apoc.vectordb.chroma.deleteCollection(hostOrKey, collection, $config)

Deletes a collection with the name specified in the 2nd parameter. The default endpoint is <hostOrKey param>/api/v1/collections/<collection param>.

apoc.vectordb.chroma.upsert(hostOrKey, collection, vectors, $config)

Upserts, in the collection with the name specified in the 2nd parameter, the vectors [{id: 'id', vector: '<vectorDb>', medatada: '<metadata>'}]. The default endpoint is <hostOrKey param>/api/v1/collections/<collection param>/upsert.

apoc.vectordb.chroma.delete(hostOrKey, collection, ids, $config)

Deletes the vectors with the specified ids. The default endpoint is <hostOrKey param>/api/v1/collections/<collection param>/delete.

apoc.vectordb.chroma.get(hostOrKey, collection, ids, $config)

Gets the vectors with the specified ids. The default endpoint is <hostOrKey param>/api/v1/collections/<collection param>/get.

apoc.vectordb.chroma.query(hostOrKey, collection, vector, filter, limit, $config)

Retrieve closest vectors from the defined vector, limit of results, in the collection with the name specified in the 2nd parameter. The default endpoint is <hostOrKey param>/api/v1/collections/<collection param>/query.

apoc.vectordb.chroma.getAndUpdate(hostOrKey, collection, ids, $config)

Gets the vectors with the specified ids, and optionally creates/updates neo4j entities. The default endpoint is <hostOrKey param>/api/v1/collections/<collection param>/get.

apoc.vectordb.chroma.queryAndUpdate(hostOrKey, collection, vector, filter, limit, $config)

Retrieve closest vectors from the defined vector, limit of results, in the collection with the name specified in the 2nd parameter, and optionally creates/updates neo4j entities. The default endpoint is <hostOrKey param>/api/v1/collections/<collection param>/query.

where the 1st parameter can be a key defined by the apoc config apoc.chroma.<key>.host=myHost. With hostOrKey=null, the default is 'http://localhost:8000'.

Examples

Create a collection (it leverages this API)
CALL apoc.vectordb.chroma.createCollection($host, 'test_collection', 'Cosine', 4, {<optional config>})
Table 1. Example results
name metadata database id tenant

test_collection

{"size": 4, "hnsw:space": "cosine"}

default_database

9c046861-f46f-417d-bd01-ca8c9f99aee5

default_tenant

Delete a collection (it leverages this API)
CALL apoc.vectordb.chroma.deleteCollection($host, '<collection_id>', {<optional config>})

which returns an empty result.

Upsert vectors (it leverages this API)
CALL apoc.vectordb.qdrant.upsert($host, '<collection_id>',
    [
        {id: 1, vector: [0.05, 0.61, 0.76, 0.74], metadata: {city: "Berlin", foo: "one"}, text: 'ajeje'},
        {id: 2, vector: [0.19, 0.81, 0.75, 0.11], metadata: {city: "London", foo: "two"}, text: 'brazorf'}
    ],
    {<optional config>})

which returns an empty result.

Get vectors (it leverages this API)
CALL apoc.vectordb.chroma.get($host, '<collection_id>', ['1','2'], {<optional config>}), text
Table 2. Example results
score metadata id vector text entity

null

{city: "Berlin", foo: "one"}

null

null

null

null

null

{city: "Berlin", foo: "two"}

null

null

null

null

Get vectors with {allResults: true}
CALL apoc.vectordb.chroma.get($host, '<collection_id>', ['1','2'], {<optional config>}), text
Table 3. Example results
score metadata id vector text entity

null

{city: "Berlin", foo: "one"}

1

[…​]

ajeje

null

null

{city: "Berlin", foo: "two"}

2

[…​]

brazorf

null

Query vectors (it leverages this API)
CALL apoc.vectordb.chroma.query($host,
    '<collection_id>',
    [0.2, 0.1, 0.9, 0.7],
    {city: 'London'},
    5,
    {allResults: true, <optional config>}), text
Table 4. Example results
score metadata id vector text

1,

{city: "Berlin", foo: "one"}

1

[…​]

ajeje

0.1

{city: "Berlin", foo: "two"}

2

[…​]

brazorf

To optimize performances, we can choose what to YIELD with the apoc.vectordb.chroma.query and the apoc.vectordb.chroma.get procedures. For example, by executing a CALL apoc.vectordb.chroma.query(…​) YIELD metadata, score, id, the RestAPI request will have an {"include": ["metadatas", "documents", "distances"]}, so that we do not return the other values that we do not need.

In the same way as other procedures, we can define a mapping, to fetch the associated nodes and relationships and optionally create them, by leveraging the vector metadata. For example:

Query vectors
CALL apoc.vectordb.chroma.query($host, '<collection_id>',
    [0.2, 0.1, 0.9, 0.7],
    {},
    5,
    { mapping: {
            embeddingKey: "vect",
            nodeLabel: "Test",
            entityKey: "myId",
            metadataKey: "foo"
        }
    })

To optimize performances, we can choose what to YIELD with the apoc.vectordb.chroma.query and the apoc.vectordb.chroma.get procedures. For example, by executing a CALL apoc.vectordb.chroma.query(…​) YIELD metadata, score, id, the RestAPI request will have an {"include": ["metadatas", "documents", "distances"]}, so that we do not return the other values that we do not need.

It is possible to execute vector db procedures together with the apoc.ml.rag as follow:

CALL apoc.vectordb.chroma.getAndUpdate($host, $collection, [<id1>, <id2>], $conf) YIELD node, metadata, id, vector
WITH collect(node) as paths
CALL apoc.ml.rag(paths, $attributes, $question, $confPrompt) YIELD value
RETURN value

which returns a string that answers the $question by leveraging the embeddings of the db vector.

Delete vectors (it leverages this API)
CALL apoc.vectordb.chroma.delete($host, '<collection_id>', [1,2], {<optional config>})

which returns an array of strings of deleted ids. For example, ["1", "2"]