Weaviate

Here is a list of all available Weaviate procedures, note that the list and the signature procedures are consistent with the others, like the Qdrant ones:

name description

apoc.vectordb.weaviate.createCollection(hostOrKey, collection, similarity, size, $config)

Creates a collection, with the name specified in the 2nd parameter, and with the specified similarity and size. The default endpoint is <hostOrKey param>/schema.

apoc.vectordb.weaviate.deleteCollection(hostOrKey, collection, $config)

Deletes a collection with the name specified in the 2nd parameter. The default endpoint is <hostOrKey param>/schema/<collection param>.

apoc.vectordb.weaviate.upsert(hostOrKey, collection, vectors, $config)

Upserts, in the collection with the name specified in the 2nd parameter, the vectors [{id: 'id', vector: '<vectorDb>', medatada: '<metadata>'}]. The default endpoint is <hostOrKey param>/objects.

apoc.vectordb.weaviate.delete(hostOrKey, collection, ids, $config)

Deletes the vectors with the specified ids. The default endpoint is <hostOrKey param>/schema.

apoc.vectordb.weaviate.get(hostOrKey, collection, ids, $config)

Gets the vectors with the specified ids. The default endpoint is <hostOrKey param>/schema.

apoc.vectordb.weaviate.query(hostOrKey, collection, vector, filter, limit, $config)

Retrieve closest vectors from the defined vector, limit of results, in the collection with the name specified in the 2nd parameter. Note that, besides the common config parameters, this procedure requires a field: [listOfProperty] config, to define which properties are to be retrieved from GraphQL running under-the-hood. The default endpoint is <hostOrKey param>/graphql.

apoc.vectordb.weaviate.getAndUpdate(hostOrKey, collection, ids, $config)

Gets the vectors with the specified ids, and optionally creates/updates neo4j entities. The default endpoint is <hostOrKey param>/schema.

apoc.vectordb.weaviate.queryAndUpdate(hostOrKey, collection, vector, filter, limit, $config)

Retrieve closest vectors from the defined vector, limit of results, in the collection with the name specified in the 2nd parameter, and optionally creates/updates neo4j entities. Note that, besides the common config parameters, this procedure requires a field: [listOfProperty] config, to define which properties are to be retrieved from GraphQL running under-the-hood. The default endpoint is <hostOrKey param>/graphql.

where the 1st parameter can be a key defined by the apoc config apoc.weaviate.<key>.host=myHost. With hostOrKey=null, the default is 'http://localhost:8080/v1'.

Examples

Create a collection (it leverages this API)
CALL apoc.vectordb.weaviate.createCollection($host, 'test_collection', 'Cosine', 4, {<optional config>})
Create a collection against a remote connection using an API key (see here)
CALL apoc.vectordb.weaviate.createCollection("https://<weaviateInstanceId>.weaviate.network",
    'TestCollection',
    'cosine',
    4,
    {headers: {Authorization: 'Bearer <apiKey>'}})
Delete a collection (it leverages this API)
CALL apoc.vectordb.weaviate.deleteCollection($host, 'test_collection', {<optional config>})
Upsert vectors (it leverages this API)
CALL apoc.vectordb.weaviate.upsert($host, 'test_collection',
    [
        {id: "8ef2b3a7-1e56-4ddd-b8c3-2ca8901ce308", vector: [0.05, 0.61, 0.76, 0.74], metadata: {city: "Berlin", foo: "one"}},
        {id: "9ef2b3a7-1e56-4ddd-b8c3-2ca8901ce308", vector: [0.19, 0.81, 0.75, 0.11], metadata: {city: "London", foo: "two"}}
    ],
    {<optional config>})
Get vectors (it leverages this API)
CALL apoc.vectordb.weaviate.get($host, 'test_collection', [1,2], {<optional config>})
Table 1. Example results
score metadata id vector text entity

null

{city: "Berlin", foo: "one"}

null

null

null

null

null

{city: "Berlin", foo: "two"}

null

null

null

null

Get vectors with {allResults: true}
CALL apoc.vectordb.weaviate.get($host, 'test_collection', [1,2], {allResults: true, <optional config>})
Table 2. Example results
score metadata id vector text entity

null

{city: "Berlin", foo: "one"}

1

[…​]

null

null

null

{city: "Berlin", foo: "two"}

2

[…​]

null

null

Query vectors (it leverages here)
CALL apoc.vectordb.weaviate.query($host,
    'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    '{operator: Equal, valueString: "London", path: ["city"]}',
    5,
    {fields: ["city", "foo"], allResults: true, <other optional config>})
Table 3. Example results
score metadata id vector text

1,

{city: "Berlin", foo: "one"}

1

[…​]

null

0.1

{city: "Berlin", foo: "two"}

2

[…​]

null

We can define a mapping, to fetch the associated nodes and relationships and optionally create them, by leveraging the vector metadata.

For example, if we have created 2 vectors with the above upsert procedures, we can populate some existing nodes (i.e. (:Test {myId: 'one'}) and (:Test {myId: 'two'})):

CALL apoc.vectordb.weaviate.query($host, 'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    {},
    5,
    { fields: ["city", "foo"],
      mapping: {
        embeddingKey: "vect",
        nodeLabel: "Test",
        entityKey: "myId",
        metadataKey: "foo"
      }
    })

which populates the two nodes as: (:Test {myId: 'one', city: 'Berlin', vect: [vector1]}) and (:Test {myId: 'two', city: 'London', vect: [vector2]}), which will be returned in the entity column result.

Or else, we can create a node if not exists, via create: true:

CALL apoc.vectordb.weaviate.query($host, 'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    {},
    5,
    { fields: ["city", "foo"],
      mapping: {
        create: true,
        embeddingKey: "vect",
        nodeLabel: "Test",
        entityKey: "myId",
        metadataKey: "foo"
      }
    })

which creates 2 new nodes as above.

Or, we can populate an existing relationship (i.e. (:Start)-[:TEST {myId: 'one'}]→(:End) and (:Start)-[:TEST {myId: 'two'}]→(:End)):

CALL apoc.vectordb.weaviate.query($host, 'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    {},
    5,
    { fields: ["city", "foo"],
      mapping: {
        embeddingKey: "vect",
        relType: "TEST",
        entityKey: "myId",
        metadataKey: "foo"
      }
    })

which populates the two relationships as: ()-[:TEST {myId: 'one', city: 'Berlin', vect: [vector1]}]-() and ()-[:TEST {myId: 'two', city: 'London', vect: [vector2]}]-(), which will be returned in the entity column result.

To optimize performances, we can choose what to YIELD with the apoc.vectordb.weaviate.query and the apoc.vectordb.weaviate.get procedures.

For example, by executing a CALL apoc.vectordb.weaviate.query(…​) YIELD metadata, score, id, the RestAPI request will have an {"with_payload": false, "with_vectors": false}, so that we do not return the other values that we do not need.

Delete vectors (it leverages this API)
CALL apoc.vectordb.weaviate.delete($host, 'test_collection', [1,2], {<optional config>})