Weaviate
Here is a list of all available Weaviate procedures, note that the list and the signature procedures are consistent with the others, like the Qdrant ones:
| name | description | 
|---|---|
apoc.vectordb.weaviate.info($host, $collectionName, $config)  | 
Get information about the specified existing collection or throws a FileNotFoundException if it does not exist  | 
apoc.vectordb.weaviate.createCollection(hostOrKey, collection, similarity, size, $config)  | 
Creates a collection, with the name specified in the 2nd parameter, and with the specified   | 
apoc.vectordb.weaviate.deleteCollection(hostOrKey, collection, $config)  | 
Deletes a collection with the name specified in the 2nd parameter.
    The default endpoint is   | 
apoc.vectordb.weaviate.upsert(hostOrKey, collection, vectors, $config)  | 
Upserts, in the collection with the name specified in the 2nd parameter, the vectors [{id: 'id', vector: '<vectorDb>', medatada: '<metadata>'}].
    The default endpoint is   | 
apoc.vectordb.weaviate.delete(hostOrKey, collection, ids, $config)  | 
Deletes the vectors with the specified   | 
apoc.vectordb.weaviate.get(hostOrKey, collection, ids, $config)  | 
Gets the vectors with the specified   | 
apoc.vectordb.weaviate.query(hostOrKey, collection, vector, filter, limit, $config)  | 
Retrieve closest vectors from the defined   | 
apoc.vectordb.weaviate.getAndUpdate(hostOrKey, collection, ids, $config)  | 
Gets the vectors with the specified   | 
apoc.vectordb.weaviate.queryAndUpdate(hostOrKey, collection, vector, filter, limit, $config)  | 
Retrieve closest vectors from the defined   | 
where the 1st parameter can be a key defined by the apoc config apoc.weaviate.<key>.host=myHost.
With hostOrKey=null, the default is 'http://localhost:8080/v1'.
Examples
CALL apoc.vectordb.weaviate.info($host, 'test_collection', {<optional config>})
| value | 
|---|
{"vectorizer": "none", "invertedIndexConfig": {"bm25": {"b": 0.75, "k1": 1.2}, "stopwords": {"additions": null, "removals": null, "preset": en}, "cleanupIntervalSeconds": 60}, "vectorIndexConfig": {"ef": -1, "dynamicEfMin": 100, "pq": {"centroids": 256, "trainingLimit": 100000, "encoder": {"type": "kmeans", "distribution": "log-normal"}, "enabled": false, "bitCompression": false, "segments": 0 }, "distance": cosine, "skip": false, "dynamicEfFactor": 8, "bq": {"enabled": false}, "vectorCacheMaxObjects": 1000000000000, "cleanupIntervalSeconds": 300, "dynamicEfMax": 500, "efConstruction": 128, "flatSearchCutoff": 40000, "maxConnections": 64}, "multiTenancyConfig": {"enabled": false}, "vectorIndexType": "hnsw", "replicationConfig": {"factor": 1}, "shardingConfig": {"desiredVirtualCount": 128, "desiredCount": 1, "actualCount": 1, "function": "murmur3", "virtualPerPhysical": 128, "strategy": "hash", "actualVirtualCount": 128, "key": "_id"}, "class": "TestCollection", "properties": [{"name": "city", "description": "This property was generated by Weaviate’s auto-schema feature on Wed Jul 10 12:50:18 2024", "indexFilterable": true, "tokenization": "word", "indexSearchable": true, "dataType": ["text"]}, {"name": "foo", "description": "This property was generated by Weaviate’s auto-schema feature on Wed Jul 10 12:50:18 2024", "indexFilterable": true, "tokenization": word, "indexSearchable": true, "dataType": ["text"]} ] }  | 
CALL apoc.vectordb.weaviate.createCollection($host, 'test_collection', 'Cosine', 4, {<optional config>})
| vectorizer | invertedIndexConfig | vectorIndexConfig | multiTenancyConfig | vectorIndexType | replicationConfig | shardingConfig | class | properties | 
|---|---|---|---|---|---|---|---|---|
none  | 
{"bm25": { "b": 0.75, "k1": 1.2 }, "stopwords": { "additions": null, "removals": null, "preset": "en" }, "cleanupIntervalSeconds": 60}  | 
{ "ef": -1, "dynamicEfMin": 100, "pq": { "centroids": 256, "trainingLimit": 100000, "encoder": { "type": "kmeans", "distribution": "log-normal" }, "enabled": false, "bitCompression": false, "segments": 0 }, "distance": "cosine", "skip": false, "dynamicEfFactor": 8, "bq": { "enabled": false }, "vectorCacheMaxObjects": 1000000000000, "cleanupIntervalSeconds": 300, "dynamicEfMax": 500, "efConstruction": 128, "flatSearchCutoff": 40000, "maxConnections": 64 }  | 
{ "enabled": false }  | 
hnsw  | 
{ "factor": 1 }  | 
{ "desiredVirtualCount": 128, "desiredCount": 1, "actualCount": 1, "function": "murmur3", "virtualPerPhysical": 128, "strategy": "hash", "actualVirtualCount": 128, "key": "_id" }  | 
TestCollection  | 
null  | 
CALL apoc.vectordb.weaviate.createCollection("https://<weaviateInstanceId>.weaviate.network",
    'TestCollection',
    'cosine',
    4,
    {headers: {Authorization: 'Bearer <apiKey>'}})
| vectorizer | invertedIndexConfig | vectorIndexConfig | multiTenancyConfig | vectorIndexType | replicationConfig | shardingConfig | class | properties | 
|---|---|---|---|---|---|---|---|---|
none  | 
{"bm25": { "b": 0.75, "k1": 1.2 }, "stopwords": { "additions": null, "removals": null, "preset": "en" }, "cleanupIntervalSeconds": 60}  | 
{ "ef": -1, "dynamicEfMin": 100, "pq": { "centroids": 256, "trainingLimit": 100000, "encoder": { "type": "kmeans", "distribution": "log-normal" }, "enabled": false, "bitCompression": false, "segments": 0 }, "distance": "cosine", "skip": false, "dynamicEfFactor": 8, "bq": { "enabled": false }, "vectorCacheMaxObjects": 1000000000000, "cleanupIntervalSeconds": 300, "dynamicEfMax": 500, "efConstruction": 128, "flatSearchCutoff": 40000, "maxConnections": 64 }  | 
{ "enabled": false }  | 
hnsw  | 
{ "factor": 1 }  | 
{ "desiredVirtualCount": 128, "desiredCount": 1, "actualCount": 1, "function": "murmur3", "virtualPerPhysical": 128, "strategy": "hash", "actualVirtualCount": 128, "key": "_id" }  | 
TestCollection  | 
null  | 
CALL apoc.vectordb.weaviate.deleteCollection($host, 'test_collection', {<optional config>})
which returns an empty result.
CALL apoc.vectordb.weaviate.upsert($host, 'test_collection',
    [
        {id: "8ef2b3a7-1e56-4ddd-b8c3-2ca8901ce308", vector: [0.05, 0.61, 0.76, 0.74], metadata: {city: "Berlin", foo: "one"}},
        {id: "9ef2b3a7-1e56-4ddd-b8c3-2ca8901ce308", vector: [0.19, 0.81, 0.75, 0.11], metadata: {city: "London", foo: "two"}}
    ],
    {<optional config>})
| lastUpdateTimeUnix | vector | id | creationTimeUnix | class | properties | 
|---|---|---|---|---|---|
1721293838439  | 
[0.05, 0.61, 0.76, 0.74]  | 
8ef2b3a7-1e56-4ddd-b8c3-2ca8901ce308  | 
1721293838439  | 
TestCollection  | 
{city: "Berlin", foo: "one"}  | 
1721293838439  | 
[0.19, 0.81, 0.75, 0.11]  | 
9ef2b3a7-1e56-4ddd-b8c3-2ca8901ce308  | 
1721293838439  | 
TestCollection  | 
{city: "London", foo: "two"}  | 
CALL apoc.vectordb.weaviate.get($host, 'test_collection', [1,2], {<optional config>})
| score | metadata | id | vector | text | entity | 
|---|---|---|---|---|---|
null  | 
{city: "Berlin", foo: "one"}  | 
null  | 
null  | 
null  | 
null  | 
null  | 
{city: "Berlin", foo: "two"}  | 
null  | 
null  | 
null  | 
null  | 
{allResults: true}CALL apoc.vectordb.weaviate.get($host, 'test_collection', [1,2], {allResults: true, <optional config>})
| score | metadata | id | vector | text | entity | 
|---|---|---|---|---|---|
null  | 
{city: "Berlin", foo: "one"}  | 
1  | 
[…]  | 
null  | 
null  | 
null  | 
{city: "Berlin", foo: "two"}  | 
2  | 
[…]  | 
null  | 
null  | 
CALL apoc.vectordb.weaviate.query($host,
    'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    '{operator: Equal, valueString: "London", path: ["city"]}',
    5,
    {fields: ["city", "foo"], allResults: true, <other optional config>})
| score | metadata | id | vector | text | 
|---|---|---|---|---|
1,  | 
{city: "Berlin", foo: "one"}  | 
1  | 
[…]  | 
null  | 
0.1  | 
{city: "Berlin", foo: "two"}  | 
2  | 
[…]  | 
null  | 
We can define a mapping, to fetch the associated nodes and relationships and optionally create them, by leveraging the vector metadata.
For example, if we have created 2 vectors with the above upsert procedures,
we can populate some existing nodes (i.e. (:Test {myId: 'one'}) and (:Test {myId: 'two'})):
CALL apoc.vectordb.weaviate.queryAndUpdate($host, 'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    {},
    5,
    { fields: ["city", "foo"],
      mapping: {
        embeddingKey: "vect",
        nodeLabel: "Test",
        entityKey: "myId",
        metadataKey: "foo"
      }
    })
which populates the two nodes as: (:Test {myId: 'one', city: 'Berlin', vect: [vector1]})
and (:Test {myId: 'two', city: 'London', vect: [vector2]}),
which will be returned in the entity column result.
We can also set the mapping configuration mode to CREATE_IF_MISSING (which creates nodes if not exist), READ_ONLY (to search for nodes/rels, without making updates) or UPDATE_EXISTING (default behavior):
CALL apoc.vectordb.weaviate.queryAndUpdate($host, 'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    {},
    5,
    { fields: ["city", "foo"],
      mapping: {
        mode: "CREATE_IF_MISSING",
        embeddingKey: "vect",
        nodeLabel: "Test",
        entityKey: "myId",
        metadataKey: "foo"
      }
    })
which creates 2 new nodes as above.
Or, we can populate an existing relationship (i.e. (:Start)-[:TEST {myId: 'one'}]→(:End) and (:Start)-[:TEST {myId: 'two'}]→(:End)):
CALL apoc.vectordb.weaviate.queryAndUpdate($host, 'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    {},
    5,
    { fields: ["city", "foo"],
      mapping: {
        embeddingKey: "vect",
        relType: "TEST",
        entityKey: "myId",
        metadataKey: "foo"
      }
    })
which populates the two relationships as: ()-[:TEST {myId: 'one', city: 'Berlin', vect: [vector1]}]-()
and ()-[:TEST {myId: 'two', city: 'London', vect: [vector2]}]-(),
which will be returned in the entity column result.
We can also use mapping for apoc.vectordb.weaviate.query procedure, to search for nodes/rels fitting label/type and metadataKey, without making updates
(i.e. equivalent to *.queryOrUpdate procedure with mapping config having mode: "READ_ONLY").
For example, with the previous relationships, we can execute the following procedure, which just return the relationships in the column rel:
CALL apoc.vectordb.weaviate.query($host, 'test_collection',
    [0.2, 0.1, 0.9, 0.7],
    {},
    5,
    { fields: ["city", "foo"],
      mapping: {
        relType: "TEST",
        entityKey: "myId",
        metadataKey: "foo"
      }
    })
| 
 We can use mapping with   | 
| 
 To optimize performances, we can choose what to  For example, by executing a   | 
It is possible to execute vector db procedures together with the apoc.ml.rag as follow:
CALL apoc.vectordb.weaviate.getAndUpdate($host, $collection, [<id1>, <id2>], $conf) YIELD score, node, metadata, id, vector
WITH collect(node) as paths
CALL apoc.ml.rag(paths, $attributes, $question, $confPrompt) YIELD value
RETURN value
which returns a string that answers the $question by leveraging the embeddings of the db vector.
CALL apoc.vectordb.weaviate.delete($host, 'test_collection', [1,2], {<optional config>})
| value | 
|---|
["1", "2"]  |