Vector Databases
APOC provides these set of procedures, which leverages the Rest APIs, to interact with Vector Databases:
-
apoc.vectordb.qdrant.*(to interact with Qdrant) -
apoc.vectordb.chroma.*(to interact with Chroma) -
apoc.vectordb.weaviate.*(to interact with Weaviate) -
apoc.vectordb.custom.*(to interact with other vector databases). -
apoc.vectordb.configure(to store host, credentials and mapping into the system database)
All the procedures, except the apoc.vectordb.configure one, can have, as a final parameter,
a configuration map with these optional parameters:
key |
description |
headers |
additional HTTP headers |
method |
HTTP method |
endpoint |
endpoint key, can be used to override the default endpoint created via the 1st parameter of the procedures, to handle potential endpoint changes. |
body |
body HTTP request |
jsonPath |
To customize JSONPath parsing of the response. The default is |
Besides the above config, the apoc.vectordb.<type>.get and the apoc.vectordb.<type>.query procedures can have these additional parameters:
key |
description |
mapping |
to fetch the associated entities and optionally create them. See examples below. |
allResults |
if true, returns the vector, metadata and text (if present), otherwise returns null values for those columns. |
vectorKey, metadataKey, scoreKey, textKey |
used with the |
Store Vector db info (i.e. apoc.vectordb.configure)
We can save some info in the System Database to be reused later, that is the host, login credentials, and mapping,
to be used in *.get and .*query procedures, except for the apoc.vectordb.custom.get one.
Therefore, to store the vector info, we can execute the CALL apoc.vectordb.configure(vectorName, keyConfig, databaseName, $configMap),
where vectorName can be "QDRANT", "CHROMA", "PINECONE", "MILVUS" or "WEAVIATE",
that indicates info to be reused respectively by apoc.vectordb.qdrant.*,apoc.vectordb.chroma.* and apoc.vectordb.weaviate.*.
Then keyConfig is the configuration name, databaseName is the database where the config will be set,
and finally the configMap, that can have:
-
hostis the host base name -
credentialsValueis the API key -
mappingis a map that can be used by theapoc.vectordb.*.getAndUpdateandapoc.vectordb.*.queryAndUpdateprocedures- NOTE
-
this procedure is only executable by a user with admin permissions and against the system database
For example:
// -- within the system database or using the Cypher clause `USE SYSTEM ..` as a prefix
CALL apoc.vectordb.configure('QDRANT', 'qdrant-config-test', 'neo4j',
{
mapping: { embeddingKey: "vect", nodeLabel: "Test", entityKey: "myId", metadataKey: "foo" },
host: 'custom-host-name',
credentials: '<apiKey>'
}
)
and then we can execute e.g. the following procedure (within the neo4j database):
CALL apoc.vectordb.qdrant.query('qdrant-config-test', 'test_collection', [0.2, 0.1, 0.9, 0.7], {}, 5)
instead of:
CALL apoc.vectordb.qdrant.query($host, 'test_collection', [0.2, 0.1, 0.9, 0.7], {}, 5,
{ mapping: {
embeddingKey: "vect",
nodeLabel: "Test",
entityKey: "myId",
metadataKey: "foo"
},
headers: {Authorization: 'Bearer <apiKey>'},
endpoint: 'custom-host-name'
})