Vector functions

Vector functions allow you to compute the similarity scores of vector pairs. These vector similarity functions are identical to those used by Neo4j’s vector search indexes.

Consider for example k-nearest neighbor queries, which return the k entities with the highest similarity scores based on comparing their associated vectors with a query vector. Such queries can be run against vector indexes in the form of approximate k-nearest neighbor (k-ANN) queries, whose returned entities have a high probability of being among the true k nearest neighbors. However, they can also be expressed as an exhaustive search using vector similarity functions directly. While this is typically significantly slower than using an index, it is exact rather than approximate and does not require an existing index. This can be useful for one-off queries on small sets of data.

Example 1. k-Nearest Neighbors

To create the graph used in this example, run the following query in an empty Neo4j database:

CREATE
  (:Node { id: 1, vector: [1.0, 4.0, 2.0]}),
  (:Node { id: 2, vector: [3.0, -2.0, 1.0]}),
  (:Node { id: 3, vector: [2.0, 8.0, 3.0]});

Given a parameter query (here set to [4.0, 5.0, 6.0]), you can query for the two nearest neighbors of that query vector by Euclidean distance. This is achieved by matching on all candidate vectors and ordering on the similarity score:

Query
MATCH (node:Node)
WITH node, vector.similarity.euclidean($query, node.vector) AS score
RETURN node, score
ORDER BY score DESCENDING
LIMIT 2;

This returns the two nearest neighbors.

Table 1. Result
node score

(:Node {vector: [2.0, 8.0, 3.0], id: 3})

0.043478261679410934

(:Node {vector: [1.0, 4.0, 2.0], id: 1})

0.03703703731298447

Rows: 2

vector.similarity.euclidean()

vector.similarity.euclidean() returns a FLOAT representing the similarity between the argument vectors based on their Euclidean distance. For more details, see the vector index documentation.

vector.similarity.euclidean(a, b)

Returns:

FLOAT

Arguments:

Name Description

a

A LIST<INTEGER | FLOAT> representing the first vector.

b

A LIST<INTEGER | FLOAT> representing the second vector.

Considerations:

vector.similarity.euclidean(NULL, NULL) returns NULL.

vector.similarity.euclidean(NULL, b) returns NULL.

vector.similarity.euclidean(a, NULL) returns NULL.

Both vectors must be of the same dimension.

Both vectors must be valid with respect to Euclidean similarity.

The implementation is identical to that of the latest available vector index provider (vector-2.0).

vector.similarity.cosine()

vector.similarity.cosine() returns a FLOAT representing the similarity between the argument vectors based on their cosine. For more details, see the vector index documentation.

vector.similarity.cosine(a, b)

Returns:

FLOAT

Arguments:

Name Description

a

A LIST<INTEGER | FLOAT> representing the first vector.

b

A LIST<INTEGER | FLOAT> representing the second vector.

Considerations:

vector.similarity.cosine(NULL, NULL) returns NULL.

vector.similarity.cosine(NULL, b) returns NULL.

vector.similarity.cosine(a, NULL) returns NULL.

Both vectors must be of the same dimension.

Both vectors must be valid with respect to cosine similarity.

The implementation is identical to that of the latest available vector index provider (vector-2.0).