Create and store embeddings

Neo4j’s vector indexes and vector functions allow you to calculate the similarity between node and relationship properties in a graph. A prerequisite for using these features is that vector embeddings have been set as properties of these entities. This page shows how these embeddings can be created and stored as properties on nodes and relationships in a Neo4j database using the GenAI plugin.

For a hands-on guide on how to use the GenAI plugin on a Neo4j database, see Embeddings & Vector Indexes Tutorial → Create embeddings with cloud AI providers.

Generate and store a single embedding

Use the ai.text.embed() function to generate a vector embedding for a single value.

Signature for `ai.text.embed()`
FunctionIntroduced in 2025.11
Syntax	`ai.text.embed(resource, provider, configuration = {}) :: VECTOR`
Description	Encode a resource as a vector using the named provider.
Inputs	Name	Type	Description
	`resource`	`STRING`	The string to transform into an embedding.
	`provider`	`STRING`	Case-insensitive identifier of the AI provider to use. See Providers for supported options.
	`configuration`	`MAP`	Provider-specific options. See Providers for details of each supported provider. Note that because this argument may contain sensitive data, it is obfuscated in the query.log. However, if the function call is misspelled or the query is otherwise malformed, it will be logged without obfuscation.
Returns		`VECTOR`	The generated vector embedding for the resource.

This function sends one API request every time it is called, which may result in a lot of overhead in terms of both network traffic and latency. If you want to generate many embeddings at once, use Generate and store a batch of embeddings.

Enteprise Edition

ai.text.embed() returns a VECTOR. Storing VECTOR values on self-managed instances requires Enterprise Edition and block format.

Create a VECTOR embedding property for the Godfather

MATCH (m:Movie {title:'Godfather, The'})
WHERE m.plot IS NOT NULL AND m.title IS NOT NULL
WITH m, m.title + ' ' + m.plot AS titleAndPlot (1)
WITH m, ai.text.embed(titleAndPlot, 'OpenAI', { token: $openaiToken, model: 'text-embedding-3-small' }) AS vector (2)
SET m.embedding = vector (3)
RETURN m.embedding AS embedding

1	Concatenate the `title` and `plot` of the `Movie` into a single `STRING`.
2	Create an embedding from `titleAndPlot`.
3	Store the `propertyVector` into an `embedding` property (type `VECTOR`) on `The Godfather` node.

Result (output capped after 4 entries)

+----------------------------------------------------------------------------------------------------+
| embedding                                                                                          |
+----------------------------------------------------------------------------------------------------+
| [0.005239539314061403, -0.039358530193567276, -0.0005175105179660022, -0.038706034421920776, ... ] |
+----------------------------------------------------------------------------------------------------+

ai.text.embed() returns a VECTOR, which can be converted into a list with toFloatList().

To run this workflow on Community Edition, you need to use this dump file (storage format aligned).

Create a LIST<FLOAT> embedding property for the Godfather

MATCH (m:Movie {title:'Godfather, The'})
WHERE m.plot IS NOT NULL AND m.title IS NOT NULL
WITH m, m.title + ' ' + m.plot AS titleAndPlot (1)
WITH m, ai.text.embed(titleAndPlot, 'OpenAI', { token: $openaiToken, model: 'text-embedding-3-small' }) AS vector (2)
CALL db.create.setNodeVectorProperty(m, 'embedding', toFloatList(vector)) (3)
RETURN m.embedding AS embedding

1	Concatenate the `title` and `plot` of the `Movie` into a single `STRING`.
2	Create an embedding from `titleAndPlot`.
3	Store the `propertyVector` into an `embedding` property (type `LIST<FLOAT>`) on `The Godfather` node. The procedures `db.create.setNodeVectorProperty` and `db.create.setRelationshipVectorProperty` store the list with a more space-efficient representation.

Result (output capped after 4 entries)

+----------------------------------------------------------------------------------------------------+
| embedding                                                                                          |
+----------------------------------------------------------------------------------------------------+
| [0.005239539314061403, -0.039358530193567276, -0.0005175105179660022, -0.038706034421920776, ... ] |
+----------------------------------------------------------------------------------------------------+

Generate and store a batch of embeddings

Use the ai.text.embedBatch procedure to generate many vector embeddings with a single API request. This procedure takes a list of resources as an input, and returns the same number of result rows.

Signature for `ai.text.embedBatch()`
ProcedureIntroduced in 2025.11
Syntax	`ai.text.embedBatch(resources, provider, configuration = {}) :: (index, resource, vector)`
Description	Encode a given batch of resources as vectors using the named provider.
Inputs	Name	Type	Description
	`resources`	`LIST<STRING>`	The strings to transform into an embedding.
	`provider`	`STRING`	Case-insensitive identifier of the AI provider to use. See Providers for supported options.
	`configuration`	`MAP`	Provider-specific options. See Providers for details of each supported provider. Note that because this argument may contain sensitive data, it is obfuscated in the query.log. However, if the function call is misspelled or the query is otherwise malformed, it will be logged without obfuscation.
Returns	Name	Type	Description
	`index`	`INTEGER`	The index of the corresponding element in the input list, to correlate results back to inputs.
	`resource`	`STRING`	The given input resource.
	`vector`	`VECTOR`	The generated vector embedding for this resource.

This procedure attempts to generate embeddings for all supplied resources in a single API request. Providing too many resources may cause the AI provider to time out or to reject the request.

Enterprise Edition

ai.text.embedBatch() returns a VECTOR for each input resource. Storing VECTOR values on an on-prem instance requires Enterprise Edition and block format.

Create embeddings from a limited number of properties and store them as VECTOR properties

MATCH (m:Movie WHERE m.plot IS NOT NULL) LIMIT 20
WITH collect(m) AS moviesList (1)
WITH moviesList, [movie IN moviesList | movie.title + ': ' + movie.plot] AS batch (2)
CALL ai.text.embedBatch(batch, 'OpenAI', { token: $openaiToken, model: 'text-embedding-3-small' }) YIELD index, vector
WITH moviesList, index, vector
MATCH (toUpdate:Movie {title: moviesList[index]['title']})
SET toUpdate.embedding = vector  (3)

1	Collect all 20 `Movie` nodes into a `LIST<NODE>`.
2	A list comprehension (`[]`) extracts the `title` and `plot` properties of the movies in `moviesList` into a new `LIST<STRING>`.
3	`SET` is run for each `vector` returned by `ai.text.embedBatch()`, and stores that vector as a property named `embedding` on the corresponding node.

Create embeddings from a large number properties and store them as VECTOR properties

MATCH (m:Movie WHERE m.plot IS NOT NULL) LIMIT 500
WITH collect(m) AS moviesList, (1)
     count(*) AS total,
     100 AS batchSize (2)
UNWIND range(0, total-1, batchSize) AS batchStart (3)
CALL (moviesList, batchStart, batchSize) { (4)
    WITH [movie IN moviesList[batchStart .. batchStart + batchSize] | movie.title + ': ' + movie.plot] AS batch (5)
    CALL ai.text.embedBatch(batch, 'OpenAI', { token: $openaiToken, model: 'text-embedding-3-small' }) YIELD index, vector
    MATCH (toUpdate:Movie {title: moviesList[batchStart + index]['title']})
    SET toUpdate.embedding = vector (6)
} IN CONCURRENT TRANSACTIONS OF 1 ROW (7)

1	Collect all returned `Movie` nodes into a `LIST<NODE>`.
2	`batchSize` defines the number of nodes in `moviesList` to be processed at once. Because vector embeddings can be very large, a larger batch size may require significantly more memory on the Neo4j server. Too large a batch size may also exceed the provider’s threshold.
3	Process `Movie` nodes in increments of `batchSize`. The end range `total-1` is due to `range` being inclusive on both ends.
4	A `CALL` subquery executes a separate transaction for each batch. Note that this `CALL` subquery uses a variable scope clause.
5	`batch` is a list of strings, each being the concatenation of `title` and `plot` of one movie.
6	The procedure sets `vector` as value for the property named `embedding` for the node at position `batchStart + index` in the `moviesList`.
7	Set to `1` the amount of batches to be processed at once. For more information on concurrency in transactions, see `CALL` subqueries → Concurrent transactions.

This example may not scale to larger datasets, as collect(m) requires the whole result set to be loaded in memory. For an alternative method more suitable to processing large amounts of data, see Embeddings & Vector Indexes Tutorial → Create embeddings with cloud AI providers.

ai.text.embedBatch() returns a VECTOR for each input resource, which can be converted into a list with toFloatList().

To run these workflows on Community Edition, you need to use this dump file (storage format aligned).

Create embeddings from a limited number of properties and store them as LIST<FLOAT> properties

MATCH (m:Movie WHERE m.plot IS NOT NULL) LIMIT 20
WITH collect(m) AS moviesList (1)
WITH moviesList, [movie IN moviesList | movie.title + ': ' + movie.plot] AS batch (2)
CALL ai.text.embedBatch(batch, 'OpenAI', { token: $openaiToken, model: 'text-embedding-3-small' }) YIELD index, vector
WITH moviesList, index, vector
CALL db.create.setNodeVectorProperty(moviesList[index], 'embedding', toFloatList(vector)) (3)

1	Collect all 20 `Movie` nodes into a `LIST<NODE>`.
2	A list comprehension (`[]`) extracts the `title` and `plot` properties of the movies in `moviesList` into a new `LIST<STRING>`.
3	Each vector is converted into a list of floats and stored as a property named `embedding` (type `LIST<FLOAT>`) on the corresponding node. The procedures `db.create.setNodeVectorProperty` and `db.create.setRelationshipVectorProperty` store the list with a more space-efficient representation.

Create embeddings from a large number properties and store them as LIST<FLOAT> values

MATCH (m:Movie WHERE m.plot IS NOT NULL) LIMIT 500
WITH collect(m) AS moviesList, (1)
     count(*) AS total,
     100 AS batchSize (2)
UNWIND range(0, total-1, batchSize) AS batchStart (3)
CALL (moviesList, batchStart, batchSize) { (4)
    WITH [movie IN moviesList[batchStart .. batchStart + batchSize] | movie.title + ': ' + movie.plot] AS batch (5)
    CALL ai.text.embedBatch(batch, 'OpenAI', { token: $openaiToken, model: 'text-embedding-3-small' }) YIELD index, vector
    CALL db.create.setNodeVectorProperty(moviesList[batchStart + index], 'embedding', toFloatList(vector)) (6)
} IN CONCURRENT TRANSACTIONS OF 1 ROW (7)

1	Collect all returned `Movie` nodes into a `LIST<NODE>`.
2	`batchSize` defines the number of nodes in `moviesList` to be processed at once. Because vector embeddings can be very large, a larger batch size may require significantly more memory on the Neo4j server. Too large a batch size may also exceed the provider’s threshold.
3	Process `Movie` nodes in increments of `batchSize`. The end range `total-1` is due to `range` being inclusive on both ends.
4	A `CALL` subquery executes a separate transaction for each batch. Note that this `CALL` subquery uses a variable scope clause.
5	`batch` is a list of strings, each being the concatenation of `title` and `plot` of one movie.
6	The procedure sets `vector` as value for the property named `embedding` for the node at position `batchStart + index` in the `moviesList`.
7	Set to `1` the amount of batches to be processed at once. For more information on concurrency in transactions, see `CALL` subqueries → Concurrent transactions.

This example may not scale to larger datasets, as collect(m) requires the whole result set to be loaded in memory. For an alternative method more suitable to processing large amounts of data, see GenAI documentation - Embeddings & Vector Indexes Tutorial → Create embeddings with cloud AI providers.

Providers

You can crate vector embeddings via the following providers:

OpenAI (openai)
Azure OpenAI (azure-openai)
Google Vertex AI (vertexai)
Amazon Bedrock Titan Models (bedrock-titan)

The query CALL ai.text.embed.providers() (see reference) shows the list of supported providers in the installed version of the plugin.

OpenAI

OpenAI parameters
Name	Type	Default	Description
`model`	`STRING`	-	Model ID (see OpenAI → Models).
`token`	`STRING`	-	OpenAI API key (see OpenAI → API Keys).
`vendorOptions`	`MAP`	`{}`	Optional vendor options that will be passed on as-is in the request to Open AI (see OpenAI → Create embeddings).

Example — Embed the string Hello World!

WITH
  {
    token: $openaiToken,
    model: 'text-embedding-3-small',
    vendorOptions: {
      dimensions: 1024
    }
  } AS conf
RETURN ai.text.embed('Hello World!', 'openai', conf) AS result

You can change OpenAI’s base URL (default: https://api.openai.com) via the genai.openai.baseurl setting. The change applies to all ai.text.* calls that use OpenAI, including ai.text.embed, ai.text.embedBatch and ai.text.completion. See Configuration Options → genai.openai.baseurl.

Azure OpenAI

OpenAI parameters
Name	Type	Default	Description
`model`	`STRING`	-	Model id (see Azure → Azure OpenAI in Foundry Models).
`resource`	`STRING`	-	Azure resource name.
`token`	`STRING`	-	Azure OAuth2 bearer token.
`vendorOptions`	`MAP`	`{}`	Optional vendor options that will be passed on as-is in the request to Azure.

Example — Embed the string Hello World!

WITH
  {
    token: $azureToken,
    resource: 'my-azure-openai-resource',
    model: 'text-embedding-3-small',
    vendorOptions: {
      dimensions: 1024
    }
  } AS conf
RETURN ai.text.embed('Hello World!', 'azure-openai', conf) AS result

Google VertexAI

VertexAI parameters
Name	Type	Default	Description
`model`	`STRING`	-	Model resource name (see Vertex AI → Model Garden).
`project`	`STRING`	-	Google Cloud project ID.
`region`	`STRING`	-	Google cloud region (see Vertex AI → Locations).
`publisher`	`STRING`	`'google'`	Model publisher.
`apiKey`	`STRING`	-	Vertex AI API key.
`token`	`STRING`	-	Vertex AI API access token.
`vendorOptions`	`MAP`	`{}`	Optional vendor options that will be passed on as-is in the request to Vertex (see Vertex AI → Method: models.predict).

Exactly one of apiKey or token must be provided.

Example — Embed the string Hello World!

WITH
  {
    token: $vertexaiApiAccessKey,
    model: 'gemini-embedding-001',
    publisher: 'google',
    project: 'my-google-cloud-project',
    region: 'asia-northeast1',
    vendorOptions: {
      outputDimensionality: 1024
    }
  } AS conf
RETURN ai.text.embed('Hello World!', 'vertexai', conf) AS result

Amazon Bedrock Titan Models

This provider supports all models that use the same request parameters and response fields as the Titan text models.

Amazon Bedrock Titan Configuration
Name	Type	Default	Description
`model`	`STRING`	-	Model ID or its ARN.
`region`	`STRING`	-	Amazon region (see Amazon Bedrock → Model Support).
`accessKeyId`	`STRING`	-	Amazon access key ID.
`secretAccessKey`	`STRING`	-	Amazon secret access key.
`vendorOptions`	`MAP`	`{}`	Optional vendor options that will be passed on as-is in the request to Bedrock (see Amazon Bedrock → Inference request parameters and response fields).

Example — Embed the string Hello World!

WITH
  {
    accessKeyId: $awsAccessKeyId,
    secretAccessKey: $secretAccessKey,
    model: 'amazon.titan-embed-text-v1',
    region: 'eu-west-2',
    vendorOptions: {
      dimensions: 1024
    }
  } AS conf
RETURN ai.text.embed('Hello World!', 'bedrock-titan', conf) AS result

(Legacy) Providers
Deprecated in 2025.11

The following provider configurations are for the genai.vector.encode function and the genai.vector.encodeBatch procedure. Both callables have been deprecated in 2025.11 and superseded by ai.text.embed and ai.text.embedBatch. For more information on the old callables, see documentation for the previous version.

OpenAI

Identifier (provider argument): "OpenAI"
Official OpenAI documentation

Configuration map
Key	Type	Description	Default
`token`	`STRING`	API access token.	Required
`model`	`STRING`	The name of the model to invoke.	`"text-embedding-ada-002"`
`dimensions`	`INTEGER`	The number of dimensions to reduce the vector to. Only supported for certain models.	Model-dependent.

Vertex AI

Identifier (provider argument): "VertexAI"
Official Vertex AI documentation

Configuration map
Key	Type	Description	Default
`token`	`STRING`	API access token.	Required
`projectId`	`STRING`	GCP project ID.	Required
`model`	`STRING`	The name of the model to invoke.	`"textembedding-gecko@001"`
`region`	`STRING`	GCP region where to send the API requests. Supported values `"us-west1"` `"us-west2"` `"us-west3"` `"us-west4"` `"us-central1"` `"us-east1"` `"us-east4"` `"us-south1"` `"northamerica-northeast1"` `"northamerica-northeast2"` `"southamerica-east1"` `"southamerica-west1"` `"europe-west2"` `"europe-west1"` `"europe-west4"` `"europe-west6"` `"europe-west3"` `"europe-north1"` `"europe-central2"` `"europe-west8"` `"europe-west9"` `"europe-southwest1"` `"asia-south1"` `"asia-southeast1"` `"asia-southeast2"` `"asia-east2"` `"asia-east1"` `"asia-northeast1"` `"asia-northeast2"` `"australia-southeast1"` `"australia-southeast2"` `"asia-northeast3"` `"me-west1"`	`"us-central1"`
`taskType`	`STRING`	The intended downstream application (see provider documentation). The specified `taskType` will apply to all resources in a batch.
`title`	`STRING`	The title of the document that is being encoded (see provider documentation). The specified `title` will apply to all resources in a batch.

Azure OpenAI

Identifier (provider argument): "AzureOpenAI"
Official Azure OpenAI documentation

Unlike the other providers, the model is configured when creating the deployment on Azure, and is thus not part of the configuration map.

Configuration map
Key	Type	Description	Default
`token`	`STRING`	API access token.	Required
`resource`	`STRING`	The name of the resource to which the model has been deployed.	Required
`deployment`	`STRING`	The name of the model deployment.	Required
`dimensions`	`INTEGER`	The number of dimensions to reduce the vector to. Only supported for certain models.	Model-dependent.

Amazon Bedrock

Identifier (provider argument): "Bedrock"
Official Bedrock documentation

Configuration map
Key	Type	Description	Default
`accessKeyId`	`STRING`	AWS access key ID.	Required
`secretAccessKey`	`STRING`	AWS secret key.	Required
`model`	`STRING`	The name of the model to invoke. Supported values: `"amazon.titan-embed-text-v1"`	`"amazon.titan-embed-text-v1"`
`region`	`STRING`	AWS region where to send the API requests. Supported values: `"us-east-1"` `"us-west-2"` `"ap-southeast-1"` `"ap-northeast-1"` `"eu-central-1"`	`"us-east-1"`

Create and store embeddings

Generate and store a single embedding

Generate and store a batch of embeddings

Providers

OpenAI

Azure OpenAI

Google VertexAI

Amazon Bedrock Titan Models

(Legacy) ProvidersDeprecated in 2025.11

OpenAI

Vertex AI

Azure OpenAI

Amazon Bedrock

(Legacy) Providers
Deprecated in 2025.11