Introducing Neo4j’s Native Vector Data Type

Photo of David Pond

David Pond

Principal Database Product Manager at Neo4j

Create a free graph database instance in Neo4j AuraDB

TL;DR: Neo4j now has a first‑class Vector type for storing embedding vectors. It’s supported end‑to‑end (drivers → Bolt → Cypher → storage → constraints). Your app code gets simpler, your data integrity gets stronger, and future vector‑specific optimizations become possible.

A vector embedding is a fixed‑length, single dtype numeric array (for example, VECTOR<FLOAT32>(1024)) used in semantic search, GraphRAG, and many other GenAI patterns.

Why a Native Vector?

Until now, Neo4j stored vector embeddings as lists of numbers. That works — but as usage has grown, we’ve seen common pain points: type/length mismatches, boilerplate conversions, and slower evolution of vector‑aware features. The native Vector type fixes this with:

  • Simpler code — First‑class vectors in the drivers mean fewer helpers and fewer mistakes.
  • Integrity by default — Property‑type constraints enforce shape and dtype and protect your indexes.
  • Future‑ready — Unlocks vector‑specific functions, indexes, and storage optimizations over time.

Native types are supported end to end: Neo4j drivers → Bolt protocol → Cypher → Storage engine → Constraints. See the Cypher manual or your preferred language’s driver documentation.

Requirements

To use the Vector type, you’ll need:

  • Neo4j Aura or Neo4j 2025.10
  • Official drivers v6.0+
  • Cypher 25
  • Database store format must be Block format

Getting Started

Working With Vectors in Application Code

Driver v6 brings a small, explicit API to create and consume vectors. Python‑style example:

# From a native list → Vector
a = [1, 2, 3]
vector = Vector.from_native(a, "i32")
print(f"Neo4j Vector: {vector}")

# Back to a native list
b = vector.to_native()
print(f"Back to native: {b}")

Notes:

  • Element dtype is explicit (e.g., “f32”, “f64”, “i32”).
  • The driver validates length and dtype early — before your data hits the database.
  • Support for common language tools — in Python’s case, easy conversion to NumPy and others.

Use vectors the same way as other parameters:

session.run(
"""
MERGE (p:Page {id: $id})
SET p.embedding = $vector
""",
id="/docs/setup",
vector=vector,
)

Using Vectors in Cypher

Store and query vectors as regular properties:

CYPHER 25
// Write a Vector parameter to a property
CREATE (chunk:DocumentChunk)
SET chunk.vector_embedding = vector([1.05, 0.123, 5], 3, FLOAT32)

// Read it back
MATCH (chunk:DocumentChunk) - (doc:Document {id: "/docs/setup"})
RETURN chunk.vector_embedding AS embedding

Vector similarity can be measured with vector‑aware functions in Cypher 25, and converted to/from standard lists when needed. Similarity functions enable exact nearest neighbor (ENN) graph pre-filter searches like this:

//ENN search with graph pre-filter
MATCH (chunk:CHUNK)-[]-(doc:DOCUMENT)
WHERE doc.year = 2025
RETURN chunk.embedding, doc.title
ORDER BY vector.similarity.cosine(chunk.embedding, $query_vec)
DESC
LIMIT $top_k

This graph pre-filter approach complements the approximate nearest neighbor (ANN) search followed by graph filter/expansion:

//ANN search with graph post-filter/expansion
CALL db.index.vector.queryNodes($index_name, $top_k, $query_vec)
YIELD node as chunk, score
MATCH (chunk)-[]-(doc:DOCUMENT)
WHERE doc.year = 2025
RETURN chunk.embedding, doc.title

Enforce Integrity With Property‑Type Constraints

A common pitfall is writing vectors that don’t match your index’s expectations. Use property‑type constraints to lock things down:

CYPHER 25
CREATE CONSTRAINT vector_type_constraint IF NOT EXISTS
FOR (chunk:DocumentChunk)
REQUIRE chunk.vector_embedding IS :: VECTOR<FLOAT32>(1024);

This ensures if a :DocumentChunk has a vector_embedding property that it is a Vector with exactly 1024 elements of FLOAT32. You can add a property existence constraint to require all :DocumentChunks to have a vector_embedding property.

Vector Indexes

Vector values work with Neo4j’s vector indexes for efficient semantic retrieval. If you already use vector indexes with list properties, they continue to work; new projects should prefer the Vector type. Index will handle a mixture of lists and vectors, but I recommend choosing a consistent format. Lists could be rewritten to Vector types using Cypher along these lines:

CYPHER 25
MATCH (n)
SET n.vector = vector(n.list, size(n.list), FLOAT32);

Neo4j’s vector index is powered by Apache Lucene’s Hierarchical Navigable Small World (HNSW) implementation, which offers competitive performance and scale. As indexes grow, it’s important to pay attention to the allocation of memory to Lucene; it doesn’t share the Neo4j page cache.

Compatibility and Rollout

  • Existing projects: No requirement to upgrade; list properties and existing vector indexes continue to work. Adopt the Vector type when convenient.
  • Aura: Available to all Aura users with the 2025.10 rollout, provided you use Cypher 25 and v6 drivers.
  • Older drivers: Reading a Vector with an older driver yields a warning and a fallback (a descriptive map of the unsupported value). The v6 drivers also expose an Unknown Type placeholder for future type extensions.

Summary

The Vector type unlocks more vector‑specific features in Cypher and the index layer, improved storage layouts, and tighter integrations across GenAI tooling. If you’re starting a new project with embeddings, start with Vector today — and add constraints early.


Introducing Neo4j’s Native Vector Data Type was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.