Building retail assistants customers can trust with Databricks and Neo4j

Photo of Shyam Kathiresan

Shyam Kathiresan

Global Cloud Partnership Director

Retail AI does not earn trust by sounding conversational. It earns trust by remembering context, understanding product relationships, checking live business data, and grounding answers in source documentation. That requires more than a chatbot over a catalog. It requires a connected intelligence layer where product knowledge, customer context, inventory, pricing, documentation, and prior interactions can be reasoned over together.

The retail AI problem

A Gartner study of 377 US consumers conducted in June and July 2025 found that 53% distrust or lack confidence in the reliability of AI search and summary results, and that the perceived usefulness of AI declines sharply as consumers move from initial product exploration toward comparing options and making a purchase. Retailers are deploying AI assistants at scale into a market where consumer confidence in those assistants is already fragile. 

That gap has a structural cause. Most retail AI systems today are built without persistent memory across sessions, without the ability to reason across product relationships, and without grounding in actual product documentation, which means they generate answers with no reliable way to verify accuracy.

Fixing that requires a governed knowledge layer that reasons across product relationships, retrieves answers from source documents, and carries customer context through the full purchase journey. That’s the pattern the Databricks Retail Assistant is built on.

What an agentic shopping assistant actually does

A shopper saying, “I need something for a home gym, I’ve got about $800, and I already have a barbell,” is expressing a budget, a compatibility context, and an equipment constraint simultaneously. Parsing all three and returning a reasoned recommendation requires a knowledge graph that holds specs, compatibility rules, availability, pricing, and extracted content from manuals and FAQs, all linked to product nodes and all governed. A recommendation engine that works solely on product names and descriptions will miss most of what that question is actually asking.

The assistant also needs to carry that context through the full purchase. Moving a customer from the first expression of a need through a cart or quote to a completed transaction, without them having to repeat themselves or to restart the conversation, is a different engineering problem from search, and the one this architecture is built to solve.

The architecture

Databricks provides the governed lakehouse foundation for transactional, customer, inventory, pricing, and operational data. Neo4j provides the connected product intelligence layer: product relationships, compatibility rules, documentation-derived knowledge, GraphRAG, and persistent agent memory. Together, they enable an assistant to answer both business and shopper questions within a single experience.

The Databricks Retail Assistant is a two-agent supervisor system. The architecture uses a supervisor agent to decide which system is best suited to answer each question. Analytics questions, such as revenue trends, customer segments, return rates, or inventory availability, go to Databricks. Product, compatibility, recommendation, and memory-based questions go to Neo4j. When a request requires both, the supervisor combines the results into one response.

The Genie Lakehouse Agent translates natural language to SQL over retail Delta tables in Unity Catalog, handling questions about revenue performance, customer segments, return rates, and inventory availability across locations, with Genie managing the NL-to-SQL translation natively against Unity Catalog.

The Knowledge Graph Agent is a LangGraph ReAct agent with persistent memory, deployed to a Databricks Model Serving endpoint via MLflow. It has tools for semantic product search, structured lookup covering specs and compatibility, graph traversal for related and compatible items, cross-sell and bundle recommendations with confidence scores, cart and quote management, and memory tools for storing and retrieving preferences, constraints, and prior context.

The two agents draw from complementary data stores tied by a shared product identifier that runs through both systems. The Lakehouse holds the transactional and operational record, covering revenue, customer history, live inventory, and pricing. The Knowledge Graph holds connected product intelligence: categories, brands, attributes, compatibility rules, pricing, and extracted content from manuals and FAQs, all vector-indexed and queryable in real time. That shared identifier means the Knowledge Graph Agent can pull graph attributes from Neo4j and live inventory and pricing from the Lakehouse in the same reasoning turn, without duplicating data or weakening governance on either side.

Figures shown reflect the synthetic dataset included in the solution accelerator for demonstration purposes.

Agent memory and GraphRAG

The Retail Assistant uses two of Neo4j’s open-source Python libraries, neo4j-agent-memory for persistent agent memory and neo4j-graphrag for GraphRAG, both running against the same Neo4j Aura instance.

The neo4j-agent-memory library gives the Knowledge Graph Agent memory that persists across sessions. Conversations, preferences, compatibility constraints, and learned facts are stored as nodes and relationships in the same instance as the product catalog, embedded as vectors so the agent can search its own history semantically. Short-term memory tracks the current session, including cart state and expressed constraints; long-term memory holds durable preferences, past purchases, and time-stamped facts; and reasoning memory records how the agent solved previous problems so that similar requests resolve faster without re-running the full inference loop. On each turn, the agent loads relevant context from all three layers before responding.

The neo4j-graphrag library builds a retrieval layer on top of the product graph, combining embeddings with graph structure so the agent can move from a matched chunk through extracted entities to find related products, shared compatibility issues, and solutions sourced from actual documentation. Knowledge articles, support tickets, manuals, and FAQs are chunked, embedded, and linked back to product nodes through entity extraction, producing Feature, Symptom, and Solution nodes connected to the products they describe. When a customer asks a product question, the agent retrieves the answer from the source document and surfaces it with a traceable reference, giving the customer something they can verify.

A full purchase in a single conversation

A shopper asking about warranty, compatibility, or setup instructions does not need a plausible answer. They need an answer grounded in the actual manual, FAQ, support article, or product documentation.

A customer opens the assistant and says they’re setting up a home gym on an $800 budget, already have a barbell and a bench, and need to fill out the rest. The Knowledge Graph Agent pulls compatible accessories using purchase relationship edges and spec attributes for space and weight, checks live inventory from the Lakehouse, and returns a bundle recommendation with a specific explanation of why each item fits the existing setup and how the total lands within budget. 

When the customer asks about the warranty on the cable machine, the agent traverses to the relevant product node and retrieves the applicable section from the product manual through neo4j-graphrag, surfacing the warranty terms with a source reference. When the customer confirms, the agent updates the cart state in memory and surfaces the checkout page, carrying the full context of the conversation through to the transaction.

What this makes possible

With Databricks and Neo4j working together, the Retail Assistant can support a more trusted, context-aware shopping journey. The result is a retail assistant that can do more than answer product questions. It can recommend with context, explain with sources, remember what the customer has already shared, and connect the full journey from discovery to checkout.

For retailers, that means more relevant recommendations, fewer unsupported answers, better use of live inventory and pricing data, and a shopping experience that feels less like search and more like guided assistance.

Getting started

Retail assistants will not earn customer trust by sounding more conversational. They will earn it by remembering context, reasoning across product relationships, checking live business data, and grounding answers in source documentation.

Explore the Retail Assistant solution accelerator on GitHub to see how Databricks and Neo4j combine lakehouse data, graph intelligence, GraphRAG, and persistent agent memory in a working agentic commerce app.