text2cypher
This is an experimental translator inspired by the Neo4j Labs project text2cypher.
If you add this translator to the classpath or use the text2cypher bundle, all queries that start with
🤖,
will be treated as natural language queries written in plain english. The driver will strip the prefix, and use OpenAI to translate the input into a Cypher® statement. The driver will augment the generation of the query by passing the current graph schema along with the input question.
The following data will be sent to an external API:
Don’t use this translator if you don’t want the above, or are not allowed to do so. |
This module requires one additional configuration: the OpenAI API key. You can use either a URL parameter, a JDBC property entry, or an environment variable:
-
URL parameter/property name is
openAIApiKey
-
Environment variable name is
OPEN_AI_API_KEY
jdbc:neo4j://localhost:7687?openAIApiKey=sk-xxx-your-key
Additional configuration properties are
property name | default value |
---|---|
openAIBaseUrl |
https://api.openai.com/v1 (defined by langchain4j) |
openAIModelName |
gpt-4-turbo |
openAITemperature |
0.0 |
With that in place, a query such as the following can be translated into Cypher:
🤖, How was The Da Vinci Code rated?
The outcome of the LLM is not deterministic and is likely to vary.
While you can execute it directly, we strongly recommend to use Connection#nativeSQL
to retrieve the Cypher statement, inspect it, and then run it separately.
In our test runs, the above questions was most often correctly translated to
MATCH (m:`Movie` {
title: 'The Da Vinci Code'
})<-[r:`REVIEWED`]-(p:`Person`)
RETURN r.rating AS Rating, p.name AS ReviewerName
Other times the result was a syntactically correct statement, but it would only return the reviewers and the movie itself. Also note that while a human likely recognizes that you are actually thinking about the average rating, the LLM does not infer this. Making the question more explicit gives better results:
🤖, How was The Da Vinci Code rated on average?
is translated more accurately to:
MATCH (m:`Movie` {
title: 'The Da Vinci Code'
})<-[:`REVIEWED`]-(p:`Person`)
RETURN avg(p.rating) AS AverageRating
Once a natural language query gets translated into Cypher, the result will be cached and further invocations of that query will use the cached result. |
All that statements that do not start with 🤖 will be used as-is and treated as Cypher.
Get the full, ready to use bundle here: https://repo.maven.apache.org/maven2/org/neo4j/neo4j-jdbc-text2cypher-bundle/6.1.1/neo4j-jdbc-text2cypher-bundle-6.1.1.jar. More information in Available bundles.