Learn to implement a Mixtral agent that interacts with a graph database Neo4j through a semantic layer
By now, we all have probably recognized that we can significantly enhance the capabilities of LLMs by providing them with additional tools. For example, even ChatGPT can use Bing Search and Python interpreter out of the box in the paid version. OpenAI is a step ahead and provides fine-tuned LLM models for tool usage, where you can pass the available tools along with the prompt to the API endpoint. The LLM then decides if it can directly provide a response or if it should use any of the available tools first. Note that the tools don’t have to be only for retrieving additional information; they can be anything, even allowing LLMs to book a dinner reservation. I have previously implemented a project allowing an LLM to interact with a graph database through a set of predefined tools, which I called a semantic layer.
Tools in the Semantic Layer
The examples in LangChain documentation (JSON agent, HuggingFace example) are using tools with a single string input. Since the tools in the semantic layer use slightly more complex inputs, I had to dig a little deeper. Here is an example input for a recommender tool.all_genres = [ "Action", "Adventure", "Animation", "Children", "Comedy", "Crime", "Documentary", "Drama", "Fantasy", "Film-Noir", "Horror", "IMAX", "Musical", "Mystery", "Romance", "Sci-Fi", "Thriller", "War", "Western", ] class RecommenderInput(BaseModel): movie: Optional[str] = Field(description="movie used for recommendation") genre: Optional[str] = Field( description=( "genre used for recommendation. Available options are:" f"{all_genres}" ) )The recommender tools have two optional inputs: movie and genre. Additionally, we use an enumeration of available values for the genre parameter. While the inputs are not highly complex, they are still more advanced than a single-string input, so the implementation has to be slightly different.
JSON-based Prompt for an LLM Agent
In my implementation, I took heavy inspiration from the existing hwchase17/react-json prompt available in LangChain hub. The prompt uses the following system message.Answer the following questions as best you can. You have access to the following tools: {tools} The way you use the tools is by specifying a json blob. Specifically, this json should have a `action` key (with the name of the tool to use) and a `action_input` key (with the input to the tool going here). The only values that should be in the "action" field are: {tool_names} The $JSON_BLOB should only contain a SINGLE action, do NOT return a list of multiple actions. Here is an example of a valid $JSON_BLOB: ``` {{ "action": $TOOL_NAME, "action_input": $INPUT }} ``` ALWAYS use the following format: Question: the input question you must answer Thought: you should always think about what to do Action: ``` $JSON_BLOB ``` Observation: the result of the action ... (this Thought/Action/Observation can repeat N times) Thought: I now know the final answer Final Answer: the final answer to the original input question Begin! Reminder to always use the exact characters `Final Answer` when responding.The prompt starts by defining the available tools, which we will get to a bit later. The most important part of the prompt is instructing the LLM on what the output should look like. When the LLM needs to call a function, it should use the following JSON structure:
{{ "action": $TOOL_NAME, "action_input": $INPUT }}That’s why it is called a JSON-based agent: we instruct the LLM to produce a JSON when it wants to use any available tools. However, that is only a part of the output definition. The full output should have the following structure:
Thought: you should always think about what to do Action: ``` $JSON_BLOB ``` Observation: the result of the action ... (this Thought/Action/Observation can repeat N times) Final Answer: the final answer to the original input questionThe LLM should always explain what it is doing in the thought part of the output. When it wants to use any of the available tools, it should provide the action input as a JSON blob. The observation part is reserved for tool outputs, and when the agent decides it can return an answer to the user, it should use the final answer key. Here is an example from the movie agent using this structure.

{{ "action": Null, "action_input": "" }}The output parsing function in LangChain doesn’t ignore the action if it is null or similar but returns an error that the null tool is not defined. I tried to prompt the engineer a solution to this problem, but I couldn’t do it in a consistent manner. Therefore, I decided to add a dummy smalltalk tool that the agent can call when the user wants to smalltalk.
response = ( "Create a final answer that says if they " "have any questions about movies or actors" ) class SmalltalkInput(BaseModel): query: Optional[str] = Field(description="user query") class SmalltalkTool(BaseTool): name = "Smalltalk" description = "useful for when user greets you or wants to smalltalk" args_schema: Type[BaseModel] = SmalltalkInput def _run( self, query: Optional[str] = None, run_manager: Optional[CallbackManagerForToolRun] = None, ) -> str: """Use the tool.""" return responseThis way, the agent can decide to use a dummy Smalltalk tool when the user greets it, and we no longer have problems parsing null or missing tool names.

Defining Tool Inputs in System Prompt
As mentioned, I had to figure out how to define slightly more complex tool inputs so that the LLM could correctly interpret them. Funnily enough, after implementing a custom function, I found an existing LangChain function that transforms the custom Pydantic tool input definition into a JSON object that the Mixtral recognizes.from langchain.tools.render import render_text_description_and_args tools = [RecommenderTool(), InformationTool(), Smalltalk()] tool_input = render_text_description_and_args(tools) print(tool_input)Produces the following string description:
"Recommender":"useful for when you need to recommend a movie", "args":{ { "movie":{ { "title":"Movie", "description":"movie used for recommendation", "type":"string" } }, "genre":{ { "title":"Genre", "description":"genre used for recommendation. Available options are:['Action', 'Adventure', 'Animation', 'Children', 'Comedy', 'Crime', 'Documentary', 'Drama', 'Fantasy', 'Film-Noir', 'Horror', 'IMAX', 'Musical', 'Mystery', 'Romance', 'Sci-Fi', 'Thriller', 'War', 'Western']", "type":"string" } } } }, "Information":"useful for when you need to answer questions about various actors or movies", "args":{ { "entity":{ { "title":"Entity", "description":"movie or a person mentioned in the question", "type":"string" } }, "entity_type":{ { "title":"Entity Type", "description":"type of the entity. Available options are 'movie' or 'person'", "type":"string" } } } }, "Smalltalk":"useful for when user greets you or wants to smalltalk", "args":{ { "query":{ { "title":"Query", "description":"user query", "type":"string" } } } }We can simply copy this tool description in the system prompt, and Mixtral will be able to use the defined tools, which is quite cool.
Conclusion
Most of the work to implement the JSON-based agent was done by Harrison Chase and the LangChain team, for which I am grateful. All I had to do was find the puzzle pieces and put them together. As mentioned, don’t expect the same level of agent performance as with GPT-4. However, I think the more powerful OSS LLMs like Mixtral could be used as agents today (with a bit more exception handling than GPT-4). I am looking forward to more open-source LLMs being fine-tuned as agents. The code is available as a Langchain template and as a Jupyter notebook.JSON-based Agents With Ollama & LangChain was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.