Add Tool Retrieval Example with the OpenAPI NLA Toolkit

2026-01-21 21:56:38 +00:00 · 2023-04-11 13:05:35 -07:00
1 changed files with 519 additions and 0 deletions
--- a/docs/modules/agents/agents/examples/openapi_nla_with_retrieval.ipynb
+++ b/docs/modules/agents/agents/examples/openapi_nla_with_retrieval.ipynb
@@ -0,0 +1,519 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "c7ad998d",
+   "metadata": {},
+   "source": [
+    "# Tool Retrieval over large Natural Language APIs\n",
+    "\n",
+    "This tutorial assumes familiarity with [Natural Language API Toolkits](../../toolkits/examples/openapi_nla.ipynb). NLAToolkits parse whole OpenAPI specs, which can be too large to reasonably fit into an agent's context. This tutorial walks through using a Retriever object to fetch tools.\n",
+    "\n",
+    "### First, import dependencies and load the LLM"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "6593f793",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import re\n",
+    "from typing import Callable, Union\n",
+    "\n",
+    "from langchain import OpenAI, LLMChain\n",
+    "from langchain.prompts import StringPromptTemplate\n",
+    "from langchain.requests import Requests\n",
+    "from langchain.agents import AgentExecutor, AgentOutputParser, AgentType, initialize_agent, LLMSingleActionAgent\n",
+    "from langchain.agents.agent_toolkits import NLAToolkit\n",
+    "from langchain.schema import AgentAction, AgentFinish"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "37141072",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "spoonacular_api_key = \"\" # Copy from the API Console at https://spoonacular.com/food-api/console#Profile"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "dd720860",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Select the LLM to use. Here, we use text-davinci-003\n",
+    "llm = OpenAI(temperature=0, max_tokens=700) # You can swap between different core LLM's here."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4cadac9d",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "### Next, load the Natural Language API Toolkits"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "6b208ab0",
+   "metadata": {
+    "scrolled": true,
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
+      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
+      "Attempting to load an OpenAPI 3.0.1 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
+      "Attempting to load an OpenAPI 3.0.0 spec.  This may result in degraded performance. Convert your OpenAPI spec to 3.1.* spec for better support.\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Accept. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Accept. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Accept. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Accept. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Accept. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Accept. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Accept. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Accept. Valid values are ['path', 'query'] Ignoring optional parameter\n",
+      "Unsupported APIPropertyLocation \"header\" for parameter Content-Type. Valid values are ['path', 'query'] Ignoring optional parameter\n"
+     ]
+    }
+   ],
+   "source": [
+    "speak_toolkit = NLAToolkit.from_llm_and_url(llm, \"https://api.speak.com/openapi.yaml\")\n",
+    "klarna_toolkit = NLAToolkit.from_llm_and_url(llm, \"https://www.klarna.com/us/shopping/public/openai/v0/api-docs/\")\n",
+    "\n",
+    "# Add the API key for authenticating to the API\n",
+    "requests = Requests(headers={\"x-api-key\": spoonacular_api_key})\n",
+    "spoonacular_toolkit = NLAToolkit.from_llm_and_url(\n",
+    "    llm, \n",
+    "    \"https://spoonacular.com/application/frontend/downloads/spoonacular-openapi-3.json\",\n",
+    "    requests=requests,\n",
+    "    max_text_length=1800, # If you want to truncate the response text\n",
+    ")\n",
+    "toolkits = (speak_toolkit, spoonacular_toolkit, klarna_toolkit)\n",
+    "ALL_TOOLS = [tool for toolkit in toolkits for tool in toolkit.get_tools()]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e5dfc494",
+   "metadata": {},
+   "source": [
+    "## Tool Retriever\n",
+    "\n",
+    "We will use a vectorstore to create embeddings for each tool description. Then, for an incoming query we can create embeddings for that query and do a similarity search for relevant tools."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "f7c29e82",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.vectorstores import FAISS\n",
+    "from langchain.embeddings import OpenAIEmbeddings\n",
+    "from langchain.schema import Document"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "5cd940a8",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "docs = [Document(page_content=t.description, metadata={\"index\": i}) for i, t in enumerate(ALL_TOOLS)]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "19d77004",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "vector_store = FAISS.from_documents(docs, OpenAIEmbeddings())"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "3d84c14d-c3eb-4381-8852-fd6175b02239",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Create the retriever object\n",
+    "retriever = vector_store.as_retriever()\n",
+    "\n",
+    "def get_tools(query):\n",
+    "    docs = retriever.get_relevant_documents(query)\n",
+    "    return [ALL_TOOLS[d.metadata[\"index\"]] for d in docs]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "d37782ab-a74c-4cd0-8712-ba9166152206",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "('Speak.explainPhrase',\n",
+       " \"I'm an AI from Speak. Instruct what you want, and I'll assist via an API with description: Explain the meaning and usage of a specific foreign language phrase that the user is asking about.\")"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "tool = get_tools(\"How would I ask 'What is the most important thing?' in Maori?\")[0]\n",
+    "tool.name, tool.description"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "88814a90-f1fe-49e7-88e4-fcd662142cfb",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "('spoonacular_API.ingredientSearch',\n",
+       " \"I'm an AI from spoonacular API. Instruct what you want, and I'll assist via an API with description: Search for simple whole foods (e.g. fruits, vegetables, nuts, grains, meat, fish, dairy etc.).\")"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "tool = get_tools(\"What's a good vegetarian Thanksgiving dish?\")[0]\n",
+    "tool.name, tool.description"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "16c7336f",
+   "metadata": {},
+   "source": [
+    "### Create the Agent"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "3bc3de7c-dca9-4e29-9efb-dadb714c7100",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Set up a prompt template\n",
+    "class CustomPromptTemplate(StringPromptTemplate):\n",
+    "    # The template to use\n",
+    "    template: str\n",
+    "    ############## NEW ######################\n",
+    "    # The list of tools available\n",
+    "    tools_getter: Callable\n",
+    "    \n",
+    "    def format(self, **kwargs) -> str:\n",
+    "        # Get the intermediate steps (AgentAction, Observation tuples)\n",
+    "        # Format them in a particular way\n",
+    "        intermediate_steps = kwargs.pop(\"intermediate_steps\")\n",
+    "        thoughts = \"\"\n",
+    "        for action, observation in intermediate_steps:\n",
+    "            thoughts += action.log\n",
+    "            thoughts += f\"\\nObservation: {observation}\\nThought: \"\n",
+    "        # Set the agent_scratchpad variable to that value\n",
+    "        kwargs[\"agent_scratchpad\"] = thoughts\n",
+    "        ############## NEW ######################\n",
+    "        tools = self.tools_getter(kwargs['input'])\n",
+    "        # Create a tools variable from the list of tools provided\n",
+    "        kwargs[\"tools\"] = \"\\n\".join([f\"{tool.name}: {tool.description}\" for tool in tools])\n",
+    "        # Create a list of tool names for the tools provided\n",
+    "        kwargs[\"tool_names\"] = \", \".join([tool.name for tool in tools])\n",
+    "        return self.template.format(**kwargs)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "9302880c-ad1d-490e-912f-bb166387c200",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Slightly tweak the instructions from the default agent\n",
+    "template = \"\"\"Answer the following questions as best you can. You have access to the following tools:\n",
+    "\n",
+    "{tools}\n",
+    "\n",
+    "Use the following format:\n",
+    "\n",
+    "Question: the input question you must answer\n",
+    "Thought: you should always think about what to do\n",
+    "Action: the action to take, should be one of [{tool_names}]\n",
+    "Action Input: full description of what you want to accomplish so the tool AI can assist.\n",
+    "Observation: The Agent's response\n",
+    "... (this Thought/Action/Action Input/Observation can repeat N times)\n",
+    "Thought: I now know the final answer. User can't see any of my observations, API responses, links, or tools.\n",
+    "Final Answer: the final answer to the original input question with the right amount of detail\n",
+    "\n",
+    "When responding with your Final Answer, remember that the person you are responding to CANNOT see any of your Thought/Action/Action Input/Observations, so if there is any relevant information there you need to include it explicitly in your response.\n",
+    "Begin!\n",
+    "\n",
+    "Question: {input}\n",
+    "Thought:{agent_scratchpad}\"\"\"\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "36dedac3-94c2-4f02-b3ed-f44793c84eac",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "prompt = CustomPromptTemplate(\n",
+    "    template=template,\n",
+    "    tools_getter=get_tools,\n",
+    "    # This omits the `agent_scratchpad`, `tools`, and `tool_names` variables because those are generated dynamically\n",
+    "    # This includes the `intermediate_steps` variable because that is needed\n",
+    "    input_variables=[\"input\", \"intermediate_steps\"]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "8f2fe3cb-1060-4e71-8c9f-31cbb1931f1a",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "class CustomOutputParser(AgentOutputParser):\n",
+    "    \n",
+    "    def parse(self, llm_output: str) -> Union[AgentAction, AgentFinish]:\n",
+    "        # Check if agent should finish\n",
+    "        if \"Final Answer:\" in llm_output:\n",
+    "            return AgentFinish(\n",
+    "                # Return values is generally always a dictionary with a single `output` key\n",
+    "                # It is not recommended to try anything else at the moment :)\n",
+    "                return_values={\"output\": llm_output.split(\"Final Answer:\")[-1].strip()},\n",
+    "                log=llm_output,\n",
+    "            )\n",
+    "        # Parse out the action and action input\n",
+    "        regex = r\"Action: (.*?)[\\n]*Action Input:[\\s]*(.*)\"\n",
+    "        match = re.search(regex, llm_output, re.DOTALL)\n",
+    "        if not match:\n",
+    "            raise ValueError(f\"Could not parse LLM output: `{llm_output}`\")\n",
+    "        action = match.group(1).strip()\n",
+    "        action_input = match.group(2)\n",
+    "        # Return the action and action input\n",
+    "        return AgentAction(tool=action, tool_input=action_input.strip(\" \").strip('\"'), log=llm_output)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "544ae3eb-9e10-40c9-b134-de752761f4f2",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "output_parser = CustomOutputParser()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4609a1ce-3f72-403e-bf83-958491f5805a",
+   "metadata": {},
+   "source": [
+    "## Set up LLM, stop sequence, and the agent\n",
+    "\n",
+    "Also the same as the previous notebook"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "af14b84e-2713-4515-bc3c-c602b5527a06",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# LLM chain consisting of the LLM and a prompt\n",
+    "llm_chain = LLMChain(llm=llm, prompt=prompt)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "bc534f60-5042-454a-afd0-c2809b82fa6a",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "tool_names = [tool.name for tool in ALL_TOOLS]\n",
+    "agent = LLMSingleActionAgent(\n",
+    "    llm_chain=llm_chain, \n",
+    "    output_parser=output_parser,\n",
+    "    stop=[\"\\nObservation:\"], \n",
+    "    allowed_tools=tool_names,\n",
+    "    verbose=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "7283d902-f682-4631-aacc-a37616999de9",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "agent_executor = AgentExecutor.from_agent_and_tools(agent=agent, tools=ALL_TOOLS, verbose=True)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "ffe44871-25b0-4ef5-89a4-5a643fa6425d",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Make the query more complex!\n",
+    "user_input = (\n",
+    "    \"My Spanish relatives are coming and I need to pick some good wine and some food. Also, any advice on how to talk about it to them would be much appreciated\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "id": "2f36278e-ec09-4237-8b7f-c9859ccd12e0",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m I need to find a wine and a dish that go well together, and also provide some advice on how to talk about it\n",
+      "Action: spoonacular_API.getWinePairing\n",
+      "Action Input: \"Spanish cuisine\"\u001b[0m\n",
+      "\n",
+      "Observation:\u001b[36;1m\u001b[1;3mI attempted to call an API to find a wine pairing for Spanish cuisine, but the API returned an error saying it could not find a pairing. It may be that there is no known wine pairing for Spanish cuisine.\u001b[0m\u001b[32;1m\u001b[1;3m I should try to find a specific dish and then find a wine that goes well with it\n",
+      "Action: spoonacular_API.getWinePairing\n",
+      "Action Input: \"paella\"\u001b[0m\n",
+      "\n",
+      "Observation:\u001b[36;1m\u001b[1;3mWhen pairing wine with Spanish dishes, it is recommended to follow the rule 'what grows together goes together'. For paella, we recommend albariño for white wine and garnachan and tempranillo for red. One wine you could try is Becker Vineyards Tempranillo, which has 4.4 out of 5 stars and a bottle costs about 18 dollars.\u001b[0m\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
+      "Final Answer: For your Spanish relatives, you should try pairing paella with Becker Vineyards Tempranillo. This red wine has 4.4 out of 5 stars and a bottle costs about 18 dollars. When pairing wine with Spanish dishes, it is recommended to follow the rule 'what grows together goes together'.\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "\"For your Spanish relatives, you should try pairing paella with Becker Vineyards Tempranillo. This red wine has 4.4 out of 5 stars and a bottle costs about 18 dollars. When pairing wine with Spanish dishes, it is recommended to follow the rule 'what grows together goes together'.\""
+      ]
+     },
+     "execution_count": 20,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "agent_executor.run(user_input)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a2959462",
+   "metadata": {},
+   "source": [
+    "## Thank you!"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}