Files
langchain/docs/docs/how_to/custom_tools.ipynb
Eugene Yurtsev b2f58d37db docs: run migration script against how-to docs (#21927)
Upgrade imports in how-to docs
2024-05-20 17:32:59 +00:00

623 lines
20 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "5436020b",
"metadata": {},
"source": [
"# How to create custom tools\n",
"\n",
"When constructing an agent, you will need to provide it with a list of `Tool`s that it can use. Besides the actual function that is called, the Tool consists of several components:\n",
"\n",
"| Attribute | Type | Description |\n",
"|-----------------|---------------------------|------------------------------------------------------------------------------------------------------------------|\n",
"| name | str | Must be unique within a set of tools provided to an LLM or agent. |\n",
"| description | str | Describes what the tool does. Used as context by the LLM or agent. |\n",
"| args_schema | Pydantic BaseModel | Optional but recommended, can be used to provide more information (e.g., few-shot examples) or validation for expected parameters |\n",
"| return_direct | boolean | Only relevant for agents. When True, after invoking the given tool, the agent will stop and return the result direcly to the user. |\n",
"\n",
"LangChain provides 3 ways to create tools:\n",
"\n",
"1. Using [@tool decorator](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.tool.html#langchain_core.tools.tool) -- the simplest way to define a custom tool.\n",
"2. Using [StructuredTool.from_function](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.StructuredTool.html#langchain_core.tools.StructuredTool.from_function) class method -- this is similar to the `@tool` decorator, but allows more configuration and specification of both sync and async implementations.\n",
"3. By sub-classing from [BaseTool](https://api.python.langchain.com/en/latest/tools/langchain_core.tools.BaseTool.html) -- This is the most flexible method, it provides the largest degree of control, at the expense of more effort and code.\n",
"\n",
"The `@tool` or the `StructuredTool.from_function` class method should be sufficient for most use cases.\n",
"\n",
":::{.callout-tip}\n",
"\n",
"Models will perform better if the tools have well chosen names, descriptions and JSON schemas.\n",
":::"
]
},
{
"cell_type": "markdown",
"id": "c7326b23",
"metadata": {},
"source": [
"## @tool decorator\n",
"\n",
"This `@tool` decorator is the simplest way to define a custom tool. The decorator uses the function name as the tool name by default, but this can be overridden by passing a string as the first argument. Additionally, the decorator will use the function's docstring as the tool's description - so a docstring MUST be provided. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "cc7005cd-072f-4d37-8453-6297468e5192",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"multiply\n",
"multiply(a: int, b: int) -> int - Multiply two numbers.\n",
"{'a': {'title': 'A', 'type': 'integer'}, 'b': {'title': 'B', 'type': 'integer'}}\n"
]
}
],
"source": [
"from langchain_core.tools import tool\n",
"\n",
"\n",
"@tool\n",
"def multiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"# Let's inspect some of the attributes associated with the tool.\n",
"print(multiply.name)\n",
"print(multiply.description)\n",
"print(multiply.args)"
]
},
{
"cell_type": "markdown",
"id": "96698b67-993a-4c97-b867-333132e1eb14",
"metadata": {},
"source": [
"Or create an **async** implementation, like this:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "0c0991db-b997-4611-be37-4346e660506b",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.tools import tool\n",
"\n",
"\n",
"@tool\n",
"async def amultiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b"
]
},
{
"cell_type": "markdown",
"id": "98d6eee9",
"metadata": {},
"source": [
"You can also customize the tool name and JSON args by passing them into the tool decorator."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "9216d03a-f6ea-4216-b7e1-0661823a4c0b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"multiplication-tool\n",
"multiplication-tool(a: int, b: int) -> int - Multiply two numbers.\n",
"{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}\n",
"True\n"
]
}
],
"source": [
"from langchain.pydantic_v1 import BaseModel, Field\n",
"\n",
"\n",
"class CalculatorInput(BaseModel):\n",
" a: int = Field(description=\"first number\")\n",
" b: int = Field(description=\"second number\")\n",
"\n",
"\n",
"@tool(\"multiplication-tool\", args_schema=CalculatorInput, return_direct=True)\n",
"def multiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"# Let's inspect some of the attributes associated with the tool.\n",
"print(multiply.name)\n",
"print(multiply.description)\n",
"print(multiply.args)\n",
"print(multiply.return_direct)"
]
},
{
"cell_type": "markdown",
"id": "b63fcc3b",
"metadata": {},
"source": [
"## StructuredTool\n",
"\n",
"The `StrurcturedTool.from_function` class method provides a bit more configurability than the `@tool` decorator, without requiring much additional code."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "564fbe6f-11df-402d-b135-ef6ff25e1e63",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"6\n",
"10\n"
]
}
],
"source": [
"from langchain_core.tools import StructuredTool\n",
"\n",
"\n",
"def multiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"async def amultiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"calculator = StructuredTool.from_function(func=multiply, coroutine=amultiply)\n",
"\n",
"print(calculator.invoke({\"a\": 2, \"b\": 3}))\n",
"print(await calculator.ainvoke({\"a\": 2, \"b\": 5}))"
]
},
{
"cell_type": "markdown",
"id": "26b3712a-b38d-4582-b6e6-bc7cfb1d6680",
"metadata": {},
"source": [
"To configure it:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "6bc055d4-1fbe-4db5-8881-9c382eba6b1b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"6\n",
"Calculator\n",
"Calculator(a: int, b: int) -> int - multiply numbers\n",
"{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}\n"
]
}
],
"source": [
"class CalculatorInput(BaseModel):\n",
" a: int = Field(description=\"first number\")\n",
" b: int = Field(description=\"second number\")\n",
"\n",
"\n",
"def multiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"calculator = StructuredTool.from_function(\n",
" func=multiply,\n",
" name=\"Calculator\",\n",
" description=\"multiply numbers\",\n",
" args_schema=CalculatorInput,\n",
" return_direct=True,\n",
" # coroutine= ... <- you can specify an async method if desired as well\n",
")\n",
"\n",
"print(calculator.invoke({\"a\": 2, \"b\": 3}))\n",
"print(calculator.name)\n",
"print(calculator.description)\n",
"print(calculator.args)"
]
},
{
"cell_type": "markdown",
"id": "b840074b-9c10-4ca0-aed8-626c52b2398f",
"metadata": {},
"source": [
"## Subclass BaseTool\n",
"\n",
"You can define a custom tool by sub-classing from `BaseTool`. This provides maximal control over the tool definition, but requires writing more code."
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "1dad8f8e",
"metadata": {},
"outputs": [],
"source": [
"from typing import Optional, Type\n",
"\n",
"from langchain.pydantic_v1 import BaseModel\n",
"from langchain_core.callbacks import (\n",
" AsyncCallbackManagerForToolRun,\n",
" CallbackManagerForToolRun,\n",
")\n",
"from langchain_core.tools import BaseTool\n",
"\n",
"\n",
"class CalculatorInput(BaseModel):\n",
" a: int = Field(description=\"first number\")\n",
" b: int = Field(description=\"second number\")\n",
"\n",
"\n",
"class CustomCalculatorTool(BaseTool):\n",
" name = \"Calculator\"\n",
" description = \"useful for when you need to answer questions about math\"\n",
" args_schema: Type[BaseModel] = CalculatorInput\n",
" return_direct: bool = True\n",
"\n",
" def _run(\n",
" self, a: int, b: int, run_manager: Optional[CallbackManagerForToolRun] = None\n",
" ) -> str:\n",
" \"\"\"Use the tool.\"\"\"\n",
" return a * b\n",
"\n",
" async def _arun(\n",
" self,\n",
" a: int,\n",
" b: int,\n",
" run_manager: Optional[AsyncCallbackManagerForToolRun] = None,\n",
" ) -> str:\n",
" \"\"\"Use the tool asynchronously.\"\"\"\n",
" # If the calculation is cheap, you can just delegate to the sync implementation\n",
" # as shown below.\n",
" # If the sync calculation is expensive, you should delete the entire _arun method.\n",
" # LangChain will automatically provide a better implementation that will\n",
" # kick off the task in a thread to make sure it doesn't block other async code.\n",
" return self._run(a, b, run_manager=run_manager.get_sync())"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "bb551c33",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Calculator\n",
"useful for when you need to answer questions about math\n",
"{'a': {'title': 'A', 'description': 'first number', 'type': 'integer'}, 'b': {'title': 'B', 'description': 'second number', 'type': 'integer'}}\n",
"True\n",
"6\n",
"6\n"
]
}
],
"source": [
"multiply = CustomCalculatorTool()\n",
"print(multiply.name)\n",
"print(multiply.description)\n",
"print(multiply.args)\n",
"print(multiply.return_direct)\n",
"\n",
"print(multiply.invoke({\"a\": 2, \"b\": 3}))\n",
"print(await multiply.ainvoke({\"a\": 2, \"b\": 3}))"
]
},
{
"cell_type": "markdown",
"id": "97aba6cc-4bdf-4fab-aff3-d89e7d9c3a09",
"metadata": {},
"source": [
"## How to create async tools\n",
"\n",
"LangChain Tools implement the [Runnable interface 🏃](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html).\n",
"\n",
"All Runnables expose the `invoke` and `ainvoke` methods (as well as other methods like `batch`, `abatch`, `astream` etc).\n",
"\n",
"So even if you only provide an `sync` implementation of a tool, you could still use the `ainvoke` interface, but there\n",
"are some important things to know:\n",
"\n",
"* LangChain's by default provides an async implementation that assumes that the function is expensive to compute, so it'll delegate execution to another thread.\n",
"* If you're working in an async codebase, you should create async tools rather than sync tools, to avoid incuring a small overhead due to that thread.\n",
"* If you need both sync and async implementations, use `StructuredTool.from_function` or sub-class from `BaseTool`.\n",
"* If implementing both sync and async, and the sync code is fast to run, override the default LangChain async implementation and simply call the sync code.\n",
"* You CANNOT and SHOULD NOT use the sync `invoke` with an `async` tool."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "6615cb77-fd4c-4676-8965-f92cc71d4944",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"6\n",
"10\n"
]
}
],
"source": [
"from langchain_core.tools import StructuredTool\n",
"\n",
"\n",
"def multiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"calculator = StructuredTool.from_function(func=multiply)\n",
"\n",
"print(calculator.invoke({\"a\": 2, \"b\": 3}))\n",
"print(\n",
" await calculator.ainvoke({\"a\": 2, \"b\": 5})\n",
") # Uses default LangChain async implementation incurs small overhead"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "bb2af583-eadd-41f4-a645-bf8748bd3dcd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"6\n",
"10\n"
]
}
],
"source": [
"from langchain_core.tools import StructuredTool\n",
"\n",
"\n",
"def multiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"async def amultiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"calculator = StructuredTool.from_function(func=multiply, coroutine=amultiply)\n",
"\n",
"print(calculator.invoke({\"a\": 2, \"b\": 3}))\n",
"print(\n",
" await calculator.ainvoke({\"a\": 2, \"b\": 5})\n",
") # Uses use provided amultiply without additional overhead"
]
},
{
"cell_type": "markdown",
"id": "c80ffdaa-e4ba-4a70-8500-32bf4f60cc1a",
"metadata": {},
"source": [
"You should not and cannot use `.invoke` when providing only an async definition."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "4ad0932c-8610-4278-8c57-f9218f654c8a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Raised not implemented error. You should not be doing this.\n"
]
}
],
"source": [
"@tool\n",
"async def multiply(a: int, b: int) -> int:\n",
" \"\"\"Multiply two numbers.\"\"\"\n",
" return a * b\n",
"\n",
"\n",
"try:\n",
" multiply.invoke({\"a\": 2, \"b\": 3})\n",
"except NotImplementedError:\n",
" print(\"Raised not implemented error. You should not be doing this.\")"
]
},
{
"cell_type": "markdown",
"id": "f9c746a7-88d7-4afb-bcb8-0e98b891e8b6",
"metadata": {},
"source": [
"## Handling Tool Errors \n",
"\n",
"If you're using tools with agents, you will likely need an error handling strategy, so the agent can recover from the error and continue execution.\n",
"\n",
"A simple strategy is to throw a `ToolException` from inside the tool and specify an error handler using `handle_tool_error`. \n",
"\n",
"When the error handler is specified, the exception will be caught and the error handler will decide which output to return from the tool.\n",
"\n",
"You can set `handle_tool_error` to `True`, a string value, or a function. If it's a function, the function should take a `ToolException` as a parameter and return a value.\n",
"\n",
"Please note that only raising a `ToolException` won't be effective. You need to first set the `handle_tool_error` of the tool because its default value is `False`."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "7094c0e8-6192-4870-a942-aad5b5ae48fd",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.tools import ToolException\n",
"\n",
"\n",
"def get_weather(city: str) -> int:\n",
" \"\"\"Get weather for the given city.\"\"\"\n",
" raise ToolException(f\"Error: There is no city by the name of {city}.\")"
]
},
{
"cell_type": "markdown",
"id": "9d93b217-1d44-4d31-8956-db9ea680ff4f",
"metadata": {},
"source": [
"Here's an example with the default `handle_tool_error=True` behavior."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "b4d22022-b105-4ccc-a15b-412cb9ea3097",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Error: There is no city by the name of foobar.'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_weather_tool = StructuredTool.from_function(\n",
" func=get_weather,\n",
" handle_tool_error=True,\n",
")\n",
"\n",
"get_weather_tool.invoke({\"city\": \"foobar\"})"
]
},
{
"cell_type": "markdown",
"id": "f91d6dc0-3271-4adc-a155-21f2e62ffa56",
"metadata": {},
"source": [
"We can set `handle_tool_error` to a string that will always be returned."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "3fad1728-d367-4e1b-9b54-3172981271cf",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"There is no such city, but it's probably above 0K there!\""
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"get_weather_tool = StructuredTool.from_function(\n",
" func=get_weather,\n",
" handle_tool_error=\"There is no such city, but it's probably above 0K there!\",\n",
")\n",
"\n",
"get_weather_tool.invoke({\"city\": \"foobar\"})"
]
},
{
"cell_type": "markdown",
"id": "b0a640c1-e08f-4413-83b6-f599f304935f",
"metadata": {},
"source": [
"Handling the error using a function:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "ebfe7c1f-318d-4e58-99e1-f31e69473c46",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The following errors occurred during tool execution: `Error: There is no city by the name of foobar.`'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def _handle_error(error: ToolException) -> str:\n",
" return f\"The following errors occurred during tool execution: `{error.args[0]}`\"\n",
"\n",
"\n",
"get_weather_tool = StructuredTool.from_function(\n",
" func=get_weather,\n",
" handle_tool_error=_handle_error,\n",
")\n",
"\n",
"get_weather_tool.invoke({\"city\": \"foobar\"})"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
},
"vscode": {
"interpreter": {
"hash": "e90c8aa204a57276aa905271aff2d11799d0acb3547adabc5892e639a5e45e34"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}