anthropic: support built-in tools, improve docs (#30274)

- Support features from recent update:
https://www.anthropic.com/news/token-saving-updates (mostly adding
support for built-in tools in `bind_tools`
- Add documentation around prompt caching, token-efficient tool use, and
built-in tools.
This commit is contained in:
ccurme
2025-03-14 12:18:50 -04:00
committed by GitHub
parent f27e2d7ce7
commit 226f29bc96
7 changed files with 629 additions and 30 deletions

View File

@@ -102,6 +102,16 @@
"%pip install -qU langchain-anthropic"
]
},
{
"cell_type": "markdown",
"id": "fe4993ad-4a9b-4021-8ebd-f0fbbc739f49",
"metadata": {},
"source": [
":::info This guide requires ``langchain-anthropic>=0.3.10``\n",
"\n",
":::"
]
},
{
"cell_type": "markdown",
"id": "a38cde65-254d-4219-a441-068766c0d4b5",
@@ -245,7 +255,7 @@
"source": [
"## Content blocks\n",
"\n",
"One key difference to note between Anthropic models and most others is that the contents of a single Anthropic AI message can either be a single string or a **list of content blocks**. For example when an Anthropic model invokes a tool, the tool invocation is part of the message content (as well as being exposed in the standardized `AIMessage.tool_calls`):"
"Content from a single Anthropic AI message can either be a single string or a **list of content blocks**. For example when an Anthropic model invokes a tool, the tool invocation is part of the message content (as well as being exposed in the standardized `AIMessage.tool_calls`):"
]
},
{
@@ -368,6 +378,377 @@
"print(json.dumps(response.content, indent=2))"
]
},
{
"cell_type": "markdown",
"id": "34349dfe-5d81-4887-a4f4-cd01e9587cdc",
"metadata": {},
"source": [
"## Prompt caching\n",
"\n",
"Anthropic supports [caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching) of [elements of your prompts](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#what-can-be-cached), including messages, tool definitions, tool results, images and documents. This allows you to re-use large documents, instructions, [few-shot documents](/docs/concepts/few_shot_prompting/), and other data to reduce latency and costs.\n",
"\n",
"To enable caching on an element of a prompt, mark its associated content block using the `cache_control` key. See examples below:\n",
"\n",
"### Messages"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "babb44a5-33f7-4200-9dfc-be867cf2c217",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"First invocation:\n",
"{'cache_read': 0, 'cache_creation': 1458}\n",
"\n",
"Second:\n",
"{'cache_read': 1458, 'cache_creation': 0}\n"
]
}
],
"source": [
"import requests\n",
"from langchain_anthropic import ChatAnthropic\n",
"\n",
"llm = ChatAnthropic(model=\"claude-3-7-sonnet-20250219\")\n",
"\n",
"# Pull LangChain readme\n",
"get_response = requests.get(\n",
" \"https://raw.githubusercontent.com/langchain-ai/langchain/master/README.md\"\n",
")\n",
"readme = get_response.text\n",
"\n",
"messages = [\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": [\n",
" {\n",
" \"type\": \"text\",\n",
" \"text\": \"You are a technology expert.\",\n",
" },\n",
" {\n",
" \"type\": \"text\",\n",
" \"text\": f\"{readme}\",\n",
" # highlight-next-line\n",
" \"cache_control\": {\"type\": \"ephemeral\"},\n",
" },\n",
" ],\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": \"What's LangChain, according to its README?\",\n",
" },\n",
"]\n",
"\n",
"response_1 = llm.invoke(messages)\n",
"response_2 = llm.invoke(messages)\n",
"\n",
"usage_1 = response_1.usage_metadata[\"input_token_details\"]\n",
"usage_2 = response_2.usage_metadata[\"input_token_details\"]\n",
"\n",
"print(f\"First invocation:\\n{usage_1}\")\n",
"print(f\"\\nSecond:\\n{usage_2}\")"
]
},
{
"cell_type": "markdown",
"id": "141ce9c5-012d-4502-9d61-4a413b5d959a",
"metadata": {},
"source": [
"### Tools"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "1de82015-810f-4ed4-a08b-9866ea8746ce",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"First invocation:\n",
"{'cache_read': 0, 'cache_creation': 1809}\n",
"\n",
"Second:\n",
"{'cache_read': 1809, 'cache_creation': 0}\n"
]
}
],
"source": [
"from langchain_anthropic import convert_to_anthropic_tool\n",
"from langchain_core.tools import tool\n",
"\n",
"# For demonstration purposes, we artificially expand the\n",
"# tool description.\n",
"description = (\n",
" f\"Get the weather at a location. By the way, check out this readme: {readme}\"\n",
")\n",
"\n",
"\n",
"@tool(description=description)\n",
"def get_weather(location: str) -> str:\n",
" return \"It's sunny.\"\n",
"\n",
"\n",
"# Enable caching on the tool\n",
"# highlight-start\n",
"weather_tool = convert_to_anthropic_tool(get_weather)\n",
"weather_tool[\"cache_control\"] = {\"type\": \"ephemeral\"}\n",
"# highlight-end\n",
"\n",
"llm = ChatAnthropic(model=\"claude-3-7-sonnet-20250219\")\n",
"llm_with_tools = llm.bind_tools([weather_tool])\n",
"query = \"What's the weather in San Francisco?\"\n",
"\n",
"response_1 = llm_with_tools.invoke(query)\n",
"response_2 = llm_with_tools.invoke(query)\n",
"\n",
"usage_1 = response_1.usage_metadata[\"input_token_details\"]\n",
"usage_2 = response_2.usage_metadata[\"input_token_details\"]\n",
"\n",
"print(f\"First invocation:\\n{usage_1}\")\n",
"print(f\"\\nSecond:\\n{usage_2}\")"
]
},
{
"cell_type": "markdown",
"id": "a763830d-82cb-448a-ab30-f561522791b9",
"metadata": {},
"source": [
"### Incremental caching in conversational applications\n",
"\n",
"Prompt caching can be used in [multi-turn conversations](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#continuing-a-multi-turn-conversation) to maintain context from earlier messages without redundant processing.\n",
"\n",
"We can enable incremental caching by marking the final message with `cache_control`. Claude will automatically use the longest previously-cacched prefix for follow-up messages.\n",
"\n",
"Below, we implement a simple chatbot that incorporates this feature. We follow the LangChain [chatbot tutorial](/docs/tutorials/chatbot/), but add a custom [reducer](https://langchain-ai.github.io/langgraph/concepts/low_level/#reducers) that automatically marks the last content block in each user message with `cache_control`. See below:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "07fde4db-344c-49bc-a5b4-99e2d20fb394",
"metadata": {},
"outputs": [],
"source": [
"import requests\n",
"from langchain_anthropic import ChatAnthropic\n",
"from langgraph.checkpoint.memory import MemorySaver\n",
"from langgraph.graph import START, StateGraph, add_messages\n",
"from typing_extensions import Annotated, TypedDict\n",
"\n",
"llm = ChatAnthropic(model=\"claude-3-7-sonnet-20250219\")\n",
"\n",
"# Pull LangChain readme\n",
"get_response = requests.get(\n",
" \"https://raw.githubusercontent.com/langchain-ai/langchain/master/README.md\"\n",
")\n",
"readme = get_response.text\n",
"\n",
"\n",
"def messages_reducer(left: list, right: list) -> list:\n",
" # Update last user message\n",
" for i in range(len(right) - 1, -1, -1):\n",
" if right[i].type == \"human\":\n",
" right[i].content[-1][\"cache_control\"] = {\"type\": \"ephemeral\"}\n",
" break\n",
"\n",
" return add_messages(left, right)\n",
"\n",
"\n",
"class State(TypedDict):\n",
" messages: Annotated[list, messages_reducer]\n",
"\n",
"\n",
"workflow = StateGraph(state_schema=State)\n",
"\n",
"\n",
"# Define the function that calls the model\n",
"def call_model(state: State):\n",
" response = llm.invoke(state[\"messages\"])\n",
" return {\"messages\": [response]}\n",
"\n",
"\n",
"# Define the (single) node in the graph\n",
"workflow.add_edge(START, \"model\")\n",
"workflow.add_node(\"model\", call_model)\n",
"\n",
"# Add memory\n",
"memory = MemorySaver()\n",
"app = workflow.compile(checkpointer=memory)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "40013035-eb22-4327-8aaf-1ee974d9ff46",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
"\n",
"Hello, Bob! It's nice to meet you. How are you doing today? Is there something I can help you with?\n",
"\n",
"{'cache_read': 0, 'cache_creation': 0}\n"
]
}
],
"source": [
"from langchain_core.messages import HumanMessage\n",
"\n",
"config = {\"configurable\": {\"thread_id\": \"abc123\"}}\n",
"\n",
"query = \"Hi! I'm Bob.\"\n",
"\n",
"input_message = HumanMessage([{\"type\": \"text\", \"text\": query}])\n",
"output = app.invoke({\"messages\": [input_message]}, config)\n",
"output[\"messages\"][-1].pretty_print()\n",
"print(f'\\n{output[\"messages\"][-1].usage_metadata[\"input_token_details\"]}')"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "22371f68-7913-4c4f-ab4a-2b4265095469",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
"\n",
"I can see you've shared the README from the LangChain GitHub repository. This is the documentation for LangChain, which is a popular framework for building applications powered by Large Language Models (LLMs). Here's a summary of what the README contains:\n",
"\n",
"LangChain is:\n",
"- A framework for developing LLM-powered applications\n",
"- Helps chain together components and integrations to simplify AI application development\n",
"- Provides a standard interface for models, embeddings, vector stores, etc.\n",
"\n",
"Key features/benefits:\n",
"- Real-time data augmentation (connect LLMs to diverse data sources)\n",
"- Model interoperability (swap models easily as needed)\n",
"- Large ecosystem of integrations\n",
"\n",
"The LangChain ecosystem includes:\n",
"- LangSmith - For evaluations and observability\n",
"- LangGraph - For building complex agents with customizable architecture\n",
"- LangGraph Platform - For deployment and scaling of agents\n",
"\n",
"The README also mentions installation instructions (`pip install -U langchain`) and links to various resources including tutorials, how-to guides, conceptual guides, and API references.\n",
"\n",
"Is there anything specific about LangChain you'd like to know more about, Bob?\n",
"\n",
"{'cache_read': 0, 'cache_creation': 1498}\n"
]
}
],
"source": [
"query = f\"Check out this readme: {readme}\"\n",
"\n",
"input_message = HumanMessage([{\"type\": \"text\", \"text\": query}])\n",
"output = app.invoke({\"messages\": [input_message]}, config)\n",
"output[\"messages\"][-1].pretty_print()\n",
"print(f'\\n{output[\"messages\"][-1].usage_metadata[\"input_token_details\"]}')"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "0e6798fc-8a80-4324-b4e3-f18706256c61",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
"\n",
"Your name is Bob. You introduced yourself at the beginning of our conversation.\n",
"\n",
"{'cache_read': 1498, 'cache_creation': 269}\n"
]
}
],
"source": [
"query = \"What was my name again?\"\n",
"\n",
"input_message = HumanMessage([{\"type\": \"text\", \"text\": query}])\n",
"output = app.invoke({\"messages\": [input_message]}, config)\n",
"output[\"messages\"][-1].pretty_print()\n",
"print(f'\\n{output[\"messages\"][-1].usage_metadata[\"input_token_details\"]}')"
]
},
{
"cell_type": "markdown",
"id": "aa4b3647-c672-4782-a88c-a55fd3bf969f",
"metadata": {},
"source": [
"In the [LangSmith trace](https://smith.langchain.com/public/4d0584d8-5f9e-4b91-8704-93ba2ccf416a/r), toggling \"raw output\" will show exactly what messages are sent to the chat model, including `cache_control` keys."
]
},
{
"cell_type": "markdown",
"id": "029009f2-2795-418b-b5fc-fb996c6fe99e",
"metadata": {},
"source": [
"## Token-efficient tool use\n",
"\n",
"Anthropic supports a (beta) [token-efficient tool use](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/token-efficient-tool-use) feature. To use it, specify the relevant beta-headers when instantiating the model."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "206cff65-33b8-4a88-9b1a-050b4d57772a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[{'name': 'get_weather', 'args': {'location': 'San Francisco'}, 'id': 'toolu_01EoeE1qYaePcmNbUvMsWtmA', 'type': 'tool_call'}]\n",
"\n",
"Total tokens: 408\n"
]
}
],
"source": [
"from langchain_anthropic import ChatAnthropic\n",
"from langchain_core.tools import tool\n",
"\n",
"llm = ChatAnthropic(\n",
" model=\"claude-3-7-sonnet-20250219\",\n",
" temperature=0,\n",
" # highlight-start\n",
" model_kwargs={\n",
" \"extra_headers\": {\"anthropic-beta\": \"token-efficient-tools-2025-02-19\"}\n",
" },\n",
" # highlight-end\n",
")\n",
"\n",
"\n",
"@tool\n",
"def get_weather(location: str) -> str:\n",
" \"\"\"Get the weather at a location.\"\"\"\n",
" return \"It's sunny.\"\n",
"\n",
"\n",
"llm_with_tools = llm.bind_tools([get_weather])\n",
"response = llm_with_tools.invoke(\"What's the weather in San Francisco?\")\n",
"print(response.tool_calls)\n",
"print(f'\\nTotal tokens: {response.usage_metadata[\"total_tokens\"]}')"
]
},
{
"cell_type": "markdown",
"id": "301d372f-4dec-43e6-b58c-eee25633e1a6",
@@ -525,6 +906,58 @@
"response.content"
]
},
{
"cell_type": "markdown",
"id": "cbfec7a9-d9df-4d12-844e-d922456dd9bf",
"metadata": {},
"source": [
"## Built-in tools\n",
"\n",
"Anthropic supports a variety of [built-in tools](https://docs.anthropic.com/en/docs/build-with-claude/tool-use/text-editor-tool), which can be bound to the model in the [usual way](/docs/how_to/tool_calling/). Claude will generate tool calls adhering to its internal schema for the tool:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "30a0af36-2327-4b1d-9ba5-e47cb72db0be",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"I'd be happy to help you fix the syntax error in your primes.py file. First, let's look at the current content of the file to identify the error.\n"
]
},
{
"data": {
"text/plain": [
"[{'name': 'str_replace_editor',\n",
" 'args': {'command': 'view', 'path': '/repo/primes.py'},\n",
" 'id': 'toolu_01VdNgt1YV7kGfj9LFLm6HyQ',\n",
" 'type': 'tool_call'}]"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_anthropic import ChatAnthropic\n",
"\n",
"llm = ChatAnthropic(model=\"claude-3-7-sonnet-20250219\")\n",
"\n",
"tool = {\"type\": \"text_editor_20250124\", \"name\": \"str_replace_editor\"}\n",
"llm_with_tools = llm.bind_tools([tool])\n",
"\n",
"response = llm_with_tools.invoke(\n",
" \"There's a syntax error in my primes.py file. Can you help me fix it?\"\n",
")\n",
"print(response.text())\n",
"response.tool_calls"
]
},
{
"cell_type": "markdown",
"id": "3a5bb5ca-c3ae-4a58-be67-2cd18574b9a3",