docs: update (#27520)

Collecting some feedback --------- Co-authored-by: Lance Martin <lance@langchain.dev>
2025-08-19 17:45:25 +00:00 · 2024-10-21 21:37:51 -04:00 · 2024-10-21 21:37:51 -04:00 · e442e485fe
commit e442e485fe
parent 606225e774
9 changed files with 220 additions and 226 deletions
--- a/docs/docs/concepts/agents.mdx
+++ b/docs/docs/concepts/agents.mdx
@ -1,40 +1,44 @@
 # Agents
-By themselves, language models can't take actions - they just output text.
+## What is an agent?
 A big use case for LangChain is creating **agents**.
 Agents are systems that use an LLM as a reasoning engine to determine which actions to take and what the inputs to those actions should be.
 The results of those actions can then be fed back into the agent and it determine whether more actions are needed, or whether it is okay to finish.
 Many LLM applications implement a particular control flow of steps before and / or after LLM calls. 
 As an example, [RAG](/docs/concepts/rag/) performs [retrieval](/docs/concepts/retrieval/) of relevant documents to a question, and passes those documents to an LLM in order to ground the [model](/docs/concepts/chat_models/)'s response.
 Instead of hard-coding a fixed control flow, we sometimes want LLM systems that can pick its own control flow to solve more complex problems! 
 This is one definition of an agent: an agent is a system that uses an LLM to decide the control flow of an application. 
 There are many ways that an LLM can control application:
 - An LLM can route between two potential paths
 - An LLM can decide which of many tools to call
 - An LLM can decide whether the generated answer is sufficient or more work is needed
 ![Agent types](/img/agent_types.png)
 ## LangGraph 
 As a result, there are many different types of [agent architectures](https://blog.langchain.dev/what-is-a-cognitive-architecture/), which give an LLM varying levels of control.
 [LangGraph](https://github.com/langchain-ai/langgraph) is an extension of LangChain specifically aimed at creating highly controllable and customizable agents.
-Please check out that documentation for a more in depth overview of agent concepts.
+The motivation of LangGraph is to help preserve high reliability as we give the agent more control over the application.
-There is a legacy `agent` concept in LangChain that we are moving towards deprecating: `AgentExecutor`.
+:::info[Further reading]
 AgentExecutor was essentially a runtime for agents.
 It was a great place to get started, however, it was not flexible enough as you started to have more customized agents.
 In order to solve that we built LangGraph to be this flexible, highly-controllable runtime.
-If you are still using AgentExecutor, do not fear: we still have a guide on [how to use AgentExecutor](/docs/how_to/agent_executor).
+* See our LangGraph overview [here](https://langchain-ai.github.io/langgraph/concepts/high_level/#core-principles).
-It is recommended, however, that you start to transition to LangGraph.
+* See our LangGraph Academy Course [here](https://academy.langchain.com/courses/intro-to-langgraph).
 In order to assist in this, we have put together a [transition guide on how to do so](/docs/how_to/migrate_agent).
-## ReAct agents
+:::
 <span data-heading-keywords="react,react agent"></span>
-One popular architecture for building agents is [**ReAct**](https://arxiv.org/abs/2210.03629).
+## Legacy Agent Concept: AgentExecutor
 ReAct combines reasoning and acting in an iterative process - in fact the name "ReAct" stands for "Reason" and "Act".
-The general flow looks like this:
+LangChain previously introduced the `AgentExecutor` as a runtime for agents. 
 While it served as an excellent starting point, its limitations became apparent when dealing with more sophisticated and customized agents. 
 As a result, we're gradually phasing out `AgentExecutor` in favor of more flexible solutions in LangGraph.
- The model will "think" about what step to take in response to an input and any previous observations.
+### Transitioning from AgentExecutor to LangGraph
 - The model will then choose an action from available tools (or choose to respond to the user).
 - The model will generate arguments to that tool.
 - The agent runtime (executor) will parse out the chosen tool and call it with the generated arguments.
 - The executor will return the results of the tool call back to the model as an observation.
 - This process repeats until the agent chooses to respond.
-There are general prompting based implementations that do not require any model-specific features, but the most
+If you're currently using `AgentExecutor`, don't worry! We've prepared resources to help you:
-reliable implementations use features like [tool calling](/docs/how_to/tool_calling/) to reliably format outputs
+
-and reduce variance.
+1. For those who still need to use `AgentExecutor`, we offer a comprehensive guide on [how to use AgentExecutor](/docs/how_to/agent_executor).
 2. However, we strongly recommend transitioning to LangGraph for improved flexibility and control. To facilitate this transition, we've created a detailed [migration guide](/docs/how_to/migrate_agent) to help you move from `AgentExecutor` to LangGraph seamlessly.
 Please see the [LangGraph documentation](https://langchain-ai.github.io/langgraph/) for more information,
 or [this how-to guide](/docs/how_to/migrate_agent/) for specific information on migrating to LangGraph.
--- a/docs/docs/concepts/embedding_models.mdx
+++ b/docs/docs/concepts/embedding_models.mdx
@ -3,7 +3,7 @@
 :::info[Prerequisites]
-* [Documents](/docs/concepts/retrievers/#interface)
+* [Documents](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html)
 :::
@ -29,7 +29,7 @@ Embeddings allow search system to find relevant documents not just based on keyw
 (2) **Measure similarity**: Embedding vectors can be comparing using simple mathematical operations.
-## Embedding data 
+## Embedding 
 ### Historical context 
@ -51,7 +51,6 @@ To navigate this variety, researchers and practitioners often turn to benchmarks
 ### LangChain Interface  
 Today, there are [many different embedding models](/docs/integrations/text_embedding/).
 LangChain provides a universal interface for working with them, providing standard methods for common operations.
 This common interface simplifies interaction with various embedding providers through two central methods:
@ -90,9 +89,13 @@ query_embedding = embeddings_model.embed_query("What is the meaning of life?")
 :::
 ### Available integrations
 LangChain offers many embedding model integrations which you can find [on the embedding models](/docs/integrations/text_embedding/) integrations page.
 ## Measure similarity
-Each embedding is essentially a set of coordinates in a vast, abstract space. 
+Each embedding is essentially a set of coordinates, often in a high-dimensional space. 
 In this space, the position of each point (embedding) reflects the meaning of its corresponding text.
 Just as similar words might be close to each other in a thesaurus, similar concepts end up close to each other in this embedding space. 
 This allows for intuitive comparisons between different pieces of text.
@ -103,7 +106,8 @@ Some common similarity metrics include:
 - **Euclidean Distance**: Measures the straight-line distance between two points.
 - **Dot Product**: Measures the projection of one vector onto another.
-As an example, any two embedded texts can be compared with cosine_similarity:
+The choice of similarity metric should be chosen based on the model.
 As an example, [OpenAI suggests cosine similarity for their embeddings](https://platform.openai.com/docs/guides/embeddings/which-distance-function-should-i-use), which can be easily implemented:
 ```python
 import numpy as np
@ -126,40 +130,3 @@ print("Cosine Similarity:", similarity)
 * See OpenAI's [FAQ](https://platform.openai.com/docs/guides/embeddings/faq) on what similarity metric to use with OpenAI embeddings.
 ::: 
 ## Advanced 
 ### Embedding with higher granularity  
 ![](/img/embeddings_colbert.png)
 Embedding models compress text into fixed-length (vector) representations, which can put a heavy burden on that single vector to capture the semantic nuance and detail of the document. 
 In some cases, irrelevant or redundant content can dilute the semantic usefulness of the embedding.
 [ColBERT](https://arxiv.org/abs/2004.12832) (Contextualized Late Interaction over BERT) is an innovative approach to address this limitation by using higher granularity embeddings. 
 Here's how ColBERT works:
 - **Token-level embeddings**: Produce contextually influenced embeddings for each token in the document and the query.
 - **MaxSim operation**: For each query token, compute its maximum similarity with all document tokens.
 - **Aggregation**: The final relevance score is obtained by summing these maximum similarities across all query tokens.
 This token-wise scoring can yield strong results, especially for tasks requiring precise matching or handling longer documents.
 Key advantages of ColBERT:
 - **Improved accuracy**: Token-level interactions can capture more nuanced relationships between query and document.
 - **Interpretability**: The token-level matching allows for easier interpretation of why a document was considered relevant.
 However, ColBERT does come with some trade-offs:
 - **Increased computational cost**: Processing and storing token-level embeddings requires more resources.
 - **Complexity**: Implementing and optimizing ColBERT can be more challenging than simpler embedding models.
 | Name                                                                             | When to use                                    | Description                                                                                                                                                                            |
 |----------------------------------------------------------------------------------|------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
 | [ColBERT](/docs/integrations/providers/ragatouille/#using-colbert-as-a-reranker) | When higher granularity embeddings are needed. | ColBERT uses contextually influenced embeddings for each token in the document and query to get a granular query-document similarity score. [Paper](https://arxiv.org/abs/2112.01488). |
 :::tip
 See our RAG from Scratch video on [ColBERT](https://youtu.be/cN6S0Ehm7_8?feature=shared>).
 :::
--- a/docs/docs/concepts/langgraph.mdx
+++ b/docs/docs/concepts/langgraph.mdx
@ -1,4 +1,13 @@
 # LangGraph
-PLACEHOLDER TO BE REPLACED BY ACTUAL DOCUMENTATION
+[LangGraph](https://github.com/langchain-ai/langgraph) is an extension of LangChain specifically aimed at creating highly controllable and customizable agents.
-USED TO MAKE SURE THAT WE DO NOT FORGET TO ADD LINKS LATER
+The motivation of LangGraph is to help preserve high reliability as we give the agent more control over the application.
 :::info[Further reading]
 * See our LangGraph overview [here](https://langchain-ai.github.io/langgraph/concepts/high_level/#core-principles).
 * See our LangGraph Academy Course [here](https://academy.langchain.com/courses/intro-to-langgraph).
 :::
--- a/docs/docs/concepts/retrievers.mdx
+++ b/docs/docs/concepts/retrievers.mdx
@ -29,7 +29,7 @@ All retrievers implement a simple interface for retrieving documents using natur
 ## Interface 
 The only requirement for a retriever is the ability to accepts a query and return documents. 
-In particular, LangChain's retriever class only requires that the `_get_relevant_documents` method is implemented, which takes a `query: str` and returns a list of [Document](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html) objects that are most relevant to the query.
+In particular, [LangChain's retriever class](https://api.python.langchain.com/en/latest/retrievers/langchain_core.retrievers.BaseRetriever.html) only requires that the `_get_relevant_documents` method is implemented, which takes a `query: str` and returns a list of [Document](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html) objects that are most relevant to the query.
 The underlying logic used to get relevant documents is specified by the retriever and can be whatever is most useful for the application.
 A LangChain retriever is a [runnable](/docs/how_to/lcel_cheatsheet/), which is a standard interface is for LangChain components. 
--- a/docs/docs/concepts/structured_output.mdx
+++ b/docs/docs/concepts/structured_output.mdx
@ -1,143 +0,0 @@
 # Structured output
 ## Overview 
 For many applications, such as chatbots, models need to respond to users directly in natural language. 
 However, there are scenarios where we need models to output in a *structured format*. 
 For example, we might want to store the model output in a database and ensure that the output conforms to the database schema.
 This need motivates the concept of structured output, where models can be instructed to respond with a particular output structure.
 ![Structured output](/img/structured_output.png)
 ## Key Concepts 
 **(1) Schema definition:** The output structure is represented as a schema, which can be defined in several ways. 
 **(2) Returning structured output:** The model is given this schema, and is instructed to return output that conforms to it.
 ## Recommended usage
 This psuedo-code illustrates the recommended workflow when using structured output.
 LangChain provides a helper function, `with_structured_output()`, that automates the process of binding the schema to the model and parsing the output.
 This helper function is available for all model providers that support structured output. 
 ```python
 # Define schema
 schema = {"foo": "bar"}
 # Bind schema to model
 model_with_structure = model.with_structured_output(schema)
 # Invoke the model to produce structured output that matches the schema
 structured_output = model_with_structure.invoke(user_input)
 ```
 ## Schema definition
 The central concept is that the output structure of model responses needs to be represented in some way. 
 While types of objects you can use depend on the model you're working with, there are common types of objects that are typically allowed or recommended for structured output in Python.
 The simplest and most common format for structured output is a JSON-like structure, which in Python can be represented as a dictionary (dict) or list (list).
 JSON objects (or dicts in Python) are often used directly when the tool requires raw, flexible, and minimal-overhead structured data.
 ```json
 {
  "answer": "The answer to the user's question",
  "followup_question": "A followup question the user could ask"
 }
 ```
 As a second example, [Pydantic](https://docs.pydantic.dev/latest/) is particularly useful for defining structured output schemas because it offers type hints and validation.
 Here's an example of a Pydantic schema: 
 ```python
 from pydantic import BaseModel, Field
 class ResponseFormatter(BaseModel):
    """Always use this tool to structure your response to the user."""
    answer: str = Field(description="The answer to the user's question")
    followup_question: str = Field(description="A followup question the user could ask")
 ```
 TODO: There are many other ways to define schemas (Dataclasses, TypedDicts, Custom Classes). How many to cover? How many supported by popular model APIs?
 ## Returning structured output
 With a schema defined, we need a way to instruct the model to use it.
 While one approach is to include this schema in the prompt and *ask nicely* for the model to use it, this is not recommended. 
 Several more powerful methods that utilities native features in the model provider's API are available.
 ### Using Tool Calling
 Many [model providers support](/docs/integrations/chat/) tool calling, a concept discussed in more detail in our [tool calling guide](/docs/concepts/tool_calling/).
 In short, tool calling involves binding a tool to a model and, when appropriate, the model can *decide* to call this tool and ensure its response conforms to the tool's schema.
 With this in mind, the central concept is straightforward: *simply bind our schema to a model as a tool!*
 Here is an example using the `ResponseFormatter` schema defined above:
 ```python
 from langchain_openai import ChatOpenAI
 model = ChatOpenAI(model="gpt-4o", temperature=0)
 # Bind ResponseFormatter schema as a tool to the model
 model_with_tools = model.bind_tools([ResponseFormatter])
 # Invoke the model
 ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")
 ```
 The arguments of the tool call are already extracted as a dictionary. 
 This dictionary can be optionally parsed into a Pydantic object, matching our original `ResponseFormatter` schema.
 ```python
 # Get the tool call arguments
 ai_msg.tool_calls[0]["args"]
 {'answer': "The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.",
 'followup_question': 'What is the function of ATP in the cell?'}
 # Parse the dictionary into a Pydantic object
 pydantic_object = ResponseFormatter.model_validate(ai_msg.tool_calls[0]["args"])
 ```
 ### JSON mode
 In addition to tool calling, some model providers support a feature called `JSON mode`. 
 This supports JSON schema definition as input and enforces the model to produce a conforming JSON output.
 You can find a table of model providers that support JSON mode [here](/docs/integrations/chat/).
 Here is an example of how to use JSON mode with OpenAI:
 ```python
 from langchain_openai import ChatOpenAI
 model = ChatOpenAI(model="gpt-4o", model_kwargs={ "response_format": { "type": "json_object" } })
 ai_msg = model.invoke("Return a JSON object with key 'random_ints' and a value of 10 random ints in [0-99]")
 ai_msg.content
 '\n{\n  "random_ints": [23, 47, 89, 15, 34, 76, 58, 3, 62, 91]\n}'
 ```
 One important point to flag: the model *still* returns a string, which needs to be parsed into a JSON object.
 This can, of course, simply use the `json` library or a JSON output parser if you need more advanced functionality.
 See this [how-to guide on the JSON output parser](/docs/how_to/output_parser_json) for more details.
 ```python
 import json
 json_object = json.loads(ai_msg.content)
 {'random_ints': [23, 47, 89, 15, 34, 76, 58, 3, 62, 91]}
 ```
 ## LangChain helper
 There a few challenges when producing structured output with the above methods: (1) If using tool calling, tool call arguments needs to be parsed from a dictionary back to the original schema.  
 (2) In addition, the model needs to be instructed to *always* use the tool when we want to enforce structured output, which is a provider specific setting. (3) If using JSON mode, the output needs to be parsed into a JSON object. 
 With these challenges in mind, LangChain provides a helper function (`with_structured_output()`) to streamline the process.
 ![Diagram of with structured output](/img/with_structured_output.png)
 This both binds the schema to the model as a tool and parses the output to the specified output schema. 
 ```python
 # Bind the schema to the model
 model_with_structure = model.with_structured_output(ResponseFormatter)
 # Invoke the model
 structured_output = model_with_structure.invoke("What is the powerhouse of the cell?")
 # Get back the Pydantic object
 structured_output
 ResponseFormatter(answer="The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.", followup_question='What is the function of ATP in the cell?')
 ```
 TODO: We need to explain the choice of implementation under the hood. Seems to be set with the `method` argument. How is default chosen? What if provider only has JSON mode? What inputs schemas are supported?
 For more details on usage, see our [how-to guide](/docs/how_to/structured_output/#the-with_structured_output-method).
--- a/docs/docs/concepts/structured_outputs.mdx
+++ b/docs/docs/concepts/structured_outputs.mdx
@ -1,3 +1,148 @@
 # Structured Outputs
-Place holder
+## Overview 
 For many applications, such as chatbots, models need to respond to users directly in natural language. 
 However, there are scenarios where we need models to output in a *structured format*. 
 For example, we might want to store the model output in a database and ensure that the output conforms to the database schema.
 This need motivates the concept of structured output, where models can be instructed to respond with a particular output structure.
 ![Structured output](/img/structured_output.png)
 ## Key Concepts 
 **(1) Schema definition:** The output structure is represented as a schema, which can be defined in several ways. 
 **(2) Returning structured output:** The model is given this schema, and is instructed to return output that conforms to it.
 ## Recommended usage
 This pseudo-code illustrates the recommended workflow when using structured output.
 LangChain provides a method, [`with_structured_output()`](/docs/how_to/structured_output/#the-with_structured_output-method), that automates the process of binding the schema to the [model](/docs/concepts/chat_models/) and parsing the output.
 This helper function is available for all model providers that support structured output. 
 ```python
 # Define schema
 schema = {"foo": "bar"}
 # Bind schema to model
 model_with_structure = model.with_structured_output(schema)
 # Invoke the model to produce structured output that matches the schema
 structured_output = model_with_structure.invoke(user_input)
 ```
 ## Schema definition
 The central concept is that the output structure of model responses needs to be represented in some way. 
 While types of objects you can use depend on the model you're working with, there are common types of objects that are typically allowed or recommended for structured output in Python.
 The simplest and most common format for structured output is a JSON-like structure, which in Python can be represented as a dictionary (dict) or list (list).
 JSON objects (or dicts in Python) are often used directly when the tool requires raw, flexible, and minimal-overhead structured data.
 ```json
 {
  "answer": "The answer to the user's question",
  "followup_question": "A followup question the user could ask"
 }
 ```
 As a second example, [Pydantic](https://docs.pydantic.dev/latest/) is particularly useful for defining structured output schemas because it offers type hints and validation.
 Here's an example of a Pydantic schema: 
 ```python
 from pydantic import BaseModel, Field
 class ResponseFormatter(BaseModel):
    """Always use this tool to structure your response to the user."""
    answer: str = Field(description="The answer to the user's question")
    followup_question: str = Field(description="A followup question the user could ask")
 ```
 ## Returning structured output
 With a schema defined, we need a way to instruct the model to use it.
 While one approach is to include this schema in the prompt and *ask nicely* for the model to use it, this is not recommended. 
 Several more powerful methods that utilizes native features in the model provider's API are available.
 ### Using Tool Calling
 Many [model providers support](/docs/integrations/chat/) tool calling, a concept discussed in more detail in our [tool calling guide](/docs/concepts/tool_calling/).
 In short, tool calling involves binding a tool to a model and, when appropriate, the model can *decide* to call this tool and ensure its response conforms to the tool's schema.
 With this in mind, the central concept is strightforward: *simply bind our schema to a model as a tool!*
 Here is an example using the `ResponseFormatter` schema defined above:
 ```python
 from langchain_openai import ChatOpenAI
 model = ChatOpenAI(model="gpt-4o", temperature=0)
 # Bind ResponseFormatter schema as a tool to the model
 model_with_tools = model.bind_tools([ResponseFormatter])
 # Invoke the model
 ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")
 ```
 The arguments of the tool call are already extracted as a dictionary. 
 This dictionary can be optionally parsed into a Pydantic object, matching our original `ResponseFormatter` schema.
 ```python
 # Get the tool call arguments
 ai_msg.tool_calls[0]["args"]
 {'answer': "The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.",
 'followup_question': 'What is the function of ATP in the cell?'}
 # Parse the dictionary into a Pydantic object
 pydantic_object = ResponseFormatter.model_validate(ai_msg.tool_calls[0]["args"])
 ```
 ### JSON mode
 In addition to tool calling, some model providers support a feature called `JSON mode`. 
 This supports JSON schema definition as input and enforces the model to produce a conforming JSON output.
 You can find a table of model providers that support JSON mode [here](/docs/integrations/chat/).
 Here is an example of how to use JSON mode with OpenAI:
 ```python
 from langchain_openai import ChatOpenAI
 model = ChatOpenAI(model="gpt-4o", model_kwargs={ "response_format": { "type": "json_object" } })
 ai_msg = model.invoke("Return a JSON object with key 'random_ints' and a value of 10 random ints in [0-99]")
 ai_msg.content
 '\n{\n  "random_ints": [23, 47, 89, 15, 34, 76, 58, 3, 62, 91]\n}'
 ```
 One important point to flag: the model *still* returns a string, which needs to be parsed into a JSON object.
 This can, of course, simply use the `json` library or a JSON output parser if you need more adavanced functionality.
 See this [how-to guide on the JSON output parser](/docs/how_to/output_parser_json) for more details.
 ```python
 import json
 json_object = json.loads(ai_msg.content)
 {'random_ints': [23, 47, 89, 15, 34, 76, 58, 3, 62, 91]}
 ```
 ## Structured output method 
 There a few challenges when producing structured output with the above methods: 
 (1) If using tool calling, tool call arguments needs to be parsed from a dictionary back to the original schema.  
 (2) In addition, the model needs to be instructed to *always* use the tool when we want to enforce structured output, which is a provider specific setting. 
 (3) If using JSON mode, the output needs to be parsed into a JSON object. 
 With these challenges in mind, LangChain provides a helper function (`with_structured_output()`) to streamline the process.
 ![Diagram of with structured output](/img/with_structured_output.png)
 This both binds the schema to the model as a tool and parses the output to the specified output schema. 
 ```python
 # Bind the schema to the model
 model_with_structure = model.with_structured_output(ResponseFormatter)
 # Invoke the model
 structured_output = model_with_structure.invoke("What is the powerhouse of the cell?")
 # Get back the Pydantic object
 structured_output
 ResponseFormatter(answer="The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.", followup_question='What is the function of ATP in the cell?')
 ```
 :::info[Further reading]
 For more details on usage, see our [how-to guide](/docs/how_to/structured_output/#the-with_structured_output-method).
 :::
--- a/docs/docs/concepts/tool_calling.mdx
+++ b/docs/docs/concepts/tool_calling.mdx
@ -30,7 +30,7 @@ You will sometimes hear the term `function calling`. We use this term interchang
 ## Recommended usage
-This psuedo-code illustrates the recommended workflow for using tool calling. 
+This pseudo-code illustrates the recommended workflow for using tool calling. 
 Created tools are passed to `.bind_tools()` method as a list.
 This model can be called, as usual. If a tool call is made, model's response will contain the tool call arguments.
 The tool call arguments can be passed directly to the tool.
@ -39,11 +39,9 @@ The tool call arguments can be passed directly to the tool.
 # Tool creation
 tools = [my_tool]
 # Tool binding
-modelwtools = model.bind_tools(tools)
+model_with_tools = model.bind_tools(tools)
 # Tool calling 
-response = modelwtools.invoke(user_input)
+response = model_with_tools.invoke(user_input)
 # Tool execution
 tool_output = my_tool(response.tool_calls)
 ```
 ## Tool Creation
@ -59,10 +57,13 @@ def multiply(a: int, b: int) -> int:
    return a * b
 ```
-For more information about tool creation, please see:
+:::info[Further reading]
-* [Conceptual guide on tools](/docs/concepts/tools/)
+* See our conceptual guide on [tools](/docs/concepts/tools/) for more details.
-* [How to create custom tools](https://python.langchain.com/docs/how_to/custom_tools/)
+* See our [model integrations](/docs/integrations/chat/) that support tool calling.
 * See our [how-to guide](/docs/how_to/tool_calling/) on tool calling.
 :::
 ## Tool Binding 
@ -125,9 +126,20 @@ For more details on usage, see our [how-to guides](/docs/how_to/#tools)!
 ## Tool execution
 [Tools](/docs/concepts/tools/) implement the [Runnable](/docs/concepts/runnables/) interface, which means that they can be invoked (e.g., `tool.invoke(args)`) directly.
 [LangGraph](https://langchain-ai.github.io/langgraph/) offers pre-built components (e.g., [`ToolNode`](https://langchain-ai.github.io/langgraph/reference/prebuilt/#toolnode)) that will often invoke the tool in behalf of the user.
 :::info[Further reading]
 * See our [how-to guide](/docs/how_to/tool_calling/) on tool calling.
 * See the [LangGraph documentation on using ToolNode](https://langchain-ai.github.io/langgraph/how-tos/tool-calling/).
 :::
 ## Best practices
-When designing tools to be used by a model, it is important to keep in mind that:
+When designing [tools](/docs/concepts/tools/) to be used by a model, it is important to keep in mind that:
 * Models that have explicit [tool-calling APIs](/docs/concepts/#functiontool-calling) will be better at tool calling than non-fine-tuned models.
 * Models will perform better if the tools have well-chosen names and descriptions.
--- a/docs/static/img/agent_types.png
+++ b/docs/static/img/agent_types.png
--- a/docs/static/img/embeddings_colbert.png
+++ b/docs/static/img/embeddings_colbert.png