Merge branch 'master' into eugene/foo_foo

core[patch]: fix repr and str for Serializable (#26786 )
Fixes #26499 --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2026-02-08 10:09:46 +00:00 · 2024-10-24 13:46:20 -04:00 · 2024-10-24 08:36:35 -07:00 · 2024-10-24 15:13:28 +00:00 · 2024-10-24 15:05:43 +00:00 · 2024-10-24 15:05:06 +00:00
233 changed files with 6087 additions and 1460 deletions
--- a/docs/api_reference/conf.py
+++ b/docs/api_reference/conf.py
@@ -222,9 +222,7 @@ html_theme_options = {
        },
    ],
    "icon_links_label": "Quick Links",
-    "external_links": [
-        {"name": "Legacy reference", "url": "https://api.python.langchain.com/"},
-    ],
+    "external_links": [],
 }


--- a/docs/docs/concepts/agents.mdx
+++ b/docs/docs/concepts/agents.mdx
@@ -1,19 +1,21 @@
 # Agents

-We recommend that you use [LangGraph](/docs/concepts/architecture#langgraph) for building agents.
+By themselves, language models can't take actions - they just output text. Agents are systems that take a high-level task and use an LLM as a reasoning engine to decide what actions to take and execute those actions.
+
+[LangGraph](/docs/concepts/architecture#langgraph) is an extension of LangChain specifically aimed at creating highly controllable and customizable agents. We recommend that you use LangGraph for building agents.

 Please see the following resources for more information:

-* LangGraph docs for conceptual architecture about [Agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/)
-* [Pre-built agent in LangGraph](https://langchain-ai.github.io/langgraph/reference/prebuilt/#langgraph.prebuilt.chat_agent_executor.create_react_agent)
+* LangGraph docs on [common agent architectures](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/)
+* [Pre-built agents in LangGraph](https://langchain-ai.github.io/langgraph/reference/prebuilt/#langgraph.prebuilt.chat_agent_executor.create_react_agent)

-## Legacy Agent Concept: AgentExecutor
+## Legacy agent concept: AgentExecutor

 LangChain previously introduced the `AgentExecutor` as a runtime for agents. 
 While it served as an excellent starting point, its limitations became apparent when dealing with more sophisticated and customized agents. 
 As a result, we're gradually phasing out `AgentExecutor` in favor of more flexible solutions in LangGraph.

-### Transitioning from AgentExecutor to LangGraph
+### Transitioning from AgentExecutor to langgraph

 If you're currently using `AgentExecutor`, don't worry! We've prepared resources to help you:

--- a/docs/docs/concepts/architecture.mdx
+++ b/docs/docs/concepts/architecture.mdx
@@ -1,40 +1,51 @@
 import ThemedImage from '@theme/ThemedImage';
 import useBaseUrl from '@docusaurus/useBaseUrl';

-## Architecture
+# Architecture

-LangChain as a framework consists of a number of packages.
+LangChain is a framework that consists of a number of packages.

-### langchain-core
+<ThemedImage
+    alt="Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers."
+    sources={{
+        light: useBaseUrl('/svg/langchain_stack_062024.svg'),
+        dark: useBaseUrl('/svg/langchain_stack_062024_dark.svg'),
+    }}
+    title="LangChain Framework Overview"
+    style={{ width: "100%" }}
+/>

-This package contains base abstractions of different components and ways to compose them together.
-The interfaces for core components like LLMs, vector stores, retrievers and more are defined here.
-No third party integrations are defined here.
-The dependencies are kept purposefully very lightweight.

-### langchain
+## langchain-core

-The main `langchain` package contains chains, agents, and retrieval strategies that make up an application's cognitive architecture.
-These are NOT third party integrations.
+This package contains base abstractions for different components and ways to compose them together.
+The interfaces for core components like chat models, vector stores, tools and more are defined here.
+No third-party integrations are defined here.
+The dependencies are very lightweight.
+
+## langchain
+
+The main `langchain` package contains chains and retrieval strategies that make up an application's cognitive architecture.
+These are NOT third-party integrations.
 All chains, agents, and retrieval strategies here are NOT specific to any one integration, but rather generic across all integrations.

-### langchain-community
+## Integration packages

-This package contains third party integrations that are maintained by the LangChain community.
-Key partner packages are separated out (see below).
-This contains all integrations for various components (LLMs, vector stores, retrievers).
-All dependencies in this package are optional to keep the package as lightweight as possible.
-
-### Partner packages
-
-While the long tail of integrations is in `langchain-community`, we split popular integrations into their own packages (e.g. `langchain-openai`, `langchain-anthropic`, etc). This was done in order to improve support for these important integrations.
+Popular integrations have their own packages (e.g. `langchain-openai`, `langchain-anthropic`, etc) so that they can be properly versioned and appropriately lightweight.

 For more information see:

-* A list [LangChain integrations](/docs/integrations/providers/)
-* The [LangChain API Reference](https://python.langchain.com/api_reference/) where you can find detailed information about the API reference of each partner package.
+* A list [integrations packages](/docs/integrations/providers/)
+* The [API Reference](https://python.langchain.com/api_reference/) where you can find detailed information about each of the integration package.

-### LangGraph
+## langchain-community
+
+This package contains third-party integrations that are maintained by the LangChain community.
+Key integration packages are separated out (see above).
+This contains integrations for various components (chat models, vector stores, tools, etc).
+All dependencies in this package are optional to keep the package as lightweight as possible.
+
+## langgraph

 `langgraph` is an extension of `langchain` aimed at building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.

@@ -47,9 +58,7 @@ LangGraph exposes high level interfaces for creating common types of agents, as

 :::

-
-
-### LangServe
+## langserve

 A package to deploy LangChain chains as REST APIs. Makes it easy to get a production ready API up and running.

@@ -62,19 +71,8 @@ If you need a deployment option for LangGraph, you should instead be looking at
 For more information, see the [LangServe documentation](/docs/langserve).


-### LangSmith
+## LangSmith

 A developer platform that lets you debug, test, evaluate, and monitor LLM applications.

 For more information, see the [LangSmith documentation](https://docs.smith.langchain.com)
-
-
-<ThemedImage
-    alt="Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers."
-    sources={{
-        light: useBaseUrl('/svg/langchain_stack_062024.svg'),
-        dark: useBaseUrl('/svg/langchain_stack_062024_dark.svg'),
-    }}
-    title="LangChain Framework Overview"
-    style={{ width: "100%" }}
-/>
--- a/docs/docs/concepts/async.mdx
+++ b/docs/docs/concepts/async.mdx
@@ -1,12 +1,10 @@
-# Async Programming with LangChain
+# Async programming with langchain

 :::info Prerequisites
-* [Runnable Interface](/docs/concepts/runnables)
-* [asyncio documentation](https://docs.python.org/3/library/asyncio.html)
+* [Runnable interface](/docs/concepts/runnables)
+* [asyncio](https://docs.python.org/3/library/asyncio.html)
 :::

-## Overview
-
 LLM based applications often involve a lot of I/O-bound operations, such as making API calls to language models, databases, or other services. Asynchronous programming (or async programming) is a paradigm that allows a program to perform multiple tasks concurrently without blocking the execution of other tasks, improving efficiency and responsiveness, particularly in I/O-bound operations.

 :::note
@@ -14,7 +12,7 @@ You are expected to be familiar with asynchronous programming in Python before r
 This guide specifically focuses on what you need to know to work with LangChain in an asynchronous context, assuming that you are already familiar with asynch
 :::

-## LangChain Asynchronous APIs
+## Langchain asynchronous APIs

 Many LangChain APIs are designed to be asynchronous, allowing you to build efficient and responsive applications.

@@ -39,9 +37,9 @@ await some_vectorstore.aadd_documents(documents)
 Runnables created using the [LangChain Expression Language (LCEL)](/docs/concepts/lcel) can also be run asynchronously as they implement
 the full [Runnable Interface](/docs/concepts/runnables).

-Fore more information, please review the [API reference](https://python.langchain.com/api_reference/) for the specific component you are using.
+For more information, please review the [API reference](https://python.langchain.com/api_reference/) for the specific component you are using.

-## Delegation to Sync Methods
+## Delegation to sync methods

 Most popular LangChain integrations implement asynchronous support of their APIs. For example, the `ainvoke` method of many ChatModel implementations uses the `httpx.AsyncClient` to make asynchronous HTTP requests to the model provider's API.

@@ -75,9 +73,9 @@ in certain scenarios.

 If you are experiencing issues with streaming, callbacks or tracing in async code and are using Python 3.9 or 3.10, this is a likely cause.

-Please read [Propagation RunnableConfig](/docs/concepts/runnables#propagation-runnableconfig) for more details to learn how to propagate the `RunnableConfig` down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).
+Please read [Propagation RunnableConfig](/docs/concepts/runnables#propagation-RunnableConfig) for more details to learn how to propagate the `RunnableConfig` down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).

-## How to use in IPython and Jupyter Notebooks
+## How to use in ipython and jupyter notebooks

 As of IPython 7.0, IPython supports asynchronous REPLs. This means that you can use the `await` keyword in the IPython REPL and Jupyter Notebooks without any additional setup. For more information, see the [IPython blog post](https://blog.jupyter.org/ipython-7-0-async-repl-a35ce050f7f7).

--- a/docs/docs/concepts/callbacks.mdx
+++ b/docs/docs/concepts/callbacks.mdx
@@ -1,16 +1,14 @@
 # Callbacks

 :::note Prerequisites
- [Runnable interface](/docs/concepts/#runnable-interface)
+- [Runnable interface](/docs/concepts/runnables)
 :::

-## Overview
-
-LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
+LangChain provides a callback system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.

 You can subscribe to these events by using the `callbacks` argument available throughout the API. This argument is list of handler objects, which are expected to implement one or more of the methods described below in more detail.

-## Callback Events
+## Callback events

 | Event            | Event Trigger                               | Associated Method     |
 |------------------|---------------------------------------------|-----------------------|
--- a/docs/docs/concepts/chat_history.mdx
+++ b/docs/docs/concepts/chat_history.mdx
@@ -1,17 +1,17 @@
-# Chat History
+# Chat history

 :::info Prerequisites

 - [Messages](/docs/concepts/messages)
- [Chat Models](/docs/concepts/chat_models)
- [Tool Calling](/docs/concepts/tool_calling)
+- [Chat models](/docs/concepts/chat_models)
+- [Tool calling](/docs/concepts/tool_calling)
 :::

-## Overview
-
 Chat history is a record of the conversation between the user and the chat model. It is used to maintain context and state throughout the conversation. The chat history is sequence of [messages](/docs/concepts/messages), each of which is associated with a specific [role](/docs/concepts/messages#role), such as "user", "assistant", "system", or "tool".

-## Conversation Patterns
+## Conversation patterns
+
+![Conversation patterns](/img/conversation_patterns.png)

 Most conversations start with a **system message** that sets the context for the conversation. This is followed by a **user message** containing the user's input, and then an **assistant message** containing the model's response.

@@ -22,7 +22,7 @@ So a full conversation often involves a combination of two patterns of alternati
 1. The **user** and the **assistant** representing a back-and-forth conversation.
 2. The **assistant** and **tool messages** representing an ["agentic" workflow](/docs/concepts/agents) where the assistant is invoking tools to perform specific tasks.

-## Managing Chat History
+## Managing chat history

 Since chat models have a maximum limit on input size, it's important to manage chat history and trim it as needed to avoid exceeding the [context window](/docs/concepts/chat_models#context_window).

@@ -40,7 +40,7 @@ Understanding correct conversation structure is essential for being able to prop
 [memory](https://langchain-ai.github.io/langgraph/concepts/memory/) in chat models.
 :::

-## Related Resources
+## Related resources

- [How to Trim Messages](https://python.langchain.com/docs/how_to/trim_messages/)
- [Memory Guide](https://langchain-ai.github.io/langgraph/concepts/memory/) for information on implementing short-term and long-term memory in chat models using [LangGraph](https://langchain-ai.github.io/langgraph/).
+- [How to trim messages](/docs/how_to/trim_messages/)
+- [Memory guide](https://langchain-ai.github.io/langgraph/concepts/memory/) for information on implementing short-term and long-term memory in chat models using [LangGraph](https://langchain-ai.github.io/langgraph/).
--- a/docs/docs/concepts/chat_models.mdx
+++ b/docs/docs/concepts/chat_models.mdx
@@ -1,21 +1,22 @@
-# Chat Models
+# Chat models

 ## Overview

 Large Language Models (LLMs) are advanced machine learning models that excel in a wide range of language-related tasks such as text generation, translation, summarization, question answering, and more, without needing task-specific tuning for every scenario.

-Modern LLMs are typically accessed through a chat model interface that takes [messages](/docs/concepts/messages) as input and returns [messages](/docs/concepts/messages) as output.
+Modern LLMs are typically accessed through a chat model interface that takes a list of [messages](/docs/concepts/messages) as input and returns a [message](/docs/concepts/messages) as output.

 The newest generation of chat models offer additional capabilities:

-* [Tool Calling](/docs/concepts#tool-calling): Many popular chat models offer a native [tool calling](/docs/concepts#tool-calling) API. This API allows developers to build rich applications that enable AI to interact with external services, APIs, and databases. Tool calling can also be used to extract structured information from unstructured data and perform various other tasks.
+* [Tool calling](/docs/concepts#tool-calling): Many popular chat models offer a native [tool calling](/docs/concepts#tool-calling) API. This API allows developers to build rich applications that enable AI to interact with external services, APIs, and databases. Tool calling can also be used to extract structured information from unstructured data and perform various other tasks.
+* [Structured output](/docs/concepts/structured_outputs): A technique to make a chat model respond in a structured format, such as JSON that matches a given schema.
 * [Multimodality](/docs/concepts/multimodality): The ability to work with data other than text; for example, images, audio, and video.

 ## Features

 LangChain provides a consistent interface for working with chat models from different providers while offering additional features for monitoring, debugging, and optimizing the performance of applications that use LLMs.

-* Integrations with many chat model providers (e.g., Anthropic, OpenAI, Ollama, Cohere, Hugging Face, Groq, Microsoft Azure, Google Vertex, Amazon Bedrock). Please see [chat model integrations](/docs/integrations/chat/) for an up-to-date list of supported models.
+* Integrations with many chat model providers (e.g., Anthropic, OpenAI, Ollama, Microsoft Azure, Google Vertex, Amazon Bedrock, Hugging Face, Cohere, Groq). Please see [chat model integrations](/docs/integrations/chat/) for an up-to-date list of supported models.
 * Use either LangChain's [messages](/docs/concepts/messages) format or OpenAI format.
 * Standard [tool calling API](/docs/concepts#tool-calling): standard interface for binding tools to models, accessing tool call requests made by models, and sending tool results back to the model.
 * Standard API for structuring outputs (/docs/concepts/structured_outputs) via the `with_structured_output` method.
@@ -23,14 +24,14 @@ LangChain provides a consistent interface for working with chat models from diff
 * Integration with [LangSmith](https://docs.smith.langchain.com) for monitoring and debugging production-grade applications based on LLMs.
 * Additional features like standardized [token usage](/docs/concepts/messages#token_usage), [rate limiting](#rate-limiting), [caching](#cache) and more.

-##  Available Integrations
+## Integrations

 LangChain has many chat model integrations that allow you to use a wide variety of models from different providers.

 These integrations are one of two types:

-1. **Official Models**: These are models that are officially supported by LangChain and/or model provider. You can find these models in the `langchain-<provider>` packages.
-2. **Community Models**: There are models that are mostly contributed and supported by the community. You can find these models in the `langchain-community` package.
+1. **Official models**: These are models that are officially supported by LangChain and/or model provider. You can find these models in the `langchain-<provider>` packages.
+2. **Community models**: There are models that are mostly contributed and supported by the community. You can find these models in the `langchain-community` package.

 LangChain chat models are named with a convention that prefixes "Chat" to their class names (e.g., `ChatOllama`, `ChatAnthropic`, `ChatOpenAI`, etc.).

@@ -56,7 +57,7 @@ However, LangChain also has implementations of older LLMs that do not follow the
 These models implement the [BaseLLM](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.llms.BaseLLM.html#langchain_core.language_models.llms.BaseLLM) interface and may be named with the "LLM" suffix (e.g., `OllamaLLM`, `AnthropicLLM`, `OpenAILLM`, etc.). Generally, users should not use these models.
 :::

-### Key Methods
+### Key methods

 The key methods of a chat model are:

@@ -68,7 +69,7 @@ The key methods of a chat model are:

 Other important methods can be found in the [BaseChatModel API Reference](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html).

-### Inputs and Outputs 
+### Inputs and outputs

 Modern LLMs are typically accessed through a chat model interface that takes [messages](/docs/concepts/messages) as input and returns [messages](/docs/concepts/messages) as output. Messages are typically associated with a role (e.g., "system", "human", "assistant") and one or more content blocks that contain text or potentially multimodal data (e.g., images, audio, video).

@@ -77,7 +78,7 @@ LangChain supports two message formats to interact with chat models:
 1. **LangChain Message Format**: LangChain's own message format, which is used by default and is used internally by LangChain.
 2. **OpenAI's Message Format**: OpenAI's message format.

-### Standard Parameters
+### Standard parameters

 Many chat models have standardized parameters that can be used to configure the model:

@@ -100,12 +101,12 @@ Some important things to note:

 ChatModels also accept other parameters that are specific to that integration. To find all the parameters supported by a ChatModel head to the [API reference](https://python.langchain.com/api_reference/) for that model.

-## Tool Calling
+## Tool calling

 Chat models can call [tools](/docs/concepts/tools) to perform tasks such as fetching data from a database, making API requests, or running custom code. Please
 see the [tool calling](/docs/concepts#tool-calling) guide for more information.

-## Structured Outputs
+## Structured outputs

 Chat models can be requested to respond in a particular format (e.g., JSON or matching a particular schema). This feature is extremely
 useful for information extraction tasks. Please read more about
@@ -117,7 +118,7 @@ Large Language Models (LLMs) are not limited to processing text. They can also b

 Currently, only some LLMs support multimodal inputs, and almost none support multimodal outputs. Please consult the specific model documentation for details.

-## Context Window
+## Context window

 A chat model's context window refers to the maximum size of the input sequence the model can process at one time. While the context windows of modern LLMs are quite large, they still present a limitation that developers must keep in mind when working with chat models.

@@ -125,7 +126,7 @@ If the input exceeds the context window, the model may not be able to process th

 The size of the input is measured in [tokens](/docs/concepts/tokens) which are the unit of processing that the model uses.

-## Advanced Topics 
+## Advanced topics
 
 ### Rate-limiting

@@ -135,7 +136,7 @@ If you hit a rate limit, you will typically receive a rate limit error response

 You have a few options to deal with rate limits:

-1. Try to avoid hitting rate limits by spacing out requests: Chat models accept a `rate_limiter` parameter that can be provided during initialization. This parameter is used to control the rate at which requests are made to the model provider. Spacing out the requests to a given model is a particularly useful strategy when benchmarking models to evaluate their performance. Please see the [how to handle rate limits](https://python.langchain.com/docs/how_to/chat_model_rate_limiting/) for more information on how to use this feature.
+1. Try to avoid hitting rate limits by spacing out requests: Chat models accept a `rate_limiter` parameter that can be provided during initialization. This parameter is used to control the rate at which requests are made to the model provider. Spacing out the requests to a given model is a particularly useful strategy when benchmarking models to evaluate their performance. Please see the [how to handle rate limits](/docs/how_to/chat_model_rate_limiting/) for more information on how to use this feature.
 2. Try to recover from rate limit errors: If you receive a rate limit error, you can wait a certain amount of time before retrying the request. The amount of time to wait can be increased with each subsequent rate limit error. Chat models have a `max_retries` parameter that can be used to control the number of retries. See the [standard parameters](#standard-parameters) section for more information.
 3. Fallback to another chat model: If you hit a rate limit with one chat model, you can switch to another chat model that is not rate-limited.

@@ -153,7 +154,7 @@ However, there might be situations where caching chat model responses is benefic

 Please see the [how to cache chat model responses](/docs/how_to/#chat-model-caching) guide for more details.

-## Related Resources
+## Related resources

 * How-to guides on using chat models: [how-to guides](/docs/how_to/#chat-models).
 * List of supported chat models: [chat model integrations](/docs/integrations/chat/).
--- a/docs/docs/concepts/document_loaders.mdx
+++ b/docs/docs/concepts/document_loaders.mdx
@@ -3,16 +3,14 @@

 :::info[Prerequisites]

-* [Document API Reference](https://python.langchain.com/docs/how_to/#document-loaders)
+* [Document loaders API reference](/docs/how_to/#document-loaders)
 :::

-## Overview
-
 Document loaders are designed to load document objects. LangChain has hundreds of integrations with various data sources to load data from: Slack, Notion, Google Drive, etc.

-## Available Integrations
+## Integrations

-You can find available integrations on the [Document Loaders Integrations page](https://python.langchain.com/docs/integrations/document_loaders/).
+You can find available integrations on the [Document loaders integrations page](/docs/integrations/document_loaders/).

 ## Interface

@@ -38,10 +36,10 @@ for document in loader.lazy_load():
    print(document)
 ```

-## Related Resources
+## Related resources

 Please see the following resources for more information:

-* [How-to guides for document loaders](https://python.langchain.com/docs/how_to/#document-loaders)
-* [Document API Reference](https://python.langchain.com/docs/how_to/#document-loaders)
-* [Document Loaders Integrations](https://python.langchain.com/docs/integrations/document_loaders/)
+* [How-to guides for document loaders](/docs/how_to/#document-loaders)
+* [Document API reference](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html)
+* [Document loaders integrations](/docs/integrations/document_loaders/)
--- a/docs/docs/concepts/embedding_models.mdx
+++ b/docs/docs/concepts/embedding_models.mdx
@@ -13,9 +13,7 @@ This conceptual overview focuses on text-based embedding models.
 Embedding models can also be [multimodal](/docs/concepts/multimodality) though such models are not currently supported by LangChain.
 :::

-## Overview
-
-Imagine being able to capture the essence of any text - a tweet, document, or book - in a single, compact representation. 
+Imagine being able to capture the essence of any text - a tweet, document, or book - in a single, compact representation.
 This is the power of embedding models, which lie at the heart of many retrieval systems.
 Embedding models transform human language into a format that machines can understand and compare with speed and accuracy. 
 These models take text as input and produce a fixed-length array of numbers, a numerical fingerprint of the text's semantic meaning.
@@ -49,7 +47,7 @@ To navigate this variety, researchers and practitioners often turn to benchmarks

 :::

-### LangChain Interface  
+### Interface

 LangChain provides a universal interface for working with them, providing standard methods for common operations.
 This common interface simplifies interaction with various embedding providers through two central methods:
@@ -89,7 +87,7 @@ query_embedding = embeddings_model.embed_query("What is the meaning of life?")

 :::

-### Available integrations
+### Integrations

 LangChain offers many embedding model integrations which you can find [on the embedding models](/docs/integrations/text_embedding/) integrations page.

--- a/docs/docs/concepts/evaluation.mdx
+++ b/docs/docs/concepts/evaluation.mdx
@@ -0,0 +1,17 @@
+# Evaluation
+<span data-heading-keywords="evaluation,evaluate"></span>
+
+Evaluation is the process of assessing the performance and effectiveness of your LLM-powered applications.
+It involves testing the model's responses against a set of predefined criteria or benchmarks to ensure it meets the desired quality standards and fulfills the intended purpose.
+This process is vital for building reliable applications.
+
+![](/img/langsmith_evaluate.png)
+
+[LangSmith](https://docs.smith.langchain.com/) helps with this process in a few ways:
+
+- It makes it easier to create and curate datasets via its tracing and annotation features
+- It provides an evaluation framework that helps you define metrics and run your app against your dataset
+- It allows you to track results over time and automatically run your evaluators on a schedule or as part of CI/Code
+
+To learn more, check out [this LangSmith guide](https://docs.smith.langchain.com/concepts/evaluation).
+
--- a/docs/docs/concepts/example_selectors.mdx
+++ b/docs/docs/concepts/example_selectors.mdx
@@ -15,6 +15,6 @@ Sometimes these examples are hardcoded into the prompt, but for more advanced si

 **Example Selectors** are classes responsible for selecting and then formatting examples into prompts.

-## Related Resources
+## Related resources

 * [Example selector how-to guides](/docs/how_to/#example-selectors)
--- a/docs/docs/concepts/few_shot_prompting.mdx
+++ b/docs/docs/concepts/few_shot_prompting.mdx
@@ -70,7 +70,7 @@ Most state-of-the-art models these days are chat models, so we'll focus on forma

 If we insert our examples into the system prompt as a string, we'll need to make sure it's clear to the model where each example begins and which parts are the input versus output. Different models respond better to different syntaxes, like [ChatML](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chat-markup-language), XML, TypeScript, etc.

-If we insert our examples as messages, where each example is represented as a sequence of Human, AI messages, we might want to also assign [names](/docs/concepts/#messages) to our messages like `"example_user"` and `"example_assistant"` to make it clear that these messages correspond to different actors than the latest input message.
+If we insert our examples as messages, where each example is represented as a sequence of Human, AI messages, we might want to also assign [names](/docs/concepts/messages) to our messages like `"example_user"` and `"example_assistant"` to make it clear that these messages correspond to different actors than the latest input message.

 **Formatting tool call examples**

--- a/docs/docs/concepts/index.mdx
+++ b/docs/docs/concepts/index.mdx
@@ -1,46 +1,44 @@
-# Conceptual Guide
+# Conceptual guide

-## Overview
+This guide provides explanations of the key concepts behind the LangChain framework and AI applications more broadly.

-In this guide, you'll find explanations of the key concepts, providing a deeper understanding of core principles.
+We recommend that you go through at least one of the [Tutorials](/docs/tutorials) before diving into the conceptual guide. This will provide practical context that will make it easier to understand the concepts discussed here.

-We recommend that you go through at least one of the [Tutorials](/docs/tutorials) before diving into the conceptual guide. This will help you understand the context and practical applications of the concepts discussed here.
+The conceptual guide does not cover step-by-step instructions or specific implementation examples — those are found in the [How-to guides](/docs/how_to/) and [Tutorials](/docs/tutorials). For detailed reference material, please see the [API reference](https://python.langchain.com/api_reference/).

-The conceptual guide will not cover step-by-step instructions or specific implementation details — those are found in the [How-To Guides](/docs/how_to/) and [Tutorials](/docs/tutorials) sections. For detailed reference material, please visit the [API Reference](https://python.langchain.com/api_reference/).
+## High level

-
-## High Level
-
- **[Why LangChain?](/docs/concepts/why_langchain)**: Why LangChain is the best choice for building AI applications.
- **[Architecture](/docs/concepts/architecture)**: Overview of how packages are organized in the LangChain ecosystem.
+- **[Why LangChain?](/docs/concepts/why_langchain)**: Overview of the value that LangChain provides.
+- **[Architecture](/docs/concepts/architecture)**: How packages are organized in the LangChain ecosystem.

 ## Concepts

- **[Chat Models](/docs/concepts/chat_models)**: Modern LLMs exposed via a chat interface which process sequences of messages as input and output a message.
- **[Messages](/docs/concepts/messages)**: Messages are the unit of communication in modern LLMs, used to represent input and output of a chat model, as well as any additional context or metadata that may be associated with the conversation.
- **[Chat History](/docs/concepts/chat_history)**: Chat history is a record of the conversation between the user and the chat model, used to maintain context and state throughout the conversation.
- **[Tools](/docs/concepts/tools)**: The **tool** abstraction in LangChain associates a Python **function** with a **schema** defining the function's **name**, **description**, and **input**.
- **[Tool Calling](/docs/concepts/tool_calling)**: Tool calling is the process of invoking a tool from a chat model.
- **[Structured Output](/docs/concepts/structured_outputs)**: A technique to make the chat model respond in a structured format, such as JSON and matching a specific schema.
- **[Memory](https://langchain-ai.github.io/langgraph/concepts/memory/)**: Explanation of **short-term memory** and **long-term memory** and how to implement them using LangGraph.
+- **[Chat models](/docs/concepts/chat_models)**: LLMs exposed via a chat API that process sequences of messages as input and output a message.
+- **[Messages](/docs/concepts/messages)**: The unit of communication in chat models, used to represent model input and output.
+- **[Chat history](/docs/concepts/chat_history)**: A conversation represented as a sequence of messages, alternating between user messages and model responses.
+- **[Tools](/docs/concepts/tools)**: A function with an associated schema defining the function's name, description, and the arguments it accepts.
+- **[Tool calling](/docs/concepts/tool_calling)**: A type of chat model API that accepts tool schemas, along with messages, as input and returns invocations of those tools as part of the output message.
+- **[Structured output](/docs/concepts/structured_outputs)**: A technique to make a chat model respond in a structured format, such as JSON that matches a given schema.
+- **[Memory](https://langchain-ai.github.io/langgraph/concepts/memory/)**: Information about a conversation that is persisted so that it can be used in future conversations.
 - **[Multimodality](/docs/concepts/multimodality)**: The ability to work with data that comes in different forms, such as text, audio, images, and video.
- **[Tokens](/docs/concepts/tokens)**: Modern large language models (LLMs) are typically based on a transformer architecture that processes a sequence of units known as tokens.
- **[Runnable Interface](/docs/concepts/runnables)**: Description of the standard Runnable interface which is implemented by many components in LangChain.
- **[LangChain Expression Language (LCEL)](/docs/concepts/lcel)**: A declarative approach to building new Runnables from existing Runnables.
- **[Document Loaders](/docs/concepts/document_loaders)**: Abstraction for loading documents.
+- **[Runnable interface](/docs/concepts/runnables)**: The base abstraction that many LangChain components and the LangChain Expression Language are built on.
+- **[LangChain Expression Language (LCEL)](/docs/concepts/lcel)**: A syntax for orchestrating LangChain components. Most useful for simpler applications.
+- **[Document loaders](/docs/concepts/document_loaders)**: Load a source as a list of documents.
 - **[Retrieval](/docs/concepts/retrieval)**: Information retrieval systems can retrieve structured or unstructured data from a datasource in response to a query.
- **[Text Splitters](/docs/concepts/text_splitters)**: Use to split long content into smaller more manageable chunks.
- **[Embedding Models](/docs/concepts/embedding_models)**: Embedding models are models that can represent data in a vector space.
- **[VectorStores](/docs/concepts/vectorstores)**: A datastore that can store embeddings and associated data and supports efficient vector search.
- **[Retriever](/docs/concepts/retrievers)**: A retriever is a component that retrieves relevant documents from a knowledge base in response to a query.
- **[Retrieval Augmented Generation (RAG)](/docs/concepts/rag)**: A powerful technique that enhances language models by combining them with external knowledge bases.
- **[Agents](/docs/concepts/agents)**: Use a [language model](/docs/concepts/chat_models) to choose a sequence of actions to take. Agents can interact with external resources via [tool calling](/docs/concepts/tool_calling).
- **[Prompt Templates](/docs/concepts/prompt_templates)**: Use to define prompt **templates** that can be lazily evaluated to generate prompts for [language models](/docs/concepts/chat_models). Primarily used with [LCEL](/docs/concepts/lcel) or prompts need to be serialized and stored for later use (e.g., in a database).
- **[Async Programming with LangChain](/docs/concepts/async)**: This guide covers some basic things that one should know to work with LangChain in an asynchronous context.
- **[Callbacks](/docs/concepts/callbacks)**: Learn about the callback system in LangChain. It is composed of CallbackManagers (which dispatch events to the registered handlers) and CallbackHandlers (which handle the events). Callbacks are used to stream outputs from LLMs in LangChain, observe the progress of an LLM application, and more.
- **[Output Parsers](/docs/concepts/output_parsers)**: Output parsers are responsible for taking the output of a model and transforming it into a more suitable format for downstream tasks. Output parsers were primarily useful prior to the general availability of [chat models](/docs/concepts/chat_models) that natively support [tool calling](/docs/concepts/tool_calling) and [structured outputs](/docs/concepts/structured_outputs).
- **[Few shot prompting](/docs/concepts/few_shot_prompting)**: Few-shot prompting is a technique used improve the performance of language models by providing them with a few examples of the task they are expected to perform.
- **[Example Selectors](/docs/concepts/example_selectors)**: Example selectors are used to select examples from a dataset based on a given input. They can be used to select examples randomly, by semantic similarity, or based on some other constraints. Example selectors are used in few-shot prompting to select examples for a prompt.
+- **[Text splitters](/docs/concepts/text_splitters)**: Split long text into smaller chunks that can be individually indexed to enable granular retrieval.
+- **[Embedding models](/docs/concepts/embedding_models)**: Models that represent data such as text or images in a vector space.
+- **[Vector stores](/docs/concepts/vectorstores)**: Storage of and efficient search over vectors and associated metadata.
+- **[Retriever](/docs/concepts/retrievers)**: A component that returns relevant documents from a knowledge base in response to a query.
+- **[Retrieval Augmented Generation (RAG)](/docs/concepts/rag)**: A technique that enhances language models by combining them with external knowledge bases.
+- **[Agents](/docs/concepts/agents)**: Use a [language model](/docs/concepts/chat_models) to choose a sequence of actions to take. Agents can interact with external resources via [tool](/docs/concepts/tools).
+- **[Prompt templates](/docs/concepts/prompt_templates)**: Component for factoring out the static parts of a model "prompt" (usually a sequence of messages). Useful for serializing, versioning, and reusing these static parts.
+- **[Output parsers](/docs/concepts/output_parsers)**: Responsible for taking the output of a model and transforming it into a more suitable format for downstream tasks. Output parsers were primarily useful prior to the general availability of [tool calling](/docs/concepts/tool_calling) and [structured outputs](/docs/concepts/structured_outputs).
+- **[Few-shot prompting](/docs/concepts/few_shot_prompting)**: A technique for improving model performance by providing a few examples of the task to perform in the prompt.
+- **[Example selectors](/docs/concepts/example_selectors)**: Used to select the most relevant examples from a dataset based on a given input. Example selectors are used in few-shot prompting to select examples for a prompt.
+- **[Async programming](/docs/concepts/async)**: The basics that one should know to use LangChain in an asynchronous context.
+- **[Callbacks](/docs/concepts/callbacks)**: Callbacks enable the execution of custom auxiliary code in built-in components. Callbacks are used to stream outputs from LLMs in LangChain, trace the intermediate steps of an application, and more.
+- **[Tracing](/docs/concepts/tracing)**: The process of recording the steps that an application takes to go from input to output. Tracing is essential for debugging and diagnosing issues in complex applications.
+- **[Evaluation](/docs/concepts/evaluation)**: The process of assessing the performance and effectiveness of AI applications. This involves testing the model's responses against a set of predefined criteria or benchmarks to ensure it meets the desired quality standards and fulfills the intended purpose. This process is vital for building reliable applications.

 ## Glossary

@@ -51,16 +49,18 @@ The conceptual guide will not cover step-by-step instructions or specific implem
 - **[batch](/docs/concepts/runnables)**: Use to execute a runnable with batch inputs a Runnable.
 - **[bind_tools](/docs/concepts/chat_models#bind-tools)**: Allows models to interact with tools.
 - **[Caching](/docs/concepts/chat_models#caching)**: Storing results to avoid redundant calls to a chat model.
- **[Chat Models](/docs/concepts/multimodality#chat-models)**: Chat models that handle multiple data modalities.
- **[Configurable Runnables](/docs/concepts/runnables#configurable-Runnables)**: Creating configurable Runnables.
+- **[Chat models](/docs/concepts/multimodality#chat-models)**: Chat models that handle multiple data modalities.
+- **[Configurable runnables](/docs/concepts/runnables#configurable-Runnables)**: Creating configurable Runnables.
 - **[Context window](/docs/concepts/chat_models#context-window)**: The maximum size of input a chat model can process.
 - **[Conversation patterns](/docs/concepts/chat_history#conversation-patterns)**: Common patterns in chat interactions.
+- **[Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html)**: LangChain's representation of a document.
 - **[Embedding models](/docs/concepts/multimodality#embedding-models)**: Models that generate vector embeddings for various data types.
 - **[HumanMessage](/docs/concepts/messages#humanmessage)**: Represents a message from a human user.
 - **[InjectedState](/docs/concepts/tools#injectedstate)**: A state injected into a tool function.
 - **[InjectedStore](/docs/concepts/tools#injectedstore)**: A store that can be injected into a tool for data persistence.
 - **[InjectedToolArg](/docs/concepts/tools#injectedtoolarg)**: Mechanism to inject arguments into tool functions.
 - **[input and output types](/docs/concepts/runnables#input-and-output-types)**: Types used for input and output in Runnables.
+- **[Integration packages](/docs/concepts/architecture#partner-packages)**: Third-party packages that integrate with LangChain.
 - **[invoke](/docs/concepts/runnables)**: A standard method to invoke a Runnable.
 - **[JSON mode](/docs/concepts/structured_outputs#json-mode)**: Returning responses in JSON format.
 - **[langchain-community](/docs/concepts/architecture#langchain-community)**: Community-driven components for LangChain.
@@ -69,23 +69,21 @@ The conceptual guide will not cover step-by-step instructions or specific implem
 - **[langgraph](/docs/concepts/architecture#langgraph)**: Powerful orchestration layer for LangChain. Use to build complex pipelines and workflows.
 - **[langserve](/docs/concepts/architecture#langserve)**: Use to deploy LangChain Runnables as REST endpoints. Uses FastAPI. Works primarily for LangChain Runnables, does not currently integrate with LangGraph.
 - **[Managing chat history](/docs/concepts/chat_history#managing-chat-history)**: Techniques to maintain and manage the chat history.
- **[Multimodality](/docs/concepts/chat_models#multimodality)**: Capability to process different types of data like text, audio, and images.
 - **[OpenAI format](/docs/concepts/messages#openai-format)**: OpenAI's message format for chat models.
- **[Partner packages](/docs/concepts/architecture#partner-packages)**: Third-party packages that integrate with LangChain.
- **[Propagation of RunnableConfig](/docs/concepts/runnables#propagation-runnableconfig)**: Propagating configuration through Runnables. Read if working with python 3.9, 3.10 and async.
+- **[Propagation of RunnableConfig](/docs/concepts/runnables#propagation-RunnableConfig)**: Propagating configuration through Runnables. Read if working with python 3.9, 3.10 and async.
 - **[rate-limiting](/docs/concepts/chat_models#rate-limiting)**: Client side rate limiting for chat models.
 - **[RemoveMessage](/docs/concepts/messages#remove-message)**: An abstraction used to remove a message from chat history, used primarily in LangGraph.
 - **[role](/docs/concepts/messages#role)**: Represents the role (e.g., user, assistant) of a chat message.
- **[RunnableConfig](/docs/concepts/runnables#runnableconfig)**: Use to pass run time information to Runnables (e.g., `run_name`, `run_id`, `tags`, `metadata`, `max_concurrency`, `recursion_limit`, `configurable`).
+- **[RunnableConfig](/docs/concepts/runnables#RunnableConfig)**: Use to pass run time information to Runnables (e.g., `run_name`, `run_id`, `tags`, `metadata`, `max_concurrency`, `recursion_limit`, `configurable`).
 - **[Standard parameters for chat models](/docs/concepts/chat_models#standard-parameters)**: Parameters such as API key, `temperature`, and `max_tokens`,
 - **[stream](/docs/concepts/streaming)**: Use to stream output from a Runnable or a graph.
 - **[Tokenization](/docs/concepts/tokens)**: The process of converting data into tokens and vice versa.
- **[Tokens](/docs/concepts/tokens)**: The basic unit that a language model reads, processes, and generates.
+- **[Tokens](/docs/concepts/tokens)**: The basic unit that a language model reads, processes, and generates under the hood.
 - **[Tool artifacts](/docs/concepts/tools#tool-artifacts)**: Add artifacts to the output of a tool that will not be sent to the model, but will be available for downstream processing.
 - **[Tool binding](/docs/concepts/tool_calling#tool-binding)**: Binding tools to models.
 - **[@tool](/docs/concepts/tools#@tool)**: Decorator for creating tools in LangChain.
 - **[Toolkits](/docs/concepts/tools#toolkits)**: A collection of tools that can be used together.
 - **[ToolMessage](/docs/concepts/messages#toolmessage)**: Represents a message that contains the results of a tool execution.
- **[Vectorstores](/docs/concepts/vectorstores)**: Datastores specialized for storing and efficiently searching vector embeddings.
+- **[Vector stores](/docs/concepts/vectorstores)**: Datastores specialized for storing and efficiently searching vector embeddings.
 - **[with_structured_output](/docs/concepts/chat_models#with-structured-output)**: A helper method for chat models that natively support [tool calling](/docs/concepts/tool_calling) to get structured output matching a given schema specified via Pydantic, JSON schema or a function.
 - **[with_types](/docs/concepts/runnables#with_types)**: Method to overwrite the input and output types of a runnable. Useful when working with complex LCEL chains and deploying with LangServe.
--- a/docs/docs/concepts/key_value_stores.mdx
+++ b/docs/docs/concepts/key_value_stores.mdx
@@ -2,7 +2,7 @@

 ## Overview

-LangChain provides a key-value store interface for storing and retrieving data.
+angChain provides a key-value store interface for storing and retrieving data.

 LangChain includes a [`BaseStore`](https://python.langchain.com/api_reference/core/stores/langchain_core.stores.BaseStore.html) interface,
 which allows for storage of arbitrary data. However, LangChain components that require KV-storage accept a
@@ -21,8 +21,8 @@ The key-value store interface in LangChain is used primarily for:

 Please see these how-to guides for more information:

-* [How to cache embeddings guide](https://python.langchain.com/docs/how_to/caching_embeddings/).
-* [How to retriever using multiple vectors per document](https://python.langchain.com/docs/how_to/custom_retriever/).
+* [How to cache embeddings guide](/docs/how_to/caching_embeddings/).
+* [How to retriever using multiple vectors per document](/docs/how_to/custom_retriever/).

 ## Interface

@@ -33,6 +33,6 @@ All [`BaseStores`](https://python.langchain.com/api_reference/core/stores/langch
 - `mdelete(key: Sequence[str]) -> None`: delete multiple keys
 - `yield_keys(prefix: Optional[str] = None) -> Iterator[str]`: yield all keys in the store, optionally filtering by a prefix

-## Available Integrations
+## Integrations

 Please reference the [stores integration page](/docs/integrations/stores/) for a list of available key-value store integrations.
--- a/docs/docs/concepts/lcel.mdx
+++ b/docs/docs/concepts/lcel.mdx
@@ -11,7 +11,7 @@ This means that you describe what you want to happen, rather than how you want i
 We often refer to a `Runnable` created using LCEL as a "chain". It's important to remember that a "chain" is `Runnable` and it implements the full [Runnable Interface](/docs/concepts/runnables).

 :::note
-* The [LCEL cheatsheet](https://python.langchain.com/docs/how_to/lcel_cheatsheet/) shows common patterns that involve the Runnable interface and LCEL expressions.
+* The [LCEL cheatsheet](/docs/how_to/lcel_cheatsheet/) shows common patterns that involve the Runnable interface and LCEL expressions.
 * Please see the following list of [how-to guides](/docs/how_to/#langchain-expression-language-lcel) that cover common tasks with LCEL.
 * A list of built-in `Runnables` can be found in the [LangChain Core API Reference](https://python.langchain.com/api_reference/core/runnables.html). Many of these Runnables are useful when composing custom "chains" in LangChain using LCEL.
 :::
@@ -22,7 +22,7 @@ LangChain optimizes the run-time execution of chains built with LCEL in a number

 - **Optimize parallel execution**: Run Runnables in parallel using [RunnableParallel](#RunnableParallel) or run multiple inputs through a given chain in parallel using the [Runnable Batch API](/docs/concepts/runnables#batch). Parallel execution can significantly reduce the latency as processing can be done in parallel instead of sequentially.
 - **Guarantee Async support**: Any chain built with LCEL can be run asynchronously using the [Runnable Async API](/docs/concepts/runnables#async-api). This can be useful when running chains in a server environment where you want to handle large number of requests concurrently.
- **Simplify streaming**: LCEL chains can be streamed, allowing for incremental output as the chain is executed. LangChain can optimize the streaming of the output to minimize the time-to-first-token(time elapsed until the first chunk of output from a [chat model](/docs/concepts/chat_models) or [llm](/docs/concepts/llms) comes out).
+- **Simplify streaming**: LCEL chains can be streamed, allowing for incremental output as the chain is executed. LangChain can optimize the streaming of the output to minimize the time-to-first-token(time elapsed until the first chunk of output from a [chat model](/docs/concepts/chat_models) or [llm](/docs/concepts/text_llms) comes out).

 Other benefits include:

@@ -210,7 +210,7 @@ lambda x: x + 1.invoke(some_input)
 ```
 :::

-## Legacy Chains
+## Legacy chains

 LCEL aims to provide consistency around behavior and customization over legacy subclassed chains such as `LLMChain` and
 `ConversationalRetrievalChain`. Many of these legacy chains hide important details like prompts, and as a wider variety
--- a/docs/docs/concepts/llms.mdx
+++ b/docs/docs/concepts/llms.mdx
@@ -1,3 +0,0 @@
-# Large Language Models (LLMs)
-
-Please see the [Chat Model Concept Guide](/docs/concepts/chat_models) page for more information.
--- a/docs/docs/concepts/messages.mdx
+++ b/docs/docs/concepts/messages.mdx
@@ -30,7 +30,7 @@ Roles are used to distinguish between different types of messages in a conversat
 | **user**              | Represents input from a user interacting with the model, usually in the form of text or other interactive input.                                                                                                |
 | **assistant**         | Represents a response from the model, which can include text or a request to invoke tools.                                                                                                                      |
 | **tool**              | A message used to pass the results of a tool invocation back to the model after external data or processing has been retrieved. Used with chat models that support [tool calling](/docs/concepts/tool_calling). |
-| **function (legacy)** | This is a legacy role, corresponding to OpenAI's legacy function-calling API. **tool** role should be used instead.                                                                                             |
+| **function** (legacy) | This is a legacy role, corresponding to OpenAI's legacy function-calling API. **tool** role should be used instead.                                                                                             |

 ### Content

@@ -145,7 +145,7 @@ An `AIMessage` has the following attributes. The attributes which are **standard
 | `content`            | Raw              | Usually a string, but can be a list of content blocks. See [content](#content) for details.                                                                                                                             |
 | `tool_calls`         | Standardized     | Tool calls associated with the message. See [tool calling](/docs/concepts/tool_calling) for details.                                                                                                                    |
 | `invalid_tool_calls` | Standardized     | Tool calls with parsing errors associated with the message. See [tool calling](/docs/concepts/tool_calling) for details.                                                                                                |
-| `usage_metadata`     | Standardized     | Usage metadata for a message, such as [token counts](/docs/concepts/tokens). See [Usage Metadata API Reference](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.UsageMetadata.html) |
+| `usage_metadata`     | Standardized     | Usage metadata for a message, such as [token counts](/docs/concepts/tokens). See [Usage Metadata API Reference](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.UsageMetadata.html). |
 | `id`                 | Standardized     | An optional unique identifier for the message, ideally provided by the provider/model that created the message.                                                                                                         |
 | `response_metadata`  | Raw              | Response metadata, e.g., response headers, logprobs, token counts.                                                                                                                                                      |

@@ -241,4 +241,4 @@ chat_model.invoke([
 At the moment, the output of the model will be in terms of LangChain messages, so you will need to convert the output to the OpenAI format if you
 need OpenAI format for the output as well.

-The [convert_to_openai_messages](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.convert_to_openai_messages.html) utility function can be used to convert from LangChain messages to OpenAI format.
+The [convert_to_openai_messages](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.convert_to_openai_messages.html) utility function can be used to convert from LangChain messages to OpenAI format.
--- a/docs/docs/concepts/multimodality.mdx
+++ b/docs/docs/concepts/multimodality.mdx
@@ -8,7 +8,7 @@
 - **Embedding Models**: Embedding Models can represent multimodal content, embedding various forms of data—such as text, images, and audio—into vector spaces.
 - **Vector Stores**: Vector stores could search over embeddings that represent multimodal data, enabling retrieval across different types of information.

-## Multimodality in Chat models
+## Multimodality in chat models

 :::info Pre-requisites
 * [Chat models](/docs/concepts/chat_models)
@@ -26,7 +26,7 @@ Multimodal support is still relatively new and less common, model providers have

 #### Inputs

-Some models can accept multimodal inputs, such as images, audio, video, or files. The types of multimodal inputs supported depend on the model provider. For instance, [Google's Gemini](https://python.langchain.com/docs/integrations/chat/google_generative_ai/) supports documents like PDFs as inputs.
+Some models can accept multimodal inputs, such as images, audio, video, or files. The types of multimodal inputs supported depend on the model provider. For instance, [Google's Gemini](/docs/integrations/chat/google_generative_ai/) supports documents like PDFs as inputs.

 Most chat models that support **multimodal inputs** also accept those values in OpenAI's content blocks format. So far this is restricted to image inputs. For models like Gemini which support video and other bytes input, the APIs also support the native, model-specific representations.

@@ -53,7 +53,7 @@ integration documentation for the correct format. Find the integration in the [c

 Virtually no popular chat models support multimodal outputs at the time of writing (October 2024). 

-The only exception is OpenAI's chat model ([gpt-4o-audio-preview](https://python.langchain.com/docs/integrations/chat/openai/)), which can generate audio outputs.
+The only exception is OpenAI's chat model ([gpt-4o-audio-preview](/docs/integrations/chat/openai/)), which can generate audio outputs.

 Multimodal outputs will appear as part of the [AIMessage](/docs/concepts/messages/#aimessage) response object.

@@ -80,7 +80,7 @@ As use cases involving multimodal search and retrieval tasks become more common,
 ## Multimodality in vector stores

 :::info Prerequisites
-* [Vectorstores](/docs/concepts/vectorstores)
+* [Vector stores](/docs/concepts/vectorstores)
 :::

 Vector stores are databases for storing and retrieving embeddings, which are typically used in search and retrieval tasks. Similar to embeddings, vector stores are currently optimized for text-based data.
--- a/docs/docs/concepts/output_parsers.mdx
+++ b/docs/docs/concepts/output_parsers.mdx
@@ -7,7 +7,7 @@
 The information here refers to parsers that take a text output from a model try to parse it into a more structured representation.
 More and more models are supporting function (or tool) calling, which handles this automatically.
 It is recommended to use function/tool calling rather than output parsing.
-See documentation for that [here](/docs/concepts/#function-tool-calling).
+See documentation for that [here](/docs/concepts/tool_calling).

 :::

@@ -26,7 +26,7 @@ LangChain has lots of different types of output parsers. This is a list of outpu

 | Name                                                                                                                                                                                                                                    | Supports Streaming | Has Format Instructions | Calls LLM | Input Type         | Output Type          | Description                                                                                                                                                                                                                                              |
 |-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------|-------------------------|-----------|--------------------|----------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| [JSON](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html#langchain_core.output_parsers.json.JsonOutputParser)                                                     | ✅                  | ✅                       |           | `str` \| `Message` | JSON object          | Returns a JSON object as specified. You can specify a Pydantic model and it will return JSON for that model. Probably the most reliable output parser for getting structured data that does NOT use function calling.                                    |
+| [JSON](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JSONOutputParser.html#langchain_core.output_parsers.json.JSONOutputParser)                                                     | ✅                  | ✅                       |           | `str` \| `Message` | JSON object          | Returns a JSON object as specified. You can specify a Pydantic model and it will return JSON for that model. Probably the most reliable output parser for getting structured data that does NOT use function calling.                                    |
 | [XML](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html#langchain_core.output_parsers.xml.XMLOutputParser)                                                          | ✅                  | ✅                       |           | `str` \| `Message` | `dict`               | Returns a dictionary of tags. Use when XML output is needed. Use with models that are good at writing XML (like Anthropic's).                                                                                                                            |
 | [CSV](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.list.CommaSeparatedListOutputParser.html#langchain_core.output_parsers.list.CommaSeparatedListOutputParser)                          | ✅                  | ✅                       |           | `str` \| `Message` | `List[str]`          | Returns a list of comma separated values.                                                                                                                                                                                                                |
 | [OutputFixing](https://python.langchain.com/api_reference/langchain/output_parsers/langchain.output_parsers.fix.OutputFixingParser.html#langchain.output_parsers.fix.OutputFixingParser)                                                |                    |                         | ✅         | `str` \| `Message` |                      | Wraps another output parser. If that output parser errors, then this will pass the error message and the bad output to an LLM and ask it to fix the output.                                                                                              |
--- a/docs/docs/concepts/rag.mdx
+++ b/docs/docs/concepts/rag.mdx
@@ -1,4 +1,4 @@
-# Retrieval Augmented Generation (RAG)
+# Retrieval augmented generation (rag)

 :::info[Prerequisites]

@@ -15,7 +15,7 @@ The system then incorporates this retrieved information into the model's prompt.
 The model uses the provided context to generate a response to the query.
 By bridging the gap between vast language models and dynamic, targeted information retrieval, RAG is a powerful technique for building more capable and reliable AI systems.

-## Key Concepts
+## Key concepts

 ![Conceptual Overview](/img/rag_concepts.png)

--- a/docs/docs/concepts/retrieval.mdx
+++ b/docs/docs/concepts/retrieval.mdx
@@ -3,7 +3,7 @@
 :::info[Prerequisites]

 * [Retrievers](/docs/concepts/retrievers/)
-* [Vectorstores](/docs/concepts/vectorstores/)
+* [Vector stores](/docs/concepts/vectorstores/)
 * [Embeddings](/docs/concepts/embedding_models/)
 * [Text splitters](/docs/concepts/text_splitters/)

@@ -39,7 +39,7 @@ This translation enables more intuitive and flexible interactions with complex d

 (2) **Information retrieval**: Search queries are used to fetch information from various retrieval systems.

-## Query Analysis 
+## Query analysis 

 While users typically prefer to interact with retrieval systems using natural language, retrieval systems can specific query syntax or benefit from particular keywords. 
 Query analysis serves as a bridge between raw user input and optimized search queries. Some common applications of query analysis include:
@@ -49,7 +49,7 @@ Query analysis serves as a bridge between raw user input and optimized search qu

 Query analysis employs models to transform or construct optimized search queries from raw user input. 

-### Query Re-writing
+### Query re-writing

 Retrieval systems should ideally handle a wide spectrum of user inputs, from simple and poorly worded queries to complex, multi-faceted questions. 
 To achieve this versatility, a popular approach is to use models to transform raw user queries into more effective search queries. 
@@ -78,7 +78,7 @@ from pydantic import BaseModel, Field
 from langchain_openai import ChatOpenAI
 from langchain_core.messages import SystemMessage, HumanMessage

-# Define a Pydantic model to enforce the output structure
+# Define a pydantic model to enforce the output structure
 class Questions(BaseModel):
    questions: List[str] = Field(
        description="A list of sub-questions related to the input query."
@@ -107,7 +107,7 @@ See our RAG from Scratch videos for a few different specific approaches:

 :::

-### Query Construction
+### Query construction

 Query analysis also can focus on translating natural language queries into specialized query languages or filters. 
 This translation is crucial for effectively interacting with various types of databases that house structured or semi-structured data.
@@ -149,7 +149,7 @@ retriever = SelfQueryRetriever.from_llm(

 ::: 

-## Information Retrieval 
+## Information retrieval 

 ### Common retrieval systems

--- a/docs/docs/concepts/retrievers.mdx
+++ b/docs/docs/concepts/retrievers.mdx
@@ -4,7 +4,7 @@

 :::info[Prerequisites]

-* [Vectorstores](/docs/concepts/vectorstores/)
+* [Vector stores](/docs/concepts/vectorstores/)
 * [Embeddings](/docs/concepts/embedding_models/)
 * [Text splitters](/docs/concepts/text_splitters/)

@@ -54,13 +54,13 @@ Retrievers return a list of [Document](https://api.python.langchain.com/en/lates

 Despite the flexibility of the retriever interface, a few common types of retrieval systems are frequently used.

-### Search APIs
+### Search apis

 It's important to note that retrievers don't need to actually *store* documents. 
 For example, we can be built retrievers on top of search APIs that simply return search results! 
-See our retriever integrations with [Amazon Kendra](https://python.langchain.com/docs/integrations/retrievers/amazon_kendra_retriever/) or [Wikipedia Search](https://python.langchain.com/docs/integrations/retrievers/wikipedia/). 
+See our retriever integrations with [Amazon Kendra](/docs/integrations/retrievers/amazon_kendra_retriever/) or [Wikipedia Search](/docs/integrations/retrievers/wikipedia/). 

-### Relational or Graph Database
+### Relational or graph database

 Retrievers can be built on top of relational or graph databases.
 In these cases, [query analysis](/docs/concepts/retrieval/) techniques to construct a structured query from natural language is critical.
@@ -73,7 +73,7 @@ For example, you can build a retriever for a SQL database using text-to-SQL conv

 :::

-### Lexical Search
+### Lexical search

 As discussed in our conceptual review of [retrieval](/docs/concepts/retrieval/), many search engines are based upon matching words in a query to the words in each document. 
 [BM25](https://en.wikipedia.org/wiki/Okapi_BM25#:~:text=BM25%20is%20a%20bag%2Dof,slightly%20different%20components%20and%20parameters.) and [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) are [two popular lexical search algorithms](https://cameronrwolfe.substack.com/p/the-basics-of-ai-powered-vector-search?utm_source=profile&utm_medium=reader2).
@@ -87,9 +87,9 @@ LangChain has retrievers for many popular lexical search algorithms / engines.

 ::: 

-### Vectorstore 
+### Vector store 

-[Vectorstores](/docs/concepts/vectorstores/) are a powerful and efficient way to index and retrieve unstructured data. 
+[Vector stores](/docs/concepts/vectorstores/) are a powerful and efficient way to index and retrieve unstructured data. 
 An vectorstore can be used as a retriever by calling the `as_retriever()` method.

 ```python
@@ -106,7 +106,7 @@ This is particularly useful when you have multiple retrievers that are good at f
 It is easy to create an [ensemble retriever](/docs/how_to/ensemble_retriever/) that combines multiple retrievers with linear weighted scores:

 ```python
-# initialize the ensemble retriever
+# Initialize the ensemble retriever
 ensemble_retriever = EnsembleRetriever(
    retrievers=[bm25_retriever, vector_store_retriever], weights=[0.5, 0.5]
 )
@@ -115,7 +115,7 @@ ensemble_retriever = EnsembleRetriever(
 When ensembling, how do we combine search results from many retrievers? 
 This motivates the concept of re-ranking, which takes the output of multiple retrievers and combines them using a more sophisticated algorithm such as [Reciprocal Rank Fusion (RRF)](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf).

-### Source Document Retention 
+### Source document retention 

 Many retrievers utilize some kind of index to make documents easily searchable.
 The process of indexing can include a transformation step (e.g., vectorstores often use document splitting). 
--- a/docs/docs/concepts/runnables.mdx
+++ b/docs/docs/concepts/runnables.mdx
@@ -1,4 +1,4 @@
-# Runnable Interface
+# Runnable interface

 The Runnable interface is foundational for working with LangChain components, and it's implemented across many of them, such as [language models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [retrievers](/docs/concepts/retrievers), [compiled LangGraph graphs](
 https://langchain-ai.github.io/langgraph/concepts/low_level/#compiling-your-graph) and more.
@@ -10,7 +10,7 @@ This guide covers the main concepts and methods of the Runnable interface, which
 * A list of built-in `Runnables` can be found in the [LangChain Core API Reference](https://python.langchain.com/api_reference/core/runnables.html). Many of these Runnables are useful when composing custom "chains" in LangChain using the [LangChain Expression Language (LCEL)](/docs/concepts/lcel).
 :::

-## Overview of Runnable Interface
+## Overview of runnable interface

 The Runnable way defines a standard interface that allows a Runnable component to be:

@@ -23,7 +23,7 @@ The Runnable way defines a standard interface that allows a Runnable component t
 Please review the [LCEL Cheatsheet](/docs/how_to/lcel_cheatsheet) for some common patterns that involve the Runnable interface and LCEL expressions.

 <a id="batch"></a>
-### Optimized Parallel Execution (Batch)
+### Optimized parallel execution (batch)
 <span data-heading-keywords="batch"></span>

 LangChain Runnables offer a built-in `batch` (and `batch_as_completed`) API that allow you to process multiple inputs in parallel.
@@ -46,19 +46,19 @@ The async versions of `abatch` and `abatch_as_completed` these rely on asyncio's
 :::

 :::tip
-When processing a large number of inputs using `batch` or `batch_as_completed`, users may want to control the maximum number of parallel calls. This can be done by setting the `max_concurrency` attribute in the `RunnableConfig` dictionary. See the [RunnableConfig](/docs/concepts/runnables#runnableconfig) for more information.
+When processing a large number of inputs using `batch` or `batch_as_completed`, users may want to control the maximum number of parallel calls. This can be done by setting the `max_concurrency` attribute in the `RunnableConfig` dictionary. See the [RunnableConfig](/docs/concepts/runnables#RunnableConfig) for more information.

 Chat Models also have a built-in [rate limiter](/docs/concepts/chat_models#rate-limiting) that can be used to control the rate at which requests are made.
 :::

-### Asynchronous Support
+### Asynchronous support
 <span data-heading-keywords="async-api"></span>

 Runnables expose an asynchronous API, allowing them to be called using the `await` syntax in Python. Asynchronous methods can be identified by the "a" prefix (e.g., `ainvoke`, `abatch`, `astream`, `abatch_as_completed`).

 Please refer to the [Async Programming with LangChain](/docs/concepts/async) guide for more details.

-## Streaming APIs
+## Streaming apis
 <span data-heading-keywords="streaming-api"></span>

 Streaming is critical in making applications based on LLMs feel responsive to end-users.
@@ -71,7 +71,7 @@ Runnables expose the following three streaming APIs:

 Please refer to the [Streaming Conceptual Guide](/docs/concepts/streaming) for more details on how to stream in LangChain.

-## Input and Output Types
+## Input and output types

 Every `Runnable` is characterized by an input and output type. These input and output types can be any Python object, and are defined by the Runnable itself.

@@ -94,7 +94,7 @@ The **input type** and **output type** vary by component:

 Please refer to the individual component documentation for more information on the input and output types and how to use them.

-### Inspecting Schemas
+### Inspecting schemas

 :::note
 This is an advanced feature that is unnecessary for most users. You should probably
@@ -121,7 +121,7 @@ Please see the [Configurable Runnables](#configurable-runnables) section for mor
 | `get_config_jsonschema` | Gives the JSONSchema of the config schema for the Runnable.      |


-#### with_types
+#### With_types

 LangChain will automatically try to infer the input and output types of a Runnable based on available information.

@@ -131,7 +131,7 @@ Currently, this inference does not work well for more complex Runnables that are
 ## RunnableConfig

 Any of the methods that are used to execute the runnable (e.g., `invoke`, `batch`, `stream`, `astream_events`) accept a second argument called
-`RunnableConfig` ([API Reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html#runnableconfig)). This argument is a dictionary that contains configuration for the Runnable that will be used
+`RunnableConfig` ([API Reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html#RunnableConfig)). This argument is a dictionary that contains configuration for the Runnable that will be used
 at run time during the execution of the runnable.

 A `RunnableConfig` can have any of the following properties defined:
@@ -212,7 +212,7 @@ attempting to stream data using `astream_events` and `astream_log` as these meth
 rely on proper propagation of [callbacks](/docs/concepts/callbacks) defined inside of `RunnableConfig`.
 :::

-### Setting Custom Run Name, Tags, and Metadata
+### Setting custom run name, tags, and metadata

 The `run_name`, `tags`, and `metadata` attributes of the `RunnableConfig` dictionary can be used to set custom values for the run name, tags, and metadata for a given Runnable.

@@ -229,7 +229,7 @@ The attributes will also be propagated to [callbacks](/docs/concepts/callbacks),
 * [How-to trace with LangChain](https://docs.smith.langchain.com/how_to_guides/tracing/trace_with_langchain)
 :::

-### Setting Run ID
+### Setting run id

 :::note
 This is an advanced feature that is unnecessary for most users.
@@ -255,10 +255,10 @@ some_runnable.invoke(
   }
 )

-# do something with the run_id
+# Do something with the run_id
 ```

-### Setting Recursion Limit
+### Setting recursion limit

 :::note
 This is an advanced feature that is unnecessary for most users.
@@ -266,7 +266,7 @@ This is an advanced feature that is unnecessary for most users.

 Some Runnables may return other Runnables, which can lead to infinite recursion if not handled properly. To prevent this, you can set a `recursion_limit` in the `RunnableConfig` dictionary. This will limit the number of times a Runnable can recurse.

-### Setting Max Concurrency
+### Setting max concurrency

 If using the `batch` or `batch_as_completed` methods, you can set the `max_concurrency` attribute in the `RunnableConfig` dictionary to control the maximum number of parallel calls to make. This can be useful when you want to limit the number of parallel calls to prevent overloading a server or API.

@@ -274,7 +274,7 @@ If using the `batch` or `batch_as_completed` methods, you can set the `max_concu
 :::tip
 If you're trying to rate limit the number of requests made by a **Chat Model**, you can use the built-in [rate limiter](/docs/concepts/chat_models#rate-limiting) instead of setting `max_concurrency`, which will be more effective.

-See the [How to handle rate limits](https://python.langchain.com/docs/how_to/chat_model_rate_limiting/) guide for more information.
+See the [How to handle rate limits](/docs/how_to/chat_model_rate_limiting/) guide for more information.
 :::

 ### Setting configurable
@@ -290,7 +290,7 @@ a `session_id` / `conversation_id` to keep track of conversation history.

 In addition, you can use it to specify any custom configuration options to pass to any [Configurable Runnable](#configurable-runnables) that they create.

-### Setting Callbacks
+### Setting callbacks

 Use this option to configure [callbacks](/docs/concepts/callbacks) for the runnable at 
 runtime. The callbacks will be passed to all sub-calls made by the runnable.
@@ -312,10 +312,10 @@ Please read the [Callbacks Conceptual Guide](/docs/concepts/callbacks) for more
 :::important
 If you're using Python 3.9 or 3.10 in an async environment, you must propagate
 the `RunnableConfig` manually to sub-calls in some cases. Please see the
-[Propagating RunnableConfig](#propagation-of-runnableconfig) section for more information.
+[Propagating RunnableConfig](#propagation-of-RunnableConfig) section for more information.
 :::

-## Creating a Runnable from a function
+## Creating a runnable from a function

 You may need to create a custom Runnable that runs arbitrary logic. This is especially
 useful if using [LangChain Expression Language (LCEL)](/docs/concepts/lcel) to compose
@@ -333,7 +333,7 @@ Users should not try to subclass Runnables to create a new custom Runnable. It i
 much more complex and error-prone than simply using `RunnableLambda` or `RunnableGenerator`.
 :::

-## Configurable Runnables
+## Configurable runnables

 :::note
 This is an advanced feature that is unnecessary for most users.
--- a/docs/docs/concepts/streaming.mdx
+++ b/docs/docs/concepts/streaming.mdx
@@ -106,7 +106,7 @@ While this API is available for use with [LangGraph](/docs/concepts/architecture

 For chains constructed using **LCEL**, the `.stream()` method only streams the output of the final step from te chain. This might be sufficient for some applications, but as you build more complex chains of several LLM calls together, you may want to use the intermediate values of the chain alongside the final output. For example, you may want to return sources alongside the final generation when building a chat-over-documents app.

-There are ways to do this [using callbacks](/docs/concepts/#callbacks-1), or by constructing your chain in such a way that it passes intermediate
+There are ways to do this [using callbacks](/docs/concepts/callbacks), or by constructing your chain in such a way that it passes intermediate
 values to the end with something like chained [`.assign()`](/docs/how_to/passthrough/) calls, but LangChain also includes an
 `.astream_events()` method that combines the flexibility of callbacks with the ergonomics of `.stream()`. When called, it returns an iterator
 which yields [various types of events](/docs/how_to/streaming/#event-reference) that you can filter and process according
@@ -140,7 +140,7 @@ See [this guide](/docs/how_to/streaming/#using-stream-events) for more detailed
 To write custom data to the stream, you will need to choose one of the following methods based on the component you are working with:

 1. LangGraph's [StreamWriter](https://langchain-ai.github.io/langgraph/reference/types/#langgraph.types.StreamWriter) can be used to write custom data that will surface through **stream** and **astream** APIs when working with LangGraph. **Important** this is a LangGraph feature, so it is not available when working with pure LCEL. See [how to streaming custom data](https://langchain-ai.github.io/langgraph/how-tos/streaming-content/) for more information.
-2. [dispatch_events](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.manager.dispatch_custom_event.html#) / [adispatch_events](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.manager.adispatch_custom_event.html) can be used to write custom data that will be surfaced through the **astream_events** API. See [how to dispatch custom callback events](https://python.langchain.com/docs/how_to/callbacks_custom_events/#astream-events-api) for more information.
+2. [dispatch_events](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.manager.dispatch_custom_event.html#) / [adispatch_events](https://python.langchain.com/api_reference/core/callbacks/langchain_core.callbacks.manager.adispatch_custom_event.html) can be used to write custom data that will be surfaced through the **astream_events** API. See [how to dispatch custom callback events](/docs/how_to/callbacks_custom_events/#astream-events-api) for more information.

 ## "Auto-Streaming" Chat Models

@@ -188,4 +188,4 @@ Please see the following how-to guides for specific examples of streaming in Lan
 For writing custom data to the stream, please see the following resources:

 * If using LangGraph, see [how to stream custom data](https://langchain-ai.github.io/langgraph/how-tos/streaming-content/).
-* If using LCEL, see [how to dispatch custom callback events](https://python.langchain.com/docs/how_to/callbacks_custom_events/#astream-events-api).
+* If using LCEL, see [how to dispatch custom callback events](/docs/how_to/callbacks_custom_events/#astream-events-api).
--- a/docs/docs/concepts/structured_outputs.mdx
+++ b/docs/docs/concepts/structured_outputs.mdx
@@ -1,4 +1,4 @@
-# Structured Outputs
+# Structured outputs

 ## Overview 

@@ -9,7 +9,7 @@ This need motivates the concept of structured output, where models can be instru

 ![Structured output](/img/structured_output.png)

-## Key Concepts 
+## Key concepts 

 **(1) Schema definition:** The output structure is represented as a schema, which can be defined in several ways. 
 **(2) Returning structured output:** The model is given this schema, and is instructed to return output that conforms to it.
@@ -62,17 +62,17 @@ With a schema defined, we need a way to instruct the model to use it.
 While one approach is to include this schema in the prompt and *ask nicely* for the model to use it, this is not recommended. 
 Several more powerful methods that utilizes native features in the model provider's API are available.

-### Using Tool Calling
+### Using tool calling

 Many [model providers support](/docs/integrations/chat/) tool calling, a concept discussed in more detail in our [tool calling guide](/docs/concepts/tool_calling/).
 In short, tool calling involves binding a tool to a model and, when appropriate, the model can *decide* to call this tool and ensure its response conforms to the tool's schema.
-With this in mind, the central concept is strightforward: *simply bind our schema to a model as a tool!*
+With this in mind, the central concept is straightforward: *simply bind our schema to a model as a tool!*
 Here is an example using the `ResponseFormatter` schema defined above:

 ```python
 from langchain_openai import ChatOpenAI
 model = ChatOpenAI(model="gpt-4o", temperature=0)
-# Bind ResponseFormatter schema as a tool to the model
+# Bind responseformatter schema as a tool to the model
 model_with_tools = model.bind_tools([ResponseFormatter])
 # Invoke the model
 ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")
@@ -86,7 +86,7 @@ This dictionary can be optionally parsed into a Pydantic object, matching our or
 ai_msg.tool_calls[0]["args"]
 {'answer': "The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.",
 'followup_question': 'What is the function of ATP in the cell?'}
-# Parse the dictionary into a Pydantic object
+# Parse the dictionary into a pydantic object
 pydantic_object = ResponseFormatter.model_validate(ai_msg.tool_calls[0]["args"])
 ```

@@ -106,7 +106,7 @@ ai_msg.content
 ```

 One important point to flag: the model *still* returns a string, which needs to be parsed into a JSON object.
-This can, of course, simply use the `json` library or a JSON output parser if you need more adavanced functionality.
+This can, of course, simply use the `json` library or a JSON output parser if you need more advanced functionality.
 See this [how-to guide on the JSON output parser](/docs/how_to/output_parser_json) for more details.

 ```python
@@ -117,7 +117,7 @@ json_object = json.loads(ai_msg.content)

 ## Structured output method 

-There a few challenges when producing structured output with the above methods: 
+There are a few challenges when producing structured output with the above methods: 

 (1) If using tool calling, tool call arguments needs to be parsed from a dictionary back to the original schema.  

@@ -136,7 +136,7 @@ This both binds the schema to the model as a tool and parses the output to the s
 model_with_structure = model.with_structured_output(ResponseFormatter)
 # Invoke the model
 structured_output = model_with_structure.invoke("What is the powerhouse of the cell?")
-# Get back the Pydantic object
+# Get back the pydantic object
 structured_output
 ResponseFormatter(answer="The powerhouse of the cell is the mitochondrion. Mitochondria are organelles that generate most of the cell's supply of adenosine triphosphate (ATP), which is used as a source of chemical energy.", followup_question='What is the function of ATP in the cell?')
 ```
@@ -145,4 +145,4 @@ ResponseFormatter(answer="The powerhouse of the cell is the mitochondrion. Mitoc

 For more details on usage, see our [how-to guide](/docs/how_to/structured_output/#the-with_structured_output-method).

-:::
+:::
--- a/docs/docs/concepts/text_llms.mdx
+++ b/docs/docs/concepts/text_llms.mdx
@@ -0,0 +1,10 @@
+# String-in, string-out llms
+
+:::tip
+You are probably looking for the [Chat Model Concept Guide](/docs/concepts/chat_models) page for more information.
+:::
+
+LangChain has implementations for older language models that take a string as input and return a string as output. These models are typically named without the "Chat" prefix (e.g., `Ollama`, `Anthropic`, `OpenAI`, etc.), and may include the "LLM" suffix (e.g., `OllamaLLM`, `AnthropicLLM`, `OpenAILLM`, etc.). These models implement the [BaseLLM](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.llms.BaseLLM.html#langchain_core.language_models.llms.BaseLLM) interface.
+
+Users should be using almost exclusively the newer [Chat Models](/docs/concepts/chat_models) as most
+model providers have adopted a chat like interface for interacting with language models.
--- a/docs/docs/concepts/text_splitters.mdx
+++ b/docs/docs/concepts/text_splitters.mdx
@@ -1,4 +1,4 @@
-# Text Splitters
+# Text splitters
 <span data-heading-keywords="text splitter,text splitting"></span>

 :::info[Prerequisites]
--- a/docs/docs/concepts/tool_calling.mdx
+++ b/docs/docs/concepts/tool_calling.mdx
@@ -1,4 +1,4 @@
-# Tool Calling
+# Tool calling

 :::info[Prerequisites]
 * [Tools](/docs/concepts/tools)
@@ -19,7 +19,7 @@ You will sometimes hear the term `function calling`. We use this term interchang

 ![Conceptual overview of tool calling](/img/tool_calling_concept.png)

-## Key Concepts 
+## Key concepts 

 **(1) Tool Creation:** Use the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator to create a [tool](/docs/concepts/tools). A tool is an association between a function and its schema.
 **(2) Tool Binding:** The tool needs to be connected to a model that supports tool calling. This gives the model awareness of the tool and the associated input schema required by the tool.
@@ -44,7 +44,7 @@ model_with_tools = model.bind_tools(tools)
 response = model_with_tools.invoke(user_input)
 ```

-## Tool Creation
+## Tool creation

 The recommended way to create a tool is using the `@tool` decorator.

@@ -65,7 +65,7 @@ def multiply(a: int, b: int) -> int:

 :::

-## Tool Binding 
+## Tool binding 

 [Many](https://platform.openai.com/docs/guides/function-calling) [model providers](https://platform.openai.com/docs/guides/function-calling) support tool calling. 

@@ -95,7 +95,7 @@ def multiply(a: int, b: int) -> int:
 llm_with_tools = tool_calling_model.bind_tools([multiply])
 ```

-## Tool Calling
+## Tool calling

 ![Diagram of a tool call by a model](/img/tool_call_example.png)

@@ -141,7 +141,7 @@ For more details on usage, see our [how-to guides](/docs/how_to/#tools)!

 When designing [tools](/docs/concepts/tools/) to be used by a model, it is important to keep in mind that:

-* Models that have explicit [tool-calling APIs](/docs/concepts/#functiontool-calling) will be better at tool calling than non-fine-tuned models.
+* Models that have explicit [tool-calling APIs](/docs/concepts/tool_calling) will be better at tool calling than non-fine-tuned models.
 * Models will perform better if the tools have well-chosen names and descriptions.
 * Simple, narrowly scoped tools are easier for models to use than complex tools.
 * Asking the model to select from a large list of tools poses challenges for the model.
--- a/docs/docs/concepts/tools.mdx
+++ b/docs/docs/concepts/tools.mdx
@@ -10,7 +10,7 @@ The **tool** abstraction in LangChain associates a python **function** with a **

 **Tools** can be passed to [chat models](/docs/concepts/chat_models) that support [tool calling](/docs/concepts/tool_calling) allowing the model to request the execution of a specific function with specific inputs.

-## Key Concepts
+## Key concepts

 - Tools are a way to encapsulate a function and its schema in a way that can be passed to a chat model.
 - Create tools using the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator, which simplifies the process of tool creation, supporting the following:
@@ -18,7 +18,7 @@ The **tool** abstraction in LangChain associates a python **function** with a **
   - Defining tools that return **artifacts** (e.g. images, dataframes, etc.)
   - Hiding input arguments from the schema (and hence from the model) using **injected tool arguments**.

-## Tool Interface
+## Tool interface

 The tool interface is defined in the [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.base.BaseTool.html#langchain_core.tools.base.BaseTool) class which is a subclass of the [Runnable Interface](/docs/concepts/runnables).

@@ -70,9 +70,9 @@ print(multiply.name) # multiply
 print(multiply.description) # Multiply two numbers.
 print(multiply.args) 
 # {
-#   'type': 'object', 
-#   'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 
-#   'required': ['a', 'b']
+# 'type': 'object', 
+# 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}}, 
+# 'required': ['a', 'b']
 # }
 ```

@@ -134,14 +134,14 @@ def user_specific_tool(input_data: str, user_id: InjectedToolArg) -> str:
 Annotating the `user_id` argument with `InjectedToolArg` tells LangChain that this argument should not be exposed as part of the
 tool's schema.

-See [how to pass run time values to tools](https://python.langchain.com/docs/how_to/tool_runtime/) for more details on how to use `InjectedToolArg`.  
+See [how to pass run time values to tools](/docs/how_to/tool_runtime/) for more details on how to use `InjectedToolArg`.  


 ### RunnableConfig

 You can use the `RunnableConfig` object to pass custom run time values to tools.

-If you need to access the [RunnableConfig](/docs/concepts/runnables/#runnableconfig) object from within a tool. This can be done by using the `RunnableConfig` annotation in the tool's function signature.
+If you need to access the [RunnableConfig](/docs/concepts/runnables/#RunnableConfig) object from within a tool. This can be done by using the `RunnableConfig` annotation in the tool's function signature.

 ```python
 from langchain_core.runnables import RunnableConfig
@@ -160,7 +160,7 @@ The `config` will not be part of the tool's schema and will be injected at runti
 :::note
 You may need to access the `config` object to manually propagate it to subclass. This happens if you're working with python 3.9 / 3.10 in an [async](/docs/concepts/async) environment and need to manually propagate the `config` object to sub-calls.

-Please read [Propagation RunnableConfig](/docs/concepts/runnables#propagation-runnableconfig) for more details to learn how to propagate the `RunnableConfig` down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).
+Please read [Propagation RunnableConfig](/docs/concepts/runnables#propagation-RunnableConfig) for more details to learn how to propagate the `RunnableConfig` down the call chain manually (or upgrade to Python 3.11 where this is no longer an issue).
 :::

 ### InjectedState
@@ -198,13 +198,13 @@ toolkit = ExampleTookit(...)
 tools = toolkit.get_tools()
 ```

-## Related Resources
+## Related resources

 See the following resources for more information:

 - [API Reference for @tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html)
- [How to create custom tools](https://python.langchain.com/docs/how_to/custom_tools/)
- [How to pass run time values to tools](https://python.langchain.com/docs/how_to/tool_runtime/)
+- [How to create custom tools](/docs/how_to/custom_tools/)
+- [How to pass run time values to tools](/docs/how_to/tool_runtime/)
 - [All LangChain tool how-to guides](https://docs.langchain.com/docs/how_to/#tools)
 - [Additional how-to guides that show usage with LangGraph](https://langchain-ai.github.io/langgraph/how-tos/tool-calling/)
 - Tool integrations, see the [tool integration docs](https://docs.langchain.com/docs/integrations/tools/).
--- a/docs/docs/concepts/tracing.mdx
+++ b/docs/docs/concepts/tracing.mdx
@@ -0,0 +1,10 @@
+# Tracing
+
+<span data-heading-keywords="trace,tracing"></span>
+
+A trace is essentially a series of steps that your application takes to go from input to output.
+Traces contain individual steps called `runs`. These can be individual calls from a model, retriever,
+tool, or sub-chains.
+Tracing gives you observability inside your chains and agents, and is vital in diagnosing issues.
+
+For a deeper dive, check out [this LangSmith conceptual guide](https://docs.smith.langchain.com/concepts/tracing).
--- a/docs/docs/concepts/vectorstores.mdx
+++ b/docs/docs/concepts/vectorstores.mdx
@@ -11,46 +11,60 @@

 This conceptual overview focuses on text-based indexing and retrieval for simplicity. 
 However, embedding models can be [multi-modal](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings)
-and vectorstores can be used to store and retrieve a variety of data types beyond text.
+and vector stores can be used to store and retrieve a variety of data types beyond text.
 :::

 ## Overview

-Vectorstores are a powerful and efficient way to index and retrieve unstructured data. 
-They leverage vector [embeddings](/docs/concepts/embedding_models/), which are numerical representations of unstructured data that capture semantic meaning.
-At their core, vectorstores utilize specialized data structures called vector indices. 
-These indices are designed to perform efficient similarity searches over embedding vectors, allowing for rapid retrieval of relevant information based on semantic similarity rather than exact keyword matches.
+Vector stores are specialized data stores that enable indexing and retrieving information based on vector representations.

-## Key concept
+These vectors, called [embeddings](/docs/concepts/embedding_models/), capture the semantic meaning of data that has been embedded.

-![Vectorstores](/img/vectorstores.png)
+Vector stores are frequently used to search over unstructured data, such as text, images, and audio, to retrieve relevant information based on semantic similarity rather than exact keyword matches.

-There are [many different types of vectorstores](/docs/integrations/vectorstores/).
-LangChain provides a universal interface for working with them, providing standard methods for common operations.
+![Vector stores](/img/vectorstores.png)
+
+## Integrations
+
+LangChain has a large number of vectorstore integrations, allowing users to easily switch between different vectorstore implementations.
+
+Please see the [full list of LangChain vectorstore integrations](/docs/integrations/vectorstores/).
+
+## Interface
+
+LangChain provides a standard interface for working with vector stores, allowing users to easily switch between different vectorstore implementations.
+
+The interface consists of basic methods for writing, deleting and searching for documents in the vector store.
+
+The key methods are:
+
+- `add_documents`: Add a list of texts to the vector store.
+- `delete_documents`: Delete a list of documents from the vector store.
+- `similarity_search`: Search for similar documents to a given query.
+
+
+## Initialization
+
+Most vectors in LangChain accept an embedding model as an argument when initializing the vector store.
+
+We will use LangChain's [InMemoryVectorStore](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.in_memory.InMemoryVectorStore.html) implementation to illustrate the API.
+
+```python
+from langchain_core.vectorstores import InMemoryVectorStore
+# Initialize with an embedding model
+vector_store = InMemoryVectorStore(embedding=SomeEmbeddingModel())
+```

 ## Adding documents

-Using [Pinecone](https://python.langchain.com/api_reference/pinecone/vectorstores/langchain_pinecone.vectorstore.PineconeVectorStore.html#langchain_pinecone.vectorstores.PineconeVectorStore) as an example, we initialize a vectorstore with the [embedding](/docs/concepts/embedding_models/) model we want to use:
+To add documents, use the `add_documents` method.

-```python
-from pinecone import Pinecone
-from langchain_openai import OpenAIEmbeddings
-from langchain_pinecone import PineconeVectorStore
-
-# Initialize Pinecone
-pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
-
-# Initialize with an embedding model
-vector_store = PineconeVectorStore(index=pc.Index(index_name), embedding=OpenAIEmbeddings())
-```
-
-Given a vectorstore, we need the ability to add documents to it.
-The `add_texts` and `add_documents` methods can be used to add texts (strings) and documents (LangChain [Document](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html) objects) to a vectorstore, respectively.
-As an example, we can create a list of [Documents](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html).
+This API works with a list of [Document](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html) objects.
 `Document` objects all have `page_content` and `metadata` attributes, making them a universal way to store unstructured text and associated metadata.

 ```python
 from langchain_core.documents import Document
+
 document_1 = Document(
    page_content="I had chocalate chip pancakes and scrambled eggs for breakfast this morning.",
    metadata={"source": "tweet"},
@@ -60,32 +74,30 @@ document_2 = Document(
    page_content="The weather forecast for tomorrow is cloudy and overcast, with a high of 62 degrees.",
    metadata={"source": "news"},
 )
+
 documents = [document_1, document_2]
+
+vector_store.add_documents(documents=documents)
 ```

-When we use the `add_documents` method to add the documents to the vectorstore, the vectorstore will use the provided embedding model to create an embedding of each document. 
-What happens if we add the same document twice?
-Many vectorstores support [`upsert`](https://docs.pinecone.io/guides/data/upsert-data) functionality, which combines the functionality of inserting and updating records.
-To use this, we simply supply a unique identifier for each document when we add it to the vectorstore using `add_documents` or `add_texts`.
-If the record doesn't exist, it inserts a new record.
-If the record already exists, it updates the existing record.
+You should usually provide IDs for the documents you add to the vector store, so
+that instead of adding the same document multiple times, you can update the existing document.

 ```python
-# Given a list of documents and a vector store
-uuids = [str(uuid4()) for _ in range(len(documents))]
-vector_store.add_documents(documents=documents, ids=uuids)
+vector_store.add_documents(documents=documents, ids=["doc1", "doc2"])
 ```

-:::info[Further reading]
+## Delete

-* See the [full list of LangChain vectorstore integrations](/docs/integrations/vectorstores/).
-* See Pinecone's [documentation](https://docs.pinecone.io/guides/data/upsert-data) on the `upsert` method.
+To delete documents, use the `delete_documents` method which takes a list of document IDs to delete.

-:::
+```python
+vector_store.delete_documents(ids=["doc1"])
+```

 ## Search

-Vectorstores embed and store the documents that added.
+Vector stores embed and store the documents that added.
 If we pass in a query, the vectorstore will embed the query, perform a similarity search over the embedded documents, and return the most similar ones.
 This captures two important concepts: first, there needs to be a way to measure the similarity between the query and *any* [embedded](/docs/concepts/embedding_models/) document.
 Second, there needs to be an algorithm to efficiently perform this similarity search across *all* embedded documents.
@@ -98,16 +110,8 @@ A critical advantage of embeddings vectors is they can be compared using many si
 - **Euclidean Distance**: Measures the straight-line distance between two points.
 - **Dot Product**: Measures the projection of one vector onto another.

-The choice of similarity metric can sometimes be selected when initializing the vectorstore.
-As an example, Pinecone allows the user to select the [similarity metric on index creation](/docs/integrations/vectorstores/pinecone/#initialization).
-
-```python
-pc.create_index(
-    name=index_name,
-    dimension=3072,
-    metric="cosine",
-)
-```
+The choice of similarity metric can sometimes be selected when initializing the vectorstore. Please refer
+to the documentation of the specific vectorstore you are using to see what similarity metrics are supported.

 :::info[Further reading]

@@ -117,7 +121,7 @@ pc.create_index(

 :::

-### Similarity Search
+### Similarity search

 Given a similarity metric to measure the distance between the embedded query and any embedded document, we need an algorithm to efficiently search over *all* the embedded documents to find the most similar ones.
 There are various ways to do this. As an example, many vectorstores implement [HNSW (Hierarchical Navigable Small World)](https://www.pinecone.io/learn/series/faiss/hnsw/), a graph-based index structure that allows for efficient similarity search.
@@ -152,11 +156,12 @@ This allows structured filters to reduce the size of the similarity search space
 1. **Semantic search**: Query the unstructured data directly, often using via embedding or keyword similarity.
 2. **Metadata search**: Apply structured query to the metadata, filering specific documents.

-Vectorstore support for metadata filtering is typically dependent on the underlying vector store implementation.
+Vector store support for metadata filtering is typically dependent on the underlying vector store implementation.
+
 Here is example usage with [Pinecone](/docs/integrations/vectorstores/pinecone/#query-directly), showing that we filter for all documents that have the metadata key `source` with value `tweet`.

 ```python
-results = vectorstore.similarity_search(
+vectorstore.similarity_search(
    "LangChain provides abstractions to make working with LLMs easy",
    k=2,
    filter={"source": "tweet"},
--- a/docs/docs/concepts/why_langchain.mdx
+++ b/docs/docs/concepts/why_langchain.mdx
@@ -0,0 +1,109 @@
+# Why langchain?
+
+The goal of `langchain` the Python package and LangChain the company is to make it as easy possible for developers to build applications that reason.
+While LangChain originally started as a single open source package, it has evolved into a company and a whole ecosystem.
+This page will talk about the LangChain ecosystem as a whole.
+Most of the components within in the LangChain ecosystem can be used by themselves - so if you feel particularly drawn to certain components but not others, that is totally fine! Pick and choose whichever components you like best.
+
+## Features
+
+There are several primary needs that LangChain aims to address:
+
+1. **Standardized component interfaces:** The growing number of [models](/docs/integrations/chat/) and [related components](/docs/integrations/vectorstores/) for AI applications has resulted in a wide variety of different APIs that developers need to learn and use.
+This diversity can make it challenging for developers to switch between providers or combine components when building applications.
+LangChain exposes a standard interface for key components, making it easy to switch between providers.
+
+2. **Orchestration:** As applications become more complex, combining multiple components and models, there's [a growing need to efficiently connect these elements into control flows](https://lilianweng.github.io/posts/2023-06-23-agent/) that can [accomplish diverse tasks](https://www.sequoiacap.com/article/generative-ais-act-o1/).
+[Orchestration](https://en.wikipedia.org/wiki/Orchestration_(computing)) is crucial for building such applications.
+
+3. **Observability and evaluation:** As applications become more complex, it becomes increasingly difficult to understand what is happening within them.
+Furthermore, the pace of development can become rate-limited by the [paradox of choice](https://en.wikipedia.org/wiki/Paradox_of_choice):
+for example, developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost. 
+[Observability](https://en.wikipedia.org/wiki/Observability) and evaluations can help developers monitor their applications and rapidly answer these types of questions with confidence.
+
+
+## Standardized component interfaces
+
+LangChain provides common interfaces for components that are central to many AI applications.
+As an example, all [chat models](/docs/concepts/chat_models/) implement the [BaseChatModel](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) interface.
+This provides a standard way to interact with chat models, supporting important but often provider-specific features like [tool calling](/docs/concepts/tool_calling/) and [structured outputs](/docs/concepts/structured_outputs/).
+
+
+### Example: chat models 
+
+Many [model providers](/docs/concepts/chat_models/) support [tool calling](/docs/concepts/tool_calling/), a critical features for many applications (e.g., [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/)), that allows a developer to request model responses that match a particular schema.
+The APIs for each provider differ. 
+LangChain's [chat model](/docs/concepts/chat_models/) interface provides a common way to bind [tools](/docs/concepts/tools) to a model in order to support [tool calling](/docs/concepts/tool_calling/):
+
+```python
+# Tool creation
+tools = [my_tool]
+# Tool binding
+model_with_tools = model.bind_tools(tools)
+```
+
+Similarly, getting models to produce [structured outputs](/docs/concepts/structured_outputs/) is an extremely common use case. 
+Providers support different approaches for this, including [JSON mode or tool calling](https://platform.openai.com/docs/guides/structured-outputs), with different APIs.
+LangChain's [chat model](/docs/concepts/chat_models/) interface provides a common way to produce structured outputs using the `with_structured_output()` method:
+
+```python
+# Define schema
+schema = ...
+# Bind schema to model
+model_with_structure = model.with_structured_output(schema)
+```
+
+### Example: retrievers
+
+In the context of [RAG](/docs/concepts/rag/) and LLM application components, LangChain's [retriever](/docs/concepts/retrievers/) interface provides a standard way to connect to many different types of data services or databases (e.g., [vector stores](/docs/concepts/vectorstores) or databases).
+The underlying implementation of the retriever depends on the type of data store or database you are connecting to, but all retrievers implement the [runnable interface](/docs/concepts/runnables/), meaning they can be invoked in a common manner.
+
+```python
+documents = my_retriever.invoke("What is the meaning of life?")
+```
+
+## Orchestration 
+
+While standardization for individual components is useful, we've increasingly seen that developers want to *combine* components into more complex applications. 
+This motivates the need for [orchestration](https://en.wikipedia.org/wiki/Orchestration_(computing)).
+There are several common characteristics of LLM applications that this orchestration layer should support:
+
+* **Complex control flow:** The application requires complex patterns such as cycles (e.g., a loop that reiterates until a condition is met).
+* **[Persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/):** The application needs to maintain [short-term and / or long-term memory](https://langchain-ai.github.io/langgraph/concepts/memory/).
+* **[Human-in-the-loop](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/):** The application needs human interaction, e.g., pausing, reviewing, editing, approving certain steps.
+
+The recommended way to do orchestration for these complex applications is [LangGraph](https://langchain-ai.github.io/langgraph/concepts/high_level/).
+LangGraph is a library that gives developers a high degree of control by expressing the flow of the application as a set of nodes and edges.
+LangGraph comes with built-in support for [persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/), [human-in-the-loop](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/), [memory](https://langchain-ai.github.io/langgraph/concepts/memory/), and other features.
+It's particularly  well suited for building [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/) or [multi-agent](https://langchain-ai.github.io/langgraph/concepts/multi_agent/) applications. 
+Importantly, individual LangChain components can be used within LangGraph nodes, but you can also use LangGraph **without** using LangChain components.
+
+:::info[Further reading]
+
+Have a look at our free course, [Introduction to LangGraph](https://academy.langchain.com/courses/intro-to-langgraph), to learn more about how to use LangGraph to build complex applications.
+
+:::
+
+## Observability and evaluation
+
+The pace of AI application development is often rate-limited by high-quality evaluations because there is a paradox of choice. 
+Developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost. 
+High quality tracing and evaluations can help you rapidly answer these types of questions with confidence.
+[LangSmith](https://docs.smith.langchain.com/) is our platform that supports observability and evaluation for AI applications.
+See our conceptual guides on [evaluations](https://docs.smith.langchain.com/concepts/evaluation) and [tracing](https://docs.smith.langchain.com/concepts/tracing) for more details.
+
+:::info[Further reading]
+
+See our video playlist on [LangSmith tracing and evaluations](https://youtube.com/playlist?list=PLfaIDFEXuae0um8Fj0V4dHG37fGFU8Q5S&feature=shared) for more details.
+
+:::
+
+## Conclusion
+
+LangChain offers standard interfaces for components that are central to many AI applications, which offers a few specific advantages:
+- **Ease of swapping providers:** It allows you to swap out different component providers without having to change the underlying code.
+- **Advanced features:** It provides common methods for more advanced features, such as [streaming](/docs/concepts/runnables/#streaming) and [tool calling](/docs/concepts/tool_calling/).
+
+[LangGraph](https://langchain-ai.github.io/langgraph/concepts/high_level/) makes it possible to orchestrate complex applications (e.g., [agents](/docs/concepts/agents/)) and provide features like including [persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/), [human-in-the-loop](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/), or [memory](https://langchain-ai.github.io/langgraph/concepts/memory/).
+
+[LangSmith](https://docs.smith.langchain.com/) makes it possible to iterate with confidence on your applications, by providing LLM-specific observability and framework for testing and evaluating your application.
--- a/docs/docs/contributing/documentation/style_guide.mdx
+++ b/docs/docs/contributing/documentation/style_guide.mdx
@@ -92,8 +92,8 @@ To quote the Diataxis website:

 Some examples include:

- [Retrieval conceptual docs](/docs/concepts/#retrieval)
- [Chat model conceptual docs](/docs/concepts/#chat-models)
+- [Retrieval conceptual docs](/docs/concepts/retrieval)
+- [Chat model conceptual docs](/docs/concepts/chat_models)

 Here are some high-level tips on writing a good conceptual guide:

--- a/docs/docs/how_to/add_scores_retriever.ipynb
+++ b/docs/docs/how_to/add_scores_retriever.ipynb
@@ -75,7 +75,7 @@
    "\n",
    "To obtain scores from a vector store retriever, we wrap the underlying vector store's `.similarity_search_with_score` method in a short function that packages scores into the associated document's metadata.\n",
    "\n",
-    "We add a `@chain` decorator to the function to create a [Runnable](/docs/concepts/#langchain-expression-language) that can be used similarly to a typical retriever."
+    "We add a `@chain` decorator to the function to create a [Runnable](/docs/concepts/lcel) that can be used similarly to a typical retriever."
   ]
  },
  {
--- a/docs/docs/how_to/agent_executor.ipynb
+++ b/docs/docs/how_to/agent_executor.ipynb
@@ -31,11 +31,11 @@
    "## Concepts\n",
    "\n",
    "Concepts we will cover are:\n",
-    "- Using [language models](/docs/concepts/#chat-models), in particular their tool calling ability\n",
-    "- Creating a [Retriever](/docs/concepts/#retrievers) to expose specific information to our agent\n",
-    "- Using a Search [Tool](/docs/concepts/#tools) to look up things online\n",
-    "- [`Chat History`](/docs/concepts/#chat-history), which allows a chatbot to \"remember\" past interactions and take them into account when responding to follow-up questions. \n",
-    "- Debugging and tracing your application using [LangSmith](/docs/concepts/#langsmith)\n",
+    "- Using [language models](/docs/concepts/chat_models), in particular their tool calling ability\n",
+    "- Creating a [Retriever](/docs/concepts/retrievers) to expose specific information to our agent\n",
+    "- Using a Search [Tool](/docs/concepts/tools) to look up things online\n",
+    "- [`Chat History`](/docs/concepts/chat_history), which allows a chatbot to \"remember\" past interactions and take them into account when responding to follow-up questions. \n",
+    "- Debugging and tracing your application using [LangSmith](https://docs.smith.langchain.com/)\n",
    "\n",
    "## Setup\n",
    "\n",
@@ -415,7 +415,7 @@
   "source": [
    "## Create the agent\n",
    "\n",
-    "Now that we have defined the tools and the LLM, we can create the agent. We will be using a tool calling agent - for more information on this type of agent, as well as other options, see [this guide](/docs/concepts/#agent_types/).\n",
+    "Now that we have defined the tools and the LLM, we can create the agent. We will be using a tool calling agent - for more information on this type of agent, as well as other options, see [this guide](/docs/concepts/agents/).\n",
    "\n",
    "We can first choose the prompt we want to use to guide the agent.\n",
    "\n",
@@ -457,7 +457,7 @@
   "id": "f8014c9d",
   "metadata": {},
   "source": [
-    "Now, we can initialize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our [conceptual guide](/docs/concepts/#agents).\n",
+    "Now, we can initialize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to think about these components, see our [conceptual guide](/docs/concepts/agents).\n",
    "\n",
    "Note that we are passing in the `model`, not `model_with_tools`. That is because `create_tool_calling_agent` will call `.bind_tools` for us under the hood."
   ]
--- a/docs/docs/how_to/assign.ipynb
+++ b/docs/docs/how_to/assign.ipynb
@@ -19,7 +19,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "- [Calling runnables in parallel](/docs/how_to/parallel/)\n",
    "- [Custom functions](/docs/how_to/functions/)\n",
@@ -29,7 +29,7 @@
    "\n",
    "An alternate way of [passing data through](/docs/how_to/passthrough) steps of a chain is to leave the current values of the chain state unchanged while assigning a new value under a given key. The [`RunnablePassthrough.assign()`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html#langchain_core.runnables.passthrough.RunnablePassthrough.assign) static method takes an input value and adds the extra arguments passed to the assign function.\n",
    "\n",
-    "This is useful in the common [LangChain Expression Language](/docs/concepts/#langchain-expression-language) pattern of additively creating a dictionary to use as input to a later step.\n",
+    "This is useful in the common [LangChain Expression Language](/docs/concepts/lcel) pattern of additively creating a dictionary to use as input to a later step.\n",
    "\n",
    "Here's an example:"
   ]
--- a/docs/docs/how_to/binding.ipynb
+++ b/docs/docs/how_to/binding.ipynb
@@ -21,7 +21,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "- [Tool calling](/docs/how_to/tool_calling)\n",
    "\n",
--- a/docs/docs/how_to/callbacks_async.ipynb
+++ b/docs/docs/how_to/callbacks_async.ipynb
@@ -10,7 +10,7 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Callbacks](/docs/concepts/#callbacks)\n",
+    "- [Callbacks](/docs/concepts/callbacks)\n",
    "- [Custom callback handlers](/docs/how_to/custom_callbacks)\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/callbacks_attach.ipynb
+++ b/docs/docs/how_to/callbacks_attach.ipynb
@@ -10,7 +10,7 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Callbacks](/docs/concepts/#callbacks)\n",
+    "- [Callbacks](/docs/concepts/callbacks)\n",
    "- [Custom callback handlers](/docs/how_to/custom_callbacks)\n",
    "- [Chaining runnables](/docs/how_to/sequence)\n",
    "- [Attach runtime arguments to a Runnable](/docs/how_to/binding)\n",
--- a/docs/docs/how_to/callbacks_constructor.ipynb
+++ b/docs/docs/how_to/callbacks_constructor.ipynb
@@ -10,7 +10,7 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Callbacks](/docs/concepts/#callbacks)\n",
+    "- [Callbacks](/docs/concepts/callbacks)\n",
    "- [Custom callback handlers](/docs/how_to/custom_callbacks)\n",
    "\n",
    ":::\n",
--- a/docs/docs/how_to/callbacks_custom_events.ipynb
+++ b/docs/docs/how_to/callbacks_custom_events.ipynb
@@ -10,13 +10,13 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Callbacks](/docs/concepts/#callbacks)\n",
+    "- [Callbacks](/docs/concepts/callbacks)\n",
    "- [Custom callback handlers](/docs/how_to/custom_callbacks)\n",
-    "- [Astream Events API](/docs/concepts/#astream_events) the `astream_events` method will surface custom callback events.\n",
+    "- [Astream Events API](/docs/concepts/streaming/#astream_events) the `astream_events` method will surface custom callback events.\n",
    ":::\n",
    "\n",
-    "In some situations, you may want to dipsatch a custom callback event from within a [Runnable](/docs/concepts/#runnable-interface) so it can be surfaced\n",
-    "in a custom callback handler or via the [Astream Events API](/docs/concepts/#astream_events).\n",
+    "In some situations, you may want to dipsatch a custom callback event from within a [Runnable](/docs/concepts/runnables) so it can be surfaced\n",
+    "in a custom callback handler or via the [Astream Events API](/docs/concepts/streaming/#astream_events).\n",
    "\n",
    "For example, if you have a long running tool with multiple steps, you can dispatch custom events between the steps and use these custom events to monitor progress.\n",
    "You could also surface these custom events to an end user of your application to show them how the current task is progressing.\n",
@@ -64,7 +64,7 @@
   "source": [
    "## Astream Events API\n",
    "\n",
-    "The most useful way to consume custom events is via the [Astream Events API](/docs/concepts/#astream_events).\n",
+    "The most useful way to consume custom events is via the [Astream Events API](/docs/concepts/streaming/#astream_events).\n",
    "\n",
    "We can use the `async` `adispatch_custom_event` API to emit custom events in an async setting. \n",
    "\n",
--- a/docs/docs/how_to/callbacks_runtime.ipynb
+++ b/docs/docs/how_to/callbacks_runtime.ipynb
@@ -10,7 +10,7 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Callbacks](/docs/concepts/#callbacks)\n",
+    "- [Callbacks](/docs/concepts/callbacks)\n",
    "- [Custom callback handlers](/docs/how_to/custom_callbacks)\n",
    "\n",
    ":::\n",
--- a/docs/docs/how_to/chat_model_caching.ipynb
+++ b/docs/docs/how_to/chat_model_caching.ipynb
@@ -10,8 +10,8 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LLMs](/docs/concepts/#llms)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [LLMs](/docs/concepts/text_llms)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/chat_model_rate_limiting.ipynb
+++ b/docs/docs/how_to/chat_model_rate_limiting.ipynb
@@ -10,8 +10,8 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LLMs](/docs/concepts/#llms)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [LLMs](/docs/concepts/text_llms)\n",
    ":::\n",
    "\n",
    "\n",
--- a/docs/docs/how_to/chat_token_usage_tracking.ipynb
+++ b/docs/docs/how_to/chat_token_usage_tracking.ipynb
@@ -10,7 +10,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/chatbots_tools.ipynb
+++ b/docs/docs/how_to/chatbots_tools.ipynb
@@ -10,9 +10,9 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Chatbots](/docs/concepts/#messages)\n",
+    "- [Chatbots](/docs/concepts/messages)\n",
    "- [Agents](/docs/tutorials/agents)\n",
-    "- [Chat history](/docs/concepts/#chat-history)\n",
+    "- [Chat history](/docs/concepts/chat_history)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/configure.ipynb
+++ b/docs/docs/how_to/configure.ipynb
@@ -21,7 +21,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "- [Binding runtime arguments](/docs/how_to/binding/)\n",
    "\n",
--- a/docs/docs/how_to/convert_runnable_to_tool.ipynb
+++ b/docs/docs/how_to/convert_runnable_to_tool.ipynb
@@ -42,7 +42,7 @@
   "source": [
    "LangChain [tools](/docs/concepts#tools) are interfaces that an agent, chain, or chat model can use to interact with the world. See [here](/docs/how_to/#tools) for how-to guides covering tool-calling, built-in tools, custom tools, and more information.\n",
    "\n",
-    "LangChain tools-- instances of [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.BaseTool.html)-- are [Runnables](/docs/concepts/#runnable-interface) with additional constraints that enable them to be invoked effectively by language models:\n",
+    "LangChain tools-- instances of [BaseTool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.BaseTool.html)-- are [Runnables](/docs/concepts/runnables) with additional constraints that enable them to be invoked effectively by language models:\n",
    "\n",
    "- Their inputs are constrained to be serializable, specifically strings and Python `dict` objects;\n",
    "- They contain names and descriptions indicating how and when they should be used;\n",
@@ -259,9 +259,9 @@
   "source": [
    "## In agents\n",
    "\n",
-    "Below we will incorporate LangChain Runnables as tools in an [agent](/docs/concepts/#agents) application. We will demonstrate with:\n",
+    "Below we will incorporate LangChain Runnables as tools in an [agent](/docs/concepts/agents) application. We will demonstrate with:\n",
    "\n",
-    "- a document [retriever](/docs/concepts/#retrievers);\n",
+    "- a document [retriever](/docs/concepts/retrievers);\n",
    "- a simple [RAG](/docs/tutorials/rag/) chain, allowing an agent to delegate relevant queries to it.\n",
    "\n",
    "We first instantiate a chat model that supports [tool calling](/docs/how_to/tool_calling/):\n",
--- a/docs/docs/how_to/custom_callbacks.ipynb
+++ b/docs/docs/how_to/custom_callbacks.ipynb
@@ -10,7 +10,7 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Callbacks](/docs/concepts/#callbacks)\n",
+    "- [Callbacks](/docs/concepts/callbacks)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/custom_chat_model.ipynb
+++ b/docs/docs/how_to/custom_chat_model.ipynb
@@ -10,7 +10,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
    "\n",
    ":::\n",
    "\n",
@@ -28,7 +28,7 @@
    "\n",
    "Chat models take messages as inputs and return a message as output. \n",
    "\n",
-    "LangChain has a few [built-in message types](/docs/concepts/#message-types):\n",
+    "LangChain has a few [built-in message types](/docs/concepts/messages):\n",
    "\n",
    "| Message Type          | Description                                                                                     |\n",
    "|-----------------------|-------------------------------------------------------------------------------------------------|\n",
--- a/docs/docs/how_to/document_loader_pdf.ipynb
+++ b/docs/docs/how_to/document_loader_pdf.ipynb
@@ -50,7 +50,7 @@
    "\n",
    "If you are looking for a simple string representation of text that is embedded in a PDF, the method below is appropriate. It will return a list of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects-- one per page-- containing a single string of the page's text in the Document's `page_content` attribute. It will not parse text in images or scanned PDF pages. Under the hood it uses the [pypydf](https://pypdf.readthedocs.io/en/stable/) Python library.\n",
    "\n",
-    "LangChain [document loaders](/docs/concepts/#document-loaders) implement `lazy_load` and its async variant, `alazy_load`, which return iterators of `Document` objects. We will use these below."
+    "LangChain [document loaders](/docs/concepts/document_loaders) implement `lazy_load` and its async variant, `alazy_load`, which return iterators of `Document` objects. We will use these below."
   ]
  },
  {
@@ -147,7 +147,7 @@
    "\n",
    "### Vector search over PDFs\n",
    "\n",
-    "Once we have loaded PDFs into LangChain `Document` objects, we can index them (e.g., a RAG application) in the usual way. Below we use OpenAI embeddings, although any LangChain [embeddings](https://python.langchain.com/docs/concepts/#embedding-models) model will suffice."
+    "Once we have loaded PDFs into LangChain `Document` objects, we can index them (e.g., a RAG application) in the usual way. Below we use OpenAI embeddings, although any LangChain [embeddings](https://python.langchain.com/docs/concepts/embedding_models) model will suffice."
   ]
  },
  {
@@ -804,7 +804,7 @@
    "\n",
    "Many modern LLMs support inference over multimodal inputs (e.g., images). In some applications-- such as question-answering over PDFs with complex layouts, diagrams, or scans-- it may be advantageous to skip the PDF parsing, instead casting a PDF page to an image and passing it to a model directly. This allows a model to reason over the two dimensional content on the page, instead of a \"one-dimensional\" string representation.\n",
    "\n",
-    "In principle we can use any LangChain [chat model](/docs/concepts/#chat-models) that supports multimodal inputs. A list of these models is documented [here](/docs/integrations/chat/). Below we use OpenAI's `gpt-4o-mini`.\n",
+    "In principle we can use any LangChain [chat model](/docs/concepts/chat_models) that supports multimodal inputs. A list of these models is documented [here](/docs/integrations/chat/). Below we use OpenAI's `gpt-4o-mini`.\n",
    "\n",
    "First we define a short utility function to convert a PDF page to a base64-encoded image:"
   ]
--- a/docs/docs/how_to/document_loader_web.ipynb
+++ b/docs/docs/how_to/document_loader_web.ipynb
@@ -389,7 +389,7 @@
   "source": [
    "### Vector search over page content\n",
    "\n",
-    "Once we have loaded the page contents into LangChain `Document` objects, we can index them (e.g., for a RAG application) in the usual way. Below we use OpenAI [embeddings](/docs/concepts/#embedding-models), although any LangChain embeddings model will suffice."
+    "Once we have loaded the page contents into LangChain `Document` objects, we can index them (e.g., for a RAG application) in the usual way. Below we use OpenAI [embeddings](/docs/concepts/embedding_models), although any LangChain embeddings model will suffice."
   ]
  },
  {
--- a/docs/docs/how_to/dynamic_chain.ipynb
+++ b/docs/docs/how_to/dynamic_chain.ipynb
@@ -10,7 +10,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [How to turn any function into a runnable](/docs/how_to/functions)\n",
    "\n",
    ":::\n",
--- a/docs/docs/how_to/example_selectors_langsmith.ipynb
+++ b/docs/docs/how_to/example_selectors_langsmith.ipynb
@@ -11,9 +11,9 @@
    "import Compatibility from \"@theme/Compatibility\";\n",
    "\n",
    "<Prerequisites titlesAndLinks={[\n",
-    "  [\"Chat models\", \"/docs/concepts/#chat-models\"],\n",
-    "  [\"Few-shot-prompting\", \"/docs/concepts/#few-shot-prompting\"],\n",
-    "  [\"LangSmith\", \"/docs/concepts/#langsmith\"],\n",
+    "  [\"Chat models\", \"/docs/concepts/chat_models\"],\n",
+    "  [\"Few-shot-prompting\", \"/docs/concepts/few-shot-prompting\"],\n",
+    "  [\"LangSmith\", \"https://docs.smith.langchain.com/\"],\n",
    "]} />\n",
    "\n",
    "\n",
--- a/docs/docs/how_to/few_shot_examples.ipynb
+++ b/docs/docs/how_to/few_shot_examples.ipynb
@@ -20,10 +20,10 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Example selectors](/docs/concepts/#example-selectors)\n",
-    "- [LLMs](/docs/concepts/#llms)\n",
-    "- [Vectorstores](/docs/concepts/#vector-stores)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
+    "- [Example selectors](/docs/concepts/example_selectors)\n",
+    "- [LLMs](/docs/concepts/text_llms)\n",
+    "- [Vectorstores](/docs/concepts/vectorstores)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/few_shot_examples_chat.ipynb
+++ b/docs/docs/how_to/few_shot_examples_chat.ipynb
@@ -20,10 +20,10 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Example selectors](/docs/concepts/#example-selectors)\n",
-    "- [Chat models](/docs/concepts/#chat-model)\n",
-    "- [Vectorstores](/docs/concepts/#vector-stores)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
+    "- [Example selectors](/docs/concepts/example_selectors)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Vectorstores](/docs/concepts/vectorstores)\n",
    "\n",
    ":::\n",
    "\n",
@@ -33,7 +33,7 @@
    "\n",
    "The goal of few-shot prompt templates are to dynamically select examples based on an input, and then format the examples in a final prompt to provide for the model.\n",
    "\n",
-    "**Note:** The following code examples are for chat models only, since `FewShotChatMessagePromptTemplates` are designed to output formatted [chat messages](/docs/concepts/#message-types) rather than pure strings. For similar few-shot prompt examples for pure string templates compatible with completion models (LLMs), see the [few-shot prompt templates](/docs/how_to/few_shot_examples/) guide."
+    "**Note:** The following code examples are for chat models only, since `FewShotChatMessagePromptTemplates` are designed to output formatted [chat messages](/docs/concepts/messages) rather than pure strings. For similar few-shot prompt examples for pure string templates compatible with completion models (LLMs), see the [few-shot prompt templates](/docs/how_to/few_shot_examples/) guide."
   ]
  },
  {
--- a/docs/docs/how_to/functions.ipynb
+++ b/docs/docs/how_to/functions.ipynb
@@ -21,7 +21,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "\n",
    ":::\n",
--- a/docs/docs/how_to/index.mdx
+++ b/docs/docs/how_to/index.mdx
@@ -27,7 +27,7 @@ This highlights functionality that is core to using LangChain.

 ## LangChain Expression Language (LCEL)

-[LangChain Expression Language](/docs/concepts/#langchain-expression-language-lcel) is a way to create arbitrary custom chains. It is built on the [Runnable](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) protocol.
+[LangChain Expression Language](/docs/concepts/lcel) is a way to create arbitrary custom chains. It is built on the [Runnable](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html) protocol.

 [**LCEL cheatsheet**](/docs/how_to/lcel_cheatsheet/): For a quick overview of how to use the main LCEL primitives.

@@ -53,7 +53,7 @@ These are the core building blocks you can use when building applications.

 ### Prompt templates

-[Prompt Templates](/docs/concepts/#prompt-templates) are responsible for formatting user input into a format that can be passed to a language model.
+[Prompt Templates](/docs/concepts/prompt_templates) are responsible for formatting user input into a format that can be passed to a language model.

 - [How to: use few shot examples](/docs/how_to/few_shot_examples)
 - [How to: use few shot examples in chat models](/docs/how_to/few_shot_examples_chat/)
@@ -62,7 +62,7 @@ These are the core building blocks you can use when building applications.

 ### Example selectors

-[Example Selectors](/docs/concepts/#example-selectors) are responsible for selecting the correct few shot examples to pass to the prompt.
+[Example Selectors](/docs/concepts/example_selectors) are responsible for selecting the correct few shot examples to pass to the prompt.

 - [How to: use example selectors](/docs/how_to/example_selectors)
 - [How to: select examples by length](/docs/how_to/example_selectors_length_based)
@@ -73,7 +73,7 @@ These are the core building blocks you can use when building applications.

 ### Chat models

-[Chat Models](/docs/concepts/#chat-models) are newer forms of language models that take messages in and output a message.
+[Chat Models](/docs/concepts/chat_models) are newer forms of language models that take messages in and output a message.

 - [How to: do function/tool calling](/docs/how_to/tool_calling)
 - [How to: get models to return structured output](/docs/how_to/structured_output)
@@ -94,7 +94,7 @@ These are the core building blocks you can use when building applications.

 ### Messages

-[Messages](/docs/concepts/#messages) are the input and output of chat models. They have some `content` and a `role`, which describes the source of the message.
+[Messages](/docs/concepts/messages) are the input and output of chat models. They have some `content` and a `role`, which describes the source of the message.

 - [How to: trim messages](/docs/how_to/trim_messages/)
 - [How to: filter messages](/docs/how_to/filter_messages/)
@@ -102,7 +102,7 @@ These are the core building blocks you can use when building applications.

 ### LLMs

-What LangChain calls [LLMs](/docs/concepts/#llms) are older forms of language models that take a string in and output a string.
+What LangChain calls [LLMs](/docs/concepts/text_llms) are older forms of language models that take a string in and output a string.

 - [How to: cache model responses](/docs/how_to/llm_caching)
 - [How to: create a custom LLM class](/docs/how_to/custom_llm)
@@ -112,7 +112,7 @@ What LangChain calls [LLMs](/docs/concepts/#llms) are older forms of language mo

 ### Output parsers

-[Output Parsers](/docs/concepts/#output-parsers) are responsible for taking the output of an LLM and parsing into more structured format.
+[Output Parsers](/docs/concepts/output_parsers) are responsible for taking the output of an LLM and parsing into more structured format.

 - [How to: use output parsers to parse an LLM response into structured format](/docs/how_to/output_parser_structured)
 - [How to: parse JSON output](/docs/how_to/output_parser_json)
@@ -124,7 +124,7 @@ What LangChain calls [LLMs](/docs/concepts/#llms) are older forms of language mo

 ### Document loaders

-[Document Loaders](/docs/concepts/#document-loaders) are responsible for loading documents from a variety of sources.
+[Document Loaders](/docs/concepts/document_loaders) are responsible for loading documents from a variety of sources.

 - [How to: load PDF files](/docs/how_to/document_loader_pdf)
 - [How to: load web pages](/docs/how_to/document_loader_web)
@@ -138,7 +138,7 @@ What LangChain calls [LLMs](/docs/concepts/#llms) are older forms of language mo

 ### Text splitters

-[Text Splitters](/docs/concepts/#text-splitters) take a document and split into chunks that can be used for retrieval.
+[Text Splitters](/docs/concepts/text_splitters) take a document and split into chunks that can be used for retrieval.

 - [How to: recursively split text](/docs/how_to/recursive_text_splitter)
 - [How to: split by HTML headers](/docs/how_to/HTML_header_metadata_splitter)
@@ -152,20 +152,20 @@ What LangChain calls [LLMs](/docs/concepts/#llms) are older forms of language mo

 ### Embedding models

-[Embedding Models](/docs/concepts/#embedding-models) take a piece of text and create a numerical representation of it.
+[Embedding Models](/docs/concepts/embedding_models) take a piece of text and create a numerical representation of it.

 - [How to: embed text data](/docs/how_to/embed_text)
 - [How to: cache embedding results](/docs/how_to/caching_embeddings)

 ### Vector stores

-[Vector stores](/docs/concepts/#vector-stores) are databases that can efficiently store and retrieve embeddings.
+[Vector stores](/docs/concepts/vectorstores) are databases that can efficiently store and retrieve embeddings.

 - [How to: use a vector store to retrieve data](/docs/how_to/vectorstores)

 ### Retrievers

-[Retrievers](/docs/concepts/#retrievers) are responsible for taking a query and returning relevant documents.
+[Retrievers](/docs/concepts/retrievers) are responsible for taking a query and returning relevant documents.

 - [How to: use a vector store to retrieve data](/docs/how_to/vectorstore_retriever)
 - [How to: generate multiple queries to retrieve data for](/docs/how_to/MultiQueryRetriever)
@@ -188,7 +188,7 @@ Indexing is the process of keeping your vectorstore in-sync with the underlying

 ### Tools

-LangChain [Tools](/docs/concepts/#tools) contain a description of the tool (to pass to the language model) as well as the implementation of the function to call. Refer [here](/docs/integrations/tools/) for a list of pre-buit tools. 
+LangChain [Tools](/docs/concepts/tools) contain a description of the tool (to pass to the language model) as well as the implementation of the function to call. Refer [here](/docs/integrations/tools/) for a list of pre-buit tools. 

 - [How to: create tools](/docs/how_to/custom_tools)
 - [How to: use built-in tools and toolkits](/docs/how_to/tools_builtin)
@@ -225,7 +225,7 @@ For in depth how-to guides for agents, please check out [LangGraph](https://lang

 ### Callbacks

-[Callbacks](/docs/concepts/#callbacks) allow you to hook into the various stages of your LLM application's execution.
+[Callbacks](/docs/concepts/callbacks) allow you to hook into the various stages of your LLM application's execution.

 - [How to: pass in callbacks at runtime](/docs/how_to/callbacks_runtime)
 - [How to: attach callbacks to a module](/docs/how_to/callbacks_attach)
--- a/docs/docs/how_to/inspect.ipynb
+++ b/docs/docs/how_to/inspect.ipynb
@@ -10,12 +10,12 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "\n",
    ":::\n",
    "\n",
-    "Once you create a runnable with [LangChain Expression Language](/docs/concepts/#langchain-expression-language), you may often want to inspect it to get a better sense for what is going on. This notebook covers some methods for doing so.\n",
+    "Once you create a runnable with [LangChain Expression Language](/docs/concepts/lcel), you may often want to inspect it to get a better sense for what is going on. This notebook covers some methods for doing so.\n",
    "\n",
    "This guide shows some ways you can programmatically introspect the internal steps of chains. If you are instead interested in debugging issues in your chain, see [this section](/docs/how_to/debugging) instead.\n",
    "\n",
--- a/docs/docs/how_to/llm_token_usage_tracking.ipynb
+++ b/docs/docs/how_to/llm_token_usage_tracking.ipynb
@@ -13,7 +13,7 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [LLMs](/docs/concepts/#llms)\n",
+    "- [LLMs](/docs/concepts/text_llms)\n",
    ":::\n",
    "\n",
    "## Using LangSmith\n",
--- a/docs/docs/how_to/local_llms.ipynb
+++ b/docs/docs/how_to/local_llms.ipynb
@@ -68,7 +68,7 @@
    "\n",
    "### Formatting prompts\n",
    "\n",
-    "Some providers have [chat model](/docs/concepts/#chat-models) wrappers that takes care of formatting your input prompt for the specific local model you're using. However, if you are prompting local models with a [text-in/text-out LLM](/docs/concepts/#llms) wrapper, you may need to use a prompt tailed for your specific model.\n",
+    "Some providers have [chat model](/docs/concepts/chat_models) wrappers that takes care of formatting your input prompt for the specific local model you're using. However, if you are prompting local models with a [text-in/text-out LLM](/docs/concepts/text_llms) wrapper, you may need to use a prompt tailed for your specific model.\n",
    "\n",
    "This can [require the inclusion of special tokens](https://huggingface.co/blog/llama2#how-to-prompt-llama-2). [Here's an example for LLaMA 2](https://smith.langchain.com/hub/rlm/rag-prompt-llama).\n",
    "\n",
--- a/docs/docs/how_to/logprobs.ipynb
+++ b/docs/docs/how_to/logprobs.ipynb
@@ -10,7 +10,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/long_context_reorder.ipynb
+++ b/docs/docs/how_to/long_context_reorder.ipynb
@@ -9,7 +9,7 @@
    "\n",
    "Substantial performance degradations in [RAG](/docs/tutorials/rag) applications have been [documented](https://arxiv.org/abs/2307.03172) as the number of retrieved documents grows (e.g., beyond ten). In brief: models are liable to miss relevant information in the middle of long contexts.\n",
    "\n",
-    "By contrast, queries against vector stores will typically return documents in descending order of relevance (e.g., as measured by cosine similarity of [embeddings](/docs/concepts/#embedding-models)).\n",
+    "By contrast, queries against vector stores will typically return documents in descending order of relevance (e.g., as measured by cosine similarity of [embeddings](/docs/concepts/embedding_models)).\n",
    "\n",
    "To mitigate the [\"lost in the middle\"](https://arxiv.org/abs/2307.03172) effect, you can re-order documents after retrieval such that the most relevant documents are positioned at extrema (e.g., the first and last pieces of context), and the least relevant documents are positioned in the middle. In some cases this can help surface the most relevant information to LLMs.\n",
    "\n",
--- a/docs/docs/how_to/message_history.ipynb
+++ b/docs/docs/how_to/message_history.ipynb
@@ -25,8 +25,8 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Chat Messages](/docs/concepts/#message-types)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
+    "- [Chat Messages](/docs/concepts/messages)\n",
    "- [LangGraph persistence](https://langchain-ai.github.io/langgraph/how-tos/persistence/)\n",
    "\n",
    ":::\n",
@@ -85,7 +85,7 @@
   "source": [
    "## Example: message inputs\n",
    "\n",
-    "Adding memory to a [chat model](/docs/concepts/#chat-models) provides a simple example. Chat models accept a list of messages as input and output a message. LangGraph includes a built-in `MessagesState` that we can use for this purpose.\n",
+    "Adding memory to a [chat model](/docs/concepts/chat_models) provides a simple example. Chat models accept a list of messages as input and output a message. LangGraph includes a built-in `MessagesState` that we can use for this purpose.\n",
    "\n",
    "Below, we:\n",
    "1. Define the graph state to be a list of messages;\n",
--- a/docs/docs/how_to/migrate_agent.ipynb
+++ b/docs/docs/how_to/migrate_agent.ipynb
@@ -24,7 +24,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Agents](/docs/concepts/#agents)\n",
+    "- [Agents](/docs/concepts/agents)\n",
    "- [LangGraph](https://langchain-ai.github.io/langgraph/)\n",
    "- [Tool calling](/docs/how_to/tool_calling/)\n",
    "\n",
@@ -298,7 +298,7 @@
    "- A `SystemMessage`, which is added to the beginning of the list of messages.\n",
    "- A `string`, which is converted to a `SystemMessage` and added to the beginning of the list of messages.\n",
    "- A `Callable`, which should take in full graph state. The output is then passed to the language model.\n",
-    "- Or a [`Runnable`](/docs/concepts/#langchain-expression-language-lcel), which should take in full graph state. The output is then passed to the language model.\n",
+    "- Or a [`Runnable`](/docs/concepts/lcel), which should take in full graph state. The output is then passed to the language model.\n",
    "\n",
    "Here's how it looks in action:"
   ]
--- a/docs/docs/how_to/multimodal_inputs.ipynb
+++ b/docs/docs/how_to/multimodal_inputs.ipynb
@@ -162,7 +162,7 @@
   "source": [
    "## Tool calls\n",
    "\n",
-    "Some multimodal models support [tool calling](/docs/concepts/#functiontool-calling) features as well. To call tools using such models, simply bind tools to them in the [usual way](/docs/how_to/tool_calling), and invoke the model using content blocks of the desired type (e.g., containing image data)."
+    "Some multimodal models support [tool calling](/docs/concepts/tool_calling) features as well. To call tools using such models, simply bind tools to them in the [usual way](/docs/how_to/tool_calling), and invoke the model using content blocks of the desired type (e.g., containing image data)."
   ]
  },
  {
--- a/docs/docs/how_to/output_parser_json.ipynb
+++ b/docs/docs/how_to/output_parser_json.ipynb
@@ -10,9 +10,9 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [Output parsers](/docs/concepts/#output-parsers)\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Output parsers](/docs/concepts/output_parsers)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
    "- [Structured output](/docs/how_to/structured_output)\n",
    "- [Chaining runnables together](/docs/how_to/sequence/)\n",
    "\n",
--- a/docs/docs/how_to/output_parser_structured.ipynb
+++ b/docs/docs/how_to/output_parser_structured.ipynb
@@ -214,7 +214,7 @@
   "id": "3ca23082-c602-4ee8-af8c-a185b1f42bd1",
   "metadata": {},
   "source": [
-    "While the PydanticOutputParser cannot:"
+    "While the `PydanticOutputParser` cannot:"
   ]
  },
  {
--- a/docs/docs/how_to/output_parser_xml.ipynb
+++ b/docs/docs/how_to/output_parser_xml.ipynb
@@ -10,15 +10,15 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [Output parsers](/docs/concepts/#output-parsers)\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Output parsers](/docs/concepts/output_parsers)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
    "- [Structured output](/docs/how_to/structured_output)\n",
    "- [Chaining runnables together](/docs/how_to/sequence/)\n",
    "\n",
    ":::\n",
    "\n",
-    "LLMs from different providers often have different strengths depending on the specific data they are trianed on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
+    "LLMs from different providers often have different strengths depending on the specific data they are trained on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
    "\n",
    "This guide shows you how to use the [`XMLOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html) to prompt models for XML output, then and parse that output into a usable format.\n",
    "\n",
--- a/docs/docs/how_to/output_parser_yaml.ipynb
+++ b/docs/docs/how_to/output_parser_yaml.ipynb
@@ -10,15 +10,15 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [Output parsers](/docs/concepts/#output-parsers)\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Output parsers](/docs/concepts/output_parsers)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
    "- [Structured output](/docs/how_to/structured_output)\n",
    "- [Chaining runnables together](/docs/how_to/sequence/)\n",
    "\n",
    ":::\n",
    "\n",
-    "LLMs from different providers often have different strengths depending on the specific data they are trianed on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
+    "LLMs from different providers often have different strengths depending on the specific data they are trained on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
    "\n",
    "This output parser allows users to specify an arbitrary schema and query LLMs for outputs that conform to that schema, using YAML to format their response.\n",
    "\n",
--- a/docs/docs/how_to/parallel.ipynb
+++ b/docs/docs/how_to/parallel.ipynb
@@ -21,7 +21,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Chaining runnables](/docs/how_to/sequence)\n",
    "\n",
    ":::\n",
--- a/docs/docs/how_to/passthrough.ipynb
+++ b/docs/docs/how_to/passthrough.ipynb
@@ -21,7 +21,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "- [Calling runnables in parallel](/docs/how_to/parallel/)\n",
    "- [Custom functions](/docs/how_to/functions/)\n",
@@ -153,7 +153,7 @@
    "\n",
    "## Next steps\n",
    "\n",
-    "Now you've learned how to pass data through your chains to help to help format the data flowing through your chains.\n",
+    "Now you've learned how to pass data through your chains to help format the data flowing through your chains.\n",
    "\n",
    "To learn more, see the other how-to guides on runnables in this section."
   ]
--- a/docs/docs/how_to/prompts_composition.ipynb
+++ b/docs/docs/how_to/prompts_composition.ipynb
@@ -20,7 +20,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/prompts_partial.ipynb
+++ b/docs/docs/how_to/prompts_partial.ipynb
@@ -20,7 +20,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/qa_chat_history_how_to.ipynb
+++ b/docs/docs/how_to/qa_chat_history_how_to.ipynb
@@ -161,7 +161,7 @@
   "id": "15f8ad59-19de-42e3-85a8-3ba95ee0bd43",
   "metadata": {},
   "source": [
-    "For the retriever, we will use [WebBaseLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.web_base.WebBaseLoader.html) to load the content of a web page. Here we instantiate a `InMemoryVectorStore` vectorstore and then use its [.as_retriever](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.VectorStore.html#langchain_core.vectorstores.VectorStore.as_retriever) method to build a retriever that can be incorporated into [LCEL](/docs/concepts/#langchain-expression-language) chains."
+    "For the retriever, we will use [WebBaseLoader](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.web_base.WebBaseLoader.html) to load the content of a web page. Here we instantiate a `InMemoryVectorStore` vectorstore and then use its [.as_retriever](https://python.langchain.com/api_reference/core/vectorstores/langchain_core.vectorstores.VectorStore.html#langchain_core.vectorstores.VectorStore.as_retriever) method to build a retriever that can be incorporated into [LCEL](/docs/concepts/lcel) chains."
   ]
  },
  {
--- a/docs/docs/how_to/qa_sources.ipynb
+++ b/docs/docs/how_to/qa_sources.ipynb
@@ -326,7 +326,7 @@
    "\n",
    "Up to this point, we've simply propagated the documents returned from the retrieval step through to the final response. But this may not illustrate what subset of information the model relied on when generating its answer. Below, we show how to structure sources into the model response, allowing the model to report what specific context it relied on for its answer.\n",
    "\n",
-    "Because the above LCEL implementation is composed of [Runnable](/docs/concepts/#runnable-interface) primitives, it is straightforward to extend. Below, we make a simple change:\n",
+    "Because the above LCEL implementation is composed of [Runnable](/docs/concepts/runnables) primitives, it is straightforward to extend. Below, we make a simple change:\n",
    "\n",
    "- We use the model's tool-calling features to generate [structured output](/docs/how_to/structured_output/), consisting of an answer and list of sources. The schema for the response is represented in the `AnswerWithSources` TypedDict, below.\n",
    "- We remove the `StrOutputParser()`, as we expect `dict` output in this scenario."
--- a/docs/docs/how_to/routing.ipynb
+++ b/docs/docs/how_to/routing.ipynb
@@ -21,11 +21,11 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "- [Configuring chain parameters at runtime](/docs/how_to/configure)\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Chat Messages](/docs/concepts/#message-types)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
+    "- [Chat Messages](/docs/concepts/messages)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/sequence.ipynb
+++ b/docs/docs/how_to/sequence.ipynb
@@ -22,14 +22,14 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [Output parser](/docs/concepts/#output-parsers)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
+    "- [Prompt templates](/docs/concepts/prompt_templates)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Output parser](/docs/concepts/output_parsers)\n",
    "\n",
    ":::\n",
    "\n",
-    "One point about [LangChain Expression Language](/docs/concepts/#langchain-expression-language) is that any two runnables can be \"chained\" together into sequences. The output of the previous runnable's `.invoke()` call is passed as input to the next runnable. This can be done using the pipe operator (`|`), or the more explicit `.pipe()` method, which does the same thing.\n",
+    "One point about [LangChain Expression Language](/docs/concepts/lcel) is that any two runnables can be \"chained\" together into sequences. The output of the previous runnable's `.invoke()` call is passed as input to the next runnable. This can be done using the pipe operator (`|`), or the more explicit `.pipe()` method, which does the same thing.\n",
    "\n",
    "The resulting [`RunnableSequence`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.RunnableSequence.html) is itself a runnable, which means it can be invoked, streamed, or further chained just like any other runnable. Advantages of chaining runnables in this way are efficient streaming (the sequence will stream output as soon as it is available), and debugging and tracing with tools like [LangSmith](/docs/how_to/debugging).\n",
    "\n",
--- a/docs/docs/how_to/serialization.ipynb
+++ b/docs/docs/how_to/serialization.ipynb
@@ -14,7 +14,7 @@
    "\n",
    "To save and load LangChain objects using this system, use the `dumpd`, `dumps`, `load`, and `loads` functions in the [load module](https://python.langchain.com/api_reference/core/load.html) of `langchain-core`. These functions support JSON and JSON-serializable objects.\n",
    "\n",
-    "All LangChain objects that inherit from [Serializable](https://python.langchain.com/api_reference/core/load/langchain_core.load.serializable.Serializable.html) are JSON-serializable. Examples include [messages](https://python.langchain.com/api_reference//python/core_api_reference.html#module-langchain_core.messages), [document objects](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) (e.g., as returned from [retrievers](/docs/concepts/#retrievers)), and most [Runnables](/docs/concepts/#langchain-expression-language-lcel), such as chat models, retrievers, and [chains](/docs/how_to/sequence) implemented with the LangChain Expression Language.\n",
+    "All LangChain objects that inherit from [Serializable](https://python.langchain.com/api_reference/core/load/langchain_core.load.serializable.Serializable.html) are JSON-serializable. Examples include [messages](https://python.langchain.com/api_reference//python/core_api_reference.html#module-langchain_core.messages), [document objects](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) (e.g., as returned from [retrievers](/docs/concepts/retrievers)), and most [Runnables](/docs/concepts/lcel), such as chat models, retrievers, and [chains](/docs/how_to/sequence) implemented with the LangChain Expression Language.\n",
    "\n",
    "Below we walk through an example with a simple [LLM chain](/docs/tutorials/llm_chain).\n",
    "\n",
--- a/docs/docs/how_to/streaming.ipynb
+++ b/docs/docs/how_to/streaming.ipynb
@@ -24,15 +24,15 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LangChain Expression Language](/docs/concepts/#langchain-expression-language)\n",
-    "- [Output parsers](/docs/concepts/#output-parsers)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [LangChain Expression Language](/docs/concepts/lcel)\n",
+    "- [Output parsers](/docs/concepts/output_parsers)\n",
    "\n",
    ":::\n",
    "\n",
    "Streaming is critical in making applications based on LLMs feel responsive to end-users.\n",
    "\n",
-    "Important LangChain primitives like [chat models](/docs/concepts/#chat-models), [output parsers](/docs/concepts/#output-parsers), [prompts](/docs/concepts/#prompt-templates), [retrievers](/docs/concepts/#retrievers), and [agents](/docs/concepts/#agents) implement the LangChain [Runnable Interface](/docs/concepts#interface).\n",
+    "Important LangChain primitives like [chat models](/docs/concepts/chat_models), [output parsers](/docs/concepts/output_parsers), [prompts](/docs/concepts/prompt_templates), [retrievers](/docs/concepts/retrievers), and [agents](/docs/concepts/agents) implement the LangChain [Runnable Interface](/docs/concepts#interface).\n",
    "\n",
    "This interface provides two general approaches to stream content:\n",
    "\n",
@@ -42,7 +42,7 @@
    "Let's take a look at both approaches, and try to understand how to use them.\n",
    "\n",
    ":::info\n",
-    "For a higher-level overview of streaming techniques in LangChain, see [this section of the conceptual guide](/docs/concepts/#streaming).\n",
+    "For a higher-level overview of streaming techniques in LangChain, see [this section of the conceptual guide](/docs/concepts/streaming).\n",
    ":::\n",
    "\n",
    "## Using Stream\n",
@@ -1510,7 +1510,7 @@
    "\n",
    "Now you've learned some ways to stream both final outputs and internal steps with LangChain.\n",
    "\n",
-    "To learn more, check out the other how-to guides in this section, or the [conceptual guide on Langchain Expression Language](/docs/concepts/#langchain-expression-language/)."
+    "To learn more, check out the other how-to guides in this section, or the [conceptual guide on Langchain Expression Language](/docs/concepts/lcel/)."
   ]
  }
 ],
--- a/docs/docs/how_to/structured_output.ipynb
+++ b/docs/docs/how_to/structured_output.ipynb
@@ -25,8 +25,8 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [Function/tool calling](/docs/concepts/#functiontool-calling)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Function/tool calling](/docs/concepts/tool_calling)\n",
    ":::\n",
    "\n",
    "It is often useful to have a model return output that matches a specific schema. One common use-case is extracting data from text to insert into a database or use with some other downstream system. This guide covers a few strategies for getting structured outputs from a model.\n",
@@ -776,7 +776,7 @@
    "\n",
    "### Custom Parsing\n",
    "\n",
-    "You can also create a custom prompt and parser with [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language), using a plain function to parse the output from the model:"
+    "You can also create a custom prompt and parser with [LangChain Expression Language (LCEL)](/docs/concepts/lcel), using a plain function to parse the output from the model:"
   ]
  },
  {
--- a/docs/docs/how_to/summarize_refine.ipynb
+++ b/docs/docs/how_to/summarize_refine.ipynb
@@ -33,7 +33,7 @@
    "\n",
    "- LangGraph allows for individual steps (such as successive summarizations) to be streamed, allowing for greater control of execution;\n",
    "- LangGraph's [checkpointing](https://langchain-ai.github.io/langgraph/how-tos/persistence/) supports error recovery, extending with human-in-the-loop workflows, and easier incorporation into conversational applications.\n",
-    "- Because it is assembled from modular components, it is also simple to extend or modify (e.g., to incorporate [tool calling](/docs/concepts/#functiontool-calling) or other behavior).\n",
+    "- Because it is assembled from modular components, it is also simple to extend or modify (e.g., to incorporate [tool calling](/docs/concepts/tool_calling) or other behavior).\n",
    "\n",
    "Below, we demonstrate how to summarize text via iterative refinement."
   ]
--- a/docs/docs/how_to/summarize_stuff.ipynb
+++ b/docs/docs/how_to/summarize_stuff.ipynb
@@ -117,7 +117,7 @@
   "source": [
    "## Invoke chain\n",
    "\n",
-    "Because the chain is a [Runnable](/docs/concepts/#runnable-interface), it implements the usual methods for invocation:"
+    "Because the chain is a [Runnable](/docs/concepts/runnables), it implements the usual methods for invocation:"
   ]
  },
  {
--- a/docs/docs/how_to/tool_artifacts.ipynb
+++ b/docs/docs/how_to/tool_artifacts.ipynb
@@ -10,9 +10,9 @@
    ":::info Prerequisites\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [ToolMessage](/docs/concepts/#toolmessage)\n",
-    "- [Tools](/docs/concepts/#tools)\n",
-    "- [Function/tool calling](/docs/concepts/#functiontool-calling)\n",
+    "- [ToolMessage](/docs/concepts/messages/#toolmessage)\n",
+    "- [Tools](/docs/concepts/tools)\n",
+    "- [Function/tool calling](/docs/concepts/tool_calling)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/tool_calling.ipynb
+++ b/docs/docs/how_to/tool_calling.ipynb
@@ -23,13 +23,13 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [Tool calling](/docs/concepts/#functiontool-calling)\n",
-    "- [Tools](/docs/concepts/#tools)\n",
-    "- [Output parsers](/docs/concepts/#output-parsers)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Tool calling](/docs/concepts/tool_calling)\n",
+    "- [Tools](/docs/concepts/tools)\n",
+    "- [Output parsers](/docs/concepts/output_parsers)\n",
    ":::\n",
    "\n",
-    "[Tool calling](/docs/concepts/#functiontool-calling) allows a chat model to respond to a given prompt by \"calling a tool\".\n",
+    "[Tool calling](/docs/concepts/tool_calling) allows a chat model to respond to a given prompt by \"calling a tool\".\n",
    "\n",
    "Remember, while the name \"tool calling\" implies that the model is directly performing some action, this is actually not the case! The model only generates the arguments to a tool, and actually running the tool (or not) is up to the user.\n",
    "\n",
--- a/docs/docs/how_to/tool_choice.ipynb
+++ b/docs/docs/how_to/tool_choice.ipynb
@@ -9,8 +9,8 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LangChain Tools](/docs/concepts/#tools)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [LangChain Tools](/docs/concepts/tools)\n",
    "- [How to use a model to call tools](/docs/how_to/tool_calling)\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/tool_configure.ipynb
+++ b/docs/docs/how_to/tool_configure.ipynb
@@ -10,9 +10,9 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [LangChain Tools](/docs/concepts/#tools)\n",
+    "- [LangChain Tools](/docs/concepts/tools)\n",
    "- [Custom tools](/docs/how_to/custom_tools)\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language-lcel)\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/lcel)\n",
    "- [Configuring runnable behavior](/docs/how_to/configure/)\n",
    "\n",
    ":::\n",
--- a/docs/docs/how_to/tool_results_pass_to_model.ipynb
+++ b/docs/docs/how_to/tool_results_pass_to_model.ipynb
@@ -9,14 +9,14 @@
    ":::info Prerequisites\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [LangChain Tools](/docs/concepts/#tools)\n",
-    "- [Function/tool calling](/docs/concepts/#functiontool-calling)\n",
+    "- [LangChain Tools](/docs/concepts/tools)\n",
+    "- [Function/tool calling](/docs/concepts/tool_calling)\n",
    "- [Using chat models to call tools](/docs/how_to/tool_calling)\n",
    "- [Defining custom tools](/docs/how_to/custom_tools/)\n",
    "\n",
    ":::\n",
    "\n",
-    "Some models are capable of [**tool calling**](/docs/concepts/#functiontool-calling) - generating arguments that conform to a specific user-provided schema. This guide will demonstrate how to use those tool cals to actually call a function and properly pass the results back to the model.\n",
+    "Some models are capable of [**tool calling**](/docs/concepts/tool_calling) - generating arguments that conform to a specific user-provided schema. This guide will demonstrate how to use those tool cals to actually call a function and properly pass the results back to the model.\n",
    "\n",
    "![Diagram of a tool call invocation](/img/tool_invocation.png)\n",
    "\n",
--- a/docs/docs/how_to/tool_runtime.ipynb
+++ b/docs/docs/how_to/tool_runtime.ipynb
@@ -10,8 +10,8 @@
    "import Compatibility from \"@theme/Compatibility\";\n",
    "\n",
    "<Prerequisites titlesAndLinks={[\n",
-    "  [\"Chat models\", \"/docs/concepts/#chat-models\"],\n",
-    "  [\"LangChain Tools\", \"/docs/concepts/#tools\"],\n",
+    "  [\"Chat models\", \"/docs/concepts/chat_models\"],\n",
+    "  [\"LangChain Tools\", \"/docs/concepts/tools\"],\n",
    "  [\"How to create tools\", \"/docs/how_to/custom_tools\"],\n",
    "  [\"How to use a model to call tools\", \"/docs/how_to/tool_calling\"],\n",
    "]} />\n",
--- a/docs/docs/how_to/tool_stream_events.ipynb
+++ b/docs/docs/how_to/tool_stream_events.ipynb
@@ -9,7 +9,7 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [LangChain Tools](/docs/concepts/#tools)\n",
+    "- [LangChain Tools](/docs/concepts/tools)\n",
    "- [Custom tools](/docs/how_to/custom_tools)\n",
    "- [Using stream events](/docs/how_to/streaming/#using-stream-events)\n",
    "- [Accessing RunnableConfig within a custom tool](/docs/how_to/tool_configure/)\n",
--- a/docs/docs/how_to/tools_builtin.ipynb
+++ b/docs/docs/how_to/tools_builtin.ipynb
@@ -23,8 +23,8 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [LangChain Tools](/docs/concepts/#tools)\n",
-    "- [LangChain Toolkits](/docs/concepts/#tools)\n",
+    "- [LangChain Tools](/docs/concepts/tools)\n",
+    "- [LangChain Toolkits](/docs/concepts/tools)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/tools_error.ipynb
+++ b/docs/docs/how_to/tools_error.ipynb
@@ -10,8 +10,8 @@
    ":::info Prerequisites\n",
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LangChain Tools](/docs/concepts/#tools)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [LangChain Tools](/docs/concepts/tools)\n",
    "- [How to use a model to call tools](/docs/how_to/tool_calling)\n",
    "\n",
    ":::\n",
--- a/docs/docs/how_to/tools_prompting.ipynb
+++ b/docs/docs/how_to/tools_prompting.ipynb
@@ -27,10 +27,10 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [LangChain Tools](/docs/concepts/#tools)\n",
-    "- [Function/tool calling](https://python.langchain.com/docs/concepts/#functiontool-calling)\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LLMs](/docs/concepts/#llms)\n",
+    "- [LangChain Tools](/docs/concepts/tools)\n",
+    "- [Function/tool calling](https://python.langchain.com/docs/concepts/tool_calling)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [LLMs](/docs/concepts/text_llms)\n",
    "\n",
    ":::\n",
    "\n",
--- a/docs/docs/how_to/trim_messages.ipynb
+++ b/docs/docs/how_to/trim_messages.ipynb
@@ -11,10 +11,10 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "\n",
-    "- [Messages](/docs/concepts/#messages)\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [Messages](/docs/concepts/messages)\n",
+    "- [Chat models](/docs/concepts/chat_models)\n",
    "- [Chaining](/docs/how_to/sequence/)\n",
-    "- [Chat history](/docs/concepts/#chat-history)\n",
+    "- [Chat history](/docs/concepts/chat_history)\n",
    "\n",
    "The methods in this guide also require `langchain-core>=0.2.9`.\n",
    "\n",
@@ -28,7 +28,7 @@
    "If passing the trimmed chat history back into a chat model directly, the trimmed chat history should satisfy the following properties:\n",
    "\n",
    "1. The resulting chat history should be **valid**. Usually this means that the following properties should be satisfied:\n",
-    "   - The chat history **starts** with either (1) a `HumanMessage` or (2) a [SystemMessage](/docs/concepts/#systemmessage) followed by a `HumanMessage`.\n",
+    "   - The chat history **starts** with either (1) a `HumanMessage` or (2) a [SystemMessage](/docs/concepts/messages/#systemmessage) followed by a `HumanMessage`.\n",
    "   - The chat history **ends** with either a `HumanMessage` or a `ToolMessage`.\n",
    "   - A `ToolMessage` can only appear after an `AIMessage` that involved a tool call. \n",
    "   This can be achieved by setting `start_on=\"human\"` and `ends_on=(\"human\", \"tool\")`.\n",
--- a/docs/docs/integrations/chat/anthropic.ipynb
+++ b/docs/docs/integrations/chat/anthropic.ipynb
@@ -17,7 +17,7 @@
   "source": [
    "# ChatAnthropic\n",
    "\n",
-    "This notebook provides a quick overview for getting started with Anthropic [chat models](/docs/concepts/#chat-models). For detailed documentation of all ChatAnthropic features and configurations head to the [API reference](https://python.langchain.com/api_reference/anthropic/chat_models/langchain_anthropic.chat_models.ChatAnthropic.html).\n",
+    "This notebook provides a quick overview for getting started with Anthropic [chat models](/docs/concepts/chat_models). For detailed documentation of all ChatAnthropic features and configurations head to the [API reference](https://python.langchain.com/api_reference/anthropic/chat_models/langchain_anthropic.chat_models.ChatAnthropic.html).\n",
    "\n",
    "Anthropic has several chat models. You can find information about their latest models and their costs, context windows, and supported input types in the [Anthropic docs](https://docs.anthropic.com/en/docs/models-overview).\n",
    "\n",
--- a/docs/docs/integrations/chat/azure_chat_openai.ipynb
+++ b/docs/docs/integrations/chat/azure_chat_openai.ipynb
@@ -17,7 +17,7 @@
   "source": [
    "# AzureChatOpenAI\n",
    "\n",
-    "This guide will help you get started with AzureOpenAI [chat models](/docs/concepts/#chat-models). For detailed documentation of all AzureChatOpenAI features and configurations head to the [API reference](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.azure.AzureChatOpenAI.html).\n",
+    "This guide will help you get started with AzureOpenAI [chat models](/docs/concepts/chat_models). For detailed documentation of all AzureChatOpenAI features and configurations head to the [API reference](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.azure.AzureChatOpenAI.html).\n",
    "\n",
    "Azure OpenAI has several chat models. You can find information about their latest models and their costs, context windows, and supported input types in the [Azure docs](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models).\n",
    "\n",
--- a/docs/docs/integrations/chat/bedrock.ipynb
+++ b/docs/docs/integrations/chat/bedrock.ipynb
@@ -17,7 +17,7 @@
   "source": [
    "# ChatBedrock\n",
    "\n",
-    "This doc will help you get started with AWS Bedrock [chat models](/docs/concepts/#chat-models). Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources. Since Amazon Bedrock is serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with.\n",
+    "This doc will help you get started with AWS Bedrock [chat models](/docs/concepts/chat_models). Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. Using Amazon Bedrock, you can easily experiment with and evaluate top FMs for your use case, privately customize them with your data using techniques such as fine-tuning and Retrieval Augmented Generation (RAG), and build agents that execute tasks using your enterprise systems and data sources. Since Amazon Bedrock is serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with.\n",
    "\n",
    "For more information on which models are accessible via Bedrock, head to the [AWS docs](https://docs.aws.amazon.com/bedrock/latest/userguide/models-features.html).\n",
    "\n",
--- a/docs/docs/integrations/chat/cerebras.ipynb
+++ b/docs/docs/integrations/chat/cerebras.ipynb
@@ -17,7 +17,7 @@
   "source": [
    "# ChatCerebras\n",
    "\n",
-    "This notebook provides a quick overview for getting started with Cerebras [chat models](/docs/concepts/#chat-models). For detailed documentation of all ChatCerebras features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_cerebras.chat_models.ChatCerebras.html).\n",
+    "This notebook provides a quick overview for getting started with Cerebras [chat models](/docs/concepts/chat_models). For detailed documentation of all ChatCerebras features and configurations head to the [API reference](https://api.python.langchain.com/en/latest/chat_models/langchain_cerebras.chat_models.ChatCerebras.html).\n",
    "\n",
    "At Cerebras, we've developed the world's largest and fastest AI processor, the Wafer-Scale Engine-3 (WSE-3). The Cerebras CS-3 system, powered by the WSE-3, represents a new class of AI supercomputer that sets the standard for generative AI training and inference with unparalleled performance and scalability.\n",
    "\n",
--- a/docs/docs/integrations/chat/databricks.ipynb
+++ b/docs/docs/integrations/chat/databricks.ipynb
@@ -21,7 +21,7 @@
    "\n",
    "> [Databricks](https://www.databricks.com/) Lakehouse Platform unifies data, analytics, and AI on one platform. \n",
    "\n",
-    "This notebook provides a quick overview for getting started with Databricks [chat models](/docs/concepts/#chat-models). For detailed documentation of all ChatDatabricks features and configurations head to the [API reference](https://python.langchain.com/api_reference/community/chat_models/langchain_community.chat_models.databricks.ChatDatabricks.html).\n",
+    "This notebook provides a quick overview for getting started with Databricks [chat models](/docs/concepts/chat_models). For detailed documentation of all ChatDatabricks features and configurations head to the [API reference](https://python.langchain.com/api_reference/community/chat_models/langchain_community.chat_models.databricks.ChatDatabricks.html).\n",
    "\n",
    "## Overview\n",
    "\n",
--- a/docs/docs/integrations/chat/fireworks.ipynb
+++ b/docs/docs/integrations/chat/fireworks.ipynb
@@ -17,7 +17,7 @@
   "source": [
    "# ChatFireworks\n",
    "\n",
-    "This doc help you get started with Fireworks AI [chat models](/docs/concepts/#chat-models). For detailed documentation of all ChatFireworks features and configurations head to the [API reference](https://python.langchain.com/api_reference/fireworks/chat_models/langchain_fireworks.chat_models.ChatFireworks.html).\n",
+    "This doc help you get started with Fireworks AI [chat models](/docs/concepts/chat_models). For detailed documentation of all ChatFireworks features and configurations head to the [API reference](https://python.langchain.com/api_reference/fireworks/chat_models/langchain_fireworks.chat_models.ChatFireworks.html).\n",
    "\n",
    "Fireworks AI is an AI inference platform to run and customize models. For a list of all models served by Fireworks see the [Fireworks docs](https://fireworks.ai/models).\n",
    "\n",
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Eugene Yurtsev	a5cfcbc849	Merge branch 'master' into eugene/foo_foo	2024-10-24 13:46:20 -04:00
Tibor Reiss	20b56a0233	core[patch]: fix repr and str for Serializable (#26786 ) Fixes #26499 --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-10-24 08:36:35 -07:00
Adarsh Sahu	2d58a8a08d	docs: Update structured_outputs.mdx (#27613 ) `strightforward` => `straightforward` `adavanced` => `advanced` `There a few challenges` => `There are a few challenges` Documentation Correction: * [`docs/docs/concepts/structured_output.mdx`]: Corrected several typos in the sentence directing users to the API reference.	2024-10-24 15:13:28 +00:00
Daniel Vu Dao	da6b526770	docs: Update `Runnable` documentation (#27606 ) Description Adds better code formatting for one of the docs.	2024-10-24 15:05:43 +00:00
QiQi	133c1b4f76	docs: Update passthrough.ipynb -- Grammar correction (#27601 ) Grammar correction needed in passthrough.ipynb The sentence is: "Now you've learned how to pass data through your chains to help to help format the data flowing through your chains." There's a redundant "to help", and it could be more succinctly written as: "Now you've learned how to pass data through your chains to help format the data flowing through your chains."	2024-10-24 15:05:06 +00:00
hippopond	61897aef90	docs: Fix for spelling mistake (#27599 ) Fixes #26009 Thank you for contributing to LangChain! - [x] PR title: "docs: Correcting spelling mistake" - [x] PR message: - Description: Corrected spelling from "trianed" to "trained" - Issue: the issue #26009 - Dependencies: NA - Twitter handle: NA - [ ] Add tests and docs: NA - [ ] Lint and test: Co-authored-by: Libby Lin <libbylin@Libbys-MacBook-Pro.local>	2024-10-24 15:04:18 +00:00
Eugene Yurtsev	7347ec8251	x	2024-10-24 10:20:42 -04:00
Eugene Yurtsev	d081a5400a	docs: fix more links (#27598 ) Fix more links	2024-10-23 21:26:38 -04:00
Lei Zhang	f203229b51	community: Fix the failure of ChatSparkLLM after upgrading to Pydantic V2 (#27418 ) Description: The test_sparkllm.py can reproduce this issue. https://github.com/langchain-ai/langchain/blob/master/libs/community/tests/integration_tests/chat_models/test_sparkllm.py#L66 ``` Testing started at 18:27 ... Launching pytest with arguments test_sparkllm.py::test_chat_spark_llm --no-header --no-summary -q in /Users/zhanglei/Work/github/langchain/libs/community/tests/integration_tests/chat_models ============================= test session starts ============================== collecting ... collected 1 item test_sparkllm.py::test_chat_spark_llm ============================== 1 failed in 0.45s =============================== FAILED [100%] tests/integration_tests/chat_models/test_sparkllm.py:65 (test_chat_spark_llm) def test_chat_spark_llm() -> None: > chat = ChatSparkLLM( spark_app_id="your spark_app_id", spark_api_key="your spark_api_key", spark_api_secret="your spark_api_secret", ) # type: ignore[call-arg] test_sparkllm.py:67: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ ../../../../core/langchain_core/load/serializable.py:111: in __init__ super().__init__(args, kwargs) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ cls = <class 'langchain_community.chat_models.sparkllm.ChatSparkLLM'> values = {'spark_api_key': 'your spark_api_key', 'spark_api_secret': 'your spark_api_secret', 'spark_api_url': 'wss://spark-api.xf-yun.com/v3.5/chat', 'spark_app_id': 'your spark_app_id', ...} @model_validator(mode="before") @classmethod def validate_environment(cls, values: Dict) -> Any: values["spark_app_id"] = get_from_dict_or_env( values, ["spark_app_id", "app_id"], "IFLYTEK_SPARK_APP_ID", ) values["spark_api_key"] = get_from_dict_or_env( values, ["spark_api_key", "api_key"], "IFLYTEK_SPARK_API_KEY", ) values["spark_api_secret"] = get_from_dict_or_env( values, ["spark_api_secret", "api_secret"], "IFLYTEK_SPARK_API_SECRET", ) values["spark_api_url"] = get_from_dict_or_env( values, "spark_api_url", "IFLYTEK_SPARK_API_URL", SPARK_API_URL, ) values["spark_llm_domain"] = get_from_dict_or_env( values, "spark_llm_domain", "IFLYTEK_SPARK_LLM_DOMAIN", SPARK_LLM_DOMAIN, ) # put extra params into model_kwargs default_values = { name: field.default for name, field in get_fields(cls).items() if field.default is not None } > values["model_kwargs"]["temperature"] = default_values.get("temperature") E KeyError: 'model_kwargs' ../../../langchain_community/chat_models/sparkllm.py:368: KeyError ``` I found that when upgrading to Pydantic v2, @root_validator was changed to @model_validator. When a class declares multiple @model_validator(model=before), the execution order in V1 and V2 is opposite. This is the reason for ChatSparkLLM's failure. The correct execution order is to execute build_extra first. https://github.com/langchain-ai/langchain/blob/langchain%3D%3D0.2.16/libs/community/langchain_community/chat_models/sparkllm.py#L302 And then execute validate_environment. https://github.com/langchain-ai/langchain/blob/langchain%3D%3D0.2.16/libs/community/langchain_community/chat_models/sparkllm.py#L329 The Pydantic community also discusses it, but there hasn't been a conclusion yet. https://github.com/pydantic/pydantic/discussions/7434 Issus:* #27416 Twitter handle: coolbeevip --------- Co-authored-by: vbarda <vadym@langchain.dev>	2024-10-23 21:17:10 -04:00
Andrew Effendi	8f151223ad	Community: Fix DuckDuckGo search tool Output Format (#27479 ) Issue: : https://github.com/langchain-ai/langchain/issues/22961 Description: Previously, the documentation for `DuckDuckGoSearchResults` said that it returns a JSON string, however the code returns a regular string that can't be parsed as is. for example running ```python from langchain_community.tools import DuckDuckGoSearchResults # Create a DuckDuckGo search instance search = DuckDuckGoSearchResults() # Invoke the search result = search.invoke("Obama") # Print the result print(result) # Print the type of the result print("Result Type:", type(result)) ``` will return ``` snippet: Harris will hold a campaign event with former President Barack Obama in Georgia next Thursday, the first time the pair has campaigned side by side, a senior campaign official said. A week from ..., title: Obamas to hit the campaign trail in first joint appearances with Harris, link: https://www.nbcnews.com/politics/2024-election/obamas-hit-campaign-trail-first-joint-appearances-harris-rcna176034, snippet: Item 1 of 3 Former U.S. first lady Michelle Obama and her husband, former U.S. President Barack Obama, stand on stage during Day 2 of the Democratic National Convention (DNC) in Chicago, Illinois ..., title: Obamas set to hit campaign trail with Kamala Harris for first time, link: https://www.reuters.com/world/us/obamas-set-hit-campaign-trail-with-kamala-harris-first-time-2024-10-18/, snippet: Barack and Michelle Obama will make their first campaign appearances alongside Kamala Harris at rallies in Georgia and Michigan. By Reid J. Epstein Reporting from Ashwaubenon, Wis. Here come the ..., title: Harris Will Join Michelle Obama and Barack Obama on Campaign Trail, link: https://www.nytimes.com/2024/10/18/us/politics/kamala-harris-michelle-obama-barack-obama.html, snippet: Obama's leaving office was "a turning point," Mirsky said. "That was the last time anybody felt normal." A few feet over, a 64-year-old physics professor named Eric Swanson who had grown ..., title: Obama's reemergence on the campaign trail for Harris comes as he ..., link: https://www.cnn.com/2024/10/13/politics/obama-campaign-trail-harris-biden/index.html Result Type: <class 'str'> ``` After the change in this PR, `DuckDuckGoSearchResults` takes an additional `output_format = "list" \| "json" \| "string"` ("string" = current behavior, default). For example, invoking `DuckDuckGoSearchResults(output_format="list")` return a list of dictionaries in the format ``` [{'snippet': '...', 'title': '...', 'link': '...'}, ...] ``` e.g. ``` [{'snippet': "Obama has in a sense been wrestling with Trump's impact since the real estate magnate broke onto the political stage in 2015. Trump's victory the next year, defeating Obama's secretary of ...", 'title': "Obama's fears about Trump drive his stepped-up campaigning", 'link': 'https://www.washingtonpost.com/politics/2024/10/18/obama-trump-anxiety-harris-campaign/'}, {'snippet': 'Harris will hold a campaign event with former President Barack Obama in Georgia next Thursday, the first time the pair has campaigned side by side, a senior campaign official said. A week from ...', 'title': 'Obamas to hit the campaign trail in first joint appearances with Harris', 'link': 'https://www.nbcnews.com/politics/2024-election/obamas-hit-campaign-trail-first-joint-appearances-harris-rcna176034'}, {'snippet': 'Item 1 of 3 Former U.S. first lady Michelle Obama and her husband, former U.S. President Barack Obama, stand on stage during Day 2 of the Democratic National Convention (DNC) in Chicago, Illinois ...', 'title': 'Obamas set to hit campaign trail with Kamala Harris for first time', 'link': 'https://www.reuters.com/world/us/obamas-set-hit-campaign-trail-with-kamala-harris-first-time-2024-10-18/'}, {'snippet': 'Barack and Michelle Obama will make their first campaign appearances alongside Kamala Harris at rallies in Georgia and Michigan. By Reid J. Epstein Reporting from Ashwaubenon, Wis. Here come the ...', 'title': 'Harris Will Join Michelle Obama and Barack Obama on Campaign Trail', 'link': 'https://www.nytimes.com/2024/10/18/us/politics/kamala-harris-michelle-obama-barack-obama.html'}] Result Type: <class 'list'> ``` --------- Co-authored-by: vbarda <vadym@langchain.dev>	2024-10-23 20:18:11 -04:00
Erick Friis	5e5647b5dd	docs: render api ref urls in search (#27594 )	2024-10-23 16:18:21 -07:00
Bagatur	948e2e6322	docs: concept nits (#27586 )	2024-10-23 14:52:44 -07:00
Eugene Yurtsev	562cf416c2	docs: Update messages.mdx (#27592 ) Add missing `.`	2024-10-23 20:18:27 +00:00
Ankur Singh	71e0f4cd62	docs: Fix spelling mistake in concepts (#27589 ) `Fore` => `For` Documentation Correction: * [`docs/docs/concepts/async.mdx`](diffhunk://#diff-4959e81c20607c20c7a9c38db4405a687c5d94f24fc8220377701afeee7562b0L40-R40): Corrected a typo from "Fore" to "For" in the sentence directing users to the API reference.	2024-10-23 16:10:21 -04:00
Bagatur	968dccee04	core[patch]: convert_to_openai_tool Anthropic support (#27591 )	2024-10-23 12:27:06 -07:00
Bagatur	217de4e6a6	langchain[patch]: de-beta init_chat_model (#27558 )	2024-10-23 08:35:15 -07:00
Eugene Yurtsev	4466caadba	concepts: update llm stub page and re-link (#27567 ) Update text llm stub page and re-link content	2024-10-22 23:03:36 -04:00
Eugene Yurtsev	f2dbf01d4a	Docs: Re-organize conceptual docs (#27047 ) Reorganization of conceptual documentation --------- Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com> Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-10-22 22:08:20 -04:00
Kwan Kin Chan	6d2a76ac05	langchain_huggingface: Fix multiple GPU usage bug in from_model_id function (#23628 ) - [ ] Description: - pass the device_map into model_kwargs - removing the unused device_map variable in the hf_pipeline function call - [ ] Issue: issue #13128 When using the from_model_id function to load a Hugging Face model for text generation across multiple GPUs, the model defaults to loading on the CPU despite multiple GPUs being available using the expected format ``` python llm = HuggingFacePipeline.from_model_id( model_id="model-id", task="text-generation", device_map="auto", ) ``` Currently, to enable multiple GPU , we have to pass in variable in this format instead ``` python llm = HuggingFacePipeline.from_model_id( model_id="model-id", task="text-generation", device=None, model_kwargs={ "device_map": "auto", } ) ``` This issue arises due to improper handling of the device and device_map parameters. - [ ] Explanation: 1. In from_model_id, the model is created using model_kwargs and passed as the model variable of the pipeline function. So at this moment, to load the model with multiple GPUs, "device_map" needs to be set to "auto" within model_kwargs. Otherwise, the model defaults to loading on the CPU. 2. The device_map variable in from_model_id is not utilized correctly. In the pipeline function's source code of tnansformer: - The device_map variable is stored in the model_kwargs dictionary (lines 867-878 of transformers/src/transformers/pipelines/\__init__.py). ```python if device_map is not None: ...... model_kwargs["device_map"] = device_map ``` - The model is constructed with model_kwargs containing the device_map value ONLY IF it is a string (lines 893-903 of transformers/src/transformers/pipelines/\__init__.py). ```python if isinstance(model, str) or framework is None: model_classes = {"tf": targeted_task["tf"], "pt": targeted_task["pt"]} framework, model = infer_framework_load_model( ... , model_kwargs, ) ``` - Consequently, since a model object is already passed to the pipeline function, the device_map variable from from_model_id is never used. 3. The device_map variable in from_model_id not only appears unused but also causes errors. Without explicitly setting device=None, attempting to load the model on multiple GPUs may result in the following error: ``` Device has 2 GPUs available. Provide device={deviceId} to `from_model_id` to use available GPUs for execution. deviceId is -1 (default) for CPU and can be a positive integer associated with CUDA device id. Traceback (most recent call last): File "foo.py", line 15, in <module> llm = HuggingFacePipeline.from_model_id( File "foo\site-packages\langchain_huggingface\llms\huggingface_pipeline.py", line 217, in from_model_id pipeline = hf_pipeline( File "foo\lib\site-packages\transformers\pipelines\__init__.py", line 1108, in pipeline return pipeline_class(model=model, framework=framework, task=task, kwargs) File "foo\lib\site-packages\transformers\pipelines\text_generation.py", line 96, in __init__ super().__init__(args, *kwargs) File "foo\lib\site-packages\transformers\pipelines\base.py", line 835, in __init__ raise ValueError( ValueError: The model has been loaded with `accelerate` and therefore cannot be moved to a specific device. Please discard the `device` argument when creating your pipeline object. ``` This error occurs because, in from_model_id, the default values in from_model_id for device and device_map are -1 and None, respectively. It would passes the statement (`device_map is not None and device < 0`) and keep the device as -1 so the pipeline function later raises an error when trying to move a GPU-loaded model back to the CPU. `19eb82e68b/libs/community/langchain_community/llms/huggingface_pipeline.py (L204-L213)` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: vbarda <vadym@langchain.dev>	2024-10-22 21:41:47 -04:00
Prakul	031d0e4725	docs:update to MongoDB Docs (#27531 ) Description: Update to MongoDB docs --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-23 00:21:37 +00:00
Fernando de Oliveira	ab205e7389	partners/openai + community: Async Azure AD token provider support for Azure OpenAI (#27488 ) This PR introduces a new `azure_ad_async_token_provider` attribute to the `AzureOpenAI` and `AzureChatOpenAI` classes in `partners/openai` and `community` packages, given it's currently supported on `openai` package as [AsyncAzureADTokenProvider](https://github.com/openai/openai-python/blob/main/src/openai/lib/azure.py#L33) type. The reason for creating a new attribute is to avoid breaking changes. Let's say you have an existing code that uses a `AzureOpenAI` or `AzureChatOpenAI` instance to perform both sync and async operations. The `azure_ad_token_provider` will work exactly as it is today, while `azure_ad_async_token_provider` will override it for async requests. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-10-22 21:43:06 +00:00
Bagatur	34684423bf	docs: rm Legacy API ref link (#27559 )	2024-10-22 14:12:38 -07:00
Savar Bhasin	0cae37b0a9	docs: fix docker command for RedisChatMessageHistory (#27484 ) docs: "fix docker command" - Description: The Redis chat message history component requires the Redis Stack to create indexes. When using only Redis, the following error occurs: "Unknown command 'FT.INFO', with args beginning with: 'chat_history'". - Twitter handle: savar_bhasin Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-22 19:42:51 +00:00
orkhank	9a277cbe00	community: Update `file_path` type in `JSONLoader.__init__()` signature (#27535 ) - Description: Change the type of the `file_path` argument from `str \| pathlib.Path` to `str \| os.PathLike`, since the latter is more widely used: https://stackoverflow.com/a/58541858 This is a very minor fix. I was just annoyed to see the red underline displayed by Pylance in VS Code: `reportArgumentType`. ![image](https://github.com/user-attachments/assets/719a7f8e-acca-4dfa-89df-925e1d938c71) The changes do not affect the behavior of the code.	2024-10-22 11:18:36 -07:00
Eric Pinzur	f636c83321	community: Cassandra Vector Store: modernize implementation (#27253 ) Description: This PR updates `CassandraGraphVectorStore` to be based off `CassandraVectorStore`, instead of using a custom CQL implementation. This allows users using a `CassandraVectorStore` to upgrade to a `GraphVectorStore` without having to change their database schema or re-embed documents. This PR also updates the documentation of the `GraphVectorStore` base class and contains native async implementations for the standard graph methods: `traversal_search` and `mmr_traversal_search` in `CassandraVectorStore`. Issue: No issue number. Dependencies: https://github.com/langchain-ai/langchain/pull/27078 (already-merged) Lint and test: - Lint and tests all pass, including existing `CassandraGraphVectorStore` tests. - Also added numerous additional tests based of the tests in `langchain-astradb` which cover many more scenarios than the existing tests for `Cassandra` and `CassandraGraphVectorStore` BREAKING CHANGE Note that this is a breaking change for existing users of `CassandraGraphVectorStore`. They will need to wipe their database table and restart. However: - The interfaces have not changed. Just the underlying storage mechanism. - Any one using `langchain_community.vectorstores.Cassandra` can instead use `langchain_community.graph_vectorstores.CassandraGraphVectorStore` and they will gain Graph capabilities without having to re-embed their existing documents. This is the primary goal of this PR. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-22 18:11:11 +00:00