From 74438f3ae85aa23bb1be3806ea02dc17463fb877 Mon Sep 17 00:00:00 2001
From: ccurme <chester.curme@gmail.com>
Date: Fri, 15 Nov 2024 09:44:11 -0500
Subject: [PATCH] docs: add links to concept guides in how-tos (#28118)

---
 docs/docs/how_to/HTML_header_metadata_splitter.ipynb  | 2 +-
 docs/docs/how_to/HTML_section_aware_splitter.ipynb    | 2 +-
 docs/docs/how_to/MultiQueryRetriever.ipynb            | 4 ++--
 docs/docs/how_to/add_scores_retriever.ipynb           | 4 ++--
 docs/docs/how_to/agent_executor.ipynb                 | 2 +-
 docs/docs/how_to/caching_embeddings.ipynb             | 2 +-
 docs/docs/how_to/character_text_splitter.ipynb        | 2 +-
 docs/docs/how_to/chat_model_caching.ipynb             | 2 +-
 docs/docs/how_to/chat_models_universal_init.ipynb     | 4 ++--
 docs/docs/how_to/chat_token_usage_tracking.ipynb      | 2 +-
 docs/docs/how_to/chatbots_retrieval.ipynb             | 4 ++--
 docs/docs/how_to/chatbots_tools.ipynb                 | 4 ++--
 docs/docs/how_to/code_splitter.ipynb                  | 2 +-
 docs/docs/how_to/contextual_compression.ipynb         | 4 ++--
 docs/docs/how_to/custom_chat_model.ipynb              | 6 +++---
 docs/docs/how_to/custom_retriever.ipynb               | 4 ++--
 docs/docs/how_to/custom_tools.ipynb                   | 2 +-
 docs/docs/how_to/document_loader_custom.ipynb         | 2 +-
 docs/docs/how_to/document_loader_pdf.ipynb            | 4 ++--
 docs/docs/how_to/document_loader_web.ipynb            | 2 +-
 docs/docs/how_to/ensemble_retriever.ipynb             | 2 +-
 docs/docs/how_to/example_selectors.ipynb              | 2 +-
 docs/docs/how_to/example_selectors_langsmith.ipynb    | 2 +-
 docs/docs/how_to/example_selectors_length_based.ipynb | 2 +-
 docs/docs/how_to/example_selectors_mmr.ipynb          | 2 +-
 docs/docs/how_to/example_selectors_ngram.ipynb        | 2 +-
 docs/docs/how_to/example_selectors_similarity.ipynb   | 2 +-
 docs/docs/how_to/extraction_examples.ipynb            | 2 +-
 docs/docs/how_to/extraction_parse.ipynb               | 2 +-
 docs/docs/how_to/few_shot_examples.ipynb              | 2 +-
 docs/docs/how_to/few_shot_examples_chat.ipynb         | 2 +-
 docs/docs/how_to/filter_messages.ipynb                | 2 +-
 docs/docs/how_to/graph_constructing.ipynb             | 4 ++--
 docs/docs/how_to/hybrid.ipynb                         | 2 +-
 docs/docs/how_to/indexing.ipynb                       | 2 +-
 docs/docs/how_to/lcel_cheatsheet.ipynb                | 2 +-
 docs/docs/how_to/llm_caching.ipynb                    | 2 +-
 docs/docs/how_to/llm_token_usage_tracking.ipynb       | 2 +-
 docs/docs/how_to/logprobs.ipynb                       | 3 ++-
 docs/docs/how_to/merge_message_runs.ipynb             | 2 +-
 docs/docs/how_to/multi_vector.ipynb                   | 2 +-
 docs/docs/how_to/multimodal_inputs.ipynb              | 4 ++--
 docs/docs/how_to/multimodal_prompts.ipynb             | 4 ++--
 docs/docs/how_to/output_parser_custom.ipynb           | 4 ++--
 docs/docs/how_to/output_parser_fixing.ipynb           | 2 +-
 docs/docs/how_to/output_parser_structured.ipynb       | 2 +-
 docs/docs/how_to/output_parser_xml.ipynb              | 2 +-
 docs/docs/how_to/parent_document_retriever.ipynb      | 4 ++--
 docs/docs/how_to/prompts_composition.ipynb            | 2 +-
 docs/docs/how_to/prompts_partial.ipynb                | 2 +-
 docs/docs/how_to/qa_chat_history_how_to.ipynb         | 2 +-
 docs/docs/how_to/qa_citations.ipynb                   | 4 ++--
 docs/docs/how_to/qa_per_user.ipynb                    | 4 ++--
 docs/docs/how_to/qa_sources.ipynb                     | 2 +-
 docs/docs/how_to/qa_streaming.ipynb                   | 2 +-
 docs/docs/how_to/query_few_shot.ipynb                 | 2 +-
 docs/docs/how_to/query_multiple_retrievers.ipynb      | 2 +-
 docs/docs/how_to/recursive_json_splitter.ipynb        | 2 +-
 docs/docs/how_to/recursive_text_splitter.ipynb        | 2 +-
 docs/docs/how_to/response_metadata.ipynb              | 2 +-
 docs/docs/how_to/runnable_runtime_secrets.ipynb       | 2 +-
 docs/docs/how_to/self_query.ipynb                     | 2 +-
 docs/docs/how_to/split_by_token.ipynb                 | 2 +-
 docs/docs/how_to/sql_prompting.ipynb                  | 2 +-
 docs/docs/how_to/structured_output.ipynb              | 6 +++---
 docs/docs/how_to/summarize_stuff.ipynb                | 2 +-
 docs/docs/how_to/time_weighted_vectorstore.ipynb      | 2 +-
 docs/docs/how_to/tool_artifacts.ipynb                 | 2 +-
 docs/docs/how_to/tool_choice.ipynb                    | 2 +-
 docs/docs/how_to/tool_configure.ipynb                 | 4 ++--
 docs/docs/how_to/tool_runtime.ipynb                   | 2 +-
 docs/docs/how_to/tool_stream_events.ipynb             | 2 +-
 docs/docs/how_to/tool_streaming.ipynb                 | 2 +-
 docs/docs/how_to/tools_as_openai_functions.ipynb      | 2 +-
 docs/docs/how_to/tools_chain.ipynb                    | 4 ++--
 docs/docs/how_to/tools_error.ipynb                    | 2 +-
 docs/docs/how_to/tools_few_shot.ipynb                 | 2 +-
 docs/docs/how_to/trim_messages.ipynb                  | 2 +-
 docs/docs/how_to/vectorstore_retriever.ipynb          | 2 +-
 79 files changed, 101 insertions(+), 100 deletions(-)

diff --git a/docs/docs/how_to/HTML_header_metadata_splitter.ipynb b/docs/docs/how_to/HTML_header_metadata_splitter.ipynb
index 2b336ed8844..266f5036679 100644
--- a/docs/docs/how_to/HTML_header_metadata_splitter.ipynb
+++ b/docs/docs/how_to/HTML_header_metadata_splitter.ipynb
@@ -13,7 +13,7 @@
     "# How to split by HTML header \n",
     "## Description and motivation\n",
     "\n",
-    "[HTMLHeaderTextSplitter](https://python.langchain.com/api_reference/text_splitters/html/langchain_text_splitters.html.HTMLHeaderTextSplitter.html) is a \"structure-aware\" chunker that splits text at the HTML element level and adds metadata for each header \"relevant\" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline.\n",
+    "[HTMLHeaderTextSplitter](https://python.langchain.com/api_reference/text_splitters/html/langchain_text_splitters.html.HTMLHeaderTextSplitter.html) is a \"structure-aware\" [text splitter](/docs/concepts/text_splitters/) that splits text at the HTML element level and adds metadata for each header \"relevant\" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline.\n",
     "\n",
     "It is analogous to the [MarkdownHeaderTextSplitter](/docs/how_to/markdown_header_metadata_splitter) for markdown files.\n",
     "\n",
diff --git a/docs/docs/how_to/HTML_section_aware_splitter.ipynb b/docs/docs/how_to/HTML_section_aware_splitter.ipynb
index be4da3f2156..368a7a2bcc4 100644
--- a/docs/docs/how_to/HTML_section_aware_splitter.ipynb
+++ b/docs/docs/how_to/HTML_section_aware_splitter.ipynb
@@ -12,7 +12,7 @@
    "source": [
     "# How to split by HTML sections\n",
     "## Description and motivation\n",
-    "Similar in concept to the [HTMLHeaderTextSplitter](/docs/how_to/HTML_header_metadata_splitter), the `HTMLSectionSplitter` is a \"structure-aware\" chunker that splits text at the element level and adds metadata for each header \"relevant\" to any given chunk.\n",
+    "Similar in concept to the [HTMLHeaderTextSplitter](/docs/how_to/HTML_header_metadata_splitter), the `HTMLSectionSplitter` is a \"structure-aware\" [text splitter](/docs/concepts/text_splitters/) that splits text at the element level and adds metadata for each header \"relevant\" to any given chunk.\n",
     "\n",
     "It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures.\n",
     "\n",
diff --git a/docs/docs/how_to/MultiQueryRetriever.ipynb b/docs/docs/how_to/MultiQueryRetriever.ipynb
index d27cca7eaa2..06d2c0fdd57 100644
--- a/docs/docs/how_to/MultiQueryRetriever.ipynb
+++ b/docs/docs/how_to/MultiQueryRetriever.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to use the MultiQueryRetriever\n",
     "\n",
-    "Distance-based vector database retrieval embeds (represents) queries in high-dimensional space and finds similar embedded documents based on a distance metric. But, retrieval may produce different results with subtle changes in query wording, or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious.\n",
+    "Distance-based [vector database](/docs/concepts/vectorstores/) retrieval [embeds](/docs/concepts/embedding_models/) (represents) queries in high-dimensional space and finds similar embedded documents based on a distance metric. But, retrieval may produce different results with subtle changes in query wording, or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious.\n",
     "\n",
     "The [MultiQueryRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html) automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` can mitigate some of the limitations of the distance-based retrieval and get a richer set of results.\n",
     "\n",
@@ -151,7 +151,7 @@
    "id": "7e170263-facd-4065-bb68-d11fb9123a45",
    "metadata": {},
    "source": [
-    "Note that the underlying queries generated by the retriever are logged at the `INFO` level."
+    "Note that the underlying queries generated by the [retriever](/docs/concepts/retrievers/) are logged at the `INFO` level."
    ]
   },
   {
diff --git a/docs/docs/how_to/add_scores_retriever.ipynb b/docs/docs/how_to/add_scores_retriever.ipynb
index 65d56cbcf83..3bdc8b80552 100644
--- a/docs/docs/how_to/add_scores_retriever.ipynb
+++ b/docs/docs/how_to/add_scores_retriever.ipynb
@@ -7,11 +7,11 @@
    "source": [
     "# How to add scores to retriever results\n",
     "\n",
-    "Retrievers will return sequences of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects, which by default include no information about the process that retrieved them (e.g., a similarity score against a query). Here we demonstrate how to add retrieval scores to the `.metadata` of documents:\n",
+    "[Retrievers](/docs/concepts/retrievers/) will return sequences of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects, which by default include no information about the process that retrieved them (e.g., a similarity score against a query). Here we demonstrate how to add retrieval scores to the `.metadata` of documents:\n",
     "1. From [vectorstore retrievers](/docs/how_to/vectorstore_retriever);\n",
     "2. From higher-order LangChain retrievers, such as [SelfQueryRetriever](/docs/how_to/self_query) or [MultiVectorRetriever](/docs/how_to/multi_vector).\n",
     "\n",
-    "For (1), we will implement a short wrapper function around the corresponding vector store. For (2), we will update a method of the corresponding class.\n",
+    "For (1), we will implement a short wrapper function around the corresponding [vector store](/docs/concepts/vectorstores/). For (2), we will update a method of the corresponding class.\n",
     "\n",
     "## Create vector store\n",
     "\n",
diff --git a/docs/docs/how_to/agent_executor.ipynb b/docs/docs/how_to/agent_executor.ipynb
index 1c357632630..c52c126a066 100644
--- a/docs/docs/how_to/agent_executor.ipynb
+++ b/docs/docs/how_to/agent_executor.ipynb
@@ -22,7 +22,7 @@
     ":::\n",
     "\n",
     "By themselves, language models can't take actions - they just output text.\n",
-    "A big use case for LangChain is creating **agents**.\n",
+    "A big use case for LangChain is creating **[agents](/docs/concepts/agents/)**.\n",
     "Agents are systems that use an LLM as a reasoning engine to determine which actions to take and what the inputs to those actions should be.\n",
     "The results of those actions can then be fed back into the agent and it determines whether more actions are needed, or whether it is okay to finish.\n",
     "\n",
diff --git a/docs/docs/how_to/caching_embeddings.ipynb b/docs/docs/how_to/caching_embeddings.ipynb
index 71868b812f2..01187ef9ba9 100644
--- a/docs/docs/how_to/caching_embeddings.ipynb
+++ b/docs/docs/how_to/caching_embeddings.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# Caching\n",
     "\n",
-    "Embeddings can be stored or temporarily cached to avoid needing to recompute them.\n",
+    "[Embeddings](/docs/concepts/embedding_models/) can be stored or temporarily cached to avoid needing to recompute them.\n",
     "\n",
     "Caching embeddings can be done using a `CacheBackedEmbeddings`. The cache backed embedder is a wrapper around an embedder that caches\n",
     "embeddings in a key-value store. The text is hashed and the hash is used as the key in the cache.\n",
diff --git a/docs/docs/how_to/character_text_splitter.ipynb b/docs/docs/how_to/character_text_splitter.ipynb
index ab82464c48f..4de1ca3bfca 100644
--- a/docs/docs/how_to/character_text_splitter.ipynb
+++ b/docs/docs/how_to/character_text_splitter.ipynb
@@ -21,7 +21,7 @@
    "source": [
     "# How to split by character\n",
     "\n",
-    "This is the simplest method. This splits based on a given character sequence, which defaults to `\"\\n\\n\"`. Chunk length is measured by number of characters.\n",
+    "This is the simplest method. This [splits](/docs/concepts/text_splitters/) based on a given character sequence, which defaults to `\"\\n\\n\"`. Chunk length is measured by number of characters.\n",
     "\n",
     "1. How the text is split: by single character separator.\n",
     "2. How the chunk size is measured: by number of characters.\n",
diff --git a/docs/docs/how_to/chat_model_caching.ipynb b/docs/docs/how_to/chat_model_caching.ipynb
index d9a7f384458..02a0b9c314c 100644
--- a/docs/docs/how_to/chat_model_caching.ipynb
+++ b/docs/docs/how_to/chat_model_caching.ipynb
@@ -15,7 +15,7 @@
     "\n",
     ":::\n",
     "\n",
-    "LangChain provides an optional caching layer for chat models. This is useful for two main reasons:\n",
+    "LangChain provides an optional caching layer for [chat models](/docs/concepts/chat_models). This is useful for two main reasons:\n",
     "\n",
     "- It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times. This is especially useful during app development.\n",
     "- It can speed up your application by reducing the number of API calls you make to the LLM provider.\n",
diff --git a/docs/docs/how_to/chat_models_universal_init.ipynb b/docs/docs/how_to/chat_models_universal_init.ipynb
index 0b14538b85a..19b8822fe57 100644
--- a/docs/docs/how_to/chat_models_universal_init.ipynb
+++ b/docs/docs/how_to/chat_models_universal_init.ipynb
@@ -7,13 +7,13 @@
    "source": [
     "# How to init any model in one line\n",
     "\n",
-    "Many LLM applications let end users specify what model provider and model they want the application to be powered by. This requires writing some logic to initialize different ChatModels based on some user configuration. The `init_chat_model()` helper method makes it easy to initialize a number of different model integrations without having to worry about import paths and class names.\n",
+    "Many LLM applications let end users specify what model provider and model they want the application to be powered by. This requires writing some logic to initialize different [chat models](/docs/concepts/chat_models/) based on some user configuration. The `init_chat_model()` helper method makes it easy to initialize a number of different model integrations without having to worry about import paths and class names.\n",
     "\n",
     ":::tip Supported models\n",
     "\n",
     "See the [init_chat_model()](https://python.langchain.com/api_reference/langchain/chat_models/langchain.chat_models.base.init_chat_model.html) API reference for a full list of supported integrations.\n",
     "\n",
-    "Make sure you have the integration packages installed for any model providers you want to support. E.g. you should have `langchain-openai` installed to init an OpenAI model.\n",
+    "Make sure you have the [integration packages](/docs/integrations/chat/) installed for any model providers you want to support. E.g. you should have `langchain-openai` installed to init an OpenAI model.\n",
     "\n",
     ":::"
    ]
diff --git a/docs/docs/how_to/chat_token_usage_tracking.ipynb b/docs/docs/how_to/chat_token_usage_tracking.ipynb
index cdcf943e843..a01ee9f01cb 100644
--- a/docs/docs/how_to/chat_token_usage_tracking.ipynb
+++ b/docs/docs/how_to/chat_token_usage_tracking.ipynb
@@ -14,7 +14,7 @@
     "\n",
     ":::\n",
     "\n",
-    "Tracking token usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
+    "Tracking [token](/docs/concepts/tokens/) usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
     "\n",
     "This guide requires `langchain-anthropic` and `langchain-openai >= 0.1.9`."
    ]
diff --git a/docs/docs/how_to/chatbots_retrieval.ipynb b/docs/docs/how_to/chatbots_retrieval.ipynb
index 015e0d34456..d3874d17767 100644
--- a/docs/docs/how_to/chatbots_retrieval.ipynb
+++ b/docs/docs/how_to/chatbots_retrieval.ipynb
@@ -15,7 +15,7 @@
    "source": [
     "# How to add retrieval to chatbots\n",
     "\n",
-    "Retrieval is a common technique chatbots use to augment their responses with data outside a chat model's training data. This section will cover how to implement retrieval in the context of chatbots, but it's worth noting that retrieval is a very subtle and deep topic - we encourage you to explore [other parts of the documentation](/docs/how_to#qa-with-rag) that go into greater depth!\n",
+    "[Retrieval](/docs/concepts/retrieval/) is a common technique chatbots use to augment their responses with data outside a chat model's training data. This section will cover how to implement retrieval in the context of chatbots, but it's worth noting that retrieval is a very subtle and deep topic - we encourage you to explore [other parts of the documentation](/docs/how_to#qa-with-rag) that go into greater depth!\n",
     "\n",
     "## Setup\n",
     "\n",
@@ -80,7 +80,7 @@
    "source": [
     "## Creating a retriever\n",
     "\n",
-    "We'll use [the LangSmith documentation](https://docs.smith.langchain.com/overview) as source material and store the content in a vectorstore for later retrieval. Note that this example will gloss over some of the specifics around parsing and storing a data source - you can see more [in-depth documentation on creating retrieval systems here](/docs/how_to#qa-with-rag).\n",
+    "We'll use [the LangSmith documentation](https://docs.smith.langchain.com/overview) as source material and store the content in a [vector store](/docs/concepts/vectorstores/) for later retrieval. Note that this example will gloss over some of the specifics around parsing and storing a data source - you can see more [in-depth documentation on creating retrieval systems here](/docs/how_to#qa-with-rag).\n",
     "\n",
     "Let's use a document loader to pull text from the docs:"
    ]
diff --git a/docs/docs/how_to/chatbots_tools.ipynb b/docs/docs/how_to/chatbots_tools.ipynb
index 0c9bbf5259e..f5f639d58d7 100644
--- a/docs/docs/how_to/chatbots_tools.ipynb
+++ b/docs/docs/how_to/chatbots_tools.ipynb
@@ -42,7 +42,7 @@
    "metadata": {},
    "outputs": [
     {
-     "name": "stdin",
+     "name": "stdout",
      "output_type": "stream",
      "text": [
       "OpenAI API Key: ········\n",
@@ -78,7 +78,7 @@
     "\n",
     "Our end goal is to create an agent that can respond conversationally to user questions while looking up information as needed.\n",
     "\n",
-    "First, let's initialize Tavily and an OpenAI chat model capable of tool calling:"
+    "First, let's initialize Tavily and an OpenAI [chat model](/docs/concepts/chat_models/) capable of tool calling:"
    ]
   },
   {
diff --git a/docs/docs/how_to/code_splitter.ipynb b/docs/docs/how_to/code_splitter.ipynb
index 74755ebeeb0..6e7f4710288 100644
--- a/docs/docs/how_to/code_splitter.ipynb
+++ b/docs/docs/how_to/code_splitter.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to split code\n",
     "\n",
-    "[RecursiveCharacterTextSplitter](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html) includes pre-built lists of separators that are useful for splitting text in a specific programming language.\n",
+    "[RecursiveCharacterTextSplitter](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html) includes pre-built lists of separators that are useful for [splitting text](/docs/concepts/text_splitters/) in a specific programming language.\n",
     "\n",
     "Supported languages are stored in the `langchain_text_splitters.Language` enum. They include:\n",
     "\n",
diff --git a/docs/docs/how_to/contextual_compression.ipynb b/docs/docs/how_to/contextual_compression.ipynb
index 5def4035eee..1009e9eb8b7 100644
--- a/docs/docs/how_to/contextual_compression.ipynb
+++ b/docs/docs/how_to/contextual_compression.ipynb
@@ -7,13 +7,13 @@
    "source": [
     "# How to do retrieval with contextual compression\n",
     "\n",
-    "One challenge with retrieval is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.\n",
+    "One challenge with [retrieval](/docs/concepts/retrieval/) is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.\n",
     "\n",
     "Contextual compression is meant to fix this. The idea is simple: instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query, so that only the relevant information is returned. “Compressing” here refers to both compressing the contents of an individual document and filtering out documents wholesale.\n",
     "\n",
     "To use the Contextual Compression Retriever, you'll need:\n",
     "\n",
-    "- a base retriever\n",
+    "- a base [retriever](/docs/concepts/retrievers/)\n",
     "- a Document Compressor\n",
     "\n",
     "The Contextual Compression Retriever passes queries to the base retriever, takes the initial documents and passes them through the Document Compressor. The Document Compressor takes a list of documents and shortens it by reducing the contents of documents or dropping documents altogether.\n",
diff --git a/docs/docs/how_to/custom_chat_model.ipynb b/docs/docs/how_to/custom_chat_model.ipynb
index 4fc502ca171..6f8e68e01a3 100644
--- a/docs/docs/how_to/custom_chat_model.ipynb
+++ b/docs/docs/how_to/custom_chat_model.ipynb
@@ -14,15 +14,15 @@
     "\n",
     ":::\n",
     "\n",
-    "In this guide, we'll learn how to create a custom chat model using LangChain abstractions.\n",
+    "In this guide, we'll learn how to create a custom [chat model](/docs/concepts/chat_models/) using LangChain abstractions.\n",
     "\n",
     "Wrapping your LLM with the standard [`BaseChatModel`](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
     "\n",
-    "As an bonus, your LLM will automatically become a LangChain `Runnable` and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
+    "As an bonus, your LLM will automatically become a LangChain [Runnable](/docs/concepts/runnables/) and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
     "\n",
     "## Inputs and outputs\n",
     "\n",
-    "First, we need to talk about **messages**, which are the inputs and outputs of chat models.\n",
+    "First, we need to talk about **[messages](/docs/concepts/messages/)**, which are the inputs and outputs of chat models.\n",
     "\n",
     "### Messages\n",
     "\n",
diff --git a/docs/docs/how_to/custom_retriever.ipynb b/docs/docs/how_to/custom_retriever.ipynb
index 31600dcf73f..31b6fb90a1c 100644
--- a/docs/docs/how_to/custom_retriever.ipynb
+++ b/docs/docs/how_to/custom_retriever.ipynb
@@ -19,9 +19,9 @@
     "\n",
     "## Overview\n",
     "\n",
-    "Many LLM applications involve retrieving information from external data sources using a `Retriever`. \n",
+    "Many LLM applications involve retrieving information from external data sources using a [Retriever](/docs/concepts/retrievers/). \n",
     "\n",
-    "A retriever is responsible for retrieving a list of relevant `Documents` to a given user `query`.\n",
+    "A retriever is responsible for retrieving a list of relevant [Documents](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) to a given user `query`.\n",
     "\n",
     "The retrieved documents are often formatted into prompts that are fed into an LLM, allowing the LLM to use the information in the to generate an appropriate response (e.g., answering a user question based on a knowledge base).\n",
     "\n",
diff --git a/docs/docs/how_to/custom_tools.ipynb b/docs/docs/how_to/custom_tools.ipynb
index d73604f445c..8046b7b00e4 100644
--- a/docs/docs/how_to/custom_tools.ipynb
+++ b/docs/docs/how_to/custom_tools.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to create tools\n",
     "\n",
-    "When constructing an agent, you will need to provide it with a list of `Tool`s that it can use. Besides the actual function that is called, the Tool consists of several components:\n",
+    "When constructing an [agent](/docs/concepts/agents/), you will need to provide it with a list of [Tools](/docs/concepts/tools/) that it can use. Besides the actual function that is called, the Tool consists of several components:\n",
     "\n",
     "| Attribute     | Type                            | Description                                                                                                                                                                    |\n",
     "|---------------|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n",
diff --git a/docs/docs/how_to/document_loader_custom.ipynb b/docs/docs/how_to/document_loader_custom.ipynb
index a14b5270405..8ebeae8fb63 100644
--- a/docs/docs/how_to/document_loader_custom.ipynb
+++ b/docs/docs/how_to/document_loader_custom.ipynb
@@ -26,7 +26,7 @@
     "`Document` objects are often formatted into prompts that are fed into an LLM, allowing the LLM to use the information in the `Document` to generate a desired response (e.g., summarizing the document).\n",
     "`Documents` can be either used immediately or indexed into a vectorstore for future retrieval and use.\n",
     "\n",
-    "The main abstractions for Document Loading are:\n",
+    "The main abstractions for [Document Loading](/docs/concepts/document_loaders/) are:\n",
     "\n",
     "\n",
     "| Component      | Description                    |\n",
diff --git a/docs/docs/how_to/document_loader_pdf.ipynb b/docs/docs/how_to/document_loader_pdf.ipynb
index f13edbc99db..766560c1db6 100644
--- a/docs/docs/how_to/document_loader_pdf.ipynb
+++ b/docs/docs/how_to/document_loader_pdf.ipynb
@@ -9,7 +9,7 @@
     "\n",
     "[Portable Document Format (PDF)](https://en.wikipedia.org/wiki/PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.\n",
     "\n",
-    "This guide covers how to load `PDF` documents into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream.\n",
+    "This guide covers how to [load](/docs/concepts/document_loaders/) `PDF` documents into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream.\n",
     "\n",
     "Text in PDFs is typically represented via text boxes. They may also contain images. A PDF parser might do some combination of the following:\n",
     "\n",
@@ -250,7 +250,7 @@
    "metadata": {},
    "outputs": [
     {
-     "name": "stdin",
+     "name": "stdout",
      "output_type": "stream",
      "text": [
       "Unstructured API Key: ········\n"
diff --git a/docs/docs/how_to/document_loader_web.ipynb b/docs/docs/how_to/document_loader_web.ipynb
index 04c4a3b7c68..9dc424babb2 100644
--- a/docs/docs/how_to/document_loader_web.ipynb
+++ b/docs/docs/how_to/document_loader_web.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to load web pages\n",
     "\n",
-    "This guide covers how to load web pages into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream. Web pages contain text, images, and other multimedia elements, and are typically represented with HTML. They may include links to other pages or resources.\n",
+    "This guide covers how to [load](/docs/concepts/document_loaders/) web pages into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream. Web pages contain text, images, and other multimedia elements, and are typically represented with HTML. They may include links to other pages or resources.\n",
     "\n",
     "LangChain integrates with a host of parsers that are appropriate for web pages. The right parser will depend on your needs. Below we demonstrate two possibilities:\n",
     "\n",
diff --git a/docs/docs/how_to/ensemble_retriever.ipynb b/docs/docs/how_to/ensemble_retriever.ipynb
index 99098554f5e..ce518edfb0b 100644
--- a/docs/docs/how_to/ensemble_retriever.ipynb
+++ b/docs/docs/how_to/ensemble_retriever.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# How to combine results from multiple retrievers\n",
     "\n",
-    "The [EnsembleRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.ensemble.EnsembleRetriever.html) supports ensembling of results from multiple retrievers. It is initialized with a list of [BaseRetriever](https://python.langchain.com/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html) objects. EnsembleRetrievers rerank the results of the constituent retrievers based on the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.\n",
+    "The [EnsembleRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.ensemble.EnsembleRetriever.html) supports ensembling of results from multiple [retrievers](/docs/concepts/retrievers/). It is initialized with a list of [BaseRetriever](https://python.langchain.com/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html) objects. EnsembleRetrievers rerank the results of the constituent retrievers based on the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.\n",
     "\n",
     "By leveraging the strengths of different algorithms, the `EnsembleRetriever` can achieve better performance than any single algorithm. \n",
     "\n",
diff --git a/docs/docs/how_to/example_selectors.ipynb b/docs/docs/how_to/example_selectors.ipynb
index 1c90627d9c8..39f0bcadf14 100644
--- a/docs/docs/how_to/example_selectors.ipynb
+++ b/docs/docs/how_to/example_selectors.ipynb
@@ -17,7 +17,7 @@
    "source": [
     "# How to use example selectors\n",
     "\n",
-    "If you have a large number of examples, you may need to select which ones to include in the prompt. The Example Selector is the class responsible for doing so.\n",
+    "If you have a large number of examples, you may need to select which ones to include in the prompt. The [Example Selector](/docs/concepts/example_selectors/) is the class responsible for doing so.\n",
     "\n",
     "The base interface is defined as below:\n",
     "\n",
diff --git a/docs/docs/how_to/example_selectors_langsmith.ipynb b/docs/docs/how_to/example_selectors_langsmith.ipynb
index efc9e2db46d..c6a032948bc 100644
--- a/docs/docs/how_to/example_selectors_langsmith.ipynb
+++ b/docs/docs/how_to/example_selectors_langsmith.ipynb
@@ -23,7 +23,7 @@
     "]} />\n",
     "\n",
     "\n",
-    "LangSmith datasets have built-in support for similarity search, making them a great tool for building and querying few-shot examples.\n",
+    "[LangSmith](https://docs.smith.langchain.com/) datasets have built-in support for similarity search, making them a great tool for building and querying few-shot examples.\n",
     "\n",
     "In this guide we'll see how to use an indexed LangSmith dataset as a few-shot example selector.\n",
     "\n",
diff --git a/docs/docs/how_to/example_selectors_length_based.ipynb b/docs/docs/how_to/example_selectors_length_based.ipynb
index 1074b3148e4..dcd897895b7 100644
--- a/docs/docs/how_to/example_selectors_length_based.ipynb
+++ b/docs/docs/how_to/example_selectors_length_based.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to select examples by length\n",
     "\n",
-    "This example selector selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more."
+    "This [example selector](/docs/concepts/example_selectors/) selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more."
    ]
   },
   {
diff --git a/docs/docs/how_to/example_selectors_mmr.ipynb b/docs/docs/how_to/example_selectors_mmr.ipynb
index b965a7dec77..9b0f96f181a 100644
--- a/docs/docs/how_to/example_selectors_mmr.ipynb
+++ b/docs/docs/how_to/example_selectors_mmr.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to select examples by maximal marginal relevance (MMR)\n",
     "\n",
-    "The `MaxMarginalRelevanceExampleSelector` selects examples based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.\n"
+    "The `MaxMarginalRelevanceExampleSelector` selects [examples](/docs/concepts/example_selectors/) based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.\n"
    ]
   },
   {
diff --git a/docs/docs/how_to/example_selectors_ngram.ipynb b/docs/docs/how_to/example_selectors_ngram.ipynb
index fb464ef8e30..80578100caf 100644
--- a/docs/docs/how_to/example_selectors_ngram.ipynb
+++ b/docs/docs/how_to/example_selectors_ngram.ipynb
@@ -9,7 +9,7 @@
     "\n",
     "The `NGramOverlapExampleSelector` selects and orders examples based on which examples are most similar to the input, according to an ngram overlap score. The ngram overlap score is a float between 0.0 and 1.0, inclusive. \n",
     "\n",
-    "The selector allows for a threshold score to be set. Examples with an ngram overlap score less than or equal to the threshold are excluded. The threshold is set to -1.0, by default, so will not exclude any examples, only reorder them. Setting the threshold to 0.0 will exclude examples that have no ngram overlaps with the input.\n"
+    "The [selector](/docs/concepts/example_selectors/) allows for a threshold score to be set. Examples with an ngram overlap score less than or equal to the threshold are excluded. The threshold is set to -1.0, by default, so will not exclude any examples, only reorder them. Setting the threshold to 0.0 will exclude examples that have no ngram overlaps with the input.\n"
    ]
   },
   {
diff --git a/docs/docs/how_to/example_selectors_similarity.ipynb b/docs/docs/how_to/example_selectors_similarity.ipynb
index d6e692cfac2..6657cd54330 100644
--- a/docs/docs/how_to/example_selectors_similarity.ipynb
+++ b/docs/docs/how_to/example_selectors_similarity.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to select examples by similarity\n",
     "\n",
-    "This object selects examples based on similarity to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.\n"
+    "This object selects [examples](/docs/concepts/example_selectors/) based on similarity to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.\n"
    ]
   },
   {
diff --git a/docs/docs/how_to/extraction_examples.ipynb b/docs/docs/how_to/extraction_examples.ipynb
index 3de5417d5fb..ff565fb35a2 100644
--- a/docs/docs/how_to/extraction_examples.ipynb
+++ b/docs/docs/how_to/extraction_examples.ipynb
@@ -9,7 +9,7 @@
     "\n",
     "The quality of extractions can often be improved by providing reference examples to the LLM.\n",
     "\n",
-    "Data extraction attempts to generate structured representations of information found in text and other unstructured or semi-structured formats. [Tool-calling](/docs/concepts/tool_calling) LLM features are often used in this context. This guide demonstrates how to build few-shot examples of tool calls to help steer the behavior of extraction and similar applications.\n",
+    "Data extraction attempts to generate [structured representations](/docs/concepts/structured_outputs/) of information found in text and other unstructured or semi-structured formats. [Tool-calling](/docs/concepts/tool_calling) LLM features are often used in this context. This guide demonstrates how to build few-shot examples of tool calls to help steer the behavior of extraction and similar applications.\n",
     "\n",
     ":::tip\n",
     "While this guide focuses how to use examples with a tool calling model, this technique is generally applicable, and will work\n",
diff --git a/docs/docs/how_to/extraction_parse.ipynb b/docs/docs/how_to/extraction_parse.ipynb
index 8d6c3a1a041..5ec9c86348d 100644
--- a/docs/docs/how_to/extraction_parse.ipynb
+++ b/docs/docs/how_to/extraction_parse.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to use prompting alone (no tool calling) to do extraction\n",
     "\n",
-    "Tool calling features are not required for generating structured output from LLMs. LLMs that are able to follow prompt instructions well can be tasked with outputting information in a given format.\n",
+    "[Tool calling](/docs/concepts/tool_calling/) features are not required for generating structured output from LLMs. LLMs that are able to follow prompt instructions well can be tasked with outputting information in a given format.\n",
     "\n",
     "This approach relies on designing good prompts and then parsing the output of the LLMs to make them extract information well.\n",
     "\n",
diff --git a/docs/docs/how_to/few_shot_examples.ipynb b/docs/docs/how_to/few_shot_examples.ipynb
index 6c8d0926f03..6fab31f8ff1 100644
--- a/docs/docs/how_to/few_shot_examples.ipynb
+++ b/docs/docs/how_to/few_shot_examples.ipynb
@@ -27,7 +27,7 @@
     "\n",
     ":::\n",
     "\n",
-    "In this guide, we'll learn how to create a simple prompt template that provides the model with example inputs and outputs when generating. Providing the LLM with a few such examples is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
+    "In this guide, we'll learn how to create a simple prompt template that provides the model with example inputs and outputs when generating. Providing the LLM with a few such examples is called [few-shotting](/docs/concepts/few_shot_prompting/), and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
     "\n",
     "A few-shot prompt template can be constructed from either a set of examples, or from an [Example Selector](https://python.langchain.com/api_reference/core/example_selectors/langchain_core.example_selectors.base.BaseExampleSelector.html) class responsible for choosing a subset of examples from the defined set.\n",
     "\n",
diff --git a/docs/docs/how_to/few_shot_examples_chat.ipynb b/docs/docs/how_to/few_shot_examples_chat.ipynb
index 51e41f65e40..3be13f6cbf5 100644
--- a/docs/docs/how_to/few_shot_examples_chat.ipynb
+++ b/docs/docs/how_to/few_shot_examples_chat.ipynb
@@ -27,7 +27,7 @@
     "\n",
     ":::\n",
     "\n",
-    "This guide covers how to prompt a chat model with example inputs and outputs. Providing the model with a few such examples is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
+    "This guide covers how to prompt a chat model with example inputs and outputs. Providing the model with a few such examples is called [few-shotting](/docs/concepts/few_shot_prompting/), and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
     "\n",
     "There does not appear to be solid consensus on how best to do few-shot prompting, and the optimal prompt compilation will likely vary by model. Because of this, we provide few-shot prompt templates like the [FewShotChatMessagePromptTemplate](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate.html?highlight=fewshot#langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate) as a flexible starting point, and you can modify or replace them as you see fit.\n",
     "\n",
diff --git a/docs/docs/how_to/filter_messages.ipynb b/docs/docs/how_to/filter_messages.ipynb
index 794ef630326..108ee908645 100644
--- a/docs/docs/how_to/filter_messages.ipynb
+++ b/docs/docs/how_to/filter_messages.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to filter messages\n",
     "\n",
-    "In more complex chains and agents we might track state with a list of messages. This list can start to accumulate messages from multiple different models, speakers, sub-chains, etc., and we may only want to pass subsets of this full list of messages to each model call in the chain/agent.\n",
+    "In more complex chains and agents we might track state with a list of [messages](/docs/concepts/messages/). This list can start to accumulate messages from multiple different models, speakers, sub-chains, etc., and we may only want to pass subsets of this full list of messages to each model call in the chain/agent.\n",
     "\n",
     "The `filter_messages` utility makes it easy to filter messages by type, id, or name.\n",
     "\n",
diff --git a/docs/docs/how_to/graph_constructing.ipynb b/docs/docs/how_to/graph_constructing.ipynb
index 79b9e1463f4..5ca45d73645 100644
--- a/docs/docs/how_to/graph_constructing.ipynb
+++ b/docs/docs/how_to/graph_constructing.ipynb
@@ -15,7 +15,7 @@
    "source": [
     "# How to construct knowledge graphs\n",
     "\n",
-    "In this guide we'll go over the basic ways of constructing a knowledge graph based on unstructured text. The constructured graph can then be used as knowledge base in a RAG application.\n",
+    "In this guide we'll go over the basic ways of constructing a knowledge graph based on unstructured text. The constructured graph can then be used as knowledge base in a [RAG](/docs/concepts/rag/) application.\n",
     "\n",
     "## ⚠️ Security note ⚠️\n",
     "\n",
@@ -68,7 +68,7 @@
    "metadata": {},
    "outputs": [
     {
-     "name": "stdin",
+     "name": "stdout",
      "output_type": "stream",
      "text": [
       " ········\n"
diff --git a/docs/docs/how_to/hybrid.ipynb b/docs/docs/how_to/hybrid.ipynb
index 5f45061f66e..55b13579cef 100644
--- a/docs/docs/how_to/hybrid.ipynb
+++ b/docs/docs/how_to/hybrid.ipynb
@@ -9,7 +9,7 @@
    "source": [
     "# Hybrid Search\n",
     "\n",
-    "The standard search in LangChain is done by vector similarity. However, a number of vectorstores implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, Qdrant...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
+    "The standard search in LangChain is done by vector similarity. However, a number of [vector store](/docs/integrations/vectorstores/) implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, Qdrant...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
     "\n",
     "**Step 1: Make sure the vectorstore you are using supports hybrid search**\n",
     "\n",
diff --git a/docs/docs/how_to/indexing.ipynb b/docs/docs/how_to/indexing.ipynb
index 904424a1ac0..e3e6ec8aef6 100644
--- a/docs/docs/how_to/indexing.ipynb
+++ b/docs/docs/how_to/indexing.ipynb
@@ -9,7 +9,7 @@
     "\n",
     "Here, we will look at a basic indexing workflow using the LangChain indexing API. \n",
     "\n",
-    "The indexing API lets you load and keep in sync documents from any source into a vector store. Specifically, it helps:\n",
+    "The indexing API lets you load and keep in sync documents from any source into a [vector store](/docs/concepts/vectorstores/). Specifically, it helps:\n",
     "\n",
     "* Avoid writing duplicated content into the vector store\n",
     "* Avoid re-writing unchanged content\n",
diff --git a/docs/docs/how_to/lcel_cheatsheet.ipynb b/docs/docs/how_to/lcel_cheatsheet.ipynb
index fb67e0cd7cf..20e825fe27b 100644
--- a/docs/docs/how_to/lcel_cheatsheet.ipynb
+++ b/docs/docs/how_to/lcel_cheatsheet.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# LangChain Expression Language Cheatsheet\n",
     "\n",
-    "This is a quick reference for all the most important LCEL primitives. For more advanced usage see the [LCEL how-to guides](/docs/how_to/#langchain-expression-language-lcel) and the [full API reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html).\n",
+    "This is a quick reference for all the most important [LCEL](/docs/concepts/lcel/) primitives. For more advanced usage see the [LCEL how-to guides](/docs/how_to/#langchain-expression-language-lcel) and the [full API reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html).\n",
     "\n",
     "### Invoke a runnable\n",
     "#### [Runnable.invoke()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.invoke) / [Runnable.ainvoke()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.ainvoke)"
diff --git a/docs/docs/how_to/llm_caching.ipynb b/docs/docs/how_to/llm_caching.ipynb
index 1ed564b393e..6fc21369221 100644
--- a/docs/docs/how_to/llm_caching.ipynb
+++ b/docs/docs/how_to/llm_caching.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to cache LLM responses\n",
     "\n",
-    "LangChain provides an optional caching layer for LLMs. This is useful for two reasons:\n",
+    "LangChain provides an optional [caching](/docs/concepts/chat_models/#caching) layer for LLMs. This is useful for two reasons:\n",
     "\n",
     "It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times.\n",
     "It can speed up your application by reducing the number of API calls you make to the LLM provider.\n"
diff --git a/docs/docs/how_to/llm_token_usage_tracking.ipynb b/docs/docs/how_to/llm_token_usage_tracking.ipynb
index 2f1a8a92c1e..a4eb596b764 100644
--- a/docs/docs/how_to/llm_token_usage_tracking.ipynb
+++ b/docs/docs/how_to/llm_token_usage_tracking.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to track token usage for LLMs\n",
     "\n",
-    "Tracking token usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
+    "Tracking [token](/docs/concepts/tokens/) usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
     "\n",
     ":::info Prerequisites\n",
     "\n",
diff --git a/docs/docs/how_to/logprobs.ipynb b/docs/docs/how_to/logprobs.ipynb
index 6033fd97823..47bfa013a7e 100644
--- a/docs/docs/how_to/logprobs.ipynb
+++ b/docs/docs/how_to/logprobs.ipynb
@@ -11,10 +11,11 @@
     "\n",
     "This guide assumes familiarity with the following concepts:\n",
     "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Tokens](/docs/concepts/tokens)\n",
     "\n",
     ":::\n",
     "\n",
-    "Certain chat models can be configured to return token-level log probabilities representing the likelihood of a given token. This guide walks through how to get this information in LangChain."
+    "Certain [chat models](/docs/concepts/chat_models/) can be configured to return token-level log probabilities representing the likelihood of a given token. This guide walks through how to get this information in LangChain."
    ]
   },
   {
diff --git a/docs/docs/how_to/merge_message_runs.ipynb b/docs/docs/how_to/merge_message_runs.ipynb
index e115eef2954..ff9aaee593f 100644
--- a/docs/docs/how_to/merge_message_runs.ipynb
+++ b/docs/docs/how_to/merge_message_runs.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to merge consecutive messages of the same type\n",
     "\n",
-    "Certain models do not support passing in consecutive messages of the same type (a.k.a. \"runs\" of the same message type).\n",
+    "Certain models do not support passing in consecutive [messages](/docs/concepts/messages/) of the same type (a.k.a. \"runs\" of the same message type).\n",
     "\n",
     "The `merge_message_runs` utility makes it easy to merge consecutive messages of the same type.\n",
     "\n",
diff --git a/docs/docs/how_to/multi_vector.ipynb b/docs/docs/how_to/multi_vector.ipynb
index 69b4b0df0e0..a68086b14fa 100644
--- a/docs/docs/how_to/multi_vector.ipynb
+++ b/docs/docs/how_to/multi_vector.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to retrieve using multiple vectors per document\n",
     "\n",
-    "It can often be useful to store multiple vectors per document. There are multiple use cases where this is beneficial. For example, we can embed multiple chunks of a document and associate those embeddings with the parent document, allowing retriever hits on the chunks to return the larger document.\n",
+    "It can often be useful to store multiple [vectors](/docs/concepts/vectorstores/) per document. There are multiple use cases where this is beneficial. For example, we can [embed](/docs/concepts/embedding_models/) multiple chunks of a document and associate those embeddings with the parent document, allowing [retriever](/docs/concepts/retrievers/) hits on the chunks to return the larger document.\n",
     "\n",
     "LangChain implements a base [MultiVectorRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.multi_vector.MultiVectorRetriever.html), which simplifies this process. Much of the complexity lies in how to create the multiple vectors per document. This notebook covers some of the common ways to create those vectors and use the `MultiVectorRetriever`.\n",
     "\n",
diff --git a/docs/docs/how_to/multimodal_inputs.ipynb b/docs/docs/how_to/multimodal_inputs.ipynb
index 6d0b0b736a4..f1eff275f60 100644
--- a/docs/docs/how_to/multimodal_inputs.ipynb
+++ b/docs/docs/how_to/multimodal_inputs.ipynb
@@ -7,11 +7,11 @@
    "source": [
     "# How to pass multimodal data directly to models\n",
     "\n",
-    "Here we demonstrate how to pass multimodal input directly to models. \n",
+    "Here we demonstrate how to pass [multimodal](/docs/concepts/multimodality/) input directly to models. \n",
     "We currently expect all input to be passed in the same format as [OpenAI expects](https://platform.openai.com/docs/guides/vision).\n",
     "For other model providers that support multimodal input, we have added logic inside the class to convert to the expected format.\n",
     "\n",
-    "In this example we will ask a model to describe an image."
+    "In this example we will ask a [model](/docs/concepts/chat_models/#multimodality) to describe an image."
    ]
   },
   {
diff --git a/docs/docs/how_to/multimodal_prompts.ipynb b/docs/docs/how_to/multimodal_prompts.ipynb
index a9cfc618a4b..321e6efa8b0 100644
--- a/docs/docs/how_to/multimodal_prompts.ipynb
+++ b/docs/docs/how_to/multimodal_prompts.ipynb
@@ -7,9 +7,9 @@
    "source": [
     "# How to use multimodal prompts\n",
     "\n",
-    "Here we demonstrate how to use prompt templates to format multimodal inputs to models. \n",
+    "Here we demonstrate how to use prompt templates to format [multimodal](/docs/concepts/multimodality/) inputs to models. \n",
     "\n",
-    "In this example we will ask a model to describe an image."
+    "In this example we will ask a [model](/docs/concepts/chat_models/#multimodality) to describe an image."
    ]
   },
   {
diff --git a/docs/docs/how_to/output_parser_custom.ipynb b/docs/docs/how_to/output_parser_custom.ipynb
index a8cca984b69..d77e1ff9c6a 100644
--- a/docs/docs/how_to/output_parser_custom.ipynb
+++ b/docs/docs/how_to/output_parser_custom.ipynb
@@ -7,11 +7,11 @@
    "source": [
     "# How to create a custom Output Parser\n",
     "\n",
-    "In some situations you may want to implement a custom parser to structure the model output into a custom format.\n",
+    "In some situations you may want to implement a custom [parser](/docs/concepts/output_parsers/) to structure the model output into a custom format.\n",
     "\n",
     "There are two ways to implement a custom parser:\n",
     "\n",
-    "1. Using `RunnableLambda` or `RunnableGenerator` in LCEL -- we strongly recommend this for most use cases\n",
+    "1. Using `RunnableLambda` or `RunnableGenerator` in [LCEL](/docs/concepts/lcel/) -- we strongly recommend this for most use cases\n",
     "2. By inherting from one of the base classes for out parsing -- this is the hard way of doing things\n",
     "\n",
     "The difference between the two approaches are mostly superficial and are mainly in terms of which callbacks are triggered (e.g., `on_chain_start` vs. `on_parser_start`), and how a runnable lambda vs. a parser might be visualized in a tracing platform like LangSmith."
diff --git a/docs/docs/how_to/output_parser_fixing.ipynb b/docs/docs/how_to/output_parser_fixing.ipynb
index 922fcf7adf0..89692da147f 100644
--- a/docs/docs/how_to/output_parser_fixing.ipynb
+++ b/docs/docs/how_to/output_parser_fixing.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to use the output-fixing parser\n",
     "\n",
-    "This output parser wraps another output parser, and in the event that the first one fails it calls out to another LLM to fix any errors.\n",
+    "This [output parser](/docs/concepts/output_parsers/) wraps another output parser, and in the event that the first one fails it calls out to another LLM to fix any errors.\n",
     "\n",
     "But we can do other things besides throw errors. Specifically, we can pass the misformatted output, along with the formatted instructions, to the model and ask it to fix it.\n",
     "\n",
diff --git a/docs/docs/how_to/output_parser_structured.ipynb b/docs/docs/how_to/output_parser_structured.ipynb
index 2cb69c7bbb4..f9dda3f95e4 100644
--- a/docs/docs/how_to/output_parser_structured.ipynb
+++ b/docs/docs/how_to/output_parser_structured.ipynb
@@ -19,7 +19,7 @@
     "\n",
     "Language models output text. But there are times where you want to get more structured information than just text back. While some model providers support [built-in ways to return structured output](/docs/how_to/structured_output), not all do.\n",
     "\n",
-    "Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:\n",
+    "[Output parsers](/docs/concepts/output_parsers/) are classes that help structure language model responses. There are two main methods an output parser must implement:\n",
     "\n",
     "- \"Get format instructions\": A method which returns a string containing instructions for how the output of a language model should be formatted.\n",
     "- \"Parse\": A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.\n",
diff --git a/docs/docs/how_to/output_parser_xml.ipynb b/docs/docs/how_to/output_parser_xml.ipynb
index d01b5990fed..dd7dac98903 100644
--- a/docs/docs/how_to/output_parser_xml.ipynb
+++ b/docs/docs/how_to/output_parser_xml.ipynb
@@ -20,7 +20,7 @@
     "\n",
     "LLMs from different providers often have different strengths depending on the specific data they are trained on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
     "\n",
-    "This guide shows you how to use the [`XMLOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html) to prompt models for XML output, then and parse that output into a usable format.\n",
+    "This guide shows you how to use the [`XMLOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html) to prompt models for XML output, then and [parse](/docs/concepts/output_parsers/) that output into a usable format.\n",
     "\n",
     ":::note\n",
     "Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed XML.\n",
diff --git a/docs/docs/how_to/parent_document_retriever.ipynb b/docs/docs/how_to/parent_document_retriever.ipynb
index 38b06d64d1a..452c31ace39 100644
--- a/docs/docs/how_to/parent_document_retriever.ipynb
+++ b/docs/docs/how_to/parent_document_retriever.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to use the Parent Document Retriever\n",
     "\n",
-    "When splitting documents for retrieval, there are often conflicting desires:\n",
+    "When splitting documents for [retrieval](/docs/concepts/retrieval/), there are often conflicting desires:\n",
     "\n",
     "1. You may want to have small documents, so that their embeddings can most\n",
     "    accurately reflect their meaning. If too long, then the embeddings can\n",
@@ -72,7 +72,7 @@
    "source": [
     "## Retrieving full documents\n",
     "\n",
-    "In this mode, we want to retrieve the full documents. Therefore, we only specify a child splitter."
+    "In this mode, we want to retrieve the full documents. Therefore, we only specify a child [splitter](/docs/concepts/text_splitters/)."
    ]
   },
   {
diff --git a/docs/docs/how_to/prompts_composition.ipynb b/docs/docs/how_to/prompts_composition.ipynb
index bf8d0f5fb23..25b51867b2e 100644
--- a/docs/docs/how_to/prompts_composition.ipynb
+++ b/docs/docs/how_to/prompts_composition.ipynb
@@ -24,7 +24,7 @@
     "\n",
     ":::\n",
     "\n",
-    "LangChain provides a user friendly interface for composing different parts of prompts together. You can do this with either string prompts or chat prompts. Constructing prompts this way allows for easy reuse of components."
+    "LangChain provides a user friendly interface for composing different parts of [prompts](/docs/concepts/prompt_templates/) together. You can do this with either string prompts or chat prompts. Constructing prompts this way allows for easy reuse of components."
    ]
   },
   {
diff --git a/docs/docs/how_to/prompts_partial.ipynb b/docs/docs/how_to/prompts_partial.ipynb
index b32e2586c17..4abf02b5323 100644
--- a/docs/docs/how_to/prompts_partial.ipynb
+++ b/docs/docs/how_to/prompts_partial.ipynb
@@ -24,7 +24,7 @@
     "\n",
     ":::\n",
     "\n",
-    "Like partially binding arguments to a function, it can make sense to \"partial\" a prompt template - e.g. pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.\n",
+    "Like partially binding arguments to a function, it can make sense to \"partial\" a [prompt template](/docs/concepts/prompt_templates/) - e.g. pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.\n",
     "\n",
     "LangChain supports this in two ways:\n",
     "\n",
diff --git a/docs/docs/how_to/qa_chat_history_how_to.ipynb b/docs/docs/how_to/qa_chat_history_how_to.ipynb
index c757e5ef35c..0c82ac75f9a 100644
--- a/docs/docs/how_to/qa_chat_history_how_to.ipynb
+++ b/docs/docs/how_to/qa_chat_history_how_to.ipynb
@@ -19,7 +19,7 @@
     ":::\n",
     "\n",
     "\n",
-    "In many Q&A applications we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of \"memory\" of past questions and answers, and some logic for incorporating those into its current thinking.\n",
+    "In many [Q&A applications](/docs/concepts/rag/) we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of \"memory\" of past questions and answers, and some logic for incorporating those into its current thinking.\n",
     "\n",
     "In this guide we focus on **adding logic for incorporating historical messages.**\n",
     "\n",
diff --git a/docs/docs/how_to/qa_citations.ipynb b/docs/docs/how_to/qa_citations.ipynb
index fda1b1f1daa..d2b61428771 100644
--- a/docs/docs/how_to/qa_citations.ipynb
+++ b/docs/docs/how_to/qa_citations.ipynb
@@ -19,7 +19,7 @@
     "\n",
     "We generally suggest using the first item of the list that works for your use-case. That is, if your model supports tool-calling, try methods 1 or 2; otherwise, or if those fail, advance down the list.\n",
     "\n",
-    "Let's first create a simple RAG chain. To start we'll just retrieve from Wikipedia using the [WikipediaRetriever](https://python.langchain.com/api_reference/community/retrievers/langchain_community.retrievers.wikipedia.WikipediaRetriever.html)."
+    "Let's first create a simple [RAG](/docs/concepts/rag/) chain. To start we'll just retrieve from Wikipedia using the [WikipediaRetriever](https://python.langchain.com/api_reference/community/retrievers/langchain_community.retrievers.wikipedia.WikipediaRetriever.html)."
    ]
   },
   {
@@ -140,7 +140,7 @@
    "id": "c89e2045-9244-43e6-bf3f-59af22658529",
    "metadata": {},
    "source": [
-    "Now that we've got a model, retriver and prompt, let's chain them all together. We'll need to add some logic for formatting our retrieved Documents to a string that can be passed to our prompt. Following the how-to guide on [adding citations](/docs/how_to/qa_citations) to a RAG application, we'll make it so our chain returns both the answer and the retrieved Documents."
+    "Now that we've got a [model](/docs/concepts/chat_models/), [retriver](/docs/concepts/retrievers/) and [prompt](/docs/concepts/prompt_templates/), let's chain them all together. We'll need to add some logic for formatting our retrieved Documents to a string that can be passed to our prompt. Following the how-to guide on [adding citations](/docs/how_to/qa_citations) to a RAG application, we'll make it so our chain returns both the answer and the retrieved Documents."
    ]
   },
   {
diff --git a/docs/docs/how_to/qa_per_user.ipynb b/docs/docs/how_to/qa_per_user.ipynb
index 65d4371b912..08d0592f803 100644
--- a/docs/docs/how_to/qa_per_user.ipynb
+++ b/docs/docs/how_to/qa_per_user.ipynb
@@ -7,9 +7,9 @@
    "source": [
     "# How to do per-user retrieval\n",
     "\n",
-    "This guide demonstrates how to configure runtime properties of a retrieval chain. An example application is to limit the documents available to a retriever based on the user.\n",
+    "This guide demonstrates how to configure runtime properties of a retrieval chain. An example application is to limit the documents available to a [retriever](/docs/concepts/retrievers/) based on the user.\n",
     "\n",
-    "When building a retrieval app, you often have to build it with multiple users in mind. This means that you may be storing data not just for one user, but for many different users, and they should not be able to see eachother's data. This means that you need to be able to configure your retrieval chain to only retrieve certain information. This generally involves two steps.\n",
+    "When building a [retrieval app](/docs/concepts/rag/), you often have to build it with multiple users in mind. This means that you may be storing data not just for one user, but for many different users, and they should not be able to see eachother's data. This means that you need to be able to configure your retrieval chain to only retrieve certain information. This generally involves two steps.\n",
     "\n",
     "**Step 1: Make sure the retriever you are using supports multiple users**\n",
     "\n",
diff --git a/docs/docs/how_to/qa_sources.ipynb b/docs/docs/how_to/qa_sources.ipynb
index c9d9ce7330f..eccf8d070e3 100644
--- a/docs/docs/how_to/qa_sources.ipynb
+++ b/docs/docs/how_to/qa_sources.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to get your RAG application to return sources\n",
     "\n",
-    "Often in Q&A applications it's important to show users the sources that were used to generate the answer. The simplest way to do this is for the chain to return the Documents that were retrieved in each generation.\n",
+    "Often in [Q&A](/docs/concepts/rag/) applications it's important to show users the sources that were used to generate the answer. The simplest way to do this is for the chain to return the Documents that were retrieved in each generation.\n",
     "\n",
     "We'll work off of the Q&A app we built over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng in the [RAG tutorial](/docs/tutorials/rag).\n",
     "\n",
diff --git a/docs/docs/how_to/qa_streaming.ipynb b/docs/docs/how_to/qa_streaming.ipynb
index 18e19fbcb99..faca44cf0e3 100644
--- a/docs/docs/how_to/qa_streaming.ipynb
+++ b/docs/docs/how_to/qa_streaming.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to stream results from your RAG application\n",
     "\n",
-    "This guide explains how to stream results from a RAG application. It covers streaming tokens from the final output as well as intermediate steps of a chain (e.g., from query re-writing).\n",
+    "This guide explains how to stream results from a [RAG](/docs/concepts/rag/) application. It covers streaming tokens from the final output as well as intermediate steps of a chain (e.g., from query re-writing).\n",
     "\n",
     "We'll work off of the Q&A app with sources we built over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng in the [RAG tutorial](/docs/tutorials/rag)."
    ]
diff --git a/docs/docs/how_to/query_few_shot.ipynb b/docs/docs/how_to/query_few_shot.ipynb
index 2b087ae214c..8955c4490c9 100644
--- a/docs/docs/how_to/query_few_shot.ipynb
+++ b/docs/docs/how_to/query_few_shot.ipynb
@@ -17,7 +17,7 @@
    "source": [
     "# How to add examples to the prompt for query analysis\n",
     "\n",
-    "As our query analysis becomes more complex, the LLM may struggle to understand how exactly it should respond in certain scenarios. In order to improve performance here, we can add examples to the prompt to guide the LLM.\n",
+    "As our query analysis becomes more complex, the LLM may struggle to understand how exactly it should respond in certain scenarios. In order to improve performance here, we can [add examples](/docs/concepts/few_shot_prompting/) to the prompt to guide the LLM.\n",
     "\n",
     "Let's take a look at how we can add examples for the LangChain YouTube video query analyzer we built in the [Quickstart](/docs/tutorials/query_analysis)."
    ]
diff --git a/docs/docs/how_to/query_multiple_retrievers.ipynb b/docs/docs/how_to/query_multiple_retrievers.ipynb
index 9e6d7861dca..6cde2876b4d 100644
--- a/docs/docs/how_to/query_multiple_retrievers.ipynb
+++ b/docs/docs/how_to/query_multiple_retrievers.ipynb
@@ -17,7 +17,7 @@
    "source": [
     "# How to handle multiple retrievers when doing query analysis\n",
     "\n",
-    "Sometimes, a query analysis technique may allow for selection of which retriever to use. To use this, you will need to add some logic to select the retriever to do. We will show a simple example (using mock data) of how to do that."
+    "Sometimes, a query analysis technique may allow for selection of which [retriever](/docs/concepts/retrievers/) to use. To use this, you will need to add some logic to select the retriever to do. We will show a simple example (using mock data) of how to do that."
    ]
   },
   {
diff --git a/docs/docs/how_to/recursive_json_splitter.ipynb b/docs/docs/how_to/recursive_json_splitter.ipynb
index 9936ed9ebcb..57e97af1bf6 100644
--- a/docs/docs/how_to/recursive_json_splitter.ipynb
+++ b/docs/docs/how_to/recursive_json_splitter.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to split JSON data\n",
     "\n",
-    "This json splitter splits json data while allowing control over chunk sizes. It traverses json data depth first and builds smaller json chunks. It attempts to keep nested json objects whole but will split them if needed to keep chunks between a min_chunk_size and the max_chunk_size.\n",
+    "This json splitter [splits](/docs/concepts/text_splitters/) json data while allowing control over chunk sizes. It traverses json data depth first and builds smaller json chunks. It attempts to keep nested json objects whole but will split them if needed to keep chunks between a min_chunk_size and the max_chunk_size.\n",
     "\n",
     "If the value is not a nested json, but rather a very large string the string will not be split. If you need a hard cap on the chunk size consider composing this with a Recursive Text splitter on those chunks. There is an optional pre-processing step to split lists, by first converting them to json (dict) and then splitting them as such.\n",
     "\n",
diff --git a/docs/docs/how_to/recursive_text_splitter.ipynb b/docs/docs/how_to/recursive_text_splitter.ipynb
index ce77b89f9d1..166fa59e874 100644
--- a/docs/docs/how_to/recursive_text_splitter.ipynb
+++ b/docs/docs/how_to/recursive_text_splitter.ipynb
@@ -21,7 +21,7 @@
    "source": [
     "# How to recursively split text by characters\n",
     "\n",
-    "This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is `[\"\\n\\n\", \"\\n\", \" \", \"\"]`. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.\n",
+    "This [text splitter](/docs/concepts/text_splitters/) is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is `[\"\\n\\n\", \"\\n\", \" \", \"\"]`. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.\n",
     "\n",
     "1. How the text is split: by list of characters.\n",
     "2. How the chunk size is measured: by number of characters.\n",
diff --git a/docs/docs/how_to/response_metadata.ipynb b/docs/docs/how_to/response_metadata.ipynb
index 773333d09dc..150cf6a89bc 100644
--- a/docs/docs/how_to/response_metadata.ipynb
+++ b/docs/docs/how_to/response_metadata.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# Response metadata\n",
     "\n",
-    "Many model providers include some metadata in their chat generation responses. This metadata can be accessed via the `AIMessage.response_metadata: Dict` attribute. Depending on the model provider and model configuration, this can contain information like [token counts](/docs/how_to/chat_token_usage_tracking), [logprobs](/docs/how_to/logprobs), and more.\n",
+    "Many model providers include some metadata in their chat generation [responses](/docs/concepts/messages/#aimessage). This metadata can be accessed via the `AIMessage.response_metadata: Dict` attribute. Depending on the model provider and model configuration, this can contain information like [token counts](/docs/how_to/chat_token_usage_tracking), [logprobs](/docs/how_to/logprobs), and more.\n",
     "\n",
     "Here's what the response metadata looks like for a few different providers:\n",
     "\n",
diff --git a/docs/docs/how_to/runnable_runtime_secrets.ipynb b/docs/docs/how_to/runnable_runtime_secrets.ipynb
index 819cdf1cbe6..5a6b5cfa1ef 100644
--- a/docs/docs/how_to/runnable_runtime_secrets.ipynb
+++ b/docs/docs/how_to/runnable_runtime_secrets.ipynb
@@ -11,7 +11,7 @@
     "\n",
     ":::\n",
     "\n",
-    "We can pass in secrets to our runnables at runtime using the `RunnableConfig`. Specifically we can pass in secrets with a `__` prefix to the `configurable` field. This will ensure that these secrets aren't traced as part of the invocation:"
+    "We can pass in secrets to our [runnables](/docs/concepts/runnables/) at runtime using the `RunnableConfig`. Specifically we can pass in secrets with a `__` prefix to the `configurable` field. This will ensure that these secrets aren't traced as part of the invocation:"
    ]
   },
   {
diff --git a/docs/docs/how_to/self_query.ipynb b/docs/docs/how_to/self_query.ipynb
index b07e6a89b15..06151f12d3d 100644
--- a/docs/docs/how_to/self_query.ipynb
+++ b/docs/docs/how_to/self_query.ipynb
@@ -13,7 +13,7 @@
     "\n",
     ":::\n",
     "\n",
-    "A self-querying retriever is one that, as the name suggests, has the ability to query itself. Specifically, given any natural language query, the retriever uses a query-constructing LLM chain to write a structured query and then applies that structured query to its underlying VectorStore. This allows the retriever to not only use the user-input query for semantic similarity comparison with the contents of stored documents but to also extract filters from the user query on the metadata of stored documents and to execute those filters.\n",
+    "A self-querying [retriever](/docs/concepts/retrievers/) is one that, as the name suggests, has the ability to query itself. Specifically, given any natural language query, the retriever uses a query-constructing LLM chain to write a structured query and then applies that structured query to its underlying [vector store](/docs/concepts/vectorstores/). This allows the retriever to not only use the user-input query for semantic similarity comparison with the contents of stored documents but to also extract filters from the user query on the metadata of stored documents and to execute those filters.\n",
     "\n",
     "![](../../static/img/self_querying.jpg)\n",
     "\n",
diff --git a/docs/docs/how_to/split_by_token.ipynb b/docs/docs/how_to/split_by_token.ipynb
index 87aad35bc64..047f4777f34 100644
--- a/docs/docs/how_to/split_by_token.ipynb
+++ b/docs/docs/how_to/split_by_token.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to split text by tokens \n",
     "\n",
-    "Language models have a token limit. You should not exceed the token limit. When you split your text into chunks it is therefore a good idea to count the number of tokens. There are many tokenizers. When you count tokens in your text you should use the same tokenizer as used in the language model. "
+    "Language models have a [token](/docs/concepts/tokens/) limit. You should not exceed the token limit. When you [split your text](/docs/concepts/text_splitters/) into chunks it is therefore a good idea to count the number of tokens. There are many tokenizers. When you count tokens in your text you should use the same tokenizer as used in the language model. "
    ]
   },
   {
diff --git a/docs/docs/how_to/sql_prompting.ipynb b/docs/docs/how_to/sql_prompting.ipynb
index 0cd1a1c2626..831a7bca13a 100644
--- a/docs/docs/how_to/sql_prompting.ipynb
+++ b/docs/docs/how_to/sql_prompting.ipynb
@@ -12,7 +12,7 @@
     "\n",
     "- How the dialect of the LangChain [SQLDatabase](https://python.langchain.com/api_reference/community/utilities/langchain_community.utilities.sql_database.SQLDatabase.html) impacts the prompt of the chain;\n",
     "- How to format schema information into the prompt using `SQLDatabase.get_context`;\n",
-    "- How to build and select few-shot examples to assist the model.\n",
+    "- How to build and select [few-shot examples](/docs/concepts/few_shot_prompting/) to assist the model.\n",
     "\n",
     "## Setup\n",
     "\n",
diff --git a/docs/docs/how_to/structured_output.ipynb b/docs/docs/how_to/structured_output.ipynb
index 04171ac72cb..4eaf09d3569 100644
--- a/docs/docs/how_to/structured_output.ipynb
+++ b/docs/docs/how_to/structured_output.ipynb
@@ -29,7 +29,7 @@
     "- [Function/tool calling](/docs/concepts/tool_calling)\n",
     ":::\n",
     "\n",
-    "It is often useful to have a model return output that matches a specific schema. One common use-case is extracting data from text to insert into a database or use with some other downstream system. This guide covers a few strategies for getting structured outputs from a model.\n",
+    "It is often useful to have a model return output that matches a specific [schema](/docs/concepts/structured_outputs/). One common use-case is extracting data from text to insert into a database or use with some other downstream system. This guide covers a few strategies for getting structured outputs from a model.\n",
     "\n",
     "## The `.with_structured_output()` method\n",
     "\n",
@@ -41,9 +41,9 @@
     "\n",
     ":::\n",
     "\n",
-    "This is the easiest and most reliable way to get structured outputs. `with_structured_output()` is implemented for models that provide native APIs for structuring outputs, like tool/function calling or JSON mode, and makes use of these capabilities under the hood.\n",
+    "This is the easiest and most reliable way to get structured outputs. `with_structured_output()` is implemented for [models that provide native APIs for structuring outputs](/docs/integrations/chat/), like tool/function calling or JSON mode, and makes use of these capabilities under the hood.\n",
     "\n",
-    "This method takes a schema as input which specifies the names, types, and descriptions of the desired output attributes. The method returns a model-like Runnable, except that instead of outputting strings or Messages it outputs objects corresponding to the given schema. The schema can be specified as a TypedDict class, [JSON Schema](https://json-schema.org/) or a Pydantic class. If TypedDict or JSON Schema are used then a dictionary will be returned by the Runnable, and if a Pydantic class is used then a Pydantic object will be returned.\n",
+    "This method takes a schema as input which specifies the names, types, and descriptions of the desired output attributes. The method returns a model-like Runnable, except that instead of outputting strings or [messages](/docs/concepts/messages/) it outputs objects corresponding to the given schema. The schema can be specified as a TypedDict class, [JSON Schema](https://json-schema.org/) or a Pydantic class. If TypedDict or JSON Schema are used then a dictionary will be returned by the Runnable, and if a Pydantic class is used then a Pydantic object will be returned.\n",
     "\n",
     "As an example, let's get a model to generate a joke and separate the setup from the punchline:\n",
     "\n",
diff --git a/docs/docs/how_to/summarize_stuff.ipynb b/docs/docs/how_to/summarize_stuff.ipynb
index 86fbb86e108..73f8f4ffd73 100644
--- a/docs/docs/how_to/summarize_stuff.ipynb
+++ b/docs/docs/how_to/summarize_stuff.ipynb
@@ -30,7 +30,7 @@
    "source": [
     "## Load chat model\n",
     "\n",
-    "Let's first load a chat model:\n",
+    "Let's first load a [chat model](/docs/concepts/chat_models/):\n",
     "\n",
     "import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
     "\n",
diff --git a/docs/docs/how_to/time_weighted_vectorstore.ipynb b/docs/docs/how_to/time_weighted_vectorstore.ipynb
index 91ba6e72f3b..d14b3757046 100644
--- a/docs/docs/how_to/time_weighted_vectorstore.ipynb
+++ b/docs/docs/how_to/time_weighted_vectorstore.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to use a time-weighted vector store retriever\n",
     "\n",
-    "This retriever uses a combination of semantic similarity and a time decay.\n",
+    "This [retriever](/docs/concepts/retrievers/) uses a combination of semantic [similarity](/docs/concepts/embedding_models/#measure-similarity) and a time decay.\n",
     "\n",
     "The algorithm for scoring them is:\n",
     "\n",
diff --git a/docs/docs/how_to/tool_artifacts.ipynb b/docs/docs/how_to/tool_artifacts.ipynb
index a6fa076c0c1..f99978aedd8 100644
--- a/docs/docs/how_to/tool_artifacts.ipynb
+++ b/docs/docs/how_to/tool_artifacts.ipynb
@@ -16,7 +16,7 @@
     "\n",
     ":::\n",
     "\n",
-    "Tools are utilities that can be called by a model, and whose outputs are designed to be fed back to a model. Sometimes, however, there are artifacts of a tool's execution that we want to make accessible to downstream components in our chain or agent, but that we don't want to expose to the model itself. For example if a tool returns a custom object, a dataframe or an image, we may want to pass some metadata about this output to the model without passing the actual output to the model. At the same time, we may want to be able to access this full output elsewhere, for example in downstream tools.\n",
+    "[Tools](/docs/concepts/tools/) are utilities that can be [called by a model](/docs/concepts/tool_calling/), and whose outputs are designed to be fed back to a model. Sometimes, however, there are artifacts of a tool's execution that we want to make accessible to downstream components in our chain or agent, but that we don't want to expose to the model itself. For example if a tool returns a custom object, a dataframe or an image, we may want to pass some metadata about this output to the model without passing the actual output to the model. At the same time, we may want to be able to access this full output elsewhere, for example in downstream tools.\n",
     "\n",
     "The Tool and [ToolMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) interfaces make it possible to distinguish between the parts of the tool output meant for the model (this is the ToolMessage.content) and those parts which are meant for use outside the model (ToolMessage.artifact).\n",
     "\n",
diff --git a/docs/docs/how_to/tool_choice.ipynb b/docs/docs/how_to/tool_choice.ipynb
index 0a8c6cdf05c..0a7ad581e0d 100644
--- a/docs/docs/how_to/tool_choice.ipynb
+++ b/docs/docs/how_to/tool_choice.ipynb
@@ -14,7 +14,7 @@
     "- [How to use a model to call tools](/docs/how_to/tool_calling)\n",
     ":::\n",
     "\n",
-    "In order to force our LLM to select a specific tool, we can use the `tool_choice` parameter to ensure certain behavior. First, let's define our model and tools:"
+    "In order to force our LLM to select a specific [tool](/docs/concepts/tools/), we can use the `tool_choice` parameter to ensure certain behavior. First, let's define our model and tools:"
    ]
   },
   {
diff --git a/docs/docs/how_to/tool_configure.ipynb b/docs/docs/how_to/tool_configure.ipynb
index 474d2f3c5fe..64305d26a1f 100644
--- a/docs/docs/how_to/tool_configure.ipynb
+++ b/docs/docs/how_to/tool_configure.ipynb
@@ -17,9 +17,9 @@
     "\n",
     ":::\n",
     "\n",
-    "If you have a tool  that call chat models, retrievers, or other runnables, you may want to access internal events from those runnables or configure them with additional properties. This guide shows you how to manually pass parameters properly so that you can do this using the `astream_events()` method.\n",
+    "If you have a [tool](/docs/concepts/tools/) that calls [chat models](/docs/concepts/chat_models/), [retrievers](/docs/concepts/retrievers/), or other [runnables](/docs/concepts/runnables/), you may want to access internal events from those runnables or configure them with additional properties. This guide shows you how to manually pass parameters properly so that you can do this using the `astream_events()` method.\n",
     "\n",
-    "Tools are runnables, and you can treat them the same way as any other runnable at the interface level - you can call `invoke()`, `batch()`, and `stream()` on them as normal. However, when writing custom tools, you may want to invoke other runnables like chat models or retrievers. In order to properly trace and configure those sub-invocations, you'll need to manually access and pass in the tool's current [`RunnableConfig`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html) object. This guide show you some examples of how to do that.\n",
+    "Tools are [runnables](/docs/concepts/runnables/), and you can treat them the same way as any other runnable at the interface level - you can call `invoke()`, `batch()`, and `stream()` on them as normal. However, when writing custom tools, you may want to invoke other runnables like chat models or retrievers. In order to properly trace and configure those sub-invocations, you'll need to manually access and pass in the tool's current [`RunnableConfig`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html) object. This guide show you some examples of how to do that.\n",
     "\n",
     ":::caution Compatibility\n",
     "\n",
diff --git a/docs/docs/how_to/tool_runtime.ipynb b/docs/docs/how_to/tool_runtime.ipynb
index afc8c1c5e4d..3356aaa059f 100644
--- a/docs/docs/how_to/tool_runtime.ipynb
+++ b/docs/docs/how_to/tool_runtime.ipynb
@@ -21,7 +21,7 @@
     "  [\"langchain-core\", \"0.2.21\"],\n",
     "]} />\n",
     "\n",
-    "You may need to bind values to a tool that are only known at runtime. For example, the tool logic may require using the ID of the user who made the request.\n",
+    "You may need to bind values to a [tool](/docs/concepts/tools/) that are only known at runtime. For example, the tool logic may require using the ID of the user who made the request.\n",
     "\n",
     "Most of the time, such values should not be controlled by the LLM. In fact, allowing the LLM to control the user ID may lead to a security risk.\n",
     "\n",
diff --git a/docs/docs/how_to/tool_stream_events.ipynb b/docs/docs/how_to/tool_stream_events.ipynb
index 5b552362f46..16134a02921 100644
--- a/docs/docs/how_to/tool_stream_events.ipynb
+++ b/docs/docs/how_to/tool_stream_events.ipynb
@@ -16,7 +16,7 @@
     "\n",
     ":::\n",
     "\n",
-    "If you have tools that call chat models, retrievers, or other runnables, you may want to access internal events from those runnables or configure them with additional properties. This guide shows you how to manually pass parameters properly so that you can do this using the `astream_events()` method.\n",
+    "If you have [tools](/docs/concepts/tools/) that call [chat models](/docs/concepts/chat_models/), [retrievers](/docs/concepts/retrievers/), or other [runnables](/docs/concepts/runnables/), you may want to access internal events from those runnables or configure them with additional properties. This guide shows you how to manually pass parameters properly so that you can do this using the `astream_events()` method.\n",
     "\n",
     ":::caution Compatibility\n",
     "\n",
diff --git a/docs/docs/how_to/tool_streaming.ipynb b/docs/docs/how_to/tool_streaming.ipynb
index b96868d80ff..fc85d3c31fa 100644
--- a/docs/docs/how_to/tool_streaming.ipynb
+++ b/docs/docs/how_to/tool_streaming.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# How to stream tool calls\n",
     "\n",
-    "When tools are called in a streaming context, \n",
+    "When [tools](/docs/concepts/tools/) are called in a streaming context, \n",
     "[message chunks](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html#langchain_core.messages.ai.AIMessageChunk) \n",
     "will be populated with [tool call chunk](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolCallChunk.html#langchain_core.messages.tool.ToolCallChunk) \n",
     "objects in a list via the `.tool_call_chunks` attribute. A `ToolCallChunk` includes \n",
diff --git a/docs/docs/how_to/tools_as_openai_functions.ipynb b/docs/docs/how_to/tools_as_openai_functions.ipynb
index d321afd0ddd..a0a1eea2aea 100644
--- a/docs/docs/how_to/tools_as_openai_functions.ipynb
+++ b/docs/docs/how_to/tools_as_openai_functions.ipynb
@@ -7,7 +7,7 @@
    "source": [
     "# How to convert tools to OpenAI Functions\n",
     "\n",
-    "This notebook goes over how to use LangChain tools as OpenAI functions."
+    "This notebook goes over how to use LangChain [tools](/docs/concepts/tools/) as OpenAI functions."
    ]
   },
   {
diff --git a/docs/docs/how_to/tools_chain.ipynb b/docs/docs/how_to/tools_chain.ipynb
index b54fa2c885f..4497659789c 100644
--- a/docs/docs/how_to/tools_chain.ipynb
+++ b/docs/docs/how_to/tools_chain.ipynb
@@ -17,7 +17,7 @@
    "source": [
     "# How to use tools in a chain\n",
     "\n",
-    "In this guide, we will go over the basic ways to create Chains and Agents that call Tools. Tools can be just about anything — APIs, functions, databases, etc. Tools allow us to extend the capabilities of a model beyond just outputting text/messages. The key to using models with tools is correctly prompting a model and parsing its response so that it chooses the right tools and provides the right inputs for them."
+    "In this guide, we will go over the basic ways to create Chains and Agents that call [Tools](/docs/concepts/tools/). Tools can be just about anything — APIs, functions, databases, etc. Tools allow us to extend the capabilities of a model beyond just outputting text/messages. The key to using models with tools is correctly prompting a model and parsing its response so that it chooses the right tools and provides the right inputs for them."
    ]
   },
   {
@@ -143,7 +143,7 @@
     "![chain](../../static/img/tool_chain.svg)\n",
     "\n",
     "### Tool/function calling\n",
-    "One of the most reliable ways to use tools with LLMs is with tool calling APIs (also sometimes called function calling). This only works with models that explicitly support tool calling. You can see which models support tool calling [here](/docs/integrations/chat/), and learn more about how to use tool calling in [this guide](/docs/how_to/function_calling).\n",
+    "One of the most reliable ways to use tools with LLMs is with [tool calling](/docs/concepts/tool_calling/) APIs (also sometimes called function calling). This only works with models that explicitly support tool calling. You can see which models support tool calling [here](/docs/integrations/chat/), and learn more about how to use tool calling in [this guide](/docs/how_to/function_calling).\n",
     "\n",
     "First we'll define our model and tools. We'll start with just a single tool, `multiply`.\n",
     "\n",
diff --git a/docs/docs/how_to/tools_error.ipynb b/docs/docs/how_to/tools_error.ipynb
index 7a881b89fbf..885c288e2af 100644
--- a/docs/docs/how_to/tools_error.ipynb
+++ b/docs/docs/how_to/tools_error.ipynb
@@ -16,7 +16,7 @@
     "\n",
     ":::\n",
     "\n",
-    "Calling tools with an LLM is generally more reliable than pure prompting, but it isn't perfect. The model may try to call a tool that doesn't exist or fail to return arguments that match the requested schema. Strategies like keeping schemas simple, reducing the number of tools you pass at once, and having good names and descriptions can help mitigate this risk, but aren't foolproof.\n",
+    "[Calling tools](/docs/concepts/tool_calling/) with an LLM is generally more reliable than pure prompting, but it isn't perfect. The model may try to call a tool that doesn't exist or fail to return arguments that match the requested schema. Strategies like keeping schemas simple, reducing the number of tools you pass at once, and having good names and descriptions can help mitigate this risk, but aren't foolproof.\n",
     "\n",
     "This guide covers some ways to build error handling into your chains to mitigate these failure modes."
    ]
diff --git a/docs/docs/how_to/tools_few_shot.ipynb b/docs/docs/how_to/tools_few_shot.ipynb
index 0e3d6564874..b9032391a4a 100644
--- a/docs/docs/how_to/tools_few_shot.ipynb
+++ b/docs/docs/how_to/tools_few_shot.ipynb
@@ -6,7 +6,7 @@
    "source": [
     "# How to use few-shot prompting with tool calling\n",
     "\n",
-    "For more complex tool use it's very useful to add few-shot examples to the prompt. We can do this by adding `AIMessage`s with `ToolCall`s and corresponding `ToolMessage`s to our prompt.\n",
+    "For more complex tool use it's very useful to add [few-shot examples](/docs/concepts/few_shot_prompting/) to the prompt. We can do this by adding `AIMessage`s with `ToolCall`s and corresponding `ToolMessage`s to our prompt.\n",
     "\n",
     "First let's define our tools and model."
    ]
diff --git a/docs/docs/how_to/trim_messages.ipynb b/docs/docs/how_to/trim_messages.ipynb
index 97b725dd72c..505e7b90195 100644
--- a/docs/docs/how_to/trim_messages.ipynb
+++ b/docs/docs/how_to/trim_messages.ipynb
@@ -20,7 +20,7 @@
     "\n",
     ":::\n",
     "\n",
-    "All models have finite context windows, meaning there's a limit to how many tokens they can take as input. If you have very long messages or a chain/agent that accumulates a long message is history, you'll need to manage the length of the messages you're passing in to the model.\n",
+    "All models have finite context windows, meaning there's a limit to how many [tokens](/docs/concepts/tokens/) they can take as input. If you have very long messages or a chain/agent that accumulates a long message history, you'll need to manage the length of the messages you're passing in to the model.\n",
     "\n",
     "[trim_messages](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.trim_messages.html) can be used to reduce the size of a chat history to a specified token count or specified message count.\n",
     "\n",
diff --git a/docs/docs/how_to/vectorstore_retriever.ipynb b/docs/docs/how_to/vectorstore_retriever.ipynb
index cc9958eb32a..f9651f2d952 100644
--- a/docs/docs/how_to/vectorstore_retriever.ipynb
+++ b/docs/docs/how_to/vectorstore_retriever.ipynb
@@ -17,7 +17,7 @@
    "source": [
     "# How to use a vectorstore as a retriever\n",
     "\n",
-    "A vector store retriever is a retriever that uses a vector store to retrieve documents. It is a lightweight wrapper around the vector store class to make it conform to the retriever interface.\n",
+    "A vector store retriever is a [retriever](/docs/concepts/retrievers/) that uses a [vector store](/docs/concepts/vectorstores/) to retrieve documents. It is a lightweight wrapper around the vector store class to make it conform to the retriever [interface](/docs/concepts/runnables/).\n",
     "It uses the search methods implemented by a vector store, like similarity search and MMR, to query the texts in the vector store.\n",
     "\n",
     "In this guide we will cover:\n",