docs: add links to concept guides in how-tos (#28118)

2025-08-07 20:15:40 +00:00 · 2024-11-15 09:44:11 -05:00 · 2024-11-15 09:44:11 -05:00 · 74438f3ae8
commit 74438f3ae8
parent ef2dc9eae5
79 changed files with 101 additions and 100 deletions
--- a/docs/docs/how_to/HTML_header_metadata_splitter.ipynb
+++ b/docs/docs/how_to/HTML_header_metadata_splitter.ipynb
@ -13,7 +13,7 @@
    "# How to split by HTML header \n",
    "## Description and motivation\n",
    "\n",
-    "[HTMLHeaderTextSplitter](https://python.langchain.com/api_reference/text_splitters/html/langchain_text_splitters.html.HTMLHeaderTextSplitter.html) is a \"structure-aware\" chunker that splits text at the HTML element level and adds metadata for each header \"relevant\" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline.\n",
+    "[HTMLHeaderTextSplitter](https://python.langchain.com/api_reference/text_splitters/html/langchain_text_splitters.html.HTMLHeaderTextSplitter.html) is a \"structure-aware\" [text splitter](/docs/concepts/text_splitters/) that splits text at the HTML element level and adds metadata for each header \"relevant\" to any given chunk. It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures. It can be used with other text splitters as part of a chunking pipeline.\n",
    "\n",
    "It is analogous to the [MarkdownHeaderTextSplitter](/docs/how_to/markdown_header_metadata_splitter) for markdown files.\n",
    "\n",
--- a/docs/docs/how_to/HTML_section_aware_splitter.ipynb
+++ b/docs/docs/how_to/HTML_section_aware_splitter.ipynb
@ -12,7 +12,7 @@
   "source": [
    "# How to split by HTML sections\n",
    "## Description and motivation\n",
-    "Similar in concept to the [HTMLHeaderTextSplitter](/docs/how_to/HTML_header_metadata_splitter), the `HTMLSectionSplitter` is a \"structure-aware\" chunker that splits text at the element level and adds metadata for each header \"relevant\" to any given chunk.\n",
+    "Similar in concept to the [HTMLHeaderTextSplitter](/docs/how_to/HTML_header_metadata_splitter), the `HTMLSectionSplitter` is a \"structure-aware\" [text splitter](/docs/concepts/text_splitters/) that splits text at the element level and adds metadata for each header \"relevant\" to any given chunk.\n",
    "\n",
    "It can return chunks element by element or combine elements with the same metadata, with the objectives of (a) keeping related text grouped (more or less) semantically and (b) preserving context-rich information encoded in document structures.\n",
    "\n",
--- a/docs/docs/how_to/MultiQueryRetriever.ipynb
+++ b/docs/docs/how_to/MultiQueryRetriever.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to use the MultiQueryRetriever\n",
    "\n",
-    "Distance-based vector database retrieval embeds (represents) queries in high-dimensional space and finds similar embedded documents based on a distance metric. But, retrieval may produce different results with subtle changes in query wording, or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious.\n",
+    "Distance-based [vector database](/docs/concepts/vectorstores/) retrieval [embeds](/docs/concepts/embedding_models/) (represents) queries in high-dimensional space and finds similar embedded documents based on a distance metric. But, retrieval may produce different results with subtle changes in query wording, or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious.\n",
    "\n",
    "The [MultiQueryRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html) automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` can mitigate some of the limitations of the distance-based retrieval and get a richer set of results.\n",
    "\n",
@ -151,7 +151,7 @@
   "id": "7e170263-facd-4065-bb68-d11fb9123a45",
   "metadata": {},
   "source": [
-    "Note that the underlying queries generated by the retriever are logged at the `INFO` level."
+    "Note that the underlying queries generated by the [retriever](/docs/concepts/retrievers/) are logged at the `INFO` level."
   ]
  },
  {
--- a/docs/docs/how_to/add_scores_retriever.ipynb
+++ b/docs/docs/how_to/add_scores_retriever.ipynb
@ -7,11 +7,11 @@
   "source": [
    "# How to add scores to retriever results\n",
    "\n",
-    "Retrievers will return sequences of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects, which by default include no information about the process that retrieved them (e.g., a similarity score against a query). Here we demonstrate how to add retrieval scores to the `.metadata` of documents:\n",
+    "[Retrievers](/docs/concepts/retrievers/) will return sequences of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects, which by default include no information about the process that retrieved them (e.g., a similarity score against a query). Here we demonstrate how to add retrieval scores to the `.metadata` of documents:\n",
    "1. From [vectorstore retrievers](/docs/how_to/vectorstore_retriever);\n",
    "2. From higher-order LangChain retrievers, such as [SelfQueryRetriever](/docs/how_to/self_query) or [MultiVectorRetriever](/docs/how_to/multi_vector).\n",
    "\n",
-    "For (1), we will implement a short wrapper function around the corresponding vector store. For (2), we will update a method of the corresponding class.\n",
+    "For (1), we will implement a short wrapper function around the corresponding [vector store](/docs/concepts/vectorstores/). For (2), we will update a method of the corresponding class.\n",
    "\n",
    "## Create vector store\n",
    "\n",
--- a/docs/docs/how_to/agent_executor.ipynb
+++ b/docs/docs/how_to/agent_executor.ipynb
@ -22,7 +22,7 @@
    ":::\n",
    "\n",
    "By themselves, language models can't take actions - they just output text.\n",
-    "A big use case for LangChain is creating **agents**.\n",
+    "A big use case for LangChain is creating **[agents](/docs/concepts/agents/)**.\n",
    "Agents are systems that use an LLM as a reasoning engine to determine which actions to take and what the inputs to those actions should be.\n",
    "The results of those actions can then be fed back into the agent and it determines whether more actions are needed, or whether it is okay to finish.\n",
    "\n",
--- a/docs/docs/how_to/caching_embeddings.ipynb
+++ b/docs/docs/how_to/caching_embeddings.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# Caching\n",
    "\n",
-    "Embeddings can be stored or temporarily cached to avoid needing to recompute them.\n",
+    "[Embeddings](/docs/concepts/embedding_models/) can be stored or temporarily cached to avoid needing to recompute them.\n",
    "\n",
    "Caching embeddings can be done using a `CacheBackedEmbeddings`. The cache backed embedder is a wrapper around an embedder that caches\n",
    "embeddings in a key-value store. The text is hashed and the hash is used as the key in the cache.\n",
--- a/docs/docs/how_to/character_text_splitter.ipynb
+++ b/docs/docs/how_to/character_text_splitter.ipynb
@ -21,7 +21,7 @@
   "source": [
    "# How to split by character\n",
    "\n",
-    "This is the simplest method. This splits based on a given character sequence, which defaults to `\"\\n\\n\"`. Chunk length is measured by number of characters.\n",
+    "This is the simplest method. This [splits](/docs/concepts/text_splitters/) based on a given character sequence, which defaults to `\"\\n\\n\"`. Chunk length is measured by number of characters.\n",
    "\n",
    "1. How the text is split: by single character separator.\n",
    "2. How the chunk size is measured: by number of characters.\n",
--- a/docs/docs/how_to/chat_model_caching.ipynb
+++ b/docs/docs/how_to/chat_model_caching.ipynb
@ -15,7 +15,7 @@
    "\n",
    ":::\n",
    "\n",
-    "LangChain provides an optional caching layer for chat models. This is useful for two main reasons:\n",
+    "LangChain provides an optional caching layer for [chat models](/docs/concepts/chat_models). This is useful for two main reasons:\n",
    "\n",
    "- It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times. This is especially useful during app development.\n",
    "- It can speed up your application by reducing the number of API calls you make to the LLM provider.\n",
--- a/docs/docs/how_to/chat_models_universal_init.ipynb
+++ b/docs/docs/how_to/chat_models_universal_init.ipynb
@ -7,13 +7,13 @@
   "source": [
    "# How to init any model in one line\n",
    "\n",
-    "Many LLM applications let end users specify what model provider and model they want the application to be powered by. This requires writing some logic to initialize different ChatModels based on some user configuration. The `init_chat_model()` helper method makes it easy to initialize a number of different model integrations without having to worry about import paths and class names.\n",
+    "Many LLM applications let end users specify what model provider and model they want the application to be powered by. This requires writing some logic to initialize different [chat models](/docs/concepts/chat_models/) based on some user configuration. The `init_chat_model()` helper method makes it easy to initialize a number of different model integrations without having to worry about import paths and class names.\n",
    "\n",
    ":::tip Supported models\n",
    "\n",
    "See the [init_chat_model()](https://python.langchain.com/api_reference/langchain/chat_models/langchain.chat_models.base.init_chat_model.html) API reference for a full list of supported integrations.\n",
    "\n",
-    "Make sure you have the integration packages installed for any model providers you want to support. E.g. you should have `langchain-openai` installed to init an OpenAI model.\n",
+    "Make sure you have the [integration packages](/docs/integrations/chat/) installed for any model providers you want to support. E.g. you should have `langchain-openai` installed to init an OpenAI model.\n",
    "\n",
    ":::"
   ]
--- a/docs/docs/how_to/chat_token_usage_tracking.ipynb
+++ b/docs/docs/how_to/chat_token_usage_tracking.ipynb
@ -14,7 +14,7 @@
    "\n",
    ":::\n",
    "\n",
-    "Tracking token usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
+    "Tracking [token](/docs/concepts/tokens/) usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
    "\n",
    "This guide requires `langchain-anthropic` and `langchain-openai >= 0.1.9`."
   ]
--- a/docs/docs/how_to/chatbots_retrieval.ipynb
+++ b/docs/docs/how_to/chatbots_retrieval.ipynb
@ -15,7 +15,7 @@
   "source": [
    "# How to add retrieval to chatbots\n",
    "\n",
-    "Retrieval is a common technique chatbots use to augment their responses with data outside a chat model's training data. This section will cover how to implement retrieval in the context of chatbots, but it's worth noting that retrieval is a very subtle and deep topic - we encourage you to explore [other parts of the documentation](/docs/how_to#qa-with-rag) that go into greater depth!\n",
+    "[Retrieval](/docs/concepts/retrieval/) is a common technique chatbots use to augment their responses with data outside a chat model's training data. This section will cover how to implement retrieval in the context of chatbots, but it's worth noting that retrieval is a very subtle and deep topic - we encourage you to explore [other parts of the documentation](/docs/how_to#qa-with-rag) that go into greater depth!\n",
    "\n",
    "## Setup\n",
    "\n",
@ -80,7 +80,7 @@
   "source": [
    "## Creating a retriever\n",
    "\n",
-    "We'll use [the LangSmith documentation](https://docs.smith.langchain.com/overview) as source material and store the content in a vectorstore for later retrieval. Note that this example will gloss over some of the specifics around parsing and storing a data source - you can see more [in-depth documentation on creating retrieval systems here](/docs/how_to#qa-with-rag).\n",
+    "We'll use [the LangSmith documentation](https://docs.smith.langchain.com/overview) as source material and store the content in a [vector store](/docs/concepts/vectorstores/) for later retrieval. Note that this example will gloss over some of the specifics around parsing and storing a data source - you can see more [in-depth documentation on creating retrieval systems here](/docs/how_to#qa-with-rag).\n",
    "\n",
    "Let's use a document loader to pull text from the docs:"
   ]
--- a/docs/docs/how_to/chatbots_tools.ipynb
+++ b/docs/docs/how_to/chatbots_tools.ipynb
@ -42,7 +42,7 @@
   "metadata": {},
   "outputs": [
    {
-     "name": "stdin",
+     "name": "stdout",
     "output_type": "stream",
     "text": [
      "OpenAI API Key: ········\n",
@ -78,7 +78,7 @@
    "\n",
    "Our end goal is to create an agent that can respond conversationally to user questions while looking up information as needed.\n",
    "\n",
-    "First, let's initialize Tavily and an OpenAI chat model capable of tool calling:"
+    "First, let's initialize Tavily and an OpenAI [chat model](/docs/concepts/chat_models/) capable of tool calling:"
   ]
  },
  {
--- a/docs/docs/how_to/code_splitter.ipynb
+++ b/docs/docs/how_to/code_splitter.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to split code\n",
    "\n",
-    "[RecursiveCharacterTextSplitter](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html) includes pre-built lists of separators that are useful for splitting text in a specific programming language.\n",
+    "[RecursiveCharacterTextSplitter](https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html) includes pre-built lists of separators that are useful for [splitting text](/docs/concepts/text_splitters/) in a specific programming language.\n",
    "\n",
    "Supported languages are stored in the `langchain_text_splitters.Language` enum. They include:\n",
    "\n",
--- a/docs/docs/how_to/contextual_compression.ipynb
+++ b/docs/docs/how_to/contextual_compression.ipynb
@ -7,13 +7,13 @@
   "source": [
    "# How to do retrieval with contextual compression\n",
    "\n",
-    "One challenge with retrieval is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.\n",
+    "One challenge with [retrieval](/docs/concepts/retrieval/) is that usually you don't know the specific queries your document storage system will face when you ingest data into the system. This means that the information most relevant to a query may be buried in a document with a lot of irrelevant text. Passing that full document through your application can lead to more expensive LLM calls and poorer responses.\n",
    "\n",
    "Contextual compression is meant to fix this. The idea is simple: instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query, so that only the relevant information is returned. “Compressing” here refers to both compressing the contents of an individual document and filtering out documents wholesale.\n",
    "\n",
    "To use the Contextual Compression Retriever, you'll need:\n",
    "\n",
-    "- a base retriever\n",
+    "- a base [retriever](/docs/concepts/retrievers/)\n",
    "- a Document Compressor\n",
    "\n",
    "The Contextual Compression Retriever passes queries to the base retriever, takes the initial documents and passes them through the Document Compressor. The Document Compressor takes a list of documents and shortens it by reducing the contents of documents or dropping documents altogether.\n",
--- a/docs/docs/how_to/custom_chat_model.ipynb
+++ b/docs/docs/how_to/custom_chat_model.ipynb
@ -14,15 +14,15 @@
    "\n",
    ":::\n",
    "\n",
-    "In this guide, we'll learn how to create a custom chat model using LangChain abstractions.\n",
+    "In this guide, we'll learn how to create a custom [chat model](/docs/concepts/chat_models/) using LangChain abstractions.\n",
    "\n",
    "Wrapping your LLM with the standard [`BaseChatModel`](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
    "\n",
-    "As an bonus, your LLM will automatically become a LangChain `Runnable` and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
+    "As an bonus, your LLM will automatically become a LangChain [Runnable](/docs/concepts/runnables/) and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
    "\n",
    "## Inputs and outputs\n",
    "\n",
-    "First, we need to talk about **messages**, which are the inputs and outputs of chat models.\n",
+    "First, we need to talk about **[messages](/docs/concepts/messages/)**, which are the inputs and outputs of chat models.\n",
    "\n",
    "### Messages\n",
    "\n",
--- a/docs/docs/how_to/custom_retriever.ipynb
+++ b/docs/docs/how_to/custom_retriever.ipynb
@ -19,9 +19,9 @@
    "\n",
    "## Overview\n",
    "\n",
-    "Many LLM applications involve retrieving information from external data sources using a `Retriever`. \n",
+    "Many LLM applications involve retrieving information from external data sources using a [Retriever](/docs/concepts/retrievers/). \n",
    "\n",
-    "A retriever is responsible for retrieving a list of relevant `Documents` to a given user `query`.\n",
+    "A retriever is responsible for retrieving a list of relevant [Documents](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) to a given user `query`.\n",
    "\n",
    "The retrieved documents are often formatted into prompts that are fed into an LLM, allowing the LLM to use the information in the to generate an appropriate response (e.g., answering a user question based on a knowledge base).\n",
    "\n",
--- a/docs/docs/how_to/custom_tools.ipynb
+++ b/docs/docs/how_to/custom_tools.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to create tools\n",
    "\n",
-    "When constructing an agent, you will need to provide it with a list of `Tool`s that it can use. Besides the actual function that is called, the Tool consists of several components:\n",
+    "When constructing an [agent](/docs/concepts/agents/), you will need to provide it with a list of [Tools](/docs/concepts/tools/) that it can use. Besides the actual function that is called, the Tool consists of several components:\n",
    "\n",
    "| Attribute     | Type                            | Description                                                                                                                                                                    |\n",
    "|---------------|---------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n",
--- a/docs/docs/how_to/document_loader_custom.ipynb
+++ b/docs/docs/how_to/document_loader_custom.ipynb
@ -26,7 +26,7 @@
    "`Document` objects are often formatted into prompts that are fed into an LLM, allowing the LLM to use the information in the `Document` to generate a desired response (e.g., summarizing the document).\n",
    "`Documents` can be either used immediately or indexed into a vectorstore for future retrieval and use.\n",
    "\n",
-    "The main abstractions for Document Loading are:\n",
+    "The main abstractions for [Document Loading](/docs/concepts/document_loaders/) are:\n",
    "\n",
    "\n",
    "| Component      | Description                    |\n",
--- a/docs/docs/how_to/document_loader_pdf.ipynb
+++ b/docs/docs/how_to/document_loader_pdf.ipynb
@ -9,7 +9,7 @@
    "\n",
    "[Portable Document Format (PDF)](https://en.wikipedia.org/wiki/PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.\n",
    "\n",
-    "This guide covers how to load `PDF` documents into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream.\n",
+    "This guide covers how to [load](/docs/concepts/document_loaders/) `PDF` documents into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream.\n",
    "\n",
    "Text in PDFs is typically represented via text boxes. They may also contain images. A PDF parser might do some combination of the following:\n",
    "\n",
@ -250,7 +250,7 @@
   "metadata": {},
   "outputs": [
    {
-     "name": "stdin",
+     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Unstructured API Key: ········\n"
--- a/docs/docs/how_to/document_loader_web.ipynb
+++ b/docs/docs/how_to/document_loader_web.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to load web pages\n",
    "\n",
-    "This guide covers how to load web pages into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream. Web pages contain text, images, and other multimedia elements, and are typically represented with HTML. They may include links to other pages or resources.\n",
+    "This guide covers how to [load](/docs/concepts/document_loaders/) web pages into the LangChain [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) format that we use downstream. Web pages contain text, images, and other multimedia elements, and are typically represented with HTML. They may include links to other pages or resources.\n",
    "\n",
    "LangChain integrates with a host of parsers that are appropriate for web pages. The right parser will depend on your needs. Below we demonstrate two possibilities:\n",
    "\n",
--- a/docs/docs/how_to/ensemble_retriever.ipynb
+++ b/docs/docs/how_to/ensemble_retriever.ipynb
@ -6,7 +6,7 @@
   "source": [
    "# How to combine results from multiple retrievers\n",
    "\n",
-    "The [EnsembleRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.ensemble.EnsembleRetriever.html) supports ensembling of results from multiple retrievers. It is initialized with a list of [BaseRetriever](https://python.langchain.com/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html) objects. EnsembleRetrievers rerank the results of the constituent retrievers based on the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.\n",
+    "The [EnsembleRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.ensemble.EnsembleRetriever.html) supports ensembling of results from multiple [retrievers](/docs/concepts/retrievers/). It is initialized with a list of [BaseRetriever](https://python.langchain.com/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html) objects. EnsembleRetrievers rerank the results of the constituent retrievers based on the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm.\n",
    "\n",
    "By leveraging the strengths of different algorithms, the `EnsembleRetriever` can achieve better performance than any single algorithm. \n",
    "\n",
--- a/docs/docs/how_to/example_selectors.ipynb
+++ b/docs/docs/how_to/example_selectors.ipynb
@ -17,7 +17,7 @@
   "source": [
    "# How to use example selectors\n",
    "\n",
-    "If you have a large number of examples, you may need to select which ones to include in the prompt. The Example Selector is the class responsible for doing so.\n",
+    "If you have a large number of examples, you may need to select which ones to include in the prompt. The [Example Selector](/docs/concepts/example_selectors/) is the class responsible for doing so.\n",
    "\n",
    "The base interface is defined as below:\n",
    "\n",
--- a/docs/docs/how_to/example_selectors_langsmith.ipynb
+++ b/docs/docs/how_to/example_selectors_langsmith.ipynb
@ -23,7 +23,7 @@
    "]} />\n",
    "\n",
    "\n",
-    "LangSmith datasets have built-in support for similarity search, making them a great tool for building and querying few-shot examples.\n",
+    "[LangSmith](https://docs.smith.langchain.com/) datasets have built-in support for similarity search, making them a great tool for building and querying few-shot examples.\n",
    "\n",
    "In this guide we'll see how to use an indexed LangSmith dataset as a few-shot example selector.\n",
    "\n",
--- a/docs/docs/how_to/example_selectors_length_based.ipynb
+++ b/docs/docs/how_to/example_selectors_length_based.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to select examples by length\n",
    "\n",
-    "This example selector selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more."
+    "This [example selector](/docs/concepts/example_selectors/) selects which examples to use based on length. This is useful when you are worried about constructing a prompt that will go over the length of the context window. For longer inputs, it will select fewer examples to include, while for shorter inputs it will select more."
   ]
  },
  {
--- a/docs/docs/how_to/example_selectors_mmr.ipynb
+++ b/docs/docs/how_to/example_selectors_mmr.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to select examples by maximal marginal relevance (MMR)\n",
    "\n",
-    "The `MaxMarginalRelevanceExampleSelector` selects examples based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.\n"
+    "The `MaxMarginalRelevanceExampleSelector` selects [examples](/docs/concepts/example_selectors/) based on a combination of which examples are most similar to the inputs, while also optimizing for diversity. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs, and then iteratively adding them while penalizing them for closeness to already selected examples.\n"
   ]
  },
  {
--- a/docs/docs/how_to/example_selectors_ngram.ipynb
+++ b/docs/docs/how_to/example_selectors_ngram.ipynb
@ -9,7 +9,7 @@
    "\n",
    "The `NGramOverlapExampleSelector` selects and orders examples based on which examples are most similar to the input, according to an ngram overlap score. The ngram overlap score is a float between 0.0 and 1.0, inclusive. \n",
    "\n",
-    "The selector allows for a threshold score to be set. Examples with an ngram overlap score less than or equal to the threshold are excluded. The threshold is set to -1.0, by default, so will not exclude any examples, only reorder them. Setting the threshold to 0.0 will exclude examples that have no ngram overlaps with the input.\n"
+    "The [selector](/docs/concepts/example_selectors/) allows for a threshold score to be set. Examples with an ngram overlap score less than or equal to the threshold are excluded. The threshold is set to -1.0, by default, so will not exclude any examples, only reorder them. Setting the threshold to 0.0 will exclude examples that have no ngram overlaps with the input.\n"
   ]
  },
  {
--- a/docs/docs/how_to/example_selectors_similarity.ipynb
+++ b/docs/docs/how_to/example_selectors_similarity.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to select examples by similarity\n",
    "\n",
-    "This object selects examples based on similarity to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.\n"
+    "This object selects [examples](/docs/concepts/example_selectors/) based on similarity to the inputs. It does this by finding the examples with the embeddings that have the greatest cosine similarity with the inputs.\n"
   ]
  },
  {
--- a/docs/docs/how_to/extraction_examples.ipynb
+++ b/docs/docs/how_to/extraction_examples.ipynb
@ -9,7 +9,7 @@
    "\n",
    "The quality of extractions can often be improved by providing reference examples to the LLM.\n",
    "\n",
-    "Data extraction attempts to generate structured representations of information found in text and other unstructured or semi-structured formats. [Tool-calling](/docs/concepts/tool_calling) LLM features are often used in this context. This guide demonstrates how to build few-shot examples of tool calls to help steer the behavior of extraction and similar applications.\n",
+    "Data extraction attempts to generate [structured representations](/docs/concepts/structured_outputs/) of information found in text and other unstructured or semi-structured formats. [Tool-calling](/docs/concepts/tool_calling) LLM features are often used in this context. This guide demonstrates how to build few-shot examples of tool calls to help steer the behavior of extraction and similar applications.\n",
    "\n",
    ":::tip\n",
    "While this guide focuses how to use examples with a tool calling model, this technique is generally applicable, and will work\n",
--- a/docs/docs/how_to/extraction_parse.ipynb
+++ b/docs/docs/how_to/extraction_parse.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to use prompting alone (no tool calling) to do extraction\n",
    "\n",
-    "Tool calling features are not required for generating structured output from LLMs. LLMs that are able to follow prompt instructions well can be tasked with outputting information in a given format.\n",
+    "[Tool calling](/docs/concepts/tool_calling/) features are not required for generating structured output from LLMs. LLMs that are able to follow prompt instructions well can be tasked with outputting information in a given format.\n",
    "\n",
    "This approach relies on designing good prompts and then parsing the output of the LLMs to make them extract information well.\n",
    "\n",
--- a/docs/docs/how_to/few_shot_examples.ipynb
+++ b/docs/docs/how_to/few_shot_examples.ipynb
@ -27,7 +27,7 @@
    "\n",
    ":::\n",
    "\n",
-    "In this guide, we'll learn how to create a simple prompt template that provides the model with example inputs and outputs when generating. Providing the LLM with a few such examples is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
+    "In this guide, we'll learn how to create a simple prompt template that provides the model with example inputs and outputs when generating. Providing the LLM with a few such examples is called [few-shotting](/docs/concepts/few_shot_prompting/), and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
    "\n",
    "A few-shot prompt template can be constructed from either a set of examples, or from an [Example Selector](https://python.langchain.com/api_reference/core/example_selectors/langchain_core.example_selectors.base.BaseExampleSelector.html) class responsible for choosing a subset of examples from the defined set.\n",
    "\n",
--- a/docs/docs/how_to/few_shot_examples_chat.ipynb
+++ b/docs/docs/how_to/few_shot_examples_chat.ipynb
@ -27,7 +27,7 @@
    "\n",
    ":::\n",
    "\n",
-    "This guide covers how to prompt a chat model with example inputs and outputs. Providing the model with a few such examples is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
+    "This guide covers how to prompt a chat model with example inputs and outputs. Providing the model with a few such examples is called [few-shotting](/docs/concepts/few_shot_prompting/), and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
    "\n",
    "There does not appear to be solid consensus on how best to do few-shot prompting, and the optimal prompt compilation will likely vary by model. Because of this, we provide few-shot prompt templates like the [FewShotChatMessagePromptTemplate](https://python.langchain.com/api_reference/core/prompts/langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate.html?highlight=fewshot#langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate) as a flexible starting point, and you can modify or replace them as you see fit.\n",
    "\n",
--- a/docs/docs/how_to/filter_messages.ipynb
+++ b/docs/docs/how_to/filter_messages.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to filter messages\n",
    "\n",
-    "In more complex chains and agents we might track state with a list of messages. This list can start to accumulate messages from multiple different models, speakers, sub-chains, etc., and we may only want to pass subsets of this full list of messages to each model call in the chain/agent.\n",
+    "In more complex chains and agents we might track state with a list of [messages](/docs/concepts/messages/). This list can start to accumulate messages from multiple different models, speakers, sub-chains, etc., and we may only want to pass subsets of this full list of messages to each model call in the chain/agent.\n",
    "\n",
    "The `filter_messages` utility makes it easy to filter messages by type, id, or name.\n",
    "\n",
--- a/docs/docs/how_to/graph_constructing.ipynb
+++ b/docs/docs/how_to/graph_constructing.ipynb
@ -15,7 +15,7 @@
   "source": [
    "# How to construct knowledge graphs\n",
    "\n",
-    "In this guide we'll go over the basic ways of constructing a knowledge graph based on unstructured text. The constructured graph can then be used as knowledge base in a RAG application.\n",
+    "In this guide we'll go over the basic ways of constructing a knowledge graph based on unstructured text. The constructured graph can then be used as knowledge base in a [RAG](/docs/concepts/rag/) application.\n",
    "\n",
    "## ⚠️ Security note ⚠️\n",
    "\n",
@ -68,7 +68,7 @@
   "metadata": {},
   "outputs": [
    {
-     "name": "stdin",
+     "name": "stdout",
     "output_type": "stream",
     "text": [
      " ········\n"
--- a/docs/docs/how_to/hybrid.ipynb
+++ b/docs/docs/how_to/hybrid.ipynb
@ -9,7 +9,7 @@
   "source": [
    "# Hybrid Search\n",
    "\n",
-    "The standard search in LangChain is done by vector similarity. However, a number of vectorstores implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, Qdrant...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
+    "The standard search in LangChain is done by vector similarity. However, a number of [vector store](/docs/integrations/vectorstores/) implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, Qdrant...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
    "\n",
    "**Step 1: Make sure the vectorstore you are using supports hybrid search**\n",
    "\n",
--- a/docs/docs/how_to/indexing.ipynb
+++ b/docs/docs/how_to/indexing.ipynb
@ -9,7 +9,7 @@
    "\n",
    "Here, we will look at a basic indexing workflow using the LangChain indexing API. \n",
    "\n",
-    "The indexing API lets you load and keep in sync documents from any source into a vector store. Specifically, it helps:\n",
+    "The indexing API lets you load and keep in sync documents from any source into a [vector store](/docs/concepts/vectorstores/). Specifically, it helps:\n",
    "\n",
    "* Avoid writing duplicated content into the vector store\n",
    "* Avoid re-writing unchanged content\n",
--- a/docs/docs/how_to/lcel_cheatsheet.ipynb
+++ b/docs/docs/how_to/lcel_cheatsheet.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# LangChain Expression Language Cheatsheet\n",
    "\n",
-    "This is a quick reference for all the most important LCEL primitives. For more advanced usage see the [LCEL how-to guides](/docs/how_to/#langchain-expression-language-lcel) and the [full API reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html).\n",
+    "This is a quick reference for all the most important [LCEL](/docs/concepts/lcel/) primitives. For more advanced usage see the [LCEL how-to guides](/docs/how_to/#langchain-expression-language-lcel) and the [full API reference](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html).\n",
    "\n",
    "### Invoke a runnable\n",
    "#### [Runnable.invoke()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.invoke) / [Runnable.ainvoke()](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.ainvoke)"
--- a/docs/docs/how_to/llm_caching.ipynb
+++ b/docs/docs/how_to/llm_caching.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to cache LLM responses\n",
    "\n",
-    "LangChain provides an optional caching layer for LLMs. This is useful for two reasons:\n",
+    "LangChain provides an optional [caching](/docs/concepts/chat_models/#caching) layer for LLMs. This is useful for two reasons:\n",
    "\n",
    "It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times.\n",
    "It can speed up your application by reducing the number of API calls you make to the LLM provider.\n"
--- a/docs/docs/how_to/llm_token_usage_tracking.ipynb
+++ b/docs/docs/how_to/llm_token_usage_tracking.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to track token usage for LLMs\n",
    "\n",
-    "Tracking token usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
+    "Tracking [token](/docs/concepts/tokens/) usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
    "\n",
    ":::info Prerequisites\n",
    "\n",
--- a/docs/docs/how_to/logprobs.ipynb
+++ b/docs/docs/how_to/logprobs.ipynb
@ -11,10 +11,11 @@
    "\n",
    "This guide assumes familiarity with the following concepts:\n",
    "- [Chat models](/docs/concepts/chat_models)\n",
+    "- [Tokens](/docs/concepts/tokens)\n",
    "\n",
    ":::\n",
    "\n",
-    "Certain chat models can be configured to return token-level log probabilities representing the likelihood of a given token. This guide walks through how to get this information in LangChain."
+    "Certain [chat models](/docs/concepts/chat_models/) can be configured to return token-level log probabilities representing the likelihood of a given token. This guide walks through how to get this information in LangChain."
   ]
  },
  {
--- a/docs/docs/how_to/merge_message_runs.ipynb
+++ b/docs/docs/how_to/merge_message_runs.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to merge consecutive messages of the same type\n",
    "\n",
-    "Certain models do not support passing in consecutive messages of the same type (a.k.a. \"runs\" of the same message type).\n",
+    "Certain models do not support passing in consecutive [messages](/docs/concepts/messages/) of the same type (a.k.a. \"runs\" of the same message type).\n",
    "\n",
    "The `merge_message_runs` utility makes it easy to merge consecutive messages of the same type.\n",
    "\n",
--- a/docs/docs/how_to/multi_vector.ipynb
+++ b/docs/docs/how_to/multi_vector.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to retrieve using multiple vectors per document\n",
    "\n",
-    "It can often be useful to store multiple vectors per document. There are multiple use cases where this is beneficial. For example, we can embed multiple chunks of a document and associate those embeddings with the parent document, allowing retriever hits on the chunks to return the larger document.\n",
+    "It can often be useful to store multiple [vectors](/docs/concepts/vectorstores/) per document. There are multiple use cases where this is beneficial. For example, we can [embed](/docs/concepts/embedding_models/) multiple chunks of a document and associate those embeddings with the parent document, allowing [retriever](/docs/concepts/retrievers/) hits on the chunks to return the larger document.\n",
    "\n",
    "LangChain implements a base [MultiVectorRetriever](https://python.langchain.com/api_reference/langchain/retrievers/langchain.retrievers.multi_vector.MultiVectorRetriever.html), which simplifies this process. Much of the complexity lies in how to create the multiple vectors per document. This notebook covers some of the common ways to create those vectors and use the `MultiVectorRetriever`.\n",
    "\n",
--- a/docs/docs/how_to/multimodal_inputs.ipynb
+++ b/docs/docs/how_to/multimodal_inputs.ipynb
@ -7,11 +7,11 @@
   "source": [
    "# How to pass multimodal data directly to models\n",
    "\n",
-    "Here we demonstrate how to pass multimodal input directly to models. \n",
+    "Here we demonstrate how to pass [multimodal](/docs/concepts/multimodality/) input directly to models. \n",
    "We currently expect all input to be passed in the same format as [OpenAI expects](https://platform.openai.com/docs/guides/vision).\n",
    "For other model providers that support multimodal input, we have added logic inside the class to convert to the expected format.\n",
    "\n",
-    "In this example we will ask a model to describe an image."
+    "In this example we will ask a [model](/docs/concepts/chat_models/#multimodality) to describe an image."
   ]
  },
  {
--- a/docs/docs/how_to/multimodal_prompts.ipynb
+++ b/docs/docs/how_to/multimodal_prompts.ipynb
@ -7,9 +7,9 @@
   "source": [
    "# How to use multimodal prompts\n",
    "\n",
-    "Here we demonstrate how to use prompt templates to format multimodal inputs to models. \n",
+    "Here we demonstrate how to use prompt templates to format [multimodal](/docs/concepts/multimodality/) inputs to models. \n",
    "\n",
-    "In this example we will ask a model to describe an image."
+    "In this example we will ask a [model](/docs/concepts/chat_models/#multimodality) to describe an image."
   ]
  },
  {
--- a/docs/docs/how_to/output_parser_custom.ipynb
+++ b/docs/docs/how_to/output_parser_custom.ipynb
@ -7,11 +7,11 @@
   "source": [
    "# How to create a custom Output Parser\n",
    "\n",
-    "In some situations you may want to implement a custom parser to structure the model output into a custom format.\n",
+    "In some situations you may want to implement a custom [parser](/docs/concepts/output_parsers/) to structure the model output into a custom format.\n",
    "\n",
    "There are two ways to implement a custom parser:\n",
    "\n",
-    "1. Using `RunnableLambda` or `RunnableGenerator` in LCEL -- we strongly recommend this for most use cases\n",
+    "1. Using `RunnableLambda` or `RunnableGenerator` in [LCEL](/docs/concepts/lcel/) -- we strongly recommend this for most use cases\n",
    "2. By inherting from one of the base classes for out parsing -- this is the hard way of doing things\n",
    "\n",
    "The difference between the two approaches are mostly superficial and are mainly in terms of which callbacks are triggered (e.g., `on_chain_start` vs. `on_parser_start`), and how a runnable lambda vs. a parser might be visualized in a tracing platform like LangSmith."
--- a/docs/docs/how_to/output_parser_fixing.ipynb
+++ b/docs/docs/how_to/output_parser_fixing.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to use the output-fixing parser\n",
    "\n",
-    "This output parser wraps another output parser, and in the event that the first one fails it calls out to another LLM to fix any errors.\n",
+    "This [output parser](/docs/concepts/output_parsers/) wraps another output parser, and in the event that the first one fails it calls out to another LLM to fix any errors.\n",
    "\n",
    "But we can do other things besides throw errors. Specifically, we can pass the misformatted output, along with the formatted instructions, to the model and ask it to fix it.\n",
    "\n",
--- a/docs/docs/how_to/output_parser_structured.ipynb
+++ b/docs/docs/how_to/output_parser_structured.ipynb
@ -19,7 +19,7 @@
    "\n",
    "Language models output text. But there are times where you want to get more structured information than just text back. While some model providers support [built-in ways to return structured output](/docs/how_to/structured_output), not all do.\n",
    "\n",
-    "Output parsers are classes that help structure language model responses. There are two main methods an output parser must implement:\n",
+    "[Output parsers](/docs/concepts/output_parsers/) are classes that help structure language model responses. There are two main methods an output parser must implement:\n",
    "\n",
    "- \"Get format instructions\": A method which returns a string containing instructions for how the output of a language model should be formatted.\n",
    "- \"Parse\": A method which takes in a string (assumed to be the response from a language model) and parses it into some structure.\n",
--- a/docs/docs/how_to/output_parser_xml.ipynb
+++ b/docs/docs/how_to/output_parser_xml.ipynb
@ -20,7 +20,7 @@
    "\n",
    "LLMs from different providers often have different strengths depending on the specific data they are trained on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
    "\n",
-    "This guide shows you how to use the [`XMLOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html) to prompt models for XML output, then and parse that output into a usable format.\n",
+    "This guide shows you how to use the [`XMLOutputParser`](https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html) to prompt models for XML output, then and [parse](/docs/concepts/output_parsers/) that output into a usable format.\n",
    "\n",
    ":::note\n",
    "Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed XML.\n",
--- a/docs/docs/how_to/parent_document_retriever.ipynb
+++ b/docs/docs/how_to/parent_document_retriever.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to use the Parent Document Retriever\n",
    "\n",
-    "When splitting documents for retrieval, there are often conflicting desires:\n",
+    "When splitting documents for [retrieval](/docs/concepts/retrieval/), there are often conflicting desires:\n",
    "\n",
    "1. You may want to have small documents, so that their embeddings can most\n",
    "    accurately reflect their meaning. If too long, then the embeddings can\n",
@ -72,7 +72,7 @@
   "source": [
    "## Retrieving full documents\n",
    "\n",
-    "In this mode, we want to retrieve the full documents. Therefore, we only specify a child splitter."
+    "In this mode, we want to retrieve the full documents. Therefore, we only specify a child [splitter](/docs/concepts/text_splitters/)."
   ]
  },
  {
--- a/docs/docs/how_to/prompts_composition.ipynb
+++ b/docs/docs/how_to/prompts_composition.ipynb
@ -24,7 +24,7 @@
    "\n",
    ":::\n",
    "\n",
-    "LangChain provides a user friendly interface for composing different parts of prompts together. You can do this with either string prompts or chat prompts. Constructing prompts this way allows for easy reuse of components."
+    "LangChain provides a user friendly interface for composing different parts of [prompts](/docs/concepts/prompt_templates/) together. You can do this with either string prompts or chat prompts. Constructing prompts this way allows for easy reuse of components."
   ]
  },
  {
--- a/docs/docs/how_to/prompts_partial.ipynb
+++ b/docs/docs/how_to/prompts_partial.ipynb
@ -24,7 +24,7 @@
    "\n",
    ":::\n",
    "\n",
-    "Like partially binding arguments to a function, it can make sense to \"partial\" a prompt template - e.g. pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.\n",
+    "Like partially binding arguments to a function, it can make sense to \"partial\" a [prompt template](/docs/concepts/prompt_templates/) - e.g. pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.\n",
    "\n",
    "LangChain supports this in two ways:\n",
    "\n",
--- a/docs/docs/how_to/qa_chat_history_how_to.ipynb
+++ b/docs/docs/how_to/qa_chat_history_how_to.ipynb
@ -19,7 +19,7 @@
    ":::\n",
    "\n",
    "\n",
-    "In many Q&A applications we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of \"memory\" of past questions and answers, and some logic for incorporating those into its current thinking.\n",
+    "In many [Q&A applications](/docs/concepts/rag/) we want to allow the user to have a back-and-forth conversation, meaning the application needs some sort of \"memory\" of past questions and answers, and some logic for incorporating those into its current thinking.\n",
    "\n",
    "In this guide we focus on **adding logic for incorporating historical messages.**\n",
    "\n",
--- a/docs/docs/how_to/qa_citations.ipynb
+++ b/docs/docs/how_to/qa_citations.ipynb
@ -19,7 +19,7 @@
    "\n",
    "We generally suggest using the first item of the list that works for your use-case. That is, if your model supports tool-calling, try methods 1 or 2; otherwise, or if those fail, advance down the list.\n",
    "\n",
-    "Let's first create a simple RAG chain. To start we'll just retrieve from Wikipedia using the [WikipediaRetriever](https://python.langchain.com/api_reference/community/retrievers/langchain_community.retrievers.wikipedia.WikipediaRetriever.html)."
+    "Let's first create a simple [RAG](/docs/concepts/rag/) chain. To start we'll just retrieve from Wikipedia using the [WikipediaRetriever](https://python.langchain.com/api_reference/community/retrievers/langchain_community.retrievers.wikipedia.WikipediaRetriever.html)."
   ]
  },
  {
@ -140,7 +140,7 @@
   "id": "c89e2045-9244-43e6-bf3f-59af22658529",
   "metadata": {},
   "source": [
-    "Now that we've got a model, retriver and prompt, let's chain them all together. We'll need to add some logic for formatting our retrieved Documents to a string that can be passed to our prompt. Following the how-to guide on [adding citations](/docs/how_to/qa_citations) to a RAG application, we'll make it so our chain returns both the answer and the retrieved Documents."
+    "Now that we've got a [model](/docs/concepts/chat_models/), [retriver](/docs/concepts/retrievers/) and [prompt](/docs/concepts/prompt_templates/), let's chain them all together. We'll need to add some logic for formatting our retrieved Documents to a string that can be passed to our prompt. Following the how-to guide on [adding citations](/docs/how_to/qa_citations) to a RAG application, we'll make it so our chain returns both the answer and the retrieved Documents."
   ]
  },
  {
--- a/docs/docs/how_to/qa_per_user.ipynb
+++ b/docs/docs/how_to/qa_per_user.ipynb
@ -7,9 +7,9 @@
   "source": [
    "# How to do per-user retrieval\n",
    "\n",
-    "This guide demonstrates how to configure runtime properties of a retrieval chain. An example application is to limit the documents available to a retriever based on the user.\n",
+    "This guide demonstrates how to configure runtime properties of a retrieval chain. An example application is to limit the documents available to a [retriever](/docs/concepts/retrievers/) based on the user.\n",
    "\n",
-    "When building a retrieval app, you often have to build it with multiple users in mind. This means that you may be storing data not just for one user, but for many different users, and they should not be able to see eachother's data. This means that you need to be able to configure your retrieval chain to only retrieve certain information. This generally involves two steps.\n",
+    "When building a [retrieval app](/docs/concepts/rag/), you often have to build it with multiple users in mind. This means that you may be storing data not just for one user, but for many different users, and they should not be able to see eachother's data. This means that you need to be able to configure your retrieval chain to only retrieve certain information. This generally involves two steps.\n",
    "\n",
    "**Step 1: Make sure the retriever you are using supports multiple users**\n",
    "\n",
--- a/docs/docs/how_to/qa_sources.ipynb
+++ b/docs/docs/how_to/qa_sources.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to get your RAG application to return sources\n",
    "\n",
-    "Often in Q&A applications it's important to show users the sources that were used to generate the answer. The simplest way to do this is for the chain to return the Documents that were retrieved in each generation.\n",
+    "Often in [Q&A](/docs/concepts/rag/) applications it's important to show users the sources that were used to generate the answer. The simplest way to do this is for the chain to return the Documents that were retrieved in each generation.\n",
    "\n",
    "We'll work off of the Q&A app we built over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng in the [RAG tutorial](/docs/tutorials/rag).\n",
    "\n",
--- a/docs/docs/how_to/qa_streaming.ipynb
+++ b/docs/docs/how_to/qa_streaming.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to stream results from your RAG application\n",
    "\n",
-    "This guide explains how to stream results from a RAG application. It covers streaming tokens from the final output as well as intermediate steps of a chain (e.g., from query re-writing).\n",
+    "This guide explains how to stream results from a [RAG](/docs/concepts/rag/) application. It covers streaming tokens from the final output as well as intermediate steps of a chain (e.g., from query re-writing).\n",
    "\n",
    "We'll work off of the Q&A app with sources we built over the [LLM Powered Autonomous Agents](https://lilianweng.github.io/posts/2023-06-23-agent/) blog post by Lilian Weng in the [RAG tutorial](/docs/tutorials/rag)."
   ]
--- a/docs/docs/how_to/query_few_shot.ipynb
+++ b/docs/docs/how_to/query_few_shot.ipynb
@ -17,7 +17,7 @@
   "source": [
    "# How to add examples to the prompt for query analysis\n",
    "\n",
-    "As our query analysis becomes more complex, the LLM may struggle to understand how exactly it should respond in certain scenarios. In order to improve performance here, we can add examples to the prompt to guide the LLM.\n",
+    "As our query analysis becomes more complex, the LLM may struggle to understand how exactly it should respond in certain scenarios. In order to improve performance here, we can [add examples](/docs/concepts/few_shot_prompting/) to the prompt to guide the LLM.\n",
    "\n",
    "Let's take a look at how we can add examples for the LangChain YouTube video query analyzer we built in the [Quickstart](/docs/tutorials/query_analysis)."
   ]
--- a/docs/docs/how_to/query_multiple_retrievers.ipynb
+++ b/docs/docs/how_to/query_multiple_retrievers.ipynb
@ -17,7 +17,7 @@
   "source": [
    "# How to handle multiple retrievers when doing query analysis\n",
    "\n",
-    "Sometimes, a query analysis technique may allow for selection of which retriever to use. To use this, you will need to add some logic to select the retriever to do. We will show a simple example (using mock data) of how to do that."
+    "Sometimes, a query analysis technique may allow for selection of which [retriever](/docs/concepts/retrievers/) to use. To use this, you will need to add some logic to select the retriever to do. We will show a simple example (using mock data) of how to do that."
   ]
  },
  {
--- a/docs/docs/how_to/recursive_json_splitter.ipynb
+++ b/docs/docs/how_to/recursive_json_splitter.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to split JSON data\n",
    "\n",
-    "This json splitter splits json data while allowing control over chunk sizes. It traverses json data depth first and builds smaller json chunks. It attempts to keep nested json objects whole but will split them if needed to keep chunks between a min_chunk_size and the max_chunk_size.\n",
+    "This json splitter [splits](/docs/concepts/text_splitters/) json data while allowing control over chunk sizes. It traverses json data depth first and builds smaller json chunks. It attempts to keep nested json objects whole but will split them if needed to keep chunks between a min_chunk_size and the max_chunk_size.\n",
    "\n",
    "If the value is not a nested json, but rather a very large string the string will not be split. If you need a hard cap on the chunk size consider composing this with a Recursive Text splitter on those chunks. There is an optional pre-processing step to split lists, by first converting them to json (dict) and then splitting them as such.\n",
    "\n",
--- a/docs/docs/how_to/recursive_text_splitter.ipynb
+++ b/docs/docs/how_to/recursive_text_splitter.ipynb
@ -21,7 +21,7 @@
   "source": [
    "# How to recursively split text by characters\n",
    "\n",
-    "This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is `[\"\\n\\n\", \"\\n\", \" \", \"\"]`. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.\n",
+    "This [text splitter](/docs/concepts/text_splitters/) is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is `[\"\\n\\n\", \"\\n\", \" \", \"\"]`. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.\n",
    "\n",
    "1. How the text is split: by list of characters.\n",
    "2. How the chunk size is measured: by number of characters.\n",
--- a/docs/docs/how_to/response_metadata.ipynb
+++ b/docs/docs/how_to/response_metadata.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# Response metadata\n",
    "\n",
-    "Many model providers include some metadata in their chat generation responses. This metadata can be accessed via the `AIMessage.response_metadata: Dict` attribute. Depending on the model provider and model configuration, this can contain information like [token counts](/docs/how_to/chat_token_usage_tracking), [logprobs](/docs/how_to/logprobs), and more.\n",
+    "Many model providers include some metadata in their chat generation [responses](/docs/concepts/messages/#aimessage). This metadata can be accessed via the `AIMessage.response_metadata: Dict` attribute. Depending on the model provider and model configuration, this can contain information like [token counts](/docs/how_to/chat_token_usage_tracking), [logprobs](/docs/how_to/logprobs), and more.\n",
    "\n",
    "Here's what the response metadata looks like for a few different providers:\n",
    "\n",
--- a/docs/docs/how_to/runnable_runtime_secrets.ipynb
+++ b/docs/docs/how_to/runnable_runtime_secrets.ipynb
@ -11,7 +11,7 @@
    "\n",
    ":::\n",
    "\n",
-    "We can pass in secrets to our runnables at runtime using the `RunnableConfig`. Specifically we can pass in secrets with a `__` prefix to the `configurable` field. This will ensure that these secrets aren't traced as part of the invocation:"
+    "We can pass in secrets to our [runnables](/docs/concepts/runnables/) at runtime using the `RunnableConfig`. Specifically we can pass in secrets with a `__` prefix to the `configurable` field. This will ensure that these secrets aren't traced as part of the invocation:"
   ]
  },
  {
--- a/docs/docs/how_to/self_query.ipynb
+++ b/docs/docs/how_to/self_query.ipynb
@ -13,7 +13,7 @@
    "\n",
    ":::\n",
    "\n",
-    "A self-querying retriever is one that, as the name suggests, has the ability to query itself. Specifically, given any natural language query, the retriever uses a query-constructing LLM chain to write a structured query and then applies that structured query to its underlying VectorStore. This allows the retriever to not only use the user-input query for semantic similarity comparison with the contents of stored documents but to also extract filters from the user query on the metadata of stored documents and to execute those filters.\n",
+    "A self-querying [retriever](/docs/concepts/retrievers/) is one that, as the name suggests, has the ability to query itself. Specifically, given any natural language query, the retriever uses a query-constructing LLM chain to write a structured query and then applies that structured query to its underlying [vector store](/docs/concepts/vectorstores/). This allows the retriever to not only use the user-input query for semantic similarity comparison with the contents of stored documents but to also extract filters from the user query on the metadata of stored documents and to execute those filters.\n",
    "\n",
    "![](../../static/img/self_querying.jpg)\n",
    "\n",
--- a/docs/docs/how_to/split_by_token.ipynb
+++ b/docs/docs/how_to/split_by_token.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to split text by tokens \n",
    "\n",
-    "Language models have a token limit. You should not exceed the token limit. When you split your text into chunks it is therefore a good idea to count the number of tokens. There are many tokenizers. When you count tokens in your text you should use the same tokenizer as used in the language model. "
+    "Language models have a [token](/docs/concepts/tokens/) limit. You should not exceed the token limit. When you [split your text](/docs/concepts/text_splitters/) into chunks it is therefore a good idea to count the number of tokens. There are many tokenizers. When you count tokens in your text you should use the same tokenizer as used in the language model. "
   ]
  },
  {
--- a/docs/docs/how_to/sql_prompting.ipynb
+++ b/docs/docs/how_to/sql_prompting.ipynb
@ -12,7 +12,7 @@
    "\n",
    "- How the dialect of the LangChain [SQLDatabase](https://python.langchain.com/api_reference/community/utilities/langchain_community.utilities.sql_database.SQLDatabase.html) impacts the prompt of the chain;\n",
    "- How to format schema information into the prompt using `SQLDatabase.get_context`;\n",
-    "- How to build and select few-shot examples to assist the model.\n",
+    "- How to build and select [few-shot examples](/docs/concepts/few_shot_prompting/) to assist the model.\n",
    "\n",
    "## Setup\n",
    "\n",
--- a/docs/docs/how_to/structured_output.ipynb
+++ b/docs/docs/how_to/structured_output.ipynb
@ -29,7 +29,7 @@
    "- [Function/tool calling](/docs/concepts/tool_calling)\n",
    ":::\n",
    "\n",
-    "It is often useful to have a model return output that matches a specific schema. One common use-case is extracting data from text to insert into a database or use with some other downstream system. This guide covers a few strategies for getting structured outputs from a model.\n",
+    "It is often useful to have a model return output that matches a specific [schema](/docs/concepts/structured_outputs/). One common use-case is extracting data from text to insert into a database or use with some other downstream system. This guide covers a few strategies for getting structured outputs from a model.\n",
    "\n",
    "## The `.with_structured_output()` method\n",
    "\n",
@ -41,9 +41,9 @@
    "\n",
    ":::\n",
    "\n",
-    "This is the easiest and most reliable way to get structured outputs. `with_structured_output()` is implemented for models that provide native APIs for structuring outputs, like tool/function calling or JSON mode, and makes use of these capabilities under the hood.\n",
+    "This is the easiest and most reliable way to get structured outputs. `with_structured_output()` is implemented for [models that provide native APIs for structuring outputs](/docs/integrations/chat/), like tool/function calling or JSON mode, and makes use of these capabilities under the hood.\n",
    "\n",
-    "This method takes a schema as input which specifies the names, types, and descriptions of the desired output attributes. The method returns a model-like Runnable, except that instead of outputting strings or Messages it outputs objects corresponding to the given schema. The schema can be specified as a TypedDict class, [JSON Schema](https://json-schema.org/) or a Pydantic class. If TypedDict or JSON Schema are used then a dictionary will be returned by the Runnable, and if a Pydantic class is used then a Pydantic object will be returned.\n",
+    "This method takes a schema as input which specifies the names, types, and descriptions of the desired output attributes. The method returns a model-like Runnable, except that instead of outputting strings or [messages](/docs/concepts/messages/) it outputs objects corresponding to the given schema. The schema can be specified as a TypedDict class, [JSON Schema](https://json-schema.org/) or a Pydantic class. If TypedDict or JSON Schema are used then a dictionary will be returned by the Runnable, and if a Pydantic class is used then a Pydantic object will be returned.\n",
    "\n",
    "As an example, let's get a model to generate a joke and separate the setup from the punchline:\n",
    "\n",
--- a/docs/docs/how_to/summarize_stuff.ipynb
+++ b/docs/docs/how_to/summarize_stuff.ipynb
@ -30,7 +30,7 @@
   "source": [
    "## Load chat model\n",
    "\n",
-    "Let's first load a chat model:\n",
+    "Let's first load a [chat model](/docs/concepts/chat_models/):\n",
    "\n",
    "import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
    "\n",
--- a/docs/docs/how_to/time_weighted_vectorstore.ipynb
+++ b/docs/docs/how_to/time_weighted_vectorstore.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to use a time-weighted vector store retriever\n",
    "\n",
-    "This retriever uses a combination of semantic similarity and a time decay.\n",
+    "This [retriever](/docs/concepts/retrievers/) uses a combination of semantic [similarity](/docs/concepts/embedding_models/#measure-similarity) and a time decay.\n",
    "\n",
    "The algorithm for scoring them is:\n",
    "\n",
--- a/docs/docs/how_to/tool_artifacts.ipynb
+++ b/docs/docs/how_to/tool_artifacts.ipynb
@ -16,7 +16,7 @@
    "\n",
    ":::\n",
    "\n",
-    "Tools are utilities that can be called by a model, and whose outputs are designed to be fed back to a model. Sometimes, however, there are artifacts of a tool's execution that we want to make accessible to downstream components in our chain or agent, but that we don't want to expose to the model itself. For example if a tool returns a custom object, a dataframe or an image, we may want to pass some metadata about this output to the model without passing the actual output to the model. At the same time, we may want to be able to access this full output elsewhere, for example in downstream tools.\n",
+    "[Tools](/docs/concepts/tools/) are utilities that can be [called by a model](/docs/concepts/tool_calling/), and whose outputs are designed to be fed back to a model. Sometimes, however, there are artifacts of a tool's execution that we want to make accessible to downstream components in our chain or agent, but that we don't want to expose to the model itself. For example if a tool returns a custom object, a dataframe or an image, we may want to pass some metadata about this output to the model without passing the actual output to the model. At the same time, we may want to be able to access this full output elsewhere, for example in downstream tools.\n",
    "\n",
    "The Tool and [ToolMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolMessage.html) interfaces make it possible to distinguish between the parts of the tool output meant for the model (this is the ToolMessage.content) and those parts which are meant for use outside the model (ToolMessage.artifact).\n",
    "\n",
--- a/docs/docs/how_to/tool_choice.ipynb
+++ b/docs/docs/how_to/tool_choice.ipynb
@ -14,7 +14,7 @@
    "- [How to use a model to call tools](/docs/how_to/tool_calling)\n",
    ":::\n",
    "\n",
-    "In order to force our LLM to select a specific tool, we can use the `tool_choice` parameter to ensure certain behavior. First, let's define our model and tools:"
+    "In order to force our LLM to select a specific [tool](/docs/concepts/tools/), we can use the `tool_choice` parameter to ensure certain behavior. First, let's define our model and tools:"
   ]
  },
  {
--- a/docs/docs/how_to/tool_configure.ipynb
+++ b/docs/docs/how_to/tool_configure.ipynb
@ -17,9 +17,9 @@
    "\n",
    ":::\n",
    "\n",
-    "If you have a tool  that call chat models, retrievers, or other runnables, you may want to access internal events from those runnables or configure them with additional properties. This guide shows you how to manually pass parameters properly so that you can do this using the `astream_events()` method.\n",
+    "If you have a [tool](/docs/concepts/tools/) that calls [chat models](/docs/concepts/chat_models/), [retrievers](/docs/concepts/retrievers/), or other [runnables](/docs/concepts/runnables/), you may want to access internal events from those runnables or configure them with additional properties. This guide shows you how to manually pass parameters properly so that you can do this using the `astream_events()` method.\n",
    "\n",
-    "Tools are runnables, and you can treat them the same way as any other runnable at the interface level - you can call `invoke()`, `batch()`, and `stream()` on them as normal. However, when writing custom tools, you may want to invoke other runnables like chat models or retrievers. In order to properly trace and configure those sub-invocations, you'll need to manually access and pass in the tool's current [`RunnableConfig`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html) object. This guide show you some examples of how to do that.\n",
+    "Tools are [runnables](/docs/concepts/runnables/), and you can treat them the same way as any other runnable at the interface level - you can call `invoke()`, `batch()`, and `stream()` on them as normal. However, when writing custom tools, you may want to invoke other runnables like chat models or retrievers. In order to properly trace and configure those sub-invocations, you'll need to manually access and pass in the tool's current [`RunnableConfig`](https://python.langchain.com/api_reference/core/runnables/langchain_core.runnables.config.RunnableConfig.html) object. This guide show you some examples of how to do that.\n",
    "\n",
    ":::caution Compatibility\n",
    "\n",
--- a/docs/docs/how_to/tool_runtime.ipynb
+++ b/docs/docs/how_to/tool_runtime.ipynb
@ -21,7 +21,7 @@
    "  [\"langchain-core\", \"0.2.21\"],\n",
    "]} />\n",
    "\n",
-    "You may need to bind values to a tool that are only known at runtime. For example, the tool logic may require using the ID of the user who made the request.\n",
+    "You may need to bind values to a [tool](/docs/concepts/tools/) that are only known at runtime. For example, the tool logic may require using the ID of the user who made the request.\n",
    "\n",
    "Most of the time, such values should not be controlled by the LLM. In fact, allowing the LLM to control the user ID may lead to a security risk.\n",
    "\n",
--- a/docs/docs/how_to/tool_stream_events.ipynb
+++ b/docs/docs/how_to/tool_stream_events.ipynb
@ -16,7 +16,7 @@
    "\n",
    ":::\n",
    "\n",
-    "If you have tools that call chat models, retrievers, or other runnables, you may want to access internal events from those runnables or configure them with additional properties. This guide shows you how to manually pass parameters properly so that you can do this using the `astream_events()` method.\n",
+    "If you have [tools](/docs/concepts/tools/) that call [chat models](/docs/concepts/chat_models/), [retrievers](/docs/concepts/retrievers/), or other [runnables](/docs/concepts/runnables/), you may want to access internal events from those runnables or configure them with additional properties. This guide shows you how to manually pass parameters properly so that you can do this using the `astream_events()` method.\n",
    "\n",
    ":::caution Compatibility\n",
    "\n",
--- a/docs/docs/how_to/tool_streaming.ipynb
+++ b/docs/docs/how_to/tool_streaming.ipynb
@ -6,7 +6,7 @@
   "source": [
    "# How to stream tool calls\n",
    "\n",
-    "When tools are called in a streaming context, \n",
+    "When [tools](/docs/concepts/tools/) are called in a streaming context, \n",
    "[message chunks](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessageChunk.html#langchain_core.messages.ai.AIMessageChunk) \n",
    "will be populated with [tool call chunk](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.tool.ToolCallChunk.html#langchain_core.messages.tool.ToolCallChunk) \n",
    "objects in a list via the `.tool_call_chunks` attribute. A `ToolCallChunk` includes \n",
--- a/docs/docs/how_to/tools_as_openai_functions.ipynb
+++ b/docs/docs/how_to/tools_as_openai_functions.ipynb
@ -7,7 +7,7 @@
   "source": [
    "# How to convert tools to OpenAI Functions\n",
    "\n",
-    "This notebook goes over how to use LangChain tools as OpenAI functions."
+    "This notebook goes over how to use LangChain [tools](/docs/concepts/tools/) as OpenAI functions."
   ]
  },
  {
--- a/docs/docs/how_to/tools_chain.ipynb
+++ b/docs/docs/how_to/tools_chain.ipynb
@ -17,7 +17,7 @@
   "source": [
    "# How to use tools in a chain\n",
    "\n",
-    "In this guide, we will go over the basic ways to create Chains and Agents that call Tools. Tools can be just about anything — APIs, functions, databases, etc. Tools allow us to extend the capabilities of a model beyond just outputting text/messages. The key to using models with tools is correctly prompting a model and parsing its response so that it chooses the right tools and provides the right inputs for them."
+    "In this guide, we will go over the basic ways to create Chains and Agents that call [Tools](/docs/concepts/tools/). Tools can be just about anything — APIs, functions, databases, etc. Tools allow us to extend the capabilities of a model beyond just outputting text/messages. The key to using models with tools is correctly prompting a model and parsing its response so that it chooses the right tools and provides the right inputs for them."
   ]
  },
  {
@ -143,7 +143,7 @@
    "![chain](../../static/img/tool_chain.svg)\n",
    "\n",
    "### Tool/function calling\n",
-    "One of the most reliable ways to use tools with LLMs is with tool calling APIs (also sometimes called function calling). This only works with models that explicitly support tool calling. You can see which models support tool calling [here](/docs/integrations/chat/), and learn more about how to use tool calling in [this guide](/docs/how_to/function_calling).\n",
+    "One of the most reliable ways to use tools with LLMs is with [tool calling](/docs/concepts/tool_calling/) APIs (also sometimes called function calling). This only works with models that explicitly support tool calling. You can see which models support tool calling [here](/docs/integrations/chat/), and learn more about how to use tool calling in [this guide](/docs/how_to/function_calling).\n",
    "\n",
    "First we'll define our model and tools. We'll start with just a single tool, `multiply`.\n",
    "\n",
--- a/docs/docs/how_to/tools_error.ipynb
+++ b/docs/docs/how_to/tools_error.ipynb
@ -16,7 +16,7 @@
    "\n",
    ":::\n",
    "\n",
-    "Calling tools with an LLM is generally more reliable than pure prompting, but it isn't perfect. The model may try to call a tool that doesn't exist or fail to return arguments that match the requested schema. Strategies like keeping schemas simple, reducing the number of tools you pass at once, and having good names and descriptions can help mitigate this risk, but aren't foolproof.\n",
+    "[Calling tools](/docs/concepts/tool_calling/) with an LLM is generally more reliable than pure prompting, but it isn't perfect. The model may try to call a tool that doesn't exist or fail to return arguments that match the requested schema. Strategies like keeping schemas simple, reducing the number of tools you pass at once, and having good names and descriptions can help mitigate this risk, but aren't foolproof.\n",
    "\n",
    "This guide covers some ways to build error handling into your chains to mitigate these failure modes."
   ]
--- a/docs/docs/how_to/tools_few_shot.ipynb
+++ b/docs/docs/how_to/tools_few_shot.ipynb
@ -6,7 +6,7 @@
   "source": [
    "# How to use few-shot prompting with tool calling\n",
    "\n",
-    "For more complex tool use it's very useful to add few-shot examples to the prompt. We can do this by adding `AIMessage`s with `ToolCall`s and corresponding `ToolMessage`s to our prompt.\n",
+    "For more complex tool use it's very useful to add [few-shot examples](/docs/concepts/few_shot_prompting/) to the prompt. We can do this by adding `AIMessage`s with `ToolCall`s and corresponding `ToolMessage`s to our prompt.\n",
    "\n",
    "First let's define our tools and model."
   ]
--- a/docs/docs/how_to/trim_messages.ipynb
+++ b/docs/docs/how_to/trim_messages.ipynb
@ -20,7 +20,7 @@
    "\n",
    ":::\n",
    "\n",
-    "All models have finite context windows, meaning there's a limit to how many tokens they can take as input. If you have very long messages or a chain/agent that accumulates a long message is history, you'll need to manage the length of the messages you're passing in to the model.\n",
+    "All models have finite context windows, meaning there's a limit to how many [tokens](/docs/concepts/tokens/) they can take as input. If you have very long messages or a chain/agent that accumulates a long message history, you'll need to manage the length of the messages you're passing in to the model.\n",
    "\n",
    "[trim_messages](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.trim_messages.html) can be used to reduce the size of a chat history to a specified token count or specified message count.\n",
    "\n",
--- a/docs/docs/how_to/vectorstore_retriever.ipynb
+++ b/docs/docs/how_to/vectorstore_retriever.ipynb
@ -17,7 +17,7 @@
   "source": [
    "# How to use a vectorstore as a retriever\n",
    "\n",
-    "A vector store retriever is a retriever that uses a vector store to retrieve documents. It is a lightweight wrapper around the vector store class to make it conform to the retriever interface.\n",
+    "A vector store retriever is a [retriever](/docs/concepts/retrievers/) that uses a [vector store](/docs/concepts/vectorstores/) to retrieve documents. It is a lightweight wrapper around the vector store class to make it conform to the retriever [interface](/docs/concepts/runnables/).\n",
    "It uses the search methods implemented by a vector store, like similarity search and MMR, to query the texts in the vector store.\n",
    "\n",
    "In this guide we will cover:\n",