fireworks: add secret (#21744 )

pinecone: bump min core version (#21742 )
fireworks: bump min core version (#21741 )
2026-02-04 08:10:25 +00:00 · 2024-05-15 19:48:51 -07:00 · 2024-05-15 19:31:43 -07:00 · 2024-05-15 19:29:13 -07:00 · 2024-05-15 19:27:39 -07:00 · 2024-05-15 19:13:25 -07:00
490 changed files with 24700 additions and 7655 deletions
--- a/.github/scripts/check_diff.py
+++ b/.github/scripts/check_diff.py
@@ -6,8 +6,8 @@ from typing import Dict
 LANGCHAIN_DIRS = [
    "libs/core",
    "libs/text-splitters",
-    "libs/community",
    "libs/langchain",
+    "libs/community",
    "libs/experimental",
 ]

--- a/.github/workflows/_release.yml
+++ b/.github/workflows/_release.yml
@@ -177,7 +177,7 @@ jobs:
        env:
          MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
        run: |
-          poetry run pip install --force-reinstall $MIN_VERSIONS
+          poetry run pip install --force-reinstall $MIN_VERSIONS --editable .
          make tests
        working-directory: ${{ inputs.working-directory }}

@@ -222,6 +222,7 @@ jobs:
          MONGODB_ATLAS_URI: ${{ secrets.MONGODB_ATLAS_URI }}
          VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
          UPSTAGE_API_KEY: ${{ secrets.UPSTAGE_API_KEY }}
+          FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
        run: make integration_tests
        working-directory: ${{ inputs.working-directory }}

--- a/cookbook/sql_db_qa.mdx
+++ b/cookbook/sql_db_qa.mdx
@@ -647,7 +647,7 @@ Sometimes you may not have the luxury of using OpenAI or other service-hosted la
 import logging
 import torch
 from transformers import AutoTokenizer, GPT2TokenizerFast, pipeline, AutoModelForSeq2SeqLM, AutoModelForCausalLM
-from langchain_community.llms import HuggingFacePipeline
+from langchain_huggingface import HuggingFacePipeline

 # Note: This model requires a large GPU, e.g. an 80GB A100. See documentation for other ways to run private non-OpenAI models.
 model_id = "google/flan-ul2"
@@ -992,7 +992,7 @@ Now that you have some examples (with manually corrected output SQL), you can do
 ```python
 from langchain.prompts import FewShotPromptTemplate, PromptTemplate
 from langchain.chains.sql_database.prompt import _sqlite_prompt, PROMPT_SUFFIX
-from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings
+from langchain_huggingface import HuggingFaceEmbeddings
 from langchain.prompts.example_selector.semantic_similarity import SemanticSimilarityExampleSelector
 from langchain_community.vectorstores import Chroma

--- a/docs/Makefile
+++ b/docs/Makefile
@@ -13,7 +13,7 @@ OUTPUT_NEW_DOCS_DIR = $(OUTPUT_NEW_DIR)/docs

 PYTHON = .venv/bin/python

-PARTNER_DEPS_LIST := $(shell find ../libs/partners -mindepth 1 -maxdepth 1 -type d -exec test -e "{}/pyproject.toml" \; -print | grep -vE "airbyte|ibm" | tr '\n' ' ')
+PARTNER_DEPS_LIST := $(shell find ../libs/partners -mindepth 1 -maxdepth 1 -type d -exec test -e "{}/pyproject.toml" \; -print | grep -vE "airbyte|ibm|ai21" | tr '\n' ' ')

 PORT ?= 3001

@@ -48,8 +48,6 @@ generate-files:
 	wget -q https://raw.githubusercontent.com/langchain-ai/langgraph/main/README.md -O $(INTERMEDIATE_DIR)/langgraph.md
 	$(PYTHON) scripts/resolve_local_links.py $(INTERMEDIATE_DIR)/langgraph.md https://github.com/langchain-ai/langgraph/tree/main/

-	$(PYTHON) scripts/generate_api_reference_links.py --docs_dir $(INTERMEDIATE_DIR)
-
 copy-infra:
 	mkdir -p $(OUTPUT_NEW_DIR)
 	cp -r src $(OUTPUT_NEW_DIR)
@@ -68,7 +66,10 @@ render:
 md-sync:
 	rsync -avm --include="*/" --include="*.mdx" --include="*.md" --include="*.png" --exclude="*" $(INTERMEDIATE_DIR)/ $(OUTPUT_NEW_DOCS_DIR)

-build: install-py-deps generate-files copy-infra render md-sync
+generate-references:
+	$(PYTHON) scripts/generate_api_reference_links.py --docs_dir $(OUTPUT_NEW_DOCS_DIR)
+
+build: install-py-deps generate-files copy-infra render md-sync generate-references

 vercel-build: install-vercel-deps build
 	rm -rf docs
@@ -78,6 +79,7 @@ vercel-build: install-vercel-deps build
 	mv build v0.2
 	mkdir build
 	mv v0.2 build
+	mv build/v0.2/404.html build

 start:
 	cd $(OUTPUT_NEW_DIR) && yarn && yarn start --port=$(PORT)
--- a/docs/docs/concepts.mdx
+++ b/docs/docs/concepts.mdx
@@ -7,16 +7,7 @@ This section contains introductions to key parts of LangChain.

 ## Architecture

-LangChain as a framework consists of several pieces. The below diagram shows how they relate.
-
-<ThemedImage
-  alt="Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers."
-  sources={{
-    light: useBaseUrl('/svg/langchain_stack.svg'),
-    dark: useBaseUrl('/svg/langchain_stack_dark.svg'),
-  }}
-  title="LangChain Framework Overview"
-/>
+LangChain as a framework consists of a number of packages.

 ### `langchain-core`
 This package contains base abstractions of different components and ways to compose them together.
@@ -24,13 +15,6 @@ The interfaces for core components like LLMs, vectorstores, retrievers and more
 No third party integrations are defined here.
 The dependencies are kept purposefully very lightweight.

-### `langchain-community`
-
-This package contains third party integrations that are maintained by the LangChain community.
-Key partner packages are separated out (see below).
-This contains all integrations for various components (LLMs, vectorstores, retrievers).
-All dependencies in this package are optional to keep the package as lightweight as possible.
-
 ### Partner packages

 While the long tail of integrations are in `langchain-community`, we split popular integrations into their own packages (e.g. `langchain-openai`, `langchain-anthropic`, etc).
@@ -42,14 +26,21 @@ The main `langchain` package contains chains, agents, and retrieval strategies t
 These are NOT third party integrations.
 All chains, agents, and retrieval strategies here are NOT specific to any one integration, but rather generic across all integrations.

-### [LangGraph](/docs/langgraph)
+### `langchain-community`

-Not currently in this repo, `langgraph` is an extension of `langchain` aimed at
+This package contains third party integrations that are maintained by the LangChain community.
+Key partner packages are separated out (see below).
+This contains all integrations for various components (LLMs, vectorstores, retrievers).
+All dependencies in this package are optional to keep the package as lightweight as possible.
+
+### [`langgraph`](/docs/langgraph)
+
+`langgraph` is an extension of `langchain` aimed at
 building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.

 LangGraph exposes high level interfaces for creating common types of agents, as well as a low-level API for constructing more contr

-### [langserve](/docs/langserve)
+### [`langserve`](/docs/langserve)

 A package to deploy LangChain chains as REST APIs. Makes it easy to get a production ready API up and running.

@@ -57,28 +48,18 @@ A package to deploy LangChain chains as REST APIs. Makes it easy to get a produc

 A developer platform that lets you debug, test, evaluate, and monitor LLM applications.

-## Installation
+<ThemedImage
+  alt="Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers."
+  sources={{
+    light: useBaseUrl('/svg/langchain_stack.svg'),
+    dark: useBaseUrl('/svg/langchain_stack_dark.svg'),
+  }}
+  title="LangChain Framework Overview"
+/>

-If you want to work with high level abstractions, you should install the `langchain` package.
+## LangChain Expression Language (LCEL)

-```shell
-pip install langchain
-```
-
-If you want to work with specific integrations, you will need to install them separately.
-See [here](/docs/integrations/platforms/) for a list of integrations and how to install them.
-
-For working with LangSmith, you will need to set up a LangSmith developer account [here](https://smith.langchain.com) and get an API key.
-After that, you can enable it by setting environment variables:
-
-```shell
-export LANGCHAIN_TRACING_V2=true
-export LANGCHAIN_API_KEY=ls__...
-```
-
-## LangChain Expression Language
-
-LangChain Expression Language, or LCEL, is a declarative way to easily compose chains together.
+LangChain Expression Language, or LCEL, is a declarative way to chain LangChain components.
 LCEL was designed from day 1 to **support putting prototypes in production, with no code changes**, from the simplest “prompt + LLM” chain to the most complex chains (we’ve seen folks successfully run LCEL chains with 100s of steps in production). To highlight a few of the reasons you might want to use LCEL:

 **First-class streaming support**
@@ -106,7 +87,7 @@ With LCEL, **all** steps are automatically logged to [LangSmith](/docs/langsmith
 [**Seamless LangServe deployment**](/docs/langserve)
 Any chain created with LCEL can be easily deployed using [LangServe](/docs/langserve).

-### Interface
+### Runnable interface

 To make it as easy as possible to create custom chains, we've implemented a ["Runnable"](https://api.python.langchain.com/en/stable/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol. Many LangChain components implement the `Runnable` protocol, including chat models, LLMs, output parsers, retrievers, prompt templates, and more. There are also several useful primitives for working with runnables, which you can read about below.

@@ -146,16 +127,6 @@ All runnables expose input and output **schemas** to inspect the inputs and outp
 LangChain provides standard, extendable interfaces and external integrations for various components useful for building with LLMs.
 Some components LangChain implements, some components we rely on third-party integrations for, and others are a mix.

-### LLMs
-Language models that takes a string as input and returns a string.
-These are traditionally older models (newer models generally are `ChatModels`, see below).
-
-Although the underlying models are string in, string out, the LangChain wrappers also allow these models to take messages as input.
-This makes them interchangeable with ChatModels.
-When messages are passed in as input, they will be formatted into a string under the hood before being passed to the underlying model.
-
-LangChain does not provide any LLMs, rather we rely on third party integrations.
-
 ### Chat models
 Language models that use a sequence of messages as inputs and return chat messages as outputs (as opposed to using plain text).
 These are traditionally newer models (older models are generally `LLMs`, see above).
@@ -172,45 +143,17 @@ We have some standardized parameters when constructing ChatModels:

 ChatModels also accept other parameters that are specific to that integration.

-### Function/Tool Calling
+### LLMs
+Language models that takes a string as input and returns a string.
+These are traditionally older models (newer models generally are `ChatModels`, see below).

-:::info
-We use the term tool calling interchangeably with function calling. Although
-function calling is sometimes meant to refer to invocations of a single function,
-we treat all models as though they can return multiple tool or function calls in
-each message.
-:::
+Although the underlying models are string in, string out, the LangChain wrappers also allow these models to take messages as input.
+This makes them interchangeable with ChatModels.
+When messages are passed in as input, they will be formatted into a string under the hood before being passed to the underlying model.

-Tool calling allows a model to respond to a given prompt by generating output that
-matches a user-defined schema. While the name implies that the model is performing
-some action, this is actually not the case! The model is coming up with the
-arguments to a tool, and actually running the tool (or not) is up to the user -
-for example, if you want to [extract output matching some schema](/docs/tutorials/extraction)
-from unstructured text, you could give the model an "extraction" tool that takes
-parameters matching the desired schema, then treat the generated output as your final
-result.
+LangChain does not provide any LLMs, rather we rely on third party integrations.

-A tool call includes a name, arguments dict, and an optional identifier. The
-arguments dict is structured `{argument_name: argument_value}`.
-
-Many LLM providers, including [Anthropic](https://www.anthropic.com/),
-[Cohere](https://cohere.com/), [Google](https://cloud.google.com/vertex-ai),
-[Mistral](https://mistral.ai/), [OpenAI](https://openai.com/), and others,
-support variants of a tool calling feature. These features typically allow requests
-to the LLM to include available tools and their schemas, and for responses to include
-calls to these tools. For instance, given a search engine tool, an LLM might handle a
-query by first issuing a call to the search engine. The system calling the LLM can
-receive the tool call, execute it, and return the output to the LLM to inform its
-response. LangChain includes a suite of [built-in tools](/docs/integrations/tools/)
-and supports several methods for defining your own [custom tools](/docs/how_to/custom_tools).
-
-There are two main use cases for function/tool calling:
-
- [How to return structured data from an LLM](/docs/how_to/structured_output/)
- [How to use a model to call tools](/docs/how_to/tool_calling/)
-
-
-### Message types
+### Messages

 Some language models take a list of messages as input and return a message.
 There are a few different types of messages.
@@ -338,7 +281,7 @@ prompt_template = ChatPromptTemplate.from_messages([
 ])
 ```

-### Example Selectors
+### Example selectors
 One common prompting technique for achieving better performance is to include examples as part of the prompt.
 This gives the language model concrete examples of how it should behave.
 Sometimes these examples are hardcoded into the prompt, but for more advanced situations it may be nice to dynamically select them.
@@ -389,7 +332,7 @@ LangChain has lots of different types of output parsers. This is a list of outpu
 | [Datetime](https://api.python.langchain.com/en/latest/output_parsers/langchain.output_parsers.datetime.DatetimeOutputParser.html#langchain.output_parsers.datetime.DatetimeOutputParser)        |                    | ✅                             |           | `str` \| `Message`                 | `datetime.datetime`  | Parses response into a datetime string.                                                                                                                                                                                                                  |
 | [Structured](https://api.python.langchain.com/en/latest/output_parsers/langchain.output_parsers.structured.StructuredOutputParser.html#langchain.output_parsers.structured.StructuredOutputParser)      |                    | ✅                             |           | `str` \| `Message`                 | `Dict[str, str]`     | An output parser that returns structured information. It is less powerful than other output parsers since it only allows for fields to be strings. This can be useful when you are working with smaller LLMs.                                            |

-### Chat History
+### Chat history
 Most LLM applications have a conversational interface.
 An essential component of a conversation is being able to refer to information introduced earlier in the conversation.
 At bare minimum, a conversational system should be able to access some window of past messages directly.
@@ -398,7 +341,7 @@ The concept of `ChatHistory` refers to a class in LangChain which can be used to
 This `ChatHistory` will keep track of inputs and outputs of the underlying chain, and append them as messages to a message database
 Future interactions will then load those messages and pass them into the chain as part of the input.

-### Document
+### Documents

 A Document object in LangChain contains information about some data. It has two attributes:

@@ -445,12 +388,12 @@ Embeddings create a vector representation of a piece of text. This is useful bec

 The base Embeddings class in LangChain provides two methods: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).

-### Vectorstores
+### Vector stores
 One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors,
 and then at query time to embed the unstructured query and retrieve the embedding vectors that are 'most similar' to the embedded query.
 A vector store takes care of storing embedded data and performing vector search for you.

-Vectorstores can be converted to the retriever interface by doing:
+Vector stores can be converted to the retriever interface by doing:

 ```python
 vectorstore = MyVectorStore()
@@ -465,31 +408,6 @@ Retrievers can be created from vectorstores, but are also broad enough to includ

 Retrievers accept a string query as input and return a list of Document's as output.

-### Advanced Retrieval Types
-
-LangChain provides several advanced retrieval types. A full list is below, along with the following information:
-
-**Name**: Name of the retrieval algorithm.
-
-**Index Type**: Which index type (if any) this relies on.
-
-**Uses an LLM**: Whether this retrieval method uses an LLM.
-
-**When to Use**: Our commentary on when you should considering using this retrieval method.
-
-**Description**: Description of what this retrieval algorithm is doing.
-
-| Name                      | Index Type                   | Uses an LLM               | When to Use                                                                                                                                   | Description                                                                                                                                                                                                                                                                                      |
-|---------------------------|------------------------------|---------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| [Vectorstore](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStoreRetriever.html#langchain_core.vectorstores.VectorStoreRetriever)               | Vectorstore                  | No                        | If you are just getting started and looking for something quick and easy.                                                                     | This is the simplest method and the one that is easiest to get started with. It involves creating embeddings for each piece of text.                                                                                                                                                             |
-| [ParentDocument](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.parent_document_retriever.ParentDocumentRetriever.html#langchain.retrievers.parent_document_retriever.ParentDocumentRetriever)            | Vectorstore + Document Store | No                        | If your pages have lots of smaller pieces of distinct information that are best indexed by themselves, but best retrieved all together.       | This involves indexing multiple chunks for each document. Then you find the chunks that are most similar in embedding space, but you retrieve the whole parent document and return that (rather than individual chunks).                                                                         |
-| [Multi Vector](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.multi_vector.MultiVectorRetriever.html#langchain.retrievers.multi_vector.MultiVectorRetriever)              | Vectorstore + Document Store | Sometimes during indexing | If you are able to extract information from documents that you think is more relevant to index than the text itself.                          | This involves creating multiple vectors for each document. Each vector could be created in a myriad of ways - examples include summaries of the text and hypothetical questions.                                                                                                                 |
-| [Self Query](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.self_query.base.SelfQueryRetriever.html#langchain.retrievers.self_query.base.SelfQueryRetriever)               | Vectorstore                  | Yes                       | If users are asking questions that are better answered by fetching documents based on metadata rather than similarity with the text.          | This uses an LLM to transform user input into two things: (1) a string to look up semantically, (2) a metadata filer to go along with it. This is useful because oftentimes questions are about the METADATA of documents (not the content itself).                                              |
-| [Contextual Compression](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.contextual_compression.ContextualCompressionRetriever.html#langchain.retrievers.contextual_compression.ContextualCompressionRetriever)    | Any                          | Sometimes                 | If you are finding that your retrieved documents contain too much irrelevant information and are distracting the LLM.                         | This puts a post-processing step on top of another retriever and extracts only the most relevant information from retrieved documents. This can be done with embeddings or an LLM.                                                                                                               |
-| [Time-Weighted Vectorstore](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.time_weighted_retriever.TimeWeightedVectorStoreRetriever.html#langchain.retrievers.time_weighted_retriever.TimeWeightedVectorStoreRetriever) | Vectorstore                  | No                        | If you have timestamps associated with your documents, and you want to retrieve the most recent ones                                          | This fetches documents based on a combination of semantic similarity (as in normal vector retrieval) and recency (looking at timestamps of indexed documents)                                                                                                                                    |
-| [Multi-Query Retriever](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.multi_query.MultiQueryRetriever.html#langchain.retrievers.multi_query.MultiQueryRetriever)     | Any                          | Yes                       | If users are asking questions that are complex and require multiple pieces of distinct information to respond                                 | This uses an LLM to generate multiple queries from the original one. This is useful when the original query needs pieces of information about multiple topics to be properly answered. By generating multiple queries, we can then fetch documents for each of them.                             |
-| [Ensemble](https://api.python.langchain.com/en/latest/retrievers/langchain.retrievers.ensemble.EnsembleRetriever.html#langchain.retrievers.ensemble.EnsembleRetriever)                  | Any                          | No                        | If you have multiple retrieval methods and want to try combining them.                                                                        | This fetches documents from multiple retrievers and then combines them.                                                                                                                                                                                                                          |
-
 ### Tools
 Tools are interfaces that an agent, chain, or LLM can use to interact with the world.
 They combine a few things:
@@ -541,3 +459,94 @@ In order to solve that we built LangGraph to be this flexible, highly-controllab
 If you are still using AgentExecutor, do not fear: we still have a guide on [how to use AgentExecutor](/docs/how_to/agent_executor).
 It is recommended, however, that you start to transition to LangGraph.
 In order to assist in this we have put together a [transition guide on how to do so](/docs/how_to/migrate_agent)
+
+## Techniques
+
+### Function/tool calling
+
+:::info
+We use the term tool calling interchangeably with function calling. Although
+function calling is sometimes meant to refer to invocations of a single function,
+we treat all models as though they can return multiple tool or function calls in
+each message.
+:::
+
+Tool calling allows a model to respond to a given prompt by generating output that
+matches a user-defined schema. While the name implies that the model is performing
+some action, this is actually not the case! The model is coming up with the
+arguments to a tool, and actually running the tool (or not) is up to the user -
+for example, if you want to [extract output matching some schema](/docs/tutorials/extraction)
+from unstructured text, you could give the model an "extraction" tool that takes
+parameters matching the desired schema, then treat the generated output as your final
+result.
+
+A tool call includes a name, arguments dict, and an optional identifier. The
+arguments dict is structured `{argument_name: argument_value}`.
+
+Many LLM providers, including [Anthropic](https://www.anthropic.com/),
+[Cohere](https://cohere.com/), [Google](https://cloud.google.com/vertex-ai),
+[Mistral](https://mistral.ai/), [OpenAI](https://openai.com/), and others,
+support variants of a tool calling feature. These features typically allow requests
+to the LLM to include available tools and their schemas, and for responses to include
+calls to these tools. For instance, given a search engine tool, an LLM might handle a
+query by first issuing a call to the search engine. The system calling the LLM can
+receive the tool call, execute it, and return the output to the LLM to inform its
+response. LangChain includes a suite of [built-in tools](/docs/integrations/tools/)
+and supports several methods for defining your own [custom tools](/docs/how_to/custom_tools).
+
+There are two main use cases for function/tool calling:
+
+- [How to return structured data from an LLM](/docs/how_to/structured_output/)
+- [How to use a model to call tools](/docs/how_to/tool_calling/)
+
+
+### Retrieval
+
+LangChain provides several advanced retrieval types. A full list is below, along with the following information:
+
+**Name**: Name of the retrieval algorithm.
+
+**Index Type**: Which index type (if any) this relies on.
+
+**Uses an LLM**: Whether this retrieval method uses an LLM.
+
+**When to Use**: Our commentary on when you should considering using this retrieval method.
+
+**Description**: Description of what this retrieval algorithm is doing.
+
+| Name                      | Index Type                   | Uses an LLM               | When to Use                                                                                                                                   | Description                                                                                                                                                                                                                                                                                      |
+|---------------------------|------------------------------|---------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| [Vectorstore](/docs/how_to/vectorstore_retriever/)               | Vectorstore                  | No                        | If you are just getting started and looking for something quick and easy.                                                                     | This is the simplest method and the one that is easiest to get started with. It involves creating embeddings for each piece of text.                                                                                                                                                             |
+| [ParentDocument](/docs/how_to/parent_document_retriever/)            | Vectorstore + Document Store | No                        | If your pages have lots of smaller pieces of distinct information that are best indexed by themselves, but best retrieved all together.       | This involves indexing multiple chunks for each document. Then you find the chunks that are most similar in embedding space, but you retrieve the whole parent document and return that (rather than individual chunks).                                                                         |
+| [Multi Vector](/docs/how_to/multi_vector/)              | Vectorstore + Document Store | Sometimes during indexing | If you are able to extract information from documents that you think is more relevant to index than the text itself.                          | This involves creating multiple vectors for each document. Each vector could be created in a myriad of ways - examples include summaries of the text and hypothetical questions.                                                                                                                 |
+| [Self Query](/docs/how_to/self_query/)               | Vectorstore                  | Yes                       | If users are asking questions that are better answered by fetching documents based on metadata rather than similarity with the text.          | This uses an LLM to transform user input into two things: (1) a string to look up semantically, (2) a metadata filer to go along with it. This is useful because oftentimes questions are about the METADATA of documents (not the content itself).                                              |
+| [Contextual Compression](/docs/how_to/contextual_compression/)    | Any                          | Sometimes                 | If you are finding that your retrieved documents contain too much irrelevant information and are distracting the LLM.                         | This puts a post-processing step on top of another retriever and extracts only the most relevant information from retrieved documents. This can be done with embeddings or an LLM.                                                                                                               |
+| [Time-Weighted Vectorstore](/docs/how_to/time_weighted_vectorstore/) | Vectorstore                  | No                        | If you have timestamps associated with your documents, and you want to retrieve the most recent ones                                          | This fetches documents based on a combination of semantic similarity (as in normal vector retrieval) and recency (looking at timestamps of indexed documents)                                                                                                                                    |
+| [Multi-Query Retriever](/docs/how_to/MultiQueryRetriever/)     | Any                          | Yes                       | If users are asking questions that are complex and require multiple pieces of distinct information to respond                                 | This uses an LLM to generate multiple queries from the original one. This is useful when the original query needs pieces of information about multiple topics to be properly answered. By generating multiple queries, we can then fetch documents for each of them.                             |
+| [Ensemble](/docs/how_to/ensemble_retriever/)                  | Any                          | No                        | If you have multiple retrieval methods and want to try combining them.                                                                        | This fetches documents from multiple retrievers and then combines them.                                                                                                                                                                                                                          |
+
+
+### Text splitting
+
+LangChain offers many different types of `text splitters`.
+These all live in the `langchain-text-splitters` package.
+
+Table columns:
+
+- **Name**: Name of the text splitter
+- **Classes**: Classes that implement this text splitter
+- **Splits On**: How this text splitter splits text
+- **Adds Metadata**: Whether or not this text splitter adds metadata about where each chunk came from.
+- **Description**: Description of the splitter, including recommendation on when to use it.
+
+
+| Name     | Classes                                                                                                                                                                                                             | Splits On                                                   | Adds Metadata | Description                                                                                                                                                                                                                                                                  |
+|----------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|---------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| Recursive | [RecursiveCharacterTextSplitter](/docs/how_to/recursive_text_splitter/), [RecursiveJsonSplitter](/docs/how_to/recursive_json_splitter/) | A list of user defined characters     |               | Recursively splits text. This splitting is trying to keep related pieces of text next to each other. This is the `recommended way` to start splitting text.                                                                                                                    |
+| HTML      | [HTMLHeaderTextSplitter](/docs/how_to/HTML_header_metadata_splitter/), [HTMLSectionSplitter](/docs/how_to/HTML_section_aware_splitter/)          | HTML specific characters                                                                                                 | ✅             | Splits text based on HTML-specific characters. Notably, this adds in relevant information about where that chunk came from (based on the HTML)                                                                                                                               |
+| Markdown  | [MarkdownHeaderTextSplitter](/docs/how_to/markdown_header_metadata_splitter/),                                                                                                           | Markdown specific characters                                                                                    | ✅             | Splits text based on Markdown-specific characters. Notably, this adds in relevant information about where that chunk came from (based on the Markdown)                                                                                                                       |
+| Code      | [many languages](/docs/how_to/code_splitter/)                                                                                                                                 | Code (Python, JS) specific characters                                                                           |               | Splits text based on characters specific to coding languages. 15 different languages are available to choose from.                                                                                                                                                           |
+| Token    | [many classes](/docs/how_to/split_by_token/)                                                                                                                                  | Tokens                                                                                                          |               | Splits text on tokens. There exist a few different ways to measure tokens.                                                                                                                                                                                                   |
+| Character  | [CharacterTextSplitter](/docs/how_to/character_text_splitter/)                                                                                                                | A user defined character                                                                                        |               | Splits text based on a user defined character. One of the simpler methods.                                                                                                                                                                                                   |
+| Semantic Chunker (Experimental) | [SemanticChunker](/docs/how_to/semantic-chunker/)                                                                                                                             | Sentences                                                                                                       |               | First splits on sentences. Then combines ones next to each other if they are semantically similar enough. Taken from [Greg Kamradt](https://github.com/FullStackRetrieval-com/RetrievalTutorials/blob/main/tutorials/LevelsOfTextSplitting/5_Levels_Of_Text_Splitting.ipynb) |
+| Integration: AI21 Semantic | [AI21SemanticTextSplitter](/docs/integrations/document_transformers/ai21_semantic_text_splitter/)                                                                                                                    |    ✅           | Identifies distinct topics that form coherent pieces of text and splits along those.                                                                                                                                                                                         |
--- a/docs/docs/how_to/add_scores_retriever.ipynb
+++ b/docs/docs/how_to/add_scores_retriever.ipynb
@@ -0,0 +1,446 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "9d59582a-6473-4b34-929b-3e94cb443c3d",
+   "metadata": {},
+   "source": [
+    "# How to add scores to retriever results\n",
+    "\n",
+    "Retrievers will return sequences of [Document](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html) objects, which by default include no information about the process that retrieved them (e.g., a similarity score against a query). Here we demonstrate how to add retrieval scores to the `.metadata` of documents:\n",
+    "1. From [vectorstore retrievers](/docs/how_to/vectorstore_retriever);\n",
+    "2. From higher-order LangChain retrievers, such as [SelfQueryRetriever](/docs/how_to/self_query) or [MultiVectorRetriever](/docs/how_to/multi_vector).\n",
+    "\n",
+    "For (1), we will implement a short wrapper function around the corresponding vector store. For (2), we will update a method of the corresponding class.\n",
+    "\n",
+    "## Create vector store\n",
+    "\n",
+    "First we populate a vector store with some data. We will use a [PineconeVectorStore](https://api.python.langchain.com/en/latest/vectorstores/langchain_pinecone.vectorstores.PineconeVectorStore.html), but this guide is compatible with any LangChain vector store that implements a `.similarity_search_with_score` method."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "b8cfcb1b-64ee-4b91-8d82-ce7803834985",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.documents import Document\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "from langchain_pinecone import PineconeVectorStore\n",
+    "\n",
+    "docs = [\n",
+    "    Document(\n",
+    "        page_content=\"A bunch of scientists bring back dinosaurs and mayhem breaks loose\",\n",
+    "        metadata={\"year\": 1993, \"rating\": 7.7, \"genre\": \"science fiction\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Leo DiCaprio gets lost in a dream within a dream within a dream within a ...\",\n",
+    "        metadata={\"year\": 2010, \"director\": \"Christopher Nolan\", \"rating\": 8.2},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea\",\n",
+    "        metadata={\"year\": 2006, \"director\": \"Satoshi Kon\", \"rating\": 8.6},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"A bunch of normal-sized women are supremely wholesome and some men pine after them\",\n",
+    "        metadata={\"year\": 2019, \"director\": \"Greta Gerwig\", \"rating\": 8.3},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Toys come alive and have a blast doing so\",\n",
+    "        metadata={\"year\": 1995, \"genre\": \"animated\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Three men walk into the Zone, three men walk out of the Zone\",\n",
+    "        metadata={\n",
+    "            \"year\": 1979,\n",
+    "            \"director\": \"Andrei Tarkovsky\",\n",
+    "            \"genre\": \"thriller\",\n",
+    "            \"rating\": 9.9,\n",
+    "        },\n",
+    "    ),\n",
+    "]\n",
+    "\n",
+    "vectorstore = PineconeVectorStore.from_documents(\n",
+    "    docs, index_name=\"sample\", embedding=OpenAIEmbeddings()\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "22ac5ef6-ce18-427f-a91c-62b38a8b41e9",
+   "metadata": {},
+   "source": [
+    "## Retriever\n",
+    "\n",
+    "To obtain scores from a vector store retriever, we wrap the underlying vector store's `.similarity_search_with_score` method in a short function that packages scores into the associated document's metadata.\n",
+    "\n",
+    "We add a `@chain` decorator to the function to create a [Runnable](/docs/concepts/#langchain-expression-language) that can be used similarly to a typical retriever."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "7e5677c3-f6ee-4974-ab5f-a0f50c199d45",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import List\n",
+    "\n",
+    "from langchain_core.documents import Document\n",
+    "from langchain_core.runnables import chain\n",
+    "\n",
+    "\n",
+    "@chain\n",
+    "def retriever(query: str) -> List[Document]:\n",
+    "    docs, scores = zip(*vectorstore.similarity_search_with_score(query))\n",
+    "    for doc, score in zip(docs, scores):\n",
+    "        doc.metadata[\"score\"] = score\n",
+    "\n",
+    "    return docs"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "c9cad75e-b955-4012-989c-3c1820b49ba9",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'genre': 'science fiction', 'rating': 7.7, 'year': 1993.0, 'score': 0.84429127}),\n",
+       " Document(page_content='Toys come alive and have a blast doing so', metadata={'genre': 'animated', 'year': 1995.0, 'score': 0.792038262}),\n",
+       " Document(page_content='Three men walk into the Zone, three men walk out of the Zone', metadata={'director': 'Andrei Tarkovsky', 'genre': 'thriller', 'rating': 9.9, 'year': 1979.0, 'score': 0.751571238}),\n",
+       " Document(page_content='A psychologist / detective gets lost in a series of dreams within dreams within dreams and Inception reused the idea', metadata={'director': 'Satoshi Kon', 'rating': 8.6, 'year': 2006.0, 'score': 0.747471571}))"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "result = retriever.invoke(\"dinosaur\")\n",
+    "result"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6671308a-be8d-4c15-ae1f-5bd07b342560",
+   "metadata": {},
+   "source": [
+    "Note that similarity scores from the retrieval step are included in the metadata of the above documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "af2e73a0-46a1-47e2-8103-68aaa637642a",
+   "metadata": {},
+   "source": [
+    "## SelfQueryRetriever\n",
+    "\n",
+    "`SelfQueryRetriever` will use a LLM to generate a query that is potentially structured-- for example, it can construct filters for the retrieval on top of the usual semantic-similarity driven selection. See [this guide](/docs/how_to/self_query) for more detail.\n",
+    "\n",
+    "`SelfQueryRetriever` includes a short (1 - 2 line) method `_get_docs_with_query` that executes the `vectorstore` search. We can subclass `SelfQueryRetriever` and override this method to propagate similarity scores.\n",
+    "\n",
+    "First, following the [how-to guide](/docs/how_to/self_query), we will need to establish some metadata on which to filter:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "8280b829-2e81-4454-8adc-9a0930047fa2",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chains.query_constructor.base import AttributeInfo\n",
+    "from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "metadata_field_info = [\n",
+    "    AttributeInfo(\n",
+    "        name=\"genre\",\n",
+    "        description=\"The genre of the movie. One of ['science fiction', 'comedy', 'drama', 'thriller', 'romance', 'action', 'animated']\",\n",
+    "        type=\"string\",\n",
+    "    ),\n",
+    "    AttributeInfo(\n",
+    "        name=\"year\",\n",
+    "        description=\"The year the movie was released\",\n",
+    "        type=\"integer\",\n",
+    "    ),\n",
+    "    AttributeInfo(\n",
+    "        name=\"director\",\n",
+    "        description=\"The name of the movie director\",\n",
+    "        type=\"string\",\n",
+    "    ),\n",
+    "    AttributeInfo(\n",
+    "        name=\"rating\", description=\"A 1-10 rating for the movie\", type=\"float\"\n",
+    "    ),\n",
+    "]\n",
+    "document_content_description = \"Brief summary of a movie\"\n",
+    "llm = ChatOpenAI(temperature=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0a6c6fa8-1e2f-45ee-83e9-a6cbd82292d2",
+   "metadata": {},
+   "source": [
+    "We then override the `_get_docs_with_query` to use the `similarity_search_with_score` method of the underlying vector store: "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "62c8f3fa-8b64-4afb-87c4-ccbbf9a8bc54",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Any, Dict\n",
+    "\n",
+    "\n",
+    "class CustomSelfQueryRetriever(SelfQueryRetriever):\n",
+    "    def _get_docs_with_query(\n",
+    "        self, query: str, search_kwargs: Dict[str, Any]\n",
+    "    ) -> List[Document]:\n",
+    "        \"\"\"Get docs, adding score information.\"\"\"\n",
+    "        docs, scores = zip(\n",
+    "            *vectorstore.similarity_search_with_score(query, **search_kwargs)\n",
+    "        )\n",
+    "        for doc, score in zip(docs, scores):\n",
+    "            doc.metadata[\"score\"] = score\n",
+    "\n",
+    "        return docs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "56e40109-1db6-44c7-a6e6-6989175e267c",
+   "metadata": {},
+   "source": [
+    "Invoking this retriever will now include similarity scores in the document metadata. Note that the underlying structured-query capabilities of `SelfQueryRetriever` are retained."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "3359a1ee-34ff-41b6-bded-64c05785b333",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "(Document(page_content='A bunch of scientists bring back dinosaurs and mayhem breaks loose', metadata={'genre': 'science fiction', 'rating': 7.7, 'year': 1993.0, 'score': 0.84429127}),)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "retriever = CustomSelfQueryRetriever.from_llm(\n",
+    "    llm,\n",
+    "    vectorstore,\n",
+    "    document_content_description,\n",
+    "    metadata_field_info,\n",
+    ")\n",
+    "\n",
+    "\n",
+    "result = retriever.invoke(\"dinosaur movie with rating less than 8\")\n",
+    "result"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "689ab3ba-3494-448b-836e-05fbe1ffd51c",
+   "metadata": {},
+   "source": [
+    "## MultiVectorRetriever\n",
+    "\n",
+    "`MultiVectorRetriever` allows you to associate multiple vectors with a single document. This can be useful in a number of applications. For example, we can index small chunks of a larger document and run the retrieval on the chunks, but return the larger \"parent\" document when invoking the retriever. [ParentDocumentRetriever](/docs/how_to/parent_document_retriever/), a subclass of `MultiVectorRetriever`, includes convenience methods for populating a vector store to support this. Further applications are detailed in this [how-to guide](/docs/how_to/multi_vector/).\n",
+    "\n",
+    "To propagate similarity scores through this retriever, we can again subclass `MultiVectorRetriever` and override a method. This time we will override `_get_relevant_documents`.\n",
+    "\n",
+    "First, we prepare some fake data. We generate fake \"whole documents\" and store them in a document store; here we will use a simple [InMemoryStore](https://api.python.langchain.com/en/latest/stores/langchain_core.stores.InMemoryBaseStore.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "a112e545-7b53-4fcd-9c4a-7a42a5cc646d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.storage import InMemoryStore\n",
+    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
+    "\n",
+    "# The storage layer for the parent documents\n",
+    "docstore = InMemoryStore()\n",
+    "fake_whole_documents = [\n",
+    "    (\"fake_id_1\", Document(page_content=\"fake whole document 1\")),\n",
+    "    (\"fake_id_2\", Document(page_content=\"fake whole document 2\")),\n",
+    "]\n",
+    "docstore.mset(fake_whole_documents)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "453b7415-4a6d-45d4-a329-9c1d7271d1b2",
+   "metadata": {},
+   "source": [
+    "Next we will add some fake \"sub-documents\" to our vector store. We can link these sub-documents to the parent documents by populating the `\"doc_id\"` key in its metadata."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "314519c0-dde4-41ea-a1ab-d3cf1c17c63f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "['62a85353-41ff-4346-bff7-be6c8ec2ed89',\n",
+       " '5d4a0e83-4cc5-40f1-bc73-ed9cbad0ee15',\n",
+       " '8c1d9a56-120f-45e4-ba70-a19cd19a38f4']"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "docs = [\n",
+    "    Document(\n",
+    "        page_content=\"A snippet from a larger document discussing cats.\",\n",
+    "        metadata={\"doc_id\": \"fake_id_1\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"A snippet from a larger document discussing discourse.\",\n",
+    "        metadata={\"doc_id\": \"fake_id_1\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"A snippet from a larger document discussing chocolate.\",\n",
+    "        metadata={\"doc_id\": \"fake_id_2\"},\n",
+    "    ),\n",
+    "]\n",
+    "\n",
+    "vectorstore.add_documents(docs)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e391f7f3-5a58-40fd-89fa-a0815c5146f7",
+   "metadata": {},
+   "source": [
+    "To propagate the scores, we subclass `MultiVectorRetriever` and override its `_get_relevant_documents` method. Here we will make two changes:\n",
+    "\n",
+    "1. We will add similarity scores to the metadata of the corresponding \"sub-documents\" using the `similarity_search_with_score` method of the underlying vector store as above;\n",
+    "2. We will include a list of these sub-documents in the metadata of the retrieved parent document. This surfaces what snippets of text were identified by the retrieval, together with their corresponding similarity scores."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "1de61de7-1b58-41d6-9dea-939fef7d741d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from collections import defaultdict\n",
+    "\n",
+    "from langchain.retrievers import MultiVectorRetriever\n",
+    "from langchain_core.callbacks import CallbackManagerForRetrieverRun\n",
+    "\n",
+    "\n",
+    "class CustomMultiVectorRetriever(MultiVectorRetriever):\n",
+    "    def _get_relevant_documents(\n",
+    "        self, query: str, *, run_manager: CallbackManagerForRetrieverRun\n",
+    "    ) -> List[Document]:\n",
+    "        \"\"\"Get documents relevant to a query.\n",
+    "        Args:\n",
+    "            query: String to find relevant documents for\n",
+    "            run_manager: The callbacks handler to use\n",
+    "        Returns:\n",
+    "            List of relevant documents\n",
+    "        \"\"\"\n",
+    "        results = self.vectorstore.similarity_search_with_score(\n",
+    "            query, **self.search_kwargs\n",
+    "        )\n",
+    "\n",
+    "        # Map doc_ids to list of sub-documents, adding scores to metadata\n",
+    "        id_to_doc = defaultdict(list)\n",
+    "        for doc, score in results:\n",
+    "            doc_id = doc.metadata.get(\"doc_id\")\n",
+    "            if doc_id:\n",
+    "                doc.metadata[\"score\"] = score\n",
+    "                id_to_doc[doc_id].append(doc)\n",
+    "\n",
+    "        # Fetch documents corresponding to doc_ids, retaining sub_docs in metadata\n",
+    "        docs = []\n",
+    "        for _id, sub_docs in id_to_doc.items():\n",
+    "            docstore_docs = self.docstore.mget([_id])\n",
+    "            if docstore_docs:\n",
+    "                if doc := docstore_docs[0]:\n",
+    "                    doc.metadata[\"sub_docs\"] = sub_docs\n",
+    "                    docs.append(doc)\n",
+    "\n",
+    "        return docs"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7af27b38-631c-463f-9d66-bcc985f06a4f",
+   "metadata": {},
+   "source": [
+    "Invoking this retriever, we can see that it identifies the correct parent document, including the relevant snippet from the sub-document with similarity score."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "dc42a1be-22e1-4ade-b1bd-bafb85f2424f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='fake whole document 1', metadata={'sub_docs': [Document(page_content='A snippet from a larger document discussing cats.', metadata={'doc_id': 'fake_id_1', 'score': 0.831276655})]})]"
+      ]
+     },
+     "execution_count": 11,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "retriever = CustomMultiVectorRetriever(vectorstore=vectorstore, docstore=docstore)\n",
+    "\n",
+    "retriever.invoke(\"cat\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/how_to/agent_executor.ipynb
+++ b/docs/docs/how_to/agent_executor.ipynb
@@ -66,7 +66,7 @@
    "```\n",
    "\n",
    "\n",
-    "For more details, see our [Installation guide](/docs/installation).\n",
+    "For more details, see our [Installation guide](/docs/how_to/installation).\n",
    "\n",
    "### LangSmith\n",
    "\n",
--- a/docs/docs/how_to/assign.ipynb
+++ b/docs/docs/how_to/assign.ipynb
@@ -16,21 +16,20 @@
   "source": [
    "# How to add values to a chain's state\n",
    "\n",
-    "An alternate way of [passing data through](/docs/how_to/passthrough) steps of a chain is to leave the current values of the chain state unchanged while assigning a new value under a given key. The [`RunnablePassthrough.assign()`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html#langchain_core.runnables.passthrough.RunnablePassthrough.assign) static method takes an input value and adds the extra arguments passed to the assign function.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "This is useful in the common [LangChain Expression Language](/docs/concepts/#langchain-expression-language) pattern of additively creating a dictionary to use as input to a later step.\n",
-    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "- [Calling runnables in parallel](/docs/how_to/parallel/)\n",
    "- [Custom functions](/docs/how_to/functions/)\n",
    "- [Passing data through](/docs/how_to/passthrough)\n",
-    "`} />\n",
-    "```\n",
+    "\n",
+    ":::\n",
+    "\n",
+    "An alternate way of [passing data through](/docs/how_to/passthrough) steps of a chain is to leave the current values of the chain state unchanged while assigning a new value under a given key. The [`RunnablePassthrough.assign()`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html#langchain_core.runnables.passthrough.RunnablePassthrough.assign) static method takes an input value and adds the extra arguments passed to the assign function.\n",
+    "\n",
+    "This is useful in the common [LangChain Expression Language](/docs/concepts/#langchain-expression-language) pattern of additively creating a dictionary to use as input to a later step.\n",
    "\n",
    "Here's an example:"
   ]
@@ -184,9 +183,9 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/docs/docs/how_to/binding.ipynb
+++ b/docs/docs/how_to/binding.ipynb
@@ -18,17 +18,16 @@
   "source": [
    "# How to attach runtime arguments to a Runnable\n",
    "\n",
-    "Sometimes we want to invoke a [`Runnable`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html) within a [RunnableSequence](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableSequence.html) with constant arguments that are not part of the output of the preceding Runnable in the sequence, and which are not part of the user input. We can use the [`Runnable.bind()`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.bind) method to set these arguments ahead of time.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "- [Tool calling](/docs/how_to/tool_calling/)\n",
-    "`} />\n",
-    "```\n",
+    "\n",
+    ":::\n",
+    "\n",
+    "Sometimes we want to invoke a [`Runnable`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html) within a [RunnableSequence](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableSequence.html) with constant arguments that are not part of the output of the preceding Runnable in the sequence, and which are not part of the user input. We can use the [`Runnable.bind()`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable.bind) method to set these arguments ahead of time.\n",
    "\n",
    "## Binding stop sequences\n",
    "\n",
@@ -228,7 +227,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/chat_model_caching.ipynb
+++ b/docs/docs/how_to/chat_model_caching.ipynb
@@ -7,21 +7,20 @@
   "source": [
    "# How to cache chat model responses\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [LLMs](/docs/concepts/#llms)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "LangChain provides an optional caching layer for chat models. This is useful for two main reasons:\n",
    "\n",
    "- It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times. This is especially useful during app development.\n",
    "- It can speed up your application by reducing the number of API calls you make to the LLM provider.\n",
    "\n",
-    "This guide will walk you through how to enable this in your apps.\n",
-    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LLMs](/docs/concepts/#llms)\n",
-    "`} />\n",
-    "```"
+    "This guide will walk you through how to enable this in your apps."
   ]
  },
  {
@@ -267,7 +266,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/chat_token_usage_tracking.ipynb
+++ b/docs/docs/how_to/chat_token_usage_tracking.ipynb
@@ -7,15 +7,14 @@
   "source": [
    "# How to track token usage in ChatModels\n",
    "\n",
-    "Tracking token usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "`} />\n",
-    "```"
+    "\n",
+    ":::\n",
+    "\n",
+    "Tracking token usage to calculate cost is an important part of putting your app in production. This guide goes over how to obtain this information from your LangChain model calls."
   ]
  },
  {
@@ -365,7 +364,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/configure.ipynb
+++ b/docs/docs/how_to/configure.ipynb
@@ -18,23 +18,22 @@
   "source": [
    "# How to configure runtime chain internals\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [Chaining runnables](/docs/how_to/sequence/)\n",
+    "- [Binding runtime arguments](/docs/how_to/binding/)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "Sometimes you may want to experiment with, or even expose to the end user, multiple different ways of doing things within your chains.\n",
    "This can include tweaking parameters such as temperature or even swapping out one model for another.\n",
    "In order to make this experience as easy as possible, we have defined two methods.\n",
    "\n",
    "- A `configurable_fields` method. This lets you configure particular fields of a runnable.\n",
    "  - This is related to the [`.bind`](/docs/how_to/binding) method on runnables, but allows you to specify parameters for a given step in a chain at runtime rather than specifying them beforehand.\n",
-    "- A `configurable_alternatives` method. With this method, you can list out alternatives for any particular runnable that can be set during runtime, and swap them for those specified alternatives.\n",
-    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
-    "- [Chaining runnables](/docs/how_to/sequence/)\n",
-    "- [Binding runtime arguments](/docs/how_to/binding/)\n",
-    "`} />\n",
-    "```"
+    "- A `configurable_alternatives` method. With this method, you can list out alternatives for any particular runnable that can be set during runtime, and swap them for those specified alternatives."
   ]
  },
  {
@@ -613,7 +612,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.5"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/custom_chat_model.ipynb
+++ b/docs/docs/how_to/custom_chat_model.ipynb
@@ -7,20 +7,19 @@
   "source": [
    "# How to create a custom chat model class\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "In this guide, we'll learn how to create a custom chat model using LangChain abstractions.\n",
    "\n",
    "Wrapping your LLM with the standard [`BaseChatModel`](https://api.python.langchain.com/en/latest/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
    "\n",
    "As an bonus, your LLM will automatically become a LangChain `Runnable` and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "`} />\n",
-    "```\n",
-    "\n",
    "## Inputs and outputs\n",
    "\n",
    "First, we need to talk about **messages**, which are the inputs and outputs of chat models.\n",
@@ -562,7 +561,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/few_shot_examples.ipynb
+++ b/docs/docs/how_to/few_shot_examples.ipynb
@@ -17,23 +17,22 @@
   "source": [
    "# How to use few shot examples\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Example selectors](/docs/concepts/#example-selectors)\n",
+    "- [LLMs](/docs/concepts/#llms)\n",
+    "- [Vectorstores](/docs/concepts/#vectorstores)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "In this guide, we'll learn how to create a simple prompt template that provides the model with example inputs and outputs when generating. Providing the LLM with a few such examples is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
    "\n",
    "A few-shot prompt template can be constructed from either a set of examples, or from an [Example Selector](https://api.python.langchain.com/en/latest/example_selectors/langchain_core.example_selectors.base.BaseExampleSelector.html) class responsible for choosing a subset of examples from the defined set.\n",
    "\n",
    "This guide will cover few-shotting with string prompt templates. For a guide on few-shotting with chat messages for chat models, see [here](/docs/how_to/few_shot_examples_chat/).\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Example selectors](/docs/concepts/#example-selectors)\n",
-    "- [LLMs](/docs/concepts/#llms)\n",
-    "- [Vectorstores](/docs/concepts/#vectorstores)\n",
-    "`} />\n",
-    "```\n",
-    "\n",
    "## Create a formatter for the few-shot examples\n",
    "\n",
    "Configure a formatter that will format the few-shot examples into a string. This formatter should be a `PromptTemplate` object."
@@ -390,7 +389,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/few_shot_examples_chat.ipynb
+++ b/docs/docs/how_to/few_shot_examples_chat.ipynb
@@ -17,24 +17,23 @@
   "source": [
    "# How to use few shot examples in chat models\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Example selectors](/docs/concepts/#example-selectors)\n",
+    "- [Chat models](/docs/concepts/#chat-model)\n",
+    "- [Vectorstores](/docs/concepts/#vectorstores)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "This guide covers how to prompt a chat model with example inputs and outputs. Providing the model with a few such examples is called few-shotting, and is a simple yet powerful way to guide generation and in some cases drastically improve model performance.\n",
    "\n",
    "There does not appear to be solid consensus on how best to do few-shot prompting, and the optimal prompt compilation will likely vary by model. Because of this, we provide few-shot prompt templates like the [FewShotChatMessagePromptTemplate](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate.html?highlight=fewshot#langchain_core.prompts.few_shot.FewShotChatMessagePromptTemplate) as a flexible starting point, and you can modify or replace them as you see fit.\n",
    "\n",
    "The goal of few-shot prompt templates are to dynamically select examples based on an input, and then format the examples in a final prompt to provide for the model.\n",
    "\n",
-    "**Note:** The following code examples are for chat models only, since `FewShotChatMessagePromptTemplates` are designed to output formatted [chat messages](/docs/concepts/#message-types) rather than pure strings. For similar few-shot prompt examples for pure string templates compatible with completion models (LLMs), see the [few-shot prompt templates](/docs/how_to/few_shot_examples/) guide.\n",
-    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Example selectors](/docs/concepts/#example-selectors)\n",
-    "- [Chat models](/docs/concepts/#chat-model)\n",
-    "- [Vectorstores](/docs/concepts/#vectorstores)\n",
-    "`} />\n",
-    "```"
+    "**Note:** The following code examples are for chat models only, since `FewShotChatMessagePromptTemplates` are designed to output formatted [chat messages](/docs/concepts/#message-types) rather than pure strings. For similar few-shot prompt examples for pure string templates compatible with completion models (LLMs), see the [few-shot prompt templates](/docs/how_to/few_shot_examples/) guide."
   ]
  },
  {
@@ -435,7 +434,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/function_calling.ipynb
+++ b/docs/docs/how_to/function_calling.ipynb
@@ -696,7 +696,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/functions.ipynb
+++ b/docs/docs/how_to/functions.ipynb
@@ -18,6 +18,14 @@
   "source": [
    "# How to run custom functions\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [Chaining runnables](/docs/how_to/sequence/)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "You can use arbitrary functions as [Runnables](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable). This is useful for formatting or when you need functionality not provided by other LangChain components, and custom functions used as Runnables are called [`RunnableLambdas`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableLambda.html).\n",
    "\n",
    "Note that all inputs to these functions need to be a SINGLE argument. If you have a function that accepts multiple arguments, you should write a wrapper that accepts a single dict input and unpacks it into multiple argument.\n",
@@ -29,15 +37,6 @@
    "- How to accept and use run metadata in your custom function\n",
    "- How to stream with custom functions by having them return generators\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
-    "- [Chaining runnables](/docs/how_to/sequence/)\n",
-    "`} />\n",
-    "```\n",
-    "\n",
    "## Using the constructor\n",
    "\n",
    "Below, we explicitly wrap our custom logic using the `RunnableLambda` constructor:"
@@ -526,7 +525,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/hybrid.ipynb
+++ b/docs/docs/how_to/hybrid.ipynb
@@ -0,0 +1,392 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "14d3fd06",
+   "metadata": {
+    "id": "14d3fd06"
+   },
+   "source": [
+    "# Hybrid Search\n",
+    "\n",
+    "The standard search in LangChain is done by vector similarity. However, a number of vectorstores implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, ...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as \"Hybrid\" search.\n",
+    "\n",
+    "**Step 1: Make sure the vectorstore you are using supports hybrid search**\n",
+    "\n",
+    "At the moment, there is no unified way to perform hybrid search in LangChain. Each vectorstore may have their own way to do it. This is generally exposed as a keyword argument that is passed in during `similarity_search`. By reading the documentation or source code, figure out whether the vectorstore you are using supports hybrid search, and, if so, how to use it.\n",
+    "\n",
+    "**Step 2: Add that parameter as a configurable field for the chain**\n",
+    "\n",
+    "This will let you easily call the chain and configure any relevant flags at runtime. See [this documentation](/docs/how_to/configure) for more information on configuration.\n",
+    "\n",
+    "**Step 3: Call the chain with that configurable field**\n",
+    "\n",
+    "Now, at runtime you can call this chain with configurable field.\n",
+    "\n",
+    "## Code Example\n",
+    "\n",
+    "Let's see a concrete example of what this looks like in code. We will use the Cassandra/CQL interface of Astra DB for this example.\n",
+    "\n",
+    "Install the following Python package:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c2efe35eea197769",
+   "metadata": {
+    "id": "c2efe35eea197769",
+    "outputId": "527275b4-076e-4b22-945c-e41a59188116"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install \"cassio>=0.1.7\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b4ef96d44341cd84",
+   "metadata": {
+    "collapsed": false,
+    "id": "b4ef96d44341cd84"
+   },
+   "source": [
+    "Get the [connection secrets](https://docs.datastax.com/en/astra/astra-db-vector/get-started/quickstart.html).\n",
+    "\n",
+    "Initialize cassio:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cb2cef097277c32e",
+   "metadata": {
+    "id": "cb2cef097277c32e",
+    "outputId": "4c3d05a0-319a-44a0-8ec3-0a9c78453132"
+   },
+   "outputs": [],
+   "source": [
+    "import cassio\n",
+    "\n",
+    "cassio.init(\n",
+    "    database_id=\"Your database ID\",\n",
+    "    token=\"Your application token\",\n",
+    "    keyspace=\"Your key space\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e1e51444877f45eb",
+   "metadata": {
+    "collapsed": false,
+    "id": "e1e51444877f45eb"
+   },
+   "source": [
+    "Create the Cassandra VectorStore with a standard [index analyzer](https://docs.datastax.com/en/astra/astra-db-vector/cql/use-analyzers-with-cql.html). The index analyzer is needed to enable term matching."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7345de3c",
+   "metadata": {
+    "id": "7345de3c",
+    "outputId": "d38bcee0-0134-4ac6-8d35-afcce282481b"
+   },
+   "outputs": [],
+   "source": [
+    "from cassio.table.cql import STANDARD_ANALYZER\n",
+    "from langchain_community.vectorstores import Cassandra\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "embeddings = OpenAIEmbeddings()\n",
+    "vectorstore = Cassandra(\n",
+    "    embedding=embeddings,\n",
+    "    table_name=\"test_hybrid\",\n",
+    "    body_index_options=[STANDARD_ANALYZER],\n",
+    "    session=None,\n",
+    "    keyspace=None,\n",
+    ")\n",
+    "\n",
+    "vectorstore.add_texts(\n",
+    "    [\n",
+    "        \"In 2023, I visited Paris\",\n",
+    "        \"In 2022, I visited New York\",\n",
+    "        \"In 2021, I visited New Orleans\",\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "73887f23bbab978c",
+   "metadata": {
+    "collapsed": false,
+    "id": "73887f23bbab978c"
+   },
+   "source": [
+    "If we do a standard similarity search, we get all the documents:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3c2a39fa",
+   "metadata": {
+    "id": "3c2a39fa",
+    "outputId": "5290085b-896c-4c81-9b40-c315331b7009"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='In 2022, I visited New York'),\n",
+       "Document(page_content='In 2023, I visited Paris'),\n",
+       "Document(page_content='In 2021, I visited New Orleans')]"
+      ]
+     },
+     "execution_count": null,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "vectorstore.as_retriever().invoke(\"What city did I visit last?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "78d4c3c79e67d8c3",
+   "metadata": {
+    "collapsed": false,
+    "id": "78d4c3c79e67d8c3"
+   },
+   "source": [
+    "The Astra DB vectorstore `body_search` argument can be used to filter the search on the term `new`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "56393baa",
+   "metadata": {
+    "id": "56393baa",
+    "outputId": "d1c939f3-342f-4df4-94a3-d25429b5a25e"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='In 2022, I visited New York'),\n",
+       "Document(page_content='In 2021, I visited New Orleans')]"
+      ]
+     },
+     "execution_count": null,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "vectorstore.as_retriever(search_kwargs={\"body_search\": \"new\"}).invoke(\n",
+    "    \"What city did I visit last?\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "88ae97ed",
+   "metadata": {
+    "id": "88ae97ed"
+   },
+   "source": [
+    "We can now create the chain that we will use to do question-answering over"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "62707b4f",
+   "metadata": {
+    "id": "62707b4f"
+   },
+   "outputs": [],
+   "source": [
+    "from langchain_core.output_parsers import StrOutputParser\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import (\n",
+    "    ConfigurableField,\n",
+    "    RunnablePassthrough,\n",
+    ")\n",
+    "from langchain_openai import ChatOpenAI"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b6778ffa",
+   "metadata": {
+    "id": "b6778ffa"
+   },
+   "source": [
+    "This is basic question-answering chain set up."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "44a865f6",
+   "metadata": {
+    "id": "44a865f6"
+   },
+   "outputs": [],
+   "source": [
+    "template = \"\"\"Answer the question based only on the following context:\n",
+    "{context}\n",
+    "Question: {question}\n",
+    "\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_template(template)\n",
+    "\n",
+    "model = ChatOpenAI()\n",
+    "\n",
+    "retriever = vectorstore.as_retriever()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "72125166",
+   "metadata": {
+    "id": "72125166"
+   },
+   "source": [
+    "Here we mark the retriever as having a configurable field. All vectorstore retrievers have `search_kwargs` as a field. This is just a dictionary, with vectorstore specific fields"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "babbadff",
+   "metadata": {
+    "id": "babbadff"
+   },
+   "outputs": [],
+   "source": [
+    "configurable_retriever = retriever.configurable_fields(\n",
+    "    search_kwargs=ConfigurableField(\n",
+    "        id=\"search_kwargs\",\n",
+    "        name=\"Search Kwargs\",\n",
+    "        description=\"The search kwargs to use\",\n",
+    "    )\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2d481b70",
+   "metadata": {
+    "id": "2d481b70"
+   },
+   "source": [
+    "We can now create the chain using our configurable retriever"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "210b0446",
+   "metadata": {
+    "id": "210b0446"
+   },
+   "outputs": [],
+   "source": [
+    "chain = (\n",
+    "    {\"context\": configurable_retriever, \"question\": RunnablePassthrough()}\n",
+    "    | prompt\n",
+    "    | model\n",
+    "    | StrOutputParser()\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a38037b2",
+   "metadata": {
+    "id": "a38037b2",
+    "outputId": "1ea14996-5965-4a5e-9678-b9c35ce5c6de"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Paris"
+      ]
+     },
+     "execution_count": null,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.invoke(\"What city did I visit last?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7f6458c3",
+   "metadata": {
+    "id": "7f6458c3"
+   },
+   "source": [
+    "We can now invoke the chain with configurable options. `search_kwargs` is the id of the configurable field. The value is the search kwargs to use for Astra DB."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9gYLqBTH8BFz",
+   "metadata": {
+    "id": "9gYLqBTH8BFz",
+    "outputId": "4358a2e6-f306-48f1-dd5c-781ac8a33e89"
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "New York"
+      ]
+     },
+     "execution_count": null,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.invoke(\n",
+    "    \"What city did I visit last?\",\n",
+    "    config={\"configurable\": {\"search_kwargs\": {\"body_search\": \"new\"}}},\n",
+    ")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/how_to/index.mdx
+++ b/docs/docs/how_to/index.mdx
@@ -3,170 +3,180 @@ sidebar_position: 0
 sidebar_class_name: hidden
 ---

-# How-to Guides
+# How-to guides

-Here you’ll find short answers to “How do I….?” types of questions. 
-These how-to guides don’t cover topics in depth – you’ll find that material in the [Tutorials](/docs/tutorials) and the [API Reference](https://api.python.langchain.com/en/latest/). 
-However, these guides will help you quickly accomplish common tasks.
+Here you’ll find answers to “How do I….?” types of questions.
+These guides are *goal-oriented* and *concrete*; they're meant to help you complete a specific task.
+For conceptual explanations see [Conceptual Guides](/docs/concepts/).
+For end-to-end walkthroughs see [Tutorials](/docs/tutorials).
+For comprehensive descriptions of every class and function see [API Reference](https://api.python.langchain.com/en/latest/).

-## Core Functionality
+## Installation

-This covers functionality that is core to using LangChain
+- [How to: install LangChain packages](/docs/how_to/installation/)

- [How to return structured data from an LLM](/docs/how_to/structured_output/)
- [How to use a chat model to call tools](/docs/how_to/tool_calling/)
- [How to stream](/docs/how_to/streaming)
- [How to debug your LLM apps](/docs/how_to/debugging/)
+## Key features
+
+This highlights functionality that is core to using LangChain.
+
+- [How to: return structured data from an LLM](/docs/how_to/structured_output/)
+- [How to: use a chat model to call tools](/docs/how_to/tool_calling/)
+- [How to: stream runnables](/docs/how_to/streaming)
+- [How to: debug your LLM apps](/docs/how_to/debugging/)

 ## LangChain Expression Language (LCEL)

-LangChain Expression Language a way to create arbitrary custom chains.
+LangChain Expression Language is a way to create arbitrary custom chains. It is built on the Runnable protocol.

- [How to combine multiple runnables into a chain](/docs/how_to/sequence)
- [How to invoke runnables in parallel](/docs/how_to/parallel/)
- [How to attach runtime arguments to a runnable](/docs/how_to/binding/)
- [How to run custom functions](/docs/how_to/functions)
- [How to pass through arguments from one step to the next](/docs/how_to/passthrough)
- [How to add values to a chain's state](/docs/how_to/assign)
- [How to configure a chain at runtime](/docs/how_to/configure)
- [How to add message history](/docs/how_to/message_history)
- [How to route execution within a chain](/docs/how_to/routing)
- [How to inspect your runnables](/docs/how_to/inspect)
- [How to add fallbacks](/docs/how_to/fallbacks)
+- [How to: chain runnables](/docs/how_to/sequence)
+- [How to: stream runnables](/docs/how_to/streaming)
+- [How to: invoke runnables in parallel](/docs/how_to/parallel/)
+- [How to: attach runtime arguments to a runnable](/docs/how_to/binding/)
+- [How to: run custom functions](/docs/how_to/functions)
+- [How to: pass through arguments from one step to the next](/docs/how_to/passthrough)
+- [How to: add values to a chain's state](/docs/how_to/assign)
+- [How to: configure a chain at runtime](/docs/how_to/configure)
+- [How to: add message history](/docs/how_to/message_history)
+- [How to: route execution within a chain](/docs/how_to/routing)
+- [How to: inspect runnables](/docs/how_to/inspect)
+- [How to: add fallbacks](/docs/how_to/fallbacks)

 ## Components

 These are the core building blocks you can use when building applications.

-### Prompt Templates
+### Prompt templates

 Prompt Templates are responsible for formatting user input into a format that can be passed to a language model.

- [How to use few shot examples](/docs/how_to/few_shot_examples)
- [How to use few shot examples in chat models](/docs/how_to/few_shot_examples_chat/)
- [How to partially format prompt templates](/docs/how_to/prompts_partial)
- [How to compose prompts together](/docs/how_to/prompts_composition)
+- [How to: use few shot examples](/docs/how_to/few_shot_examples)
+- [How to: use few shot examples in chat models](/docs/how_to/few_shot_examples_chat/)
+- [How to: partially format prompt templates](/docs/how_to/prompts_partial)
+- [How to: compose prompts together](/docs/how_to/prompts_composition)

-### Example Selectors
+### Example selectors

 Example Selectors are responsible for selecting the correct few shot examples to pass to the prompt.

- [How to use example selectors](/docs/how_to/example_selectors)
- [How to select examples by length](/docs/how_to/example_selectors_length_based)
- [How to select examples by semantic similarity](/docs/how_to/example_selectors_similarity)
- [How to select examples by semantic ngram overlap](/docs/how_to/example_selectors_ngram)
- [How to select examples by maximal marginal relevance](/docs/how_to/example_selectors_mmr)
+- [How to: use example selectors](/docs/how_to/example_selectors)
+- [How to: select examples by length](/docs/how_to/example_selectors_length_based)
+- [How to: select examples by semantic similarity](/docs/how_to/example_selectors_similarity)
+- [How to: select examples by semantic ngram overlap](/docs/how_to/example_selectors_ngram)
+- [How to: select examples by maximal marginal relevance](/docs/how_to/example_selectors_mmr)

-### Chat Models
+### Chat models

 Chat Models are newer forms of language models that take messages in and output a message.

- [How to do function/tool calling](/docs/how_to/tool_calling)
- [How to get models to return structured output](/docs/how_to/structured_output)
- [How to cache model responses](/docs/how_to/chat_model_caching)
- [How to get log probabilities from model calls](/docs/how_to/logprobs)
- [How to create a custom chat model class](/docs/how_to/custom_chat_model)
- [How to stream a response back](/docs/how_to/chat_streaming)
- [How to track token usage](/docs/how_to/chat_token_usage_tracking)
- [How to track response metadata across providers](/docs/how_to/response_metadata)
+- [How to: do function/tool calling](/docs/how_to/tool_calling)
+- [How to: get models to return structured output](/docs/how_to/structured_output)
+- [How to: cache model responses](/docs/how_to/chat_model_caching)
+- [How to: get log probabilities](/docs/how_to/logprobs)
+- [How to: create a custom chat model class](/docs/how_to/custom_chat_model)
+- [How to: stream a response back](/docs/how_to/chat_streaming)
+- [How to: track token usage](/docs/how_to/chat_token_usage_tracking)
+- [How to: track response metadata across providers](/docs/how_to/response_metadata)

 ### LLMs

 What LangChain calls LLMs are older forms of language models that take a string in and output a string.

- [How to cache model responses](/docs/how_to/llm_caching)
- [How to create a custom LLM class](/docs/how_to/custom_llm)
- [How to stream a response back](/docs/how_to/streaming_llm)
- [How to track token usage](/docs/how_to/llm_token_usage_tracking)
- [How to work with local LLMs](/docs/how_to/local_llms)
+- [How to: cache model responses](/docs/how_to/llm_caching)
+- [How to: create a custom LLM class](/docs/how_to/custom_llm)
+- [How to: stream a response back](/docs/how_to/streaming_llm)
+- [How to: track token usage](/docs/how_to/llm_token_usage_tracking)
+- [How to: work with local LLMs](/docs/how_to/local_llms)

-### Output Parsers
+### Output parsers

 Output Parsers are responsible for taking the output of an LLM and parsing into more structured format.

- [How to use output parsers to parse an LLM response into structured format](/docs/how_to/output_parser_structured)
- [How to parse JSON output](/docs/how_to/output_parser_json)
- [How to parse XML output](/docs/how_to/output_parser_xml)
- [How to parse YAML output](/docs/how_to/output_parser_yaml)
- [How to retry when output parsing errors occur](/docs/how_to/output_parser_retry)
- [How to try to fix errors in output parsing](/docs/how_to/output_parser_fixing)
- [How to write a custom output parser class](/docs/how_to/output_parser_custom)
+- [How to: use output parsers to parse an LLM response into structured format](/docs/how_to/output_parser_structured)
+- [How to: parse JSON output](/docs/how_to/output_parser_json)
+- [How to: parse XML output](/docs/how_to/output_parser_xml)
+- [How to: parse YAML output](/docs/how_to/output_parser_yaml)
+- [How to: retry when output parsing errors occur](/docs/how_to/output_parser_retry)
+- [How to: try to fix errors in output parsing](/docs/how_to/output_parser_fixing)
+- [How to: write a custom output parser class](/docs/how_to/output_parser_custom)

-### Document Loaders
+### Document loaders

 Document Loaders are responsible for loading documents from a variety of sources.

- [How to load CSV data](/docs/how_to/document_loader_csv)
- [How to load data from a directory](/docs/how_to/document_loader_directory)
- [How to load HTML data](/docs/how_to/document_loader_html)
- [How to load JSON data](/docs/how_to/document_loader_json)
- [How to load Markdown data](/docs/how_to/document_loader_markdown)
- [How to load Microsoft Office data](/docs/how_to/document_loader_office_file)
- [How to load PDF files](/docs/how_to/document_loader_pdf)
- [How to write a custom document loader](/docs/how_to/document_loader_custom)
+- [How to: load CSV data](/docs/how_to/document_loader_csv)
+- [How to: load data from a directory](/docs/how_to/document_loader_directory)
+- [How to: load HTML data](/docs/how_to/document_loader_html)
+- [How to: load JSON data](/docs/how_to/document_loader_json)
+- [How to: load Markdown data](/docs/how_to/document_loader_markdown)
+- [How to: load Microsoft Office data](/docs/how_to/document_loader_office_file)
+- [How to: load PDF files](/docs/how_to/document_loader_pdf)
+- [How to: write a custom document loader](/docs/how_to/document_loader_custom)

-### Text Splitters
+### Text splitters

 Text Splitters take a document and split into chunks that can be used for retrieval.

- [How to recursively split text](/docs/how_to/recursive_text_splitter)
- [How to split by HTML headers](/docs/how_to/HTML_header_metadata_splitter)
- [How to split by HTML sections](/docs/how_to/HTML_section_aware_splitter)
- [How to split by character](/docs/how_to/character_text_splitter)
- [How to split code](/docs/how_to/code_splitter)
- [How to split Markdown by headers](/docs/how_to/markdown_header_metadata_splitter)
- [How to recursively split JSON](/docs/how_to/recursive_json_splitter)
- [How to split text into semantic chunks](/docs/how_to/semantic-chunker)
- [How to split by tokens](/docs/how_to/split_by_token)
+- [How to: recursively split text](/docs/how_to/recursive_text_splitter)
+- [How to: split by HTML headers](/docs/how_to/HTML_header_metadata_splitter)
+- [How to: split by HTML sections](/docs/how_to/HTML_section_aware_splitter)
+- [How to: split by character](/docs/how_to/character_text_splitter)
+- [How to: split code](/docs/how_to/code_splitter)
+- [How to: split Markdown by headers](/docs/how_to/markdown_header_metadata_splitter)
+- [How to: recursively split JSON](/docs/how_to/recursive_json_splitter)
+- [How to: split text into semantic chunks](/docs/how_to/semantic-chunker)
+- [How to: split by tokens](/docs/how_to/split_by_token)

-### Embedding Models
+### Embedding models

 Embedding Models take a piece of text and create a numerical representation of it.

- [How to embed text data](/docs/how_to/embed_text)
- [How to cache embedding results](/docs/how_to/caching_embeddings)
+- [How to: embed text data](/docs/how_to/embed_text)
+- [How to: cache embedding results](/docs/how_to/caching_embeddings)

-### Vector Stores
+### Vector stores

-Vector Stores are databases that can efficiently store and retrieve embeddings.
+Vector stores are databases that can efficiently store and retrieve embeddings.

- [How to use a vector store to retrieve data](/docs/how_to/vectorstores)
+- [How to: use a vector store to retrieve data](/docs/how_to/vectorstores)

 ### Retrievers

 Retrievers are responsible for taking a query and returning relevant documents.

- [How use a vector store to retrieve data](/docs/how_to/vectorstore_retriever)
- [How to generate multiple queries to retrieve data for](/docs/how_to/MultiQueryRetriever)
- [How to use contextual compression to compress the data retrieved](/docs/how_to/contextual_compression)
- [How to write a custom retriever class](/docs/how_to/custom_retriever)
- [How to combine the results from multiple retrievers](/docs/how_to/ensemble_retriever)
- [How to reorder retrieved results to put most relevant documents not in the middle](/docs/how_to/long_context_reorder)
- [How to generate multiple embeddings per document](/docs/how_to/multi_vector)
- [How to retrieve the whole document for a chunk](/docs/how_to/parent_document_retriever)
- [How to generate metadata filters](/docs/how_to/self_query)
- [How to create a time-weighted retriever](/docs/how_to/time_weighted_vectorstore)
+- [How to: use a vector store to retrieve data](/docs/how_to/vectorstore_retriever)
+- [How to: generate multiple queries to retrieve data for](/docs/how_to/MultiQueryRetriever)
+- [How to: use contextual compression to compress the data retrieved](/docs/how_to/contextual_compression)
+- [How to: write a custom retriever class](/docs/how_to/custom_retriever)
+- [How to: add similarity scores to retriever results](/docs/how_to/add_scores_retriever)
+- [How to: combine the results from multiple retrievers](/docs/how_to/ensemble_retriever)
+- [How to: reorder retrieved results to put most relevant documents not in the middle](/docs/how_to/long_context_reorder)
+- [How to: generate multiple embeddings per document](/docs/how_to/multi_vector)
+- [How to: retrieve the whole document for a chunk](/docs/how_to/parent_document_retriever)
+- [How to: generate metadata filters](/docs/how_to/self_query)
+- [How to: create a time-weighted retriever](/docs/how_to/time_weighted_vectorstore)
+- [How to: use hybrid vector and keyword retrieval](/docs/how_to/hybrid)

 ### Indexing

 Indexing is the process of keeping your vectorstore in-sync with the underlying data source.

- [How to reindex data to keep your vectorstore in-sync with the underlying data source](/docs/how_to/indexing)
+- [How to: reindex data to keep your vectorstore in-sync with the underlying data source](/docs/how_to/indexing)

 ### Tools

 LangChain Tools contain a description of the tool (to pass to the language model) as well as the implementation of the function to call).

- [How to use LangChain tools](/docs/how_to/tools)
- [How to use a chat model to call tools](/docs/how_to/tool_calling/)
- [How to use LangChain toolkits](/docs/how_to/toolkits)
- [How to define a custom tool](/docs/how_to/custom_tools)
- [How to convert LangChain tools to OpenAI functions](/docs/how_to/tools_as_openai_functions)
- [How to use tools without function calling](/docs/how_to/tools_prompting)
- [How to let the LLM choose between multiple tools](/docs/how_to/tools_multiple)
- [How to add a human in the loop to tool usage](/docs/how_to/tools_human)
- [How to do parallel tool use](/docs/how_to/tools_parallel)
- [How to handle errors when calling tools](/docs/how_to/tools_error)
+- [How to: use LangChain tools](/docs/how_to/tools)
+- [How to: use a chat model to call tools](/docs/how_to/tool_calling/)
+- [How to: use LangChain toolkits](/docs/how_to/toolkits)
+- [How to: define a custom tool](/docs/how_to/custom_tools)
+- [How to: convert LangChain tools to OpenAI functions](/docs/how_to/tools_as_openai_functions)
+- [How to: use tools without function calling](/docs/how_to/tools_prompting)
+- [How to: let the LLM choose between multiple tools](/docs/how_to/tools_multiple)
+- [How to: add a human in the loop to tool usage](/docs/how_to/tools_human)
+- [How to: do parallel tool use](/docs/how_to/tools_parallel)
+- [How to: handle errors when calling tools](/docs/how_to/tools_error)
+- [How to: call tools using multi-modal data](/docs/how_to/tool_calls_multi_modal)

 ### Agents

@@ -176,25 +186,22 @@ For in depth how-to guides for agents, please check out [LangGraph](https://gith

 :::

- [How to use legacy LangChain Agents (AgentExecutor)](/docs/how_to/agent_executor)
- [How to migrate from legacy LangChain agents to LangGraph](/docs/how_to/migrate_agent)
+- [How to: use legacy LangChain Agents (AgentExecutor)](/docs/how_to/agent_executor)
+- [How to: migrate from legacy LangChain agents to LangGraph](/docs/how_to/migrate_agent)

 ### Custom

 All of LangChain components can easily be extended to support your own versions.

- [How to create a custom chat model class](/docs/how_to/custom_chat_model)
- [How to create a custom LLM class](/docs/how_to/custom_llm)
- [How to write a custom retriever class](/docs/how_to/custom_retriever)
- [How to write a custom document loader](/docs/how_to/document_loader_custom)
- [How to write a custom output parser class](/docs/how_to/output_parser_custom)
-
- [How to define a custom tool](/docs/how_to/custom_tools)
+- [How to: create a custom chat model class](/docs/how_to/custom_chat_model)
+- [How to: create a custom LLM class](/docs/how_to/custom_llm)
+- [How to: write a custom retriever class](/docs/how_to/custom_retriever)
+- [How to: write a custom document loader](/docs/how_to/document_loader_custom)
+- [How to: write a custom output parser class](/docs/how_to/output_parser_custom)
+- [How to: define a custom tool](/docs/how_to/custom_tools)


-
-
-## Use Cases
+## Use cases

 These guides cover use-case specific details.

@@ -202,54 +209,54 @@ These guides cover use-case specific details.

 Retrieval Augmented Generation (RAG) is a way to connect LLMs to external sources of data.

- [How to add chat history](/docs/how_to/qa_chat_history_how_to/)
- [How to stream](/docs/how_to/qa_streaming/)
- [How to return sources](/docs/how_to/qa_sources/)
- [How to return citations](/docs/how_to/qa_citations/)
- [How to do per-user retrieval](/docs/how_to/qa_per_user/)
+- [How to: add chat history](/docs/how_to/qa_chat_history_how_to/)
+- [How to: stream](/docs/how_to/qa_streaming/)
+- [How to: return sources](/docs/how_to/qa_sources/)
+- [How to: return citations](/docs/how_to/qa_citations/)
+- [How to: do per-user retrieval](/docs/how_to/qa_per_user/)


 ### Extraction

 Extraction is when you use LLMs to extract structured information from unstructured text.

- [How to use reference examples](/docs/how_to/extraction_examples/)
- [How to handle long text](/docs/how_to/extraction_long_text/)
- [How to do extraction without using function calling](/docs/how_to/extraction_parse)
+- [How to: use reference examples](/docs/how_to/extraction_examples/)
+- [How to: handle long text](/docs/how_to/extraction_long_text/)
+- [How to: do extraction without using function calling](/docs/how_to/extraction_parse)

 ### Chatbots

 Chatbots involve using an LLM to have a conversation.

- [How to manage memory](/docs/how_to/chatbots_memory)
- [How to do retrieval](/docs/how_to/chatbots_retrieval)
- [How to use tools](/docs/how_to/chatbots_tools)
+- [How to: manage memory](/docs/how_to/chatbots_memory)
+- [How to: do retrieval](/docs/how_to/chatbots_retrieval)
+- [How to: use tools](/docs/how_to/chatbots_tools)

-### Query Analysis
+### Query analysis

 Query Analysis is the task of using an LLM to generate a query to send to a retriever.

- [How to add examples to the prompt](/docs/how_to/query_few_shot)
- [How to handle cases where no queries are generated](/docs/how_to/query_no_queries)
- [How to handle multiple queries](/docs/how_to/query_multiple_queries)
- [How to handle multiple retrievers](/docs/how_to/query_multiple_retrievers)
- [How to construct filters](/docs/how_to/query_constructing_filters)
- [How to deal with high cardinality categorical variables](/docs/how_to/query_high_cardinality)
+- [How to: add examples to the prompt](/docs/how_to/query_few_shot)
+- [How to: handle cases where no queries are generated](/docs/how_to/query_no_queries)
+- [How to: handle multiple queries](/docs/how_to/query_multiple_queries)
+- [How to: handle multiple retrievers](/docs/how_to/query_multiple_retrievers)
+- [How to: construct filters](/docs/how_to/query_constructing_filters)
+- [How to: deal with high cardinality categorical variables](/docs/how_to/query_high_cardinality)

 ### Q&A over SQL + CSV

 You can use LLMs to do question answering over tabular data.

- [How to use prompting to improve results](/docs/how_to/sql_prompting)
- [How to do query validation](/docs/how_to/sql_query_checking)
- [How to deal with large databases](/docs/how_to/sql_large_db)
- [How to deal with CSV files](/docs/how_to/sql_csv)
+- [How to: use prompting to improve results](/docs/how_to/sql_prompting)
+- [How to: do query validation](/docs/how_to/sql_query_checking)
+- [How to: deal with large databases](/docs/how_to/sql_large_db)
+- [How to: deal with CSV files](/docs/how_to/sql_csv)

-### Q&A over Graph Databases
+### Q&A over graph databases

 You can use an LLM to do question answering over graph databases.

- [How to map values to a database](/docs/how_to/graph_mapping)
- [How to add a semantic layer over the database](/docs/how_to/graph_semantic)
- [How to improve results with prompting](/docs/how_to/graph_prompting)
- [How to construct knowledge graphs](/docs/how_to/graph_constructing)
+- [How to: map values to a database](/docs/how_to/graph_mapping)
+- [How to: add a semantic layer over the database](/docs/how_to/graph_semantic)
+- [How to: improve results with prompting](/docs/how_to/graph_prompting)
+- [How to: construct knowledge graphs](/docs/how_to/graph_constructing)
--- a/docs/docs/how_to/inspect.ipynb
+++ b/docs/docs/how_to/inspect.ipynb
@@ -5,21 +5,20 @@
   "id": "8c5eb99a",
   "metadata": {},
   "source": [
-    "# How to inspect your runnables\n",
+    "# How to inspect runnables\n",
+    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [Chaining runnables](/docs/how_to/sequence/)\n",
+    "\n",
+    ":::\n",
    "\n",
    "Once you create a runnable with [LangChain Expression Language](/docs/concepts/#langchain-expression-language), you may often want to inspect it to get a better sense for what is going on. This notebook covers some methods for doing so.\n",
    "\n",
    "This guide shows some ways you can programmatically introspect the internal steps of chains. If you are instead interested in debugging issues in your chain, see [this section](/docs/how_to/debugging) instead.\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
-    "- [Chaining runnables](/docs/how_to/sequence/)\n",
-    "`} />\n",
-    "```\n",
-    "\n",
    "First, let's create an example chain. We will create one that does retrieval:"
   ]
  },
@@ -222,7 +221,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/installation.mdx
+++ b/docs/docs/how_to/installation.mdx
--- a/docs/docs/how_to/logprobs.ipynb
+++ b/docs/docs/how_to/logprobs.ipynb
@@ -5,17 +5,16 @@
   "id": "78b45321-7740-4399-b2ad-459811131de3",
   "metadata": {},
   "source": [
-    "# How to get log probabilities from model calls\n",
+    "# How to get log probabilities\n",
    "\n",
-    "Certain chat models can be configured to return token-level log probabilities representing the likelihood of a given token. This guide walks through how to get this information in LangChain.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "`} />\n",
-    "```"
+    "\n",
+    ":::\n",
+    "\n",
+    "Certain chat models can be configured to return token-level log probabilities representing the likelihood of a given token. This guide walks through how to get this information in LangChain."
   ]
  },
  {
@@ -170,7 +169,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/long_context_reorder.ipynb
+++ b/docs/docs/how_to/long_context_reorder.ipynb
@@ -21,7 +21,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  sentence-transformers langchain-chroma langchain langchain-openai > /dev/null"
+    "%pip install --upgrade --quiet  sentence-transformers langchain-chroma langchain langchain-openai langchain-huggingface > /dev/null"
   ]
  },
  {
@@ -57,7 +57,7 @@
    "from langchain_community.document_transformers import (\n",
    "    LongContextReorder,\n",
    ")\n",
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_openai import OpenAI\n",
    "\n",
    "# Get embeddings.\n",
--- a/docs/docs/how_to/message_history.ipynb
+++ b/docs/docs/how_to/message_history.ipynb
@@ -7,6 +7,17 @@
   "source": [
    "# How to add message history\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [Chaining runnables](/docs/how_to/sequence/)\n",
+    "- [Configuring chain parameters at runtime](/docs/how_to/configure)\n",
+    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Chat Messages](/docs/concepts/#message-types)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "Passing conversation state into and out a chain is vital when building a chatbot. The [`RunnableWithMessageHistory`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html#langchain_core.runnables.history.RunnableWithMessageHistory) class lets us add message history to certain types of chains. It wraps another Runnable and manages the chat message history for it.\n",
    "\n",
    "Specifically, it can be used for any Runnable that takes as input one of:\n",
@@ -21,18 +32,6 @@
    "* a sequence of `BaseMessage`\n",
    "* a dict with a key that contains a sequence of `BaseMessage`\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
-    "- [Chaining runnables](/docs/how_to/sequence/)\n",
-    "- [Configuring chain parameters at runtime](/docs/how_to/configure)\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Chat Messages](/docs/concepts/#message-types)\n",
-    "`} />\n",
-    "```\n",
-    "\n",
    "Let's take a look at some examples to see how it works. First we construct a runnable (which here accepts a dict as input and returns a message as output):\n",
    "\n",
    "```{=mdx}\n",
@@ -667,7 +666,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/migrate_agent.ipynb
+++ b/docs/docs/how_to/migrate_agent.ipynb
@@ -8,8 +8,23 @@
    "# How to migrate from legacy LangChain agents to LangGraph\n",
    "\n",
    "Here we focus on how to move from legacy LangChain agents to LangGraph agents.\n",
-    "LangChain agents (the AgentExecutor in particular) have multiple configuration parameters.\n",
-    "In this notebook we will show how those parameters map to the LangGraph `chat_agent_executor`."
+    "LangChain agents (the [AgentExecutor](https://api.python.langchain.com/en/latest/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor) in particular) have multiple configuration parameters.\n",
+    "In this notebook we will show how those parameters map to the LangGraph [react agent executor](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent).\n",
+    "\n",
+    "#### Prerequisites\n",
+    "\n",
+    "This how-to guide uses OpenAI as the LLM. Install the dependencies to run."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "662fac50",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%%capture --no-stderr\n",
+    "%pip install -U langchain-openai langchain langgraph"
   ]
  },
  {
@@ -24,7 +39,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 20,
+   "execution_count": 6,
   "id": "1e425fea-2796-4b99-bee6-9a6ffe73f756",
   "metadata": {},
   "outputs": [],
@@ -32,7 +47,7 @@
    "from langchain_core.tools import tool\n",
    "from langchain_openai import ChatOpenAI\n",
    "\n",
-    "model = ChatOpenAI()\n",
+    "model = ChatOpenAI(model=\"gpt-4o\")\n",
    "\n",
    "\n",
    "@tool\n",
@@ -52,12 +67,12 @@
   "id": "af002033-fe51-4d14-b47c-3e9b483c8395",
   "metadata": {},
   "source": [
-    "For AgentExecutor, we define a prompt with a placeholder for the agent's scratchpad. The agent can be invoked as follows:"
+    "For the LangChain [AgentExecutor](https://api.python.langchain.com/en/latest/agents/langchain.agents.agent.AgentExecutor.html#langchain.agents.agent.AgentExecutor), we define a prompt with a placeholder for the agent's scratchpad. The agent can be invoked as follows:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 15,
   "id": "03ea357c-9c36-4464-b2cc-27bd150e1554",
   "metadata": {},
   "outputs": [
@@ -68,20 +83,21 @@
       " 'output': 'The value of `magic_function(3)` is 5.'}"
      ]
     },
-     "execution_count": 21,
+     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from langchain.agents import AgentExecutor, create_tool_calling_agent\n",
-    "from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
+    "from langchain_core.prompts import ChatPromptTemplate\n",
    "\n",
    "prompt = ChatPromptTemplate.from_messages(\n",
    "    [\n",
    "        (\"system\", \"You are a helpful assistant\"),\n",
    "        (\"human\", \"{input}\"),\n",
-    "        MessagesPlaceholder(\"agent_scratchpad\"),\n",
+    "        # Placeholders fill up a **list** of messages\n",
+    "        (\"placeholder\", \"{agent_scratchpad}\"),\n",
    "    ]\n",
    ")\n",
    "\n",
@@ -97,13 +113,13 @@
   "id": "94205f3b-fd2b-4fd7-af69-0a3fc313dc88",
   "metadata": {},
   "source": [
-    "LangGraph's `chat_agent_executor` manages a state that is defined by a list of messages. It will continue to process the list until there are no tool calls in the agent's output. To kick it off, we input a list of messages. The output will contain the entire state of the graph-- in this case, the conversation history.\n",
+    "LangGraph's [react agent executor](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent) manages a state that is defined by a list of messages. It will continue to process the list until there are no tool calls in the agent's output. To kick it off, we input a list of messages. The output will contain the entire state of the graph-- in this case, the conversation history.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 16,
   "id": "53a3737a-d167-4255-89bf-20ac37f89a3e",
   "metadata": {},
   "outputs": [
@@ -111,18 +127,18 @@
     "data": {
      "text/plain": [
       "{'input': 'what is the value of magic_function(3)?',\n",
-       " 'output': 'The value of the magic function with input 3 is 5.'}"
+       " 'output': 'The value of `magic_function(3)` is 5.'}"
      ]
     },
-     "execution_count": 22,
+     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "from langgraph.prebuilt import chat_agent_executor\n",
+    "from langgraph.prebuilt import create_react_agent\n",
    "\n",
-    "app = chat_agent_executor.create_tool_calling_executor(model, tools)\n",
+    "app = create_react_agent(model, tools)\n",
    "\n",
    "\n",
    "messages = app.invoke({\"messages\": [(\"human\", query)]})\n",
@@ -134,7 +150,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": 17,
   "id": "74ecebe3-512e-409c-a661-bdd5b0a2b782",
   "metadata": {},
   "outputs": [
@@ -142,10 +158,10 @@
     "data": {
      "text/plain": [
       "{'input': 'Pardon?',\n",
-       " 'output': 'The value of the magic function with input 3 is 5.'}"
+       " 'output': 'The result of applying the `magic_function` to the input `3` is `5`.'}"
      ]
     },
-     "execution_count": 23,
+     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -171,7 +187,7 @@
    "\n",
    "With legacy LangChain agents you have to pass in a prompt template. You can use this to control the agent.\n",
    "\n",
-    "With LangGraph `chat_agent_executor`, by default there is no prompt. You can achieve similar control over the agent in a few ways:\n",
+    "With LangGraph [react agent executor](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent), by default there is no prompt. You can achieve similar control over the agent in a few ways:\n",
    "\n",
    "1. Pass in a system message as input\n",
    "2. Initialize the agent with a system message\n",
@@ -184,7 +200,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 24,
+   "execution_count": 18,
   "id": "a9a11ccd-75e2-4c11-844d-a34870b0ff91",
   "metadata": {},
   "outputs": [
@@ -195,7 +211,7 @@
       " 'output': 'El valor de `magic_function(3)` es 5.'}"
      ]
     },
-     "execution_count": 24,
+     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -205,7 +221,8 @@
    "    [\n",
    "        (\"system\", \"You are a helpful assistant. Respond only in Spanish.\"),\n",
    "        (\"human\", \"{input}\"),\n",
-    "        MessagesPlaceholder(\"agent_scratchpad\"),\n",
+    "        # Placeholders fill up a **list** of messages\n",
+    "        (\"placeholder\", \"{agent_scratchpad}\"),\n",
    "    ]\n",
    ")\n",
    "\n",
@@ -221,44 +238,27 @@
   "id": "bd5f5500-5ae4-4000-a9fd-8c5a2cc6404d",
   "metadata": {},
   "source": [
-    "Now, let's pass a custom system message to `chat_agent_executor`. This can either be a string or a LangChain SystemMessage."
+    "Now, let's pass a custom system message to [react agent executor](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent). This can either be a string or a LangChain SystemMessage."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 26,
+   "execution_count": 14,
   "id": "a9486805-676a-4d19-a5c4-08b41b172989",
   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'input': 'what is the value of magic_function(3)?',\n",
-       " 'output': 'El valor de magic_function(3) es 5.'}"
-      ]
-     },
-     "execution_count": 26,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
    "from langchain_core.messages import SystemMessage\n",
+    "from langgraph.prebuilt import create_react_agent\n",
    "\n",
-    "system_message = \"Respond only in Spanish\"\n",
+    "system_message = \"You are a helpful assistant. Respond only in Spanish.\"\n",
    "# This could also be a SystemMessage object\n",
-    "# system_message = SystemMessage(content=\"Respond only in Spanish\")\n",
+    "# system_message = SystemMessage(content=\"You are a helpful assistant. Respond only in Spanish.\")\n",
    "\n",
-    "app = chat_agent_executor.create_tool_calling_executor(\n",
-    "    model, tools, messages_modifier=system_message\n",
-    ")\n",
+    "app = create_react_agent(model, tools, messages_modifier=system_message)\n",
    "\n",
    "\n",
-    "messages = app.invoke({\"messages\": [(\"human\", query)]})\n",
-    "{\n",
-    "    \"input\": query,\n",
-    "    \"output\": messages[\"messages\"][-1].content,\n",
-    "}"
+    "messages = app.invoke({\"messages\": [(\"user\", query)]})"
   ]
  },
  {
@@ -272,7 +272,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 27,
+   "execution_count": 21,
   "id": "d369ab45-0c82-45f4-9d3e-8efb8dd47e2c",
   "metadata": {},
   "outputs": [
@@ -280,24 +280,35 @@
     "data": {
      "text/plain": [
       "{'input': 'what is the value of magic_function(3)?',\n",
-       " 'output': 'El valor de magic_function(3) es 5.'}"
+       " 'output': 'El valor de magic_function(3) es 5. ¡Pandamonium!'}"
      ]
     },
-     "execution_count": 27,
+     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "def _modify_messages(messages):\n",
-    "    return [SystemMessage(content=\"Respond only in spanish\")] + messages\n",
+    "from langchain_core.messages import AnyMessage\n",
+    "from langgraph.prebuilt import create_react_agent\n",
    "\n",
-    "\n",
-    "app = chat_agent_executor.create_tool_calling_executor(\n",
-    "    model, tools, messages_modifier=_modify_messages\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", \"You are a helpful assistant. Respond only in Spanish.\"),\n",
+    "        (\"placeholder\", \"{messages}\"),\n",
+    "    ]\n",
    ")\n",
    "\n",
    "\n",
+    "def _modify_messages(messages: list[AnyMessage]):\n",
+    "    return prompt.invoke({\"messages\": messages}).to_messages() + [\n",
+    "        (\"user\", \"Also say 'Pandamonium!' after the answer.\")\n",
+    "    ]\n",
+    "\n",
+    "\n",
+    "app = create_react_agent(model, tools, messages_modifier=_modify_messages)\n",
+    "\n",
+    "\n",
    "messages = app.invoke({\"messages\": [(\"human\", query)]})\n",
    "{\n",
    "    \"input\": query,\n",
@@ -317,7 +328,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 22,
   "id": "4eff44bc-a620-4c8a-97b1-268692a842bb",
   "metadata": {},
   "outputs": [
@@ -325,7 +336,7 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "[(ToolAgentAction(tool='magic_function', tool_input={'input': 3}, log=\"\\nInvoking: `magic_function` with `{'input': 3}`\\n\\n\\n\", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_qckwqZI7p2LGYhMnQI5r6qsL', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-0602a2dd-c4d9-4050-b851-3e2b838c6773', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_qckwqZI7p2LGYhMnQI5r6qsL'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_qckwqZI7p2LGYhMnQI5r6qsL', 'index': 0}])], tool_call_id='call_qckwqZI7p2LGYhMnQI5r6qsL'), 5)]\n"
+      "[(ToolAgentAction(tool='magic_function', tool_input={'input': 3}, log=\"\\nInvoking: `magic_function` with `{'input': 3}`\\n\\n\\n\", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_lIjE9voYOCFAVoUXSDPQ5bFI', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'}, id='run-7a23003a-ab50-4d7c-b14b-86129d1cacfe', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_lIjE9voYOCFAVoUXSDPQ5bFI'}], tool_call_chunks=[{'name': 'magic_function', 'args': '{\"input\":3}', 'id': 'call_lIjE9voYOCFAVoUXSDPQ5bFI', 'index': 0}])], tool_call_id='call_lIjE9voYOCFAVoUXSDPQ5bFI'), 5)]\n"
     ]
    }
   ],
@@ -340,34 +351,33 @@
   "id": "594f7567-302f-4fa8-85bb-025ac8322162",
   "metadata": {},
   "source": [
-    "By default the `chat_agent_executor` in LangGraph appends all messages to the central state. Therefore, it is easy to see any intermediate steps by just looking at the full state."
+    "By default the [react agent executor](https://langchain-ai.github.io/langgraph/reference/prebuilt/#create_react_agent) in LangGraph appends all messages to the central state. Therefore, it is easy to see any intermediate steps by just looking at the full state."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 23,
   "id": "4f4364ea-dffe-4d25-bdce-ef7d0020b880",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "{'messages': [HumanMessage(content='what is the value of magic_function(3)?', id='408451ee-d65b-498b-abf1-788aaadfbeff'),\n",
-       "  AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_eF7WussX7KgpGdoJFj6cWTxR', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 65, 'total_tokens': 79}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-a07e5d11-9319-4e27-85fb-253b75c5d7c3-0', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_eF7WussX7KgpGdoJFj6cWTxR'}]),\n",
-       "  ToolMessage(content='5', name='magic_function', id='35045a27-a301-474b-b321-5f93da671fb1', tool_call_id='call_eF7WussX7KgpGdoJFj6cWTxR'),\n",
-       "  AIMessage(content='The value of magic_function(3) is 5.', response_metadata={'token_usage': {'completion_tokens': 13, 'prompt_tokens': 88, 'total_tokens': 101}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'stop', 'logprobs': None}, id='run-18a36a26-2477-4fc6-be51-7a675a6e10e8-0')]}"
+       "{'messages': [HumanMessage(content='what is the value of magic_function(3)?', id='8c252eb2-9496-4ad0-b3ae-9ecb2f6c406e'),\n",
+       "  AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_xmBLOw2pRqB1aRTTiwqEEftW', 'function': {'arguments': '{\"input\":3}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-2393b69c-7c52-4771-8bec-aca0e097fcc1-0', tool_calls=[{'name': 'magic_function', 'args': {'input': 3}, 'id': 'call_xmBLOw2pRqB1aRTTiwqEEftW'}]),\n",
+       "  ToolMessage(content='5', name='magic_function', id='bec0d0f9-bbaf-49fb-b0cb-46a658658f87', tool_call_id='call_xmBLOw2pRqB1aRTTiwqEEftW'),\n",
+       "  AIMessage(content='The value of `magic_function(3)` is 5.', response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 87, 'total_tokens': 101}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'stop', 'logprobs': None}, id='run-5904d36f-b2a4-4f55-b431-12c82992c92c-0')]}"
      ]
     },
-     "execution_count": 6,
+     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "from langgraph.prebuilt import chat_agent_executor\n",
-    "\n",
-    "app = chat_agent_executor.create_tool_calling_executor(model, tools)\n",
+    "from langgraph.prebuilt import create_react_agent\n",
    "\n",
+    "app = create_react_agent(model, tools=tools)\n",
    "\n",
    "messages = app.invoke({\"messages\": [(\"human\", query)]})\n",
    "\n",
@@ -390,7 +400,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 24,
   "id": "16f189a7-fc78-4cb5-aa16-a94ca06401a6",
   "metadata": {},
   "outputs": [],
@@ -406,7 +416,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 26,
   "id": "c96aefd7-6f6e-4670-aca6-1ac3d4e7871f",
   "metadata": {},
   "outputs": [
@@ -421,15 +431,7 @@
      "Invoking: `magic_function` with `{'input': '3'}`\n",
      "\n",
      "\n",
-      "\u001b[0m\u001b[36;1m\u001b[1;3mSorry, there was an error. Please try again.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
-      "Invoking: `magic_function` with `{'input': '3'}`\n",
-      "responded: I encountered an error while trying to determine the value of the magic function for the input \"3\". Let me try again.\n",
-      "\n",
-      "\u001b[0m\u001b[36;1m\u001b[1;3mSorry, there was an error. Please try again.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
-      "Invoking: `magic_function` with `{'input': '3'}`\n",
-      "responded: I apologize for the inconvenience. It seems there is still an error in calculating the value of the magic function for the input \"3\". Let me attempt to resolve the issue by trying a different approach.\n",
-      "\n",
-      "\u001b[0m\u001b[36;1m\u001b[1;3mSorry, there was an error. Please try again.\u001b[0m\u001b[32;1m\u001b[1;3m\u001b[0m\n",
+      "\u001b[0m\u001b[36;1m\u001b[1;3mSorry, there was an error. Please try again.\u001b[0m\u001b[32;1m\u001b[1;3mParece que hubo un error al intentar obtener el valor de `magic_function(3)`. ¿Te gustaría que lo intente de nuevo?\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
@@ -438,15 +440,24 @@
     "data": {
      "text/plain": [
       "{'input': 'what is the value of magic_function(3)?',\n",
-       " 'output': 'Agent stopped due to max iterations.'}"
+       " 'output': 'Parece que hubo un error al intentar obtener el valor de `magic_function(3)`. ¿Te gustaría que lo intente de nuevo?'}"
      ]
     },
-     "execution_count": 8,
+     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\"system\", \"You are a helpful assistant. Respond only in Spanish.\"),\n",
+    "        (\"human\", \"{input}\"),\n",
+    "        # Placeholders fill up a **list** of messages\n",
+    "        (\"placeholder\", \"{agent_scratchpad}\"),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
    "agent = create_tool_calling_agent(model, tools, prompt)\n",
    "agent_executor = AgentExecutor(\n",
    "    agent=agent,\n",
@@ -460,7 +471,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 29,
   "id": "b974a91f-6ae8-4644-83d9-73666258a6db",
   "metadata": {},
   "outputs": [
@@ -468,35 +479,33 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_VkrswGIkIUKJQyVF0AvMaU3p', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 65, 'total_tokens': 79}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-2dd5504b-9386-4b35-aed1-a2a267f883fd-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_VkrswGIkIUKJQyVF0AvMaU3p'}])]}}\n",
-      "------\n",
-      "{'action': {'messages': [ToolMessage(content='Sorry, there was an error. Please try again.', name='magic_function', id='85d7e845-f4ef-40a6-828d-c48c93b02b97', tool_call_id='call_VkrswGIkIUKJQyVF0AvMaU3p')]}}\n",
-      "------\n",
-      "{'agent': {'messages': [AIMessage(content='It seems there was an error when trying to calculate the value of the magic function for the input 3. Let me try again.', additional_kwargs={'tool_calls': [{'id': 'call_i5ZWsDhQvzgKs2bCroMB4JSL', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 42, 'prompt_tokens': 98, 'total_tokens': 140}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-6224c33b-0d3a-4925-9050-cb2a844dfe62-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_i5ZWsDhQvzgKs2bCroMB4JSL'}])]}}\n",
-      "------\n",
-      "{'action': {'messages': [ToolMessage(content='Sorry, there was an error. Please try again.', name='magic_function', id='f846363c-b143-402c-949d-40d84b19d979', tool_call_id='call_i5ZWsDhQvzgKs2bCroMB4JSL')]}}\n",
-      "------\n",
-      "{'agent': {'messages': [AIMessage(content='Unfortunately, there seems to be an issue with calculating the value of the magic function for the input 3. Let me attempt to resolve this issue by using a different approach.', additional_kwargs={'tool_calls': [{'id': 'call_I26nZWbe4iVnagUh4GVePwig', 'function': {'arguments': '{\"input\": \"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 65, 'prompt_tokens': 162, 'total_tokens': 227}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-0512509d-201e-4fbb-ac96-fdd68400810a-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_I26nZWbe4iVnagUh4GVePwig'}])]}}\n",
-      "------\n",
-      "{'action': {'messages': [ToolMessage(content='Sorry, there was an error. Please try again.', name='magic_function', id='fb19299f-de26-4659-9507-4bf4fb53bff4', tool_call_id='call_I26nZWbe4iVnagUh4GVePwig')]}}\n",
-      "------\n",
+      "('human', 'what is the value of magic_function(3)?')\n",
+      "content='' additional_kwargs={'tool_calls': [{'id': 'call_9fMkSAUGRa2BsADwF32ct1m1', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-79084bff-6e10-49bb-b7f0-f613ebcc68ac-0' tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_9fMkSAUGRa2BsADwF32ct1m1'}]\n",
+      "content='Sorry, there was an error. Please try again.' name='magic_function' id='06f997fd-5309-4d56-afa3-2fe8cbf0d04f' tool_call_id='call_9fMkSAUGRa2BsADwF32ct1m1'\n",
+      "content='' additional_kwargs={'tool_calls': [{'id': 'call_Fg92zoL8oS5q6im2jR1INRvH', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 97, 'total_tokens': 111}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-fc2e201f-6330-4330-8c4e-1a66e85c1ffa-0' tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_Fg92zoL8oS5q6im2jR1INRvH'}]\n",
+      "content='Sorry, there was an error. Please try again.' name='magic_function' id='a931dd6e-2ed7-42ea-a58c-5ffb4041d7c9' tool_call_id='call_Fg92zoL8oS5q6im2jR1INRvH'\n",
+      "content='It seems there is an issue with processing the request for the value of `magic_function(3)`. Let me try a different approach.' additional_kwargs={'tool_calls': [{'id': 'call_lbYBMptprZ6HMqNiTvoqhmwP', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]} response_metadata={'token_usage': {'completion_tokens': 43, 'prompt_tokens': 130, 'total_tokens': 173}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None} id='run-2e0baab0-c4c1-42e8-b49d-a2704ae977c0-0' tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_lbYBMptprZ6HMqNiTvoqhmwP'}]\n",
+      "content='Sorry, there was an error. Please try again.' name='magic_function' id='9957435a-5de3-4662-b23c-abfa31e71208' tool_call_id='call_lbYBMptprZ6HMqNiTvoqhmwP'\n",
+      "content='It appears that the `magic_function` is currently experiencing issues when attempting to process the input \"3\". Unfortunately, I can\\'t provide the value of `magic_function(3)` at this moment.\\n\\nIf you have any other questions or need assistance with something else, please let me know!' response_metadata={'token_usage': {'completion_tokens': 58, 'prompt_tokens': 195, 'total_tokens': 253}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'stop', 'logprobs': None} id='run-bb68d7ca-da76-43ad-80ab-23737a70c391-0'\n",
      "{'input': 'what is the value of magic_function(3)?', 'output': 'Agent stopped due to max iterations.'}\n"
     ]
    }
   ],
   "source": [
-    "from langgraph.pregel import GraphRecursionError\n",
+    "from langgraph.errors import GraphRecursionError\n",
+    "from langgraph.prebuilt import create_react_agent\n",
    "\n",
    "RECURSION_LIMIT = 2 * 3 + 1\n",
    "\n",
-    "app = chat_agent_executor.create_tool_calling_executor(model, tools)\n",
+    "app = create_react_agent(model, tools=tools)\n",
    "\n",
    "try:\n",
    "    for chunk in app.stream(\n",
-    "        {\"messages\": [(\"human\", query)]}, {\"recursion_limit\": RECURSION_LIMIT}\n",
+    "        {\"messages\": [(\"human\", query)]},\n",
+    "        {\"recursion_limit\": RECURSION_LIMIT},\n",
+    "        stream_mode=\"values\",\n",
    "    ):\n",
-    "        print(chunk)\n",
-    "        print(\"------\")\n",
+    "        print(chunk[\"messages\"][-1])\n",
    "except GraphRecursionError:\n",
    "    print({\"input\": query, \"output\": \"Agent stopped due to max iterations.\"})"
   ]
@@ -513,7 +522,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 30,
   "id": "4b8498fc-a7af-4164-a401-d8714f082306",
   "metadata": {},
   "outputs": [
@@ -540,7 +549,7 @@
       " 'output': 'Agent stopped due to max iterations.'}"
      ]
     },
-     "execution_count": 17,
+     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -569,9 +578,19 @@
    "agent_executor.invoke({\"input\": query})"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "d02eb025",
+   "metadata": {},
+   "source": [
+    "With LangGraph's react agent, you can control timeouts on two levels. \n",
+    "\n",
+    "You can set a `step_timeout` to bound each **step**:"
+   ]
+  },
  {
   "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 31,
   "id": "a2b29113-e6be-4f91-aa4c-5c63dea3e423",
   "metadata": {},
   "outputs": [
@@ -579,14 +598,16 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_lp2tuTmBpulORJr4FJp9za4E', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 65, 'total_tokens': 79}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-4070a5d8-c2ea-46f3-a3a2-dfcd2ebdadc2-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_lp2tuTmBpulORJr4FJp9za4E'}])]}}\n",
+      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_GlXWTlJ0jQc2B8jQuDVFzmnc', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-38a0459b-a363-4181-b7a3-f25cb5c5d728-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_GlXWTlJ0jQc2B8jQuDVFzmnc'}])]}}\n",
      "------\n",
      "{'input': 'what is the value of magic_function(3)?', 'output': 'Agent stopped due to max iterations.'}\n"
     ]
    }
   ],
   "source": [
-    "app = chat_agent_executor.create_tool_calling_executor(model, tools)\n",
+    "from langgraph.prebuilt import create_react_agent\n",
+    "\n",
+    "app = create_react_agent(model, tools=tools)\n",
    "# Set the max timeout for each step here\n",
    "app.step_timeout = 2\n",
    "\n",
@@ -598,13 +619,52 @@
    "    print({\"input\": query, \"output\": \"Agent stopped due to max iterations.\"})"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "32a9db70",
+   "metadata": {},
+   "source": [
+    "The other way to set a max timeout is just via python's stdlib [asyncio](https://docs.python.org/3/library/asyncio.html)."
+   ]
+  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 34,
   "id": "e9eb55f4-a321-4bac-b52d-9e43b411cf92",
   "metadata": {},
-   "outputs": [],
-   "source": []
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'agent': {'messages': [AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_cR1oJuYcNrOmcaaIRRvh5dSr', 'function': {'arguments': '{\"input\":\"3\"}', 'name': 'magic_function'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 64, 'total_tokens': 78}, 'model_name': 'gpt-4o', 'system_fingerprint': 'fp_729ea513f7', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-1c03c5d6-4883-4ccd-aa78-53dbafa99622-0', tool_calls=[{'name': 'magic_function', 'args': {'input': '3'}, 'id': 'call_cR1oJuYcNrOmcaaIRRvh5dSr'}])]}}\n",
+      "------\n",
+      "{'action': {'messages': [ToolMessage(content='Sorry, there was an error. Please try again.', name='magic_function', id='596baf13-de35-4a4f-8b78-475b387a1f40', tool_call_id='call_cR1oJuYcNrOmcaaIRRvh5dSr')]}}\n",
+      "------\n",
+      "{'input': 'what is the value of magic_function(3)?', 'output': 'Task Cancelled.'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "from langgraph.prebuilt import create_react_agent\n",
+    "\n",
+    "app = create_react_agent(model, tools=tools)\n",
+    "\n",
+    "\n",
+    "async def stream(app, inputs):\n",
+    "    async for chunk in app.astream({\"messages\": [(\"human\", query)]}):\n",
+    "        print(chunk)\n",
+    "        print(\"------\")\n",
+    "\n",
+    "\n",
+    "try:\n",
+    "    task = asyncio.create_task(stream(app, {\"messages\": [(\"human\", query)]}))\n",
+    "    await asyncio.wait_for(task, timeout=3)\n",
+    "except TimeoutError:\n",
+    "    print(\"Task Cancelled.\")"
+   ]
  }
 ],
 "metadata": {
@@ -623,7 +683,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.1"
+   "version": "3.11.2"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/output_parser_json.ipynb
+++ b/docs/docs/how_to/output_parser_json.ipynb
@@ -7,23 +7,22 @@
   "source": [
    "# How to parse JSON output\n",
    "\n",
-    "While some model providers support [built-in ways to return structured output](/docs/how_to/structured_output), not all do. We can use an output parser to help users to specify an arbitrary JSON schema via the prompt, query a model for outputs that conform to that schema, and finally parse that schema as JSON.\n",
+    ":::info Prerequisites\n",
    "\n",
-    ":::{.callout-note}\n",
-    "Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed JSON.\n",
-    ":::\n",
-    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [Chat models](/docs/concepts/#chat-models)\n",
    "- [Output parsers](/docs/concepts/#output-parsers)\n",
    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
    "- [Structured output](/docs/how_to/structured_output)\n",
    "- [Chaining runnables together](/docs/how_to/sequence/)\n",
-    "`}/>\n",
-    "```"
+    "\n",
+    ":::\n",
+    "\n",
+    "While some model providers support [built-in ways to return structured output](/docs/how_to/structured_output), not all do. We can use an output parser to help users to specify an arbitrary JSON schema via the prompt, query a model for outputs that conform to that schema, and finally parse that schema as JSON.\n",
+    "\n",
+    ":::{.callout-note}\n",
+    "Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed JSON.\n",
+    ":::"
   ]
  },
  {
@@ -255,7 +254,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/output_parser_xml.ipynb
+++ b/docs/docs/how_to/output_parser_xml.ipynb
@@ -7,6 +7,17 @@
   "source": [
    "# How to parse XML output\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [Output parsers](/docs/concepts/#output-parsers)\n",
+    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Structured output](/docs/how_to/structured_output)\n",
+    "- [Chaining runnables together](/docs/how_to/sequence/)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "LLMs from different providers often have different strengths depending on the specific data they are trianed on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
    "\n",
    "This guide shows you how to use the [`XMLOutputParser`](https://api.python.langchain.com/en/latest/output_parsers/langchain_core.output_parsers.xml.XMLOutputParser.html) to prompt models for XML output, then and parse that output into a usable format.\n",
@@ -15,17 +26,6 @@
    "Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed XML.\n",
    ":::\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [Output parsers](/docs/concepts/#output-parsers)\n",
-    "- [Structured output](/docs/how_to/structured_output)\n",
-    "- [Chaining runnables together](/docs/how_to/sequence/)\n",
-    "`}/>\n",
-    "```\n",
-    "\n",
    "In the following examples, we use Anthropic's Claude-2 model (https://docs.anthropic.com/claude/docs), which is one such model that is optimized for XML tags."
   ]
  },
@@ -274,7 +274,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/output_parser_yaml.ipynb
+++ b/docs/docs/how_to/output_parser_yaml.ipynb
@@ -7,24 +7,24 @@
   "source": [
    "# How to parse YAML output\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [Output parsers](/docs/concepts/#output-parsers)\n",
+    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Structured output](/docs/how_to/structured_output)\n",
+    "- [Chaining runnables together](/docs/how_to/sequence/)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "LLMs from different providers often have different strengths depending on the specific data they are trianed on. This also means that some may be \"better\" and more reliable at generating output in formats other than JSON.\n",
    "\n",
    "This output parser allows users to specify an arbitrary schema and query LLMs for outputs that conform to that schema, using YAML to format their response.\n",
    "\n",
    ":::{.callout-note}\n",
    "Keep in mind that large language models are leaky abstractions! You'll have to use an LLM with sufficient capacity to generate well-formed YAML.\n",
-    ":::\n",
-    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [Output parsers](/docs/concepts/#output-parsers)\n",
-    "- [Structured output](/docs/how_to/structured_output)\n",
-    "- [Chaining runnables together](/docs/how_to/sequence/)\n",
-    "`}/>\n",
-    "```"
+    ":::\n"
   ]
  },
  {
@@ -165,7 +165,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/parallel.ipynb
+++ b/docs/docs/how_to/parallel.ipynb
@@ -18,16 +18,15 @@
   "source": [
    "# How to invoke runnables in parallel\n",
    "\n",
-    "The [`RunnableParallel`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableParallel.html) primitive is essentially a dict whose values are runnables (or things that can be coerced to runnables, like functions). It runs all of its values in parallel, and each value is called with the overall input of the `RunnableParallel`. The final return value is a dict with the results of each value under its appropriate key.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
    "- [Chaining runnables](/docs/how_to/sequence)\n",
-    "`} />\n",
-    "```\n",
+    "\n",
+    ":::\n",
+    "\n",
+    "The [`RunnableParallel`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableParallel.html) primitive is essentially a dict whose values are runnables (or things that can be coerced to runnables, like functions). It runs all of its values in parallel, and each value is called with the overall input of the `RunnableParallel`. The final return value is a dict with the results of each value under its appropriate key.\n",
    "\n",
    "## Formatting with `RunnableParallels`\n",
    "\n",
@@ -354,7 +353,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/parent_document_retriever.ipynb
+++ b/docs/docs/how_to/parent_document_retriever.ipynb
@@ -57,7 +57,7 @@
   "outputs": [],
   "source": [
    "loaders = [\n",
-    "    TextLoader(\"../../paul_graham_essay.txt\"),\n",
+    "    TextLoader(\"paul_graham_essay.txt\"),\n",
    "    TextLoader(\"state_of_the_union.txt\"),\n",
    "]\n",
    "docs = []\n",
@@ -124,8 +124,8 @@
    {
     "data": {
      "text/plain": [
-       "['cfdf4af7-51f2-4ea3-8166-5be208efa040',\n",
-       " 'bf213c21-cc66-4208-8a72-733d030187e6']"
+       "['9a63376c-58cc-42c9-b0f7-61f0e1a3a688',\n",
+       " '40091598-e918-4a18-9be0-f46413a95ae4']"
      ]
     },
     "execution_count": 6,
@@ -190,7 +190,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "retrieved_docs = retriever.get_relevant_documents(\"justice breyer\")"
+    "retrieved_docs = retriever.invoke(\"justice breyer\")"
   ]
  },
  {
@@ -338,17 +338,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 18,
   "id": "3a3202df",
   "metadata": {},
   "outputs": [],
   "source": [
-    "retrieved_docs = retriever.get_relevant_documents(\"justice breyer\")"
+    "retrieved_docs = retriever.invoke(\"justice breyer\")"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 19,
   "id": "684fdb2c",
   "metadata": {},
   "outputs": [
@@ -358,7 +358,7 @@
       "1849"
      ]
     },
-     "execution_count": 18,
+     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -369,7 +369,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 20,
   "id": "9f17f662",
   "metadata": {},
   "outputs": [
@@ -424,7 +424,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.10.4"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/passthrough.ipynb
+++ b/docs/docs/how_to/passthrough.ipynb
@@ -18,18 +18,18 @@
   "source": [
    "# How to pass through arguments from one step to the next\n",
    "\n",
-    "When composing chains with several steps, sometimes you will want to pass data from previous steps unchanged for use as input to a later step. The [`RunnablePassthrough`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html) class allows you to do just this, and is typically is used in conjuction with a [RunnableParallel](/docs/how_to/parallel/) to pass data through to a later step in your constructed chains.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
    "- [Chaining runnables](/docs/how_to/sequence/)\n",
    "- [Calling runnables in parallel](/docs/how_to/parallel/)\n",
    "- [Custom functions](/docs/how_to/functions/)\n",
-    "`} />\n",
-    "```\n",
+    "\n",
+    ":::\n",
+    "\n",
+    "\n",
+    "When composing chains with several steps, sometimes you will want to pass data from previous steps unchanged for use as input to a later step. The [`RunnablePassthrough`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.passthrough.RunnablePassthrough.html) class allows you to do just this, and is typically is used in conjuction with a [RunnableParallel](/docs/how_to/parallel/) to pass data through to a later step in your constructed chains.\n",
    "\n",
    "See the example below:"
   ]
@@ -174,7 +174,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/paul_graham_essay.txt
+++ b/docs/docs/how_to/paul_graham_essay.txt
@@ -0,0 +1,351 @@
+What I Worked On
+
+February 2021
+
+Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.
+
+The first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.
+
+The language we used was an early version of Fortran. You had to type programs on punch cards, then stack them in the card reader and press a button to load the program into memory and run it. The result would ordinarily be to print something on the spectacularly loud printer.
+
+I was puzzled by the 1401. I couldn't figure out what to do with it. And in retrospect there's not much I could have done with it. The only form of input to programs was data stored on punched cards, and I didn't have any data stored on punched cards. The only other option was to do things that didn't rely on any input, like calculate approximations of pi, but I didn't know enough math to do anything interesting of that type. So I'm not surprised I can't remember any programs I wrote, because they can't have done much. My clearest memory is of the moment I learned it was possible for programs not to terminate, when one of mine didn't. On a machine without time-sharing, this was a social as well as a technical error, as the data center manager's expression made clear.
+
+With microcomputers, everything changed. Now you could have a computer sitting right in front of you, on a desk, that could respond to your keystrokes as it was running instead of just churning through a stack of punch cards and then stopping. [1]
+
+The first of my friends to get a microcomputer built it himself. It was sold as a kit by Heathkit. I remember vividly how impressed and envious I felt watching him sitting in front of it, typing programs right into the computer.
+
+Computers were expensive in those days and it took me years of nagging before I convinced my father to buy one, a TRS-80, in about 1980. The gold standard then was the Apple II, but a TRS-80 was good enough. This was when I really started programming. I wrote simple games, a program to predict how high my model rockets would fly, and a word processor that my father used to write at least one book. There was only room in memory for about 2 pages of text, so he'd write 2 pages at a time and then print them out, but it was a lot better than a typewriter.
+
+Though I liked programming, I didn't plan to study it in college. In college I was going to study philosophy, which sounded much more powerful. It seemed, to my naive high school self, to be the study of the ultimate truths, compared to which the things studied in other fields would be mere domain knowledge. What I discovered when I got to college was that the other fields took up so much of the space of ideas that there wasn't much left for these supposed ultimate truths. All that seemed left for philosophy were edge cases that people in other fields felt could safely be ignored.
+
+I couldn't have put this into words when I was 18. All I knew at the time was that I kept taking philosophy courses and they kept being boring. So I decided to switch to AI.
+
+AI was in the air in the mid 1980s, but there were two things especially that made me want to work on it: a novel by Heinlein called The Moon is a Harsh Mistress, which featured an intelligent computer called Mike, and a PBS documentary that showed Terry Winograd using SHRDLU. I haven't tried rereading The Moon is a Harsh Mistress, so I don't know how well it has aged, but when I read it I was drawn entirely into its world. It seemed only a matter of time before we'd have Mike, and when I saw Winograd using SHRDLU, it seemed like that time would be a few years at most. All you had to do was teach SHRDLU more words.
+
+There weren't any classes in AI at Cornell then, not even graduate classes, so I started trying to teach myself. Which meant learning Lisp, since in those days Lisp was regarded as the language of AI. The commonly used programming languages then were pretty primitive, and programmers' ideas correspondingly so. The default language at Cornell was a Pascal-like language called PL/I, and the situation was similar elsewhere. Learning Lisp expanded my concept of a program so fast that it was years before I started to have a sense of where the new limits were. This was more like it; this was what I had expected college to do. It wasn't happening in a class, like it was supposed to, but that was ok. For the next couple years I was on a roll. I knew what I was going to do.
+
+For my undergraduate thesis, I reverse-engineered SHRDLU. My God did I love working on that program. It was a pleasing bit of code, but what made it even more exciting was my belief — hard to imagine now, but not unique in 1985 — that it was already climbing the lower slopes of intelligence.
+
+I had gotten into a program at Cornell that didn't make you choose a major. You could take whatever classes you liked, and choose whatever you liked to put on your degree. I of course chose "Artificial Intelligence." When I got the actual physical diploma, I was dismayed to find that the quotes had been included, which made them read as scare-quotes. At the time this bothered me, but now it seems amusingly accurate, for reasons I was about to discover.
+
+I applied to 3 grad schools: MIT and Yale, which were renowned for AI at the time, and Harvard, which I'd visited because Rich Draves went there, and was also home to Bill Woods, who'd invented the type of parser I used in my SHRDLU clone. Only Harvard accepted me, so that was where I went.
+
+I don't remember the moment it happened, or if there even was a specific moment, but during the first year of grad school I realized that AI, as practiced at the time, was a hoax. By which I mean the sort of AI in which a program that's told "the dog is sitting on the chair" translates this into some formal representation and adds it to the list of things it knows.
+
+What these programs really showed was that there's a subset of natural language that's a formal language. But a very proper subset. It was clear that there was an unbridgeable gap between what they could do and actually understanding natural language. It was not, in fact, simply a matter of teaching SHRDLU more words. That whole way of doing AI, with explicit data structures representing concepts, was not going to work. Its brokenness did, as so often happens, generate a lot of opportunities to write papers about various band-aids that could be applied to it, but it was never going to get us Mike.
+
+So I looked around to see what I could salvage from the wreckage of my plans, and there was Lisp. I knew from experience that Lisp was interesting for its own sake and not just for its association with AI, even though that was the main reason people cared about it at the time. So I decided to focus on Lisp. In fact, I decided to write a book about Lisp hacking. It's scary to think how little I knew about Lisp hacking when I started writing that book. But there's nothing like writing a book about something to help you learn it. The book, On Lisp, wasn't published till 1993, but I wrote much of it in grad school.
+
+Computer Science is an uneasy alliance between two halves, theory and systems. The theory people prove things, and the systems people build things. I wanted to build things. I had plenty of respect for theory — indeed, a sneaking suspicion that it was the more admirable of the two halves — but building things seemed so much more exciting.
+
+The problem with systems work, though, was that it didn't last. Any program you wrote today, no matter how good, would be obsolete in a couple decades at best. People might mention your software in footnotes, but no one would actually use it. And indeed, it would seem very feeble work. Only people with a sense of the history of the field would even realize that, in its time, it had been good.
+
+There were some surplus Xerox Dandelions floating around the computer lab at one point. Anyone who wanted one to play around with could have one. I was briefly tempted, but they were so slow by present standards; what was the point? No one else wanted one either, so off they went. That was what happened to systems work.
+
+I wanted not just to build things, but to build things that would last.
+
+In this dissatisfied state I went in 1988 to visit Rich Draves at CMU, where he was in grad school. One day I went to visit the Carnegie Institute, where I'd spent a lot of time as a kid. While looking at a painting there I realized something that might seem obvious, but was a big surprise to me. There, right on the wall, was something you could make that would last. Paintings didn't become obsolete. Some of the best ones were hundreds of years old.
+
+And moreover this was something you could make a living doing. Not as easily as you could by writing software, of course, but I thought if you were really industrious and lived really cheaply, it had to be possible to make enough to survive. And as an artist you could be truly independent. You wouldn't have a boss, or even need to get research funding.
+
+I had always liked looking at paintings. Could I make them? I had no idea. I'd never imagined it was even possible. I knew intellectually that people made art — that it didn't just appear spontaneously — but it was as if the people who made it were a different species. They either lived long ago or were mysterious geniuses doing strange things in profiles in Life magazine. The idea of actually being able to make art, to put that verb before that noun, seemed almost miraculous.
+
+That fall I started taking art classes at Harvard. Grad students could take classes in any department, and my advisor, Tom Cheatham, was very easy going. If he even knew about the strange classes I was taking, he never said anything.
+
+So now I was in a PhD program in computer science, yet planning to be an artist, yet also genuinely in love with Lisp hacking and working away at On Lisp. In other words, like many a grad student, I was working energetically on multiple projects that were not my thesis.
+
+I didn't see a way out of this situation. I didn't want to drop out of grad school, but how else was I going to get out? I remember when my friend Robert Morris got kicked out of Cornell for writing the internet worm of 1988, I was envious that he'd found such a spectacular way to get out of grad school.
+
+Then one day in April 1990 a crack appeared in the wall. I ran into professor Cheatham and he asked if I was far enough along to graduate that June. I didn't have a word of my dissertation written, but in what must have been the quickest bit of thinking in my life, I decided to take a shot at writing one in the 5 weeks or so that remained before the deadline, reusing parts of On Lisp where I could, and I was able to respond, with no perceptible delay "Yes, I think so. I'll give you something to read in a few days."
+
+I picked applications of continuations as the topic. In retrospect I should have written about macros and embedded languages. There's a whole world there that's barely been explored. But all I wanted was to get out of grad school, and my rapidly written dissertation sufficed, just barely.
+
+Meanwhile I was applying to art schools. I applied to two: RISD in the US, and the Accademia di Belli Arti in Florence, which, because it was the oldest art school, I imagined would be good. RISD accepted me, and I never heard back from the Accademia, so off to Providence I went.
+
+I'd applied for the BFA program at RISD, which meant in effect that I had to go to college again. This was not as strange as it sounds, because I was only 25, and art schools are full of people of different ages. RISD counted me as a transfer sophomore and said I had to do the foundation that summer. The foundation means the classes that everyone has to take in fundamental subjects like drawing, color, and design.
+
+Toward the end of the summer I got a big surprise: a letter from the Accademia, which had been delayed because they'd sent it to Cambridge England instead of Cambridge Massachusetts, inviting me to take the entrance exam in Florence that fall. This was now only weeks away. My nice landlady let me leave my stuff in her attic. I had some money saved from consulting work I'd done in grad school; there was probably enough to last a year if I lived cheaply. Now all I had to do was learn Italian.
+
+Only stranieri (foreigners) had to take this entrance exam. In retrospect it may well have been a way of excluding them, because there were so many stranieri attracted by the idea of studying art in Florence that the Italian students would otherwise have been outnumbered. I was in decent shape at painting and drawing from the RISD foundation that summer, but I still don't know how I managed to pass the written exam. I remember that I answered the essay question by writing about Cezanne, and that I cranked up the intellectual level as high as I could to make the most of my limited vocabulary. [2]
+
+I'm only up to age 25 and already there are such conspicuous patterns. Here I was, yet again about to attend some august institution in the hopes of learning about some prestigious subject, and yet again about to be disappointed. The students and faculty in the painting department at the Accademia were the nicest people you could imagine, but they had long since arrived at an arrangement whereby the students wouldn't require the faculty to teach anything, and in return the faculty wouldn't require the students to learn anything. And at the same time all involved would adhere outwardly to the conventions of a 19th century atelier. We actually had one of those little stoves, fed with kindling, that you see in 19th century studio paintings, and a nude model sitting as close to it as possible without getting burned. Except hardly anyone else painted her besides me. The rest of the students spent their time chatting or occasionally trying to imitate things they'd seen in American art magazines.
+
+Our model turned out to live just down the street from me. She made a living from a combination of modelling and making fakes for a local antique dealer. She'd copy an obscure old painting out of a book, and then he'd take the copy and maltreat it to make it look old. [3]
+
+While I was a student at the Accademia I started painting still lives in my bedroom at night. These paintings were tiny, because the room was, and because I painted them on leftover scraps of canvas, which was all I could afford at the time. Painting still lives is different from painting people, because the subject, as its name suggests, can't move. People can't sit for more than about 15 minutes at a time, and when they do they don't sit very still. So the traditional m.o. for painting people is to know how to paint a generic person, which you then modify to match the specific person you're painting. Whereas a still life you can, if you want, copy pixel by pixel from what you're seeing. You don't want to stop there, of course, or you get merely photographic accuracy, and what makes a still life interesting is that it's been through a head. You want to emphasize the visual cues that tell you, for example, that the reason the color changes suddenly at a certain point is that it's the edge of an object. By subtly emphasizing such things you can make paintings that are more realistic than photographs not just in some metaphorical sense, but in the strict information-theoretic sense. [4]
+
+I liked painting still lives because I was curious about what I was seeing. In everyday life, we aren't consciously aware of much we're seeing. Most visual perception is handled by low-level processes that merely tell your brain "that's a water droplet" without telling you details like where the lightest and darkest points are, or "that's a bush" without telling you the shape and position of every leaf. This is a feature of brains, not a bug. In everyday life it would be distracting to notice every leaf on every bush. But when you have to paint something, you have to look more closely, and when you do there's a lot to see. You can still be noticing new things after days of trying to paint something people usually take for granted, just as you can after days of trying to write an essay about something people usually take for granted.
+
+This is not the only way to paint. I'm not 100% sure it's even a good way to paint. But it seemed a good enough bet to be worth trying.
+
+Our teacher, professor Ulivi, was a nice guy. He could see I worked hard, and gave me a good grade, which he wrote down in a sort of passport each student had. But the Accademia wasn't teaching me anything except Italian, and my money was running out, so at the end of the first year I went back to the US.
+
+I wanted to go back to RISD, but I was now broke and RISD was very expensive, so I decided to get a job for a year and then return to RISD the next fall. I got one at a company called Interleaf, which made software for creating documents. You mean like Microsoft Word? Exactly. That was how I learned that low end software tends to eat high end software. But Interleaf still had a few years to live yet. [5]
+
+Interleaf had done something pretty bold. Inspired by Emacs, they'd added a scripting language, and even made the scripting language a dialect of Lisp. Now they wanted a Lisp hacker to write things in it. This was the closest thing I've had to a normal job, and I hereby apologize to my boss and coworkers, because I was a bad employee. Their Lisp was the thinnest icing on a giant C cake, and since I didn't know C and didn't want to learn it, I never understood most of the software. Plus I was terribly irresponsible. This was back when a programming job meant showing up every day during certain working hours. That seemed unnatural to me, and on this point the rest of the world is coming around to my way of thinking, but at the time it caused a lot of friction. Toward the end of the year I spent much of my time surreptitiously working on On Lisp, which I had by this time gotten a contract to publish.
+
+The good part was that I got paid huge amounts of money, especially by art student standards. In Florence, after paying my part of the rent, my budget for everything else had been $7 a day. Now I was getting paid more than 4 times that every hour, even when I was just sitting in a meeting. By living cheaply I not only managed to save enough to go back to RISD, but also paid off my college loans.
+
+I learned some useful things at Interleaf, though they were mostly about what not to do. I learned that it's better for technology companies to be run by product people than sales people (though sales is a real skill and people who are good at it are really good at it), that it leads to bugs when code is edited by too many people, that cheap office space is no bargain if it's depressing, that planned meetings are inferior to corridor conversations, that big, bureaucratic customers are a dangerous source of money, and that there's not much overlap between conventional office hours and the optimal time for hacking, or conventional offices and the optimal place for it.
+
+But the most important thing I learned, and which I used in both Viaweb and Y Combinator, is that the low end eats the high end: that it's good to be the "entry level" option, even though that will be less prestigious, because if you're not, someone else will be, and will squash you against the ceiling. Which in turn means that prestige is a danger sign.
+
+When I left to go back to RISD the next fall, I arranged to do freelance work for the group that did projects for customers, and this was how I survived for the next several years. When I came back to visit for a project later on, someone told me about a new thing called HTML, which was, as he described it, a derivative of SGML. Markup language enthusiasts were an occupational hazard at Interleaf and I ignored him, but this HTML thing later became a big part of my life.
+
+In the fall of 1992 I moved back to Providence to continue at RISD. The foundation had merely been intro stuff, and the Accademia had been a (very civilized) joke. Now I was going to see what real art school was like. But alas it was more like the Accademia than not. Better organized, certainly, and a lot more expensive, but it was now becoming clear that art school did not bear the same relationship to art that medical school bore to medicine. At least not the painting department. The textile department, which my next door neighbor belonged to, seemed to be pretty rigorous. No doubt illustration and architecture were too. But painting was post-rigorous. Painting students were supposed to express themselves, which to the more worldly ones meant to try to cook up some sort of distinctive signature style.
+
+A signature style is the visual equivalent of what in show business is known as a "schtick": something that immediately identifies the work as yours and no one else's. For example, when you see a painting that looks like a certain kind of cartoon, you know it's by Roy Lichtenstein. So if you see a big painting of this type hanging in the apartment of a hedge fund manager, you know he paid millions of dollars for it. That's not always why artists have a signature style, but it's usually why buyers pay a lot for such work. [6]
+
+There were plenty of earnest students too: kids who "could draw" in high school, and now had come to what was supposed to be the best art school in the country, to learn to draw even better. They tended to be confused and demoralized by what they found at RISD, but they kept going, because painting was what they did. I was not one of the kids who could draw in high school, but at RISD I was definitely closer to their tribe than the tribe of signature style seekers.
+
+I learned a lot in the color class I took at RISD, but otherwise I was basically teaching myself to paint, and I could do that for free. So in 1993 I dropped out. I hung around Providence for a bit, and then my college friend Nancy Parmet did me a big favor. A rent-controlled apartment in a building her mother owned in New York was becoming vacant. Did I want it? It wasn't much more than my current place, and New York was supposed to be where the artists were. So yes, I wanted it! [7]
+
+Asterix comics begin by zooming in on a tiny corner of Roman Gaul that turns out not to be controlled by the Romans. You can do something similar on a map of New York City: if you zoom in on the Upper East Side, there's a tiny corner that's not rich, or at least wasn't in 1993. It's called Yorkville, and that was my new home. Now I was a New York artist — in the strictly technical sense of making paintings and living in New York.
+
+I was nervous about money, because I could sense that Interleaf was on the way down. Freelance Lisp hacking work was very rare, and I didn't want to have to program in another language, which in those days would have meant C++ if I was lucky. So with my unerring nose for financial opportunity, I decided to write another book on Lisp. This would be a popular book, the sort of book that could be used as a textbook. I imagined myself living frugally off the royalties and spending all my time painting. (The painting on the cover of this book, ANSI Common Lisp, is one that I painted around this time.)
+
+The best thing about New York for me was the presence of Idelle and Julian Weber. Idelle Weber was a painter, one of the early photorealists, and I'd taken her painting class at Harvard. I've never known a teacher more beloved by her students. Large numbers of former students kept in touch with her, including me. After I moved to New York I became her de facto studio assistant.
+
+She liked to paint on big, square canvases, 4 to 5 feet on a side. One day in late 1994 as I was stretching one of these monsters there was something on the radio about a famous fund manager. He wasn't that much older than me, and was super rich. The thought suddenly occurred to me: why don't I become rich? Then I'll be able to work on whatever I want.
+
+Meanwhile I'd been hearing more and more about this new thing called the World Wide Web. Robert Morris showed it to me when I visited him in Cambridge, where he was now in grad school at Harvard. It seemed to me that the web would be a big deal. I'd seen what graphical user interfaces had done for the popularity of microcomputers. It seemed like the web would do the same for the internet.
+
+If I wanted to get rich, here was the next train leaving the station. I was right about that part. What I got wrong was the idea. I decided we should start a company to put art galleries online. I can't honestly say, after reading so many Y Combinator applications, that this was the worst startup idea ever, but it was up there. Art galleries didn't want to be online, and still don't, not the fancy ones. That's not how they sell. I wrote some software to generate web sites for galleries, and Robert wrote some to resize images and set up an http server to serve the pages. Then we tried to sign up galleries. To call this a difficult sale would be an understatement. It was difficult to give away. A few galleries let us make sites for them for free, but none paid us.
+
+Then some online stores started to appear, and I realized that except for the order buttons they were identical to the sites we'd been generating for galleries. This impressive-sounding thing called an "internet storefront" was something we already knew how to build.
+
+So in the summer of 1995, after I submitted the camera-ready copy of ANSI Common Lisp to the publishers, we started trying to write software to build online stores. At first this was going to be normal desktop software, which in those days meant Windows software. That was an alarming prospect, because neither of us knew how to write Windows software or wanted to learn. We lived in the Unix world. But we decided we'd at least try writing a prototype store builder on Unix. Robert wrote a shopping cart, and I wrote a new site generator for stores — in Lisp, of course.
+
+We were working out of Robert's apartment in Cambridge. His roommate was away for big chunks of time, during which I got to sleep in his room. For some reason there was no bed frame or sheets, just a mattress on the floor. One morning as I was lying on this mattress I had an idea that made me sit up like a capital L. What if we ran the software on the server, and let users control it by clicking on links? Then we'd never have to write anything to run on users' computers. We could generate the sites on the same server we'd serve them from. Users wouldn't need anything more than a browser.
+
+This kind of software, known as a web app, is common now, but at the time it wasn't clear that it was even possible. To find out, we decided to try making a version of our store builder that you could control through the browser. A couple days later, on August 12, we had one that worked. The UI was horrible, but it proved you could build a whole store through the browser, without any client software or typing anything into the command line on the server.
+
+Now we felt like we were really onto something. I had visions of a whole new generation of software working this way. You wouldn't need versions, or ports, or any of that crap. At Interleaf there had been a whole group called Release Engineering that seemed to be at least as big as the group that actually wrote the software. Now you could just update the software right on the server.
+
+We started a new company we called Viaweb, after the fact that our software worked via the web, and we got $10,000 in seed funding from Idelle's husband Julian. In return for that and doing the initial legal work and giving us business advice, we gave him 10% of the company. Ten years later this deal became the model for Y Combinator's. We knew founders needed something like this, because we'd needed it ourselves.
+
+At this stage I had a negative net worth, because the thousand dollars or so I had in the bank was more than counterbalanced by what I owed the government in taxes. (Had I diligently set aside the proper proportion of the money I'd made consulting for Interleaf? No, I had not.) So although Robert had his graduate student stipend, I needed that seed funding to live on.
+
+We originally hoped to launch in September, but we got more ambitious about the software as we worked on it. Eventually we managed to build a WYSIWYG site builder, in the sense that as you were creating pages, they looked exactly like the static ones that would be generated later, except that instead of leading to static pages, the links all referred to closures stored in a hash table on the server.
+
+It helped to have studied art, because the main goal of an online store builder is to make users look legit, and the key to looking legit is high production values. If you get page layouts and fonts and colors right, you can make a guy running a store out of his bedroom look more legit than a big company.
+
+(If you're curious why my site looks so old-fashioned, it's because it's still made with this software. It may look clunky today, but in 1996 it was the last word in slick.)
+
+In September, Robert rebelled. "We've been working on this for a month," he said, "and it's still not done." This is funny in retrospect, because he would still be working on it almost 3 years later. But I decided it might be prudent to recruit more programmers, and I asked Robert who else in grad school with him was really good. He recommended Trevor Blackwell, which surprised me at first, because at that point I knew Trevor mainly for his plan to reduce everything in his life to a stack of notecards, which he carried around with him. But Rtm was right, as usual. Trevor turned out to be a frighteningly effective hacker.
+
+It was a lot of fun working with Robert and Trevor. They're the two most independent-minded people I know, and in completely different ways. If you could see inside Rtm's brain it would look like a colonial New England church, and if you could see inside Trevor's it would look like the worst excesses of Austrian Rococo.
+
+We opened for business, with 6 stores, in January 1996. It was just as well we waited a few months, because although we worried we were late, we were actually almost fatally early. There was a lot of talk in the press then about ecommerce, but not many people actually wanted online stores. [8]
+
+There were three main parts to the software: the editor, which people used to build sites and which I wrote, the shopping cart, which Robert wrote, and the manager, which kept track of orders and statistics, and which Trevor wrote. In its time, the editor was one of the best general-purpose site builders. I kept the code tight and didn't have to integrate with any other software except Robert's and Trevor's, so it was quite fun to work on. If all I'd had to do was work on this software, the next 3 years would have been the easiest of my life. Unfortunately I had to do a lot more, all of it stuff I was worse at than programming, and the next 3 years were instead the most stressful.
+
+There were a lot of startups making ecommerce software in the second half of the 90s. We were determined to be the Microsoft Word, not the Interleaf. Which meant being easy to use and inexpensive. It was lucky for us that we were poor, because that caused us to make Viaweb even more inexpensive than we realized. We charged $100 a month for a small store and $300 a month for a big one. This low price was a big attraction, and a constant thorn in the sides of competitors, but it wasn't because of some clever insight that we set the price low. We had no idea what businesses paid for things. $300 a month seemed like a lot of money to us.
+
+We did a lot of things right by accident like that. For example, we did what's now called "doing things that don't scale," although at the time we would have described it as "being so lame that we're driven to the most desperate measures to get users." The most common of which was building stores for them. This seemed particularly humiliating, since the whole reason d'etre of our software was that people could use it to make their own stores. But anything to get users.
+
+We learned a lot more about retail than we wanted to know. For example, that if you could only have a small image of a man's shirt (and all images were small then by present standards), it was better to have a closeup of the collar than a picture of the whole shirt. The reason I remember learning this was that it meant I had to rescan about 30 images of men's shirts. My first set of scans were so beautiful too.
+
+Though this felt wrong, it was exactly the right thing to be doing. Building stores for users taught us about retail, and about how it felt to use our software. I was initially both mystified and repelled by "business" and thought we needed a "business person" to be in charge of it, but once we started to get users, I was converted, in much the same way I was converted to fatherhood once I had kids. Whatever users wanted, I was all theirs. Maybe one day we'd have so many users that I couldn't scan their images for them, but in the meantime there was nothing more important to do.
+
+Another thing I didn't get at the time is that growth rate is the ultimate test of a startup. Our growth rate was fine. We had about 70 stores at the end of 1996 and about 500 at the end of 1997. I mistakenly thought the thing that mattered was the absolute number of users. And that is the thing that matters in the sense that that's how much money you're making, and if you're not making enough, you might go out of business. But in the long term the growth rate takes care of the absolute number. If we'd been a startup I was advising at Y Combinator, I would have said: Stop being so stressed out, because you're doing fine. You're growing 7x a year. Just don't hire too many more people and you'll soon be profitable, and then you'll control your own destiny.
+
+Alas I hired lots more people, partly because our investors wanted me to, and partly because that's what startups did during the Internet Bubble. A company with just a handful of employees would have seemed amateurish. So we didn't reach breakeven until about when Yahoo bought us in the summer of 1998. Which in turn meant we were at the mercy of investors for the entire life of the company. And since both we and our investors were noobs at startups, the result was a mess even by startup standards.
+
+It was a huge relief when Yahoo bought us. In principle our Viaweb stock was valuable. It was a share in a business that was profitable and growing rapidly. But it didn't feel very valuable to me; I had no idea how to value a business, but I was all too keenly aware of the near-death experiences we seemed to have every few months. Nor had I changed my grad student lifestyle significantly since we started. So when Yahoo bought us it felt like going from rags to riches. Since we were going to California, I bought a car, a yellow 1998 VW GTI. I remember thinking that its leather seats alone were by far the most luxurious thing I owned.
+
+The next year, from the summer of 1998 to the summer of 1999, must have been the least productive of my life. I didn't realize it at the time, but I was worn out from the effort and stress of running Viaweb. For a while after I got to California I tried to continue my usual m.o. of programming till 3 in the morning, but fatigue combined with Yahoo's prematurely aged culture and grim cube farm in Santa Clara gradually dragged me down. After a few months it felt disconcertingly like working at Interleaf.
+
+Yahoo had given us a lot of options when they bought us. At the time I thought Yahoo was so overvalued that they'd never be worth anything, but to my astonishment the stock went up 5x in the next year. I hung on till the first chunk of options vested, then in the summer of 1999 I left. It had been so long since I'd painted anything that I'd half forgotten why I was doing this. My brain had been entirely full of software and men's shirts for 4 years. But I had done this to get rich so I could paint, I reminded myself, and now I was rich, so I should go paint.
+
+When I said I was leaving, my boss at Yahoo had a long conversation with me about my plans. I told him all about the kinds of pictures I wanted to paint. At the time I was touched that he took such an interest in me. Now I realize it was because he thought I was lying. My options at that point were worth about $2 million a month. If I was leaving that kind of money on the table, it could only be to go and start some new startup, and if I did, I might take people with me. This was the height of the Internet Bubble, and Yahoo was ground zero of it. My boss was at that moment a billionaire. Leaving then to start a new startup must have seemed to him an insanely, and yet also plausibly, ambitious plan.
+
+But I really was quitting to paint, and I started immediately. There was no time to lose. I'd already burned 4 years getting rich. Now when I talk to founders who are leaving after selling their companies, my advice is always the same: take a vacation. That's what I should have done, just gone off somewhere and done nothing for a month or two, but the idea never occurred to me.
+
+So I tried to paint, but I just didn't seem to have any energy or ambition. Part of the problem was that I didn't know many people in California. I'd compounded this problem by buying a house up in the Santa Cruz Mountains, with a beautiful view but miles from anywhere. I stuck it out for a few more months, then in desperation I went back to New York, where unless you understand about rent control you'll be surprised to hear I still had my apartment, sealed up like a tomb of my old life. Idelle was in New York at least, and there were other people trying to paint there, even though I didn't know any of them.
+
+When I got back to New York I resumed my old life, except now I was rich. It was as weird as it sounds. I resumed all my old patterns, except now there were doors where there hadn't been. Now when I was tired of walking, all I had to do was raise my hand, and (unless it was raining) a taxi would stop to pick me up. Now when I walked past charming little restaurants I could go in and order lunch. It was exciting for a while. Painting started to go better. I experimented with a new kind of still life where I'd paint one painting in the old way, then photograph it and print it, blown up, on canvas, and then use that as the underpainting for a second still life, painted from the same objects (which hopefully hadn't rotted yet).
+
+Meanwhile I looked for an apartment to buy. Now I could actually choose what neighborhood to live in. Where, I asked myself and various real estate agents, is the Cambridge of New York? Aided by occasional visits to actual Cambridge, I gradually realized there wasn't one. Huh.
+
+Around this time, in the spring of 2000, I had an idea. It was clear from our experience with Viaweb that web apps were the future. Why not build a web app for making web apps? Why not let people edit code on our server through the browser, and then host the resulting applications for them? [9] You could run all sorts of services on the servers that these applications could use just by making an API call: making and receiving phone calls, manipulating images, taking credit card payments, etc.
+
+I got so excited about this idea that I couldn't think about anything else. It seemed obvious that this was the future. I didn't particularly want to start another company, but it was clear that this idea would have to be embodied as one, so I decided to move to Cambridge and start it. I hoped to lure Robert into working on it with me, but there I ran into a hitch. Robert was now a postdoc at MIT, and though he'd made a lot of money the last time I'd lured him into working on one of my schemes, it had also been a huge time sink. So while he agreed that it sounded like a plausible idea, he firmly refused to work on it.
+
+Hmph. Well, I'd do it myself then. I recruited Dan Giffin, who had worked for Viaweb, and two undergrads who wanted summer jobs, and we got to work trying to build what it's now clear is about twenty companies and several open-source projects worth of software. The language for defining applications would of course be a dialect of Lisp. But I wasn't so naive as to assume I could spring an overt Lisp on a general audience; we'd hide the parentheses, like Dylan did.
+
+By then there was a name for the kind of company Viaweb was, an "application service provider," or ASP. This name didn't last long before it was replaced by "software as a service," but it was current for long enough that I named this new company after it: it was going to be called Aspra.
+
+I started working on the application builder, Dan worked on network infrastructure, and the two undergrads worked on the first two services (images and phone calls). But about halfway through the summer I realized I really didn't want to run a company — especially not a big one, which it was looking like this would have to be. I'd only started Viaweb because I needed the money. Now that I didn't need money anymore, why was I doing this? If this vision had to be realized as a company, then screw the vision. I'd build a subset that could be done as an open-source project.
+
+Much to my surprise, the time I spent working on this stuff was not wasted after all. After we started Y Combinator, I would often encounter startups working on parts of this new architecture, and it was very useful to have spent so much time thinking about it and even trying to write some of it.
+
+The subset I would build as an open-source project was the new Lisp, whose parentheses I now wouldn't even have to hide. A lot of Lisp hackers dream of building a new Lisp, partly because one of the distinctive features of the language is that it has dialects, and partly, I think, because we have in our minds a Platonic form of Lisp that all existing dialects fall short of. I certainly did. So at the end of the summer Dan and I switched to working on this new dialect of Lisp, which I called Arc, in a house I bought in Cambridge.
+
+The following spring, lightning struck. I was invited to give a talk at a Lisp conference, so I gave one about how we'd used Lisp at Viaweb. Afterward I put a postscript file of this talk online, on paulgraham.com, which I'd created years before using Viaweb but had never used for anything. In one day it got 30,000 page views. What on earth had happened? The referring urls showed that someone had posted it on Slashdot. [10]
+
+Wow, I thought, there's an audience. If I write something and put it on the web, anyone can read it. That may seem obvious now, but it was surprising then. In the print era there was a narrow channel to readers, guarded by fierce monsters known as editors. The only way to get an audience for anything you wrote was to get it published as a book, or in a newspaper or magazine. Now anyone could publish anything.
+
+This had been possible in principle since 1993, but not many people had realized it yet. I had been intimately involved with building the infrastructure of the web for most of that time, and a writer as well, and it had taken me 8 years to realize it. Even then it took me several years to understand the implications. It meant there would be a whole new generation of essays. [11]
+
+In the print era, the channel for publishing essays had been vanishingly small. Except for a few officially anointed thinkers who went to the right parties in New York, the only people allowed to publish essays were specialists writing about their specialties. There were so many essays that had never been written, because there had been no way to publish them. Now they could be, and I was going to write them. [12]
+
+I've worked on several different things, but to the extent there was a turning point where I figured out what to work on, it was when I started publishing essays online. From then on I knew that whatever else I did, I'd always write essays too.
+
+I knew that online essays would be a marginal medium at first. Socially they'd seem more like rants posted by nutjobs on their GeoCities sites than the genteel and beautifully typeset compositions published in The New Yorker. But by this point I knew enough to find that encouraging instead of discouraging.
+
+One of the most conspicuous patterns I've noticed in my life is how well it has worked, for me at least, to work on things that weren't prestigious. Still life has always been the least prestigious form of painting. Viaweb and Y Combinator both seemed lame when we started them. I still get the glassy eye from strangers when they ask what I'm writing, and I explain that it's an essay I'm going to publish on my web site. Even Lisp, though prestigious intellectually in something like the way Latin is, also seems about as hip.
+
+It's not that unprestigious types of work are good per se. But when you find yourself drawn to some kind of work despite its current lack of prestige, it's a sign both that there's something real to be discovered there, and that you have the right kind of motives. Impure motives are a big danger for the ambitious. If anything is going to lead you astray, it will be the desire to impress people. So while working on things that aren't prestigious doesn't guarantee you're on the right track, it at least guarantees you're not on the most common type of wrong one.
+
+Over the next several years I wrote lots of essays about all kinds of different topics. O'Reilly reprinted a collection of them as a book, called Hackers & Painters after one of the essays in it. I also worked on spam filters, and did some more painting. I used to have dinners for a group of friends every thursday night, which taught me how to cook for groups. And I bought another building in Cambridge, a former candy factory (and later, twas said, porn studio), to use as an office.
+
+One night in October 2003 there was a big party at my house. It was a clever idea of my friend Maria Daniels, who was one of the thursday diners. Three separate hosts would all invite their friends to one party. So for every guest, two thirds of the other guests would be people they didn't know but would probably like. One of the guests was someone I didn't know but would turn out to like a lot: a woman called Jessica Livingston. A couple days later I asked her out.
+
+Jessica was in charge of marketing at a Boston investment bank. This bank thought it understood startups, but over the next year, as she met friends of mine from the startup world, she was surprised how different reality was. And how colorful their stories were. So she decided to compile a book of interviews with startup founders.
+
+When the bank had financial problems and she had to fire half her staff, she started looking for a new job. In early 2005 she interviewed for a marketing job at a Boston VC firm. It took them weeks to make up their minds, and during this time I started telling her about all the things that needed to be fixed about venture capital. They should make a larger number of smaller investments instead of a handful of giant ones, they should be funding younger, more technical founders instead of MBAs, they should let the founders remain as CEO, and so on.
+
+One of my tricks for writing essays had always been to give talks. The prospect of having to stand up in front of a group of people and tell them something that won't waste their time is a great spur to the imagination. When the Harvard Computer Society, the undergrad computer club, asked me to give a talk, I decided I would tell them how to start a startup. Maybe they'd be able to avoid the worst of the mistakes we'd made.
+
+So I gave this talk, in the course of which I told them that the best sources of seed funding were successful startup founders, because then they'd be sources of advice too. Whereupon it seemed they were all looking expectantly at me. Horrified at the prospect of having my inbox flooded by business plans (if I'd only known), I blurted out "But not me!" and went on with the talk. But afterward it occurred to me that I should really stop procrastinating about angel investing. I'd been meaning to since Yahoo bought us, and now it was 7 years later and I still hadn't done one angel investment.
+
+Meanwhile I had been scheming with Robert and Trevor about projects we could work on together. I missed working with them, and it seemed like there had to be something we could collaborate on.
+
+As Jessica and I were walking home from dinner on March 11, at the corner of Garden and Walker streets, these three threads converged. Screw the VCs who were taking so long to make up their minds. We'd start our own investment firm and actually implement the ideas we'd been talking about. I'd fund it, and Jessica could quit her job and work for it, and we'd get Robert and Trevor as partners too. [13]
+
+Once again, ignorance worked in our favor. We had no idea how to be angel investors, and in Boston in 2005 there were no Ron Conways to learn from. So we just made what seemed like the obvious choices, and some of the things we did turned out to be novel.
+
+There are multiple components to Y Combinator, and we didn't figure them all out at once. The part we got first was to be an angel firm. In those days, those two words didn't go together. There were VC firms, which were organized companies with people whose job it was to make investments, but they only did big, million dollar investments. And there were angels, who did smaller investments, but these were individuals who were usually focused on other things and made investments on the side. And neither of them helped founders enough in the beginning. We knew how helpless founders were in some respects, because we remembered how helpless we'd been. For example, one thing Julian had done for us that seemed to us like magic was to get us set up as a company. We were fine writing fairly difficult software, but actually getting incorporated, with bylaws and stock and all that stuff, how on earth did you do that? Our plan was not only to make seed investments, but to do for startups everything Julian had done for us.
+
+YC was not organized as a fund. It was cheap enough to run that we funded it with our own money. That went right by 99% of readers, but professional investors are thinking "Wow, that means they got all the returns." But once again, this was not due to any particular insight on our part. We didn't know how VC firms were organized. It never occurred to us to try to raise a fund, and if it had, we wouldn't have known where to start. [14]
+
+The most distinctive thing about YC is the batch model: to fund a bunch of startups all at once, twice a year, and then to spend three months focusing intensively on trying to help them. That part we discovered by accident, not merely implicitly but explicitly due to our ignorance about investing. We needed to get experience as investors. What better way, we thought, than to fund a whole bunch of startups at once? We knew undergrads got temporary jobs at tech companies during the summer. Why not organize a summer program where they'd start startups instead? We wouldn't feel guilty for being in a sense fake investors, because they would in a similar sense be fake founders. So while we probably wouldn't make much money out of it, we'd at least get to practice being investors on them, and they for their part would probably have a more interesting summer than they would working at Microsoft.
+
+We'd use the building I owned in Cambridge as our headquarters. We'd all have dinner there once a week — on tuesdays, since I was already cooking for the thursday diners on thursdays — and after dinner we'd bring in experts on startups to give talks.
+
+We knew undergrads were deciding then about summer jobs, so in a matter of days we cooked up something we called the Summer Founders Program, and I posted an announcement on my site, inviting undergrads to apply. I had never imagined that writing essays would be a way to get "deal flow," as investors call it, but it turned out to be the perfect source. [15] We got 225 applications for the Summer Founders Program, and we were surprised to find that a lot of them were from people who'd already graduated, or were about to that spring. Already this SFP thing was starting to feel more serious than we'd intended.
+
+We invited about 20 of the 225 groups to interview in person, and from those we picked 8 to fund. They were an impressive group. That first batch included reddit, Justin Kan and Emmett Shear, who went on to found Twitch, Aaron Swartz, who had already helped write the RSS spec and would a few years later become a martyr for open access, and Sam Altman, who would later become the second president of YC. I don't think it was entirely luck that the first batch was so good. You had to be pretty bold to sign up for a weird thing like the Summer Founders Program instead of a summer job at a legit place like Microsoft or Goldman Sachs.
+
+The deal for startups was based on a combination of the deal we did with Julian ($10k for 10%) and what Robert said MIT grad students got for the summer ($6k). We invested $6k per founder, which in the typical two-founder case was $12k, in return for 6%. That had to be fair, because it was twice as good as the deal we ourselves had taken. Plus that first summer, which was really hot, Jessica brought the founders free air conditioners. [16]
+
+Fairly quickly I realized that we had stumbled upon the way to scale startup funding. Funding startups in batches was more convenient for us, because it meant we could do things for a lot of startups at once, but being part of a batch was better for the startups too. It solved one of the biggest problems faced by founders: the isolation. Now you not only had colleagues, but colleagues who understood the problems you were facing and could tell you how they were solving them.
+
+As YC grew, we started to notice other advantages of scale. The alumni became a tight community, dedicated to helping one another, and especially the current batch, whose shoes they remembered being in. We also noticed that the startups were becoming one another's customers. We used to refer jokingly to the "YC GDP," but as YC grows this becomes less and less of a joke. Now lots of startups get their initial set of customers almost entirely from among their batchmates.
+
+I had not originally intended YC to be a full-time job. I was going to do three things: hack, write essays, and work on YC. As YC grew, and I grew more excited about it, it started to take up a lot more than a third of my attention. But for the first few years I was still able to work on other things.
+
+In the summer of 2006, Robert and I started working on a new version of Arc. This one was reasonably fast, because it was compiled into Scheme. To test this new Arc, I wrote Hacker News in it. It was originally meant to be a news aggregator for startup founders and was called Startup News, but after a few months I got tired of reading about nothing but startups. Plus it wasn't startup founders we wanted to reach. It was future startup founders. So I changed the name to Hacker News and the topic to whatever engaged one's intellectual curiosity.
+
+HN was no doubt good for YC, but it was also by far the biggest source of stress for me. If all I'd had to do was select and help founders, life would have been so easy. And that implies that HN was a mistake. Surely the biggest source of stress in one's work should at least be something close to the core of the work. Whereas I was like someone who was in pain while running a marathon not from the exertion of running, but because I had a blister from an ill-fitting shoe. When I was dealing with some urgent problem during YC, there was about a 60% chance it had to do with HN, and a 40% chance it had do with everything else combined. [17]
+
+As well as HN, I wrote all of YC's internal software in Arc. But while I continued to work a good deal in Arc, I gradually stopped working on Arc, partly because I didn't have time to, and partly because it was a lot less attractive to mess around with the language now that we had all this infrastructure depending on it. So now my three projects were reduced to two: writing essays and working on YC.
+
+YC was different from other kinds of work I've done. Instead of deciding for myself what to work on, the problems came to me. Every 6 months there was a new batch of startups, and their problems, whatever they were, became our problems. It was very engaging work, because their problems were quite varied, and the good founders were very effective. If you were trying to learn the most you could about startups in the shortest possible time, you couldn't have picked a better way to do it.
+
+There were parts of the job I didn't like. Disputes between cofounders, figuring out when people were lying to us, fighting with people who maltreated the startups, and so on. But I worked hard even at the parts I didn't like. I was haunted by something Kevin Hale once said about companies: "No one works harder than the boss." He meant it both descriptively and prescriptively, and it was the second part that scared me. I wanted YC to be good, so if how hard I worked set the upper bound on how hard everyone else worked, I'd better work very hard.
+
+One day in 2010, when he was visiting California for interviews, Robert Morris did something astonishing: he offered me unsolicited advice. I can only remember him doing that once before. One day at Viaweb, when I was bent over double from a kidney stone, he suggested that it would be a good idea for him to take me to the hospital. That was what it took for Rtm to offer unsolicited advice. So I remember his exact words very clearly. "You know," he said, "you should make sure Y Combinator isn't the last cool thing you do."
+
+At the time I didn't understand what he meant, but gradually it dawned on me that he was saying I should quit. This seemed strange advice, because YC was doing great. But if there was one thing rarer than Rtm offering advice, it was Rtm being wrong. So this set me thinking. It was true that on my current trajectory, YC would be the last thing I did, because it was only taking up more of my attention. It had already eaten Arc, and was in the process of eating essays too. Either YC was my life's work or I'd have to leave eventually. And it wasn't, so I would.
+
+In the summer of 2012 my mother had a stroke, and the cause turned out to be a blood clot caused by colon cancer. The stroke destroyed her balance, and she was put in a nursing home, but she really wanted to get out of it and back to her house, and my sister and I were determined to help her do it. I used to fly up to Oregon to visit her regularly, and I had a lot of time to think on those flights. On one of them I realized I was ready to hand YC over to someone else.
+
+I asked Jessica if she wanted to be president, but she didn't, so we decided we'd try to recruit Sam Altman. We talked to Robert and Trevor and we agreed to make it a complete changing of the guard. Up till that point YC had been controlled by the original LLC we four had started. But we wanted YC to last for a long time, and to do that it couldn't be controlled by the founders. So if Sam said yes, we'd let him reorganize YC. Robert and I would retire, and Jessica and Trevor would become ordinary partners.
+
+When we asked Sam if he wanted to be president of YC, initially he said no. He wanted to start a startup to make nuclear reactors. But I kept at it, and in October 2013 he finally agreed. We decided he'd take over starting with the winter 2014 batch. For the rest of 2013 I left running YC more and more to Sam, partly so he could learn the job, and partly because I was focused on my mother, whose cancer had returned.
+
+She died on January 15, 2014. We knew this was coming, but it was still hard when it did.
+
+I kept working on YC till March, to help get that batch of startups through Demo Day, then I checked out pretty completely. (I still talk to alumni and to new startups working on things I'm interested in, but that only takes a few hours a week.)
+
+What should I do next? Rtm's advice hadn't included anything about that. I wanted to do something completely different, so I decided I'd paint. I wanted to see how good I could get if I really focused on it. So the day after I stopped working on YC, I started painting. I was rusty and it took a while to get back into shape, but it was at least completely engaging. [18]
+
+I spent most of the rest of 2014 painting. I'd never been able to work so uninterruptedly before, and I got to be better than I had been. Not good enough, but better. Then in November, right in the middle of a painting, I ran out of steam. Up till that point I'd always been curious to see how the painting I was working on would turn out, but suddenly finishing this one seemed like a chore. So I stopped working on it and cleaned my brushes and haven't painted since. So far anyway.
+
+I realize that sounds rather wimpy. But attention is a zero sum game. If you can choose what to work on, and you choose a project that's not the best one (or at least a good one) for you, then it's getting in the way of another project that is. And at 50 there was some opportunity cost to screwing around.
+
+I started writing essays again, and wrote a bunch of new ones over the next few months. I even wrote a couple that weren't about startups. Then in March 2015 I started working on Lisp again.
+
+The distinctive thing about Lisp is that its core is a language defined by writing an interpreter in itself. It wasn't originally intended as a programming language in the ordinary sense. It was meant to be a formal model of computation, an alternative to the Turing machine. If you want to write an interpreter for a language in itself, what's the minimum set of predefined operators you need? The Lisp that John McCarthy invented, or more accurately discovered, is an answer to that question. [19]
+
+McCarthy didn't realize this Lisp could even be used to program computers till his grad student Steve Russell suggested it. Russell translated McCarthy's interpreter into IBM 704 machine language, and from that point Lisp started also to be a programming language in the ordinary sense. But its origins as a model of computation gave it a power and elegance that other languages couldn't match. It was this that attracted me in college, though I didn't understand why at the time.
+
+McCarthy's 1960 Lisp did nothing more than interpret Lisp expressions. It was missing a lot of things you'd want in a programming language. So these had to be added, and when they were, they weren't defined using McCarthy's original axiomatic approach. That wouldn't have been feasible at the time. McCarthy tested his interpreter by hand-simulating the execution of programs. But it was already getting close to the limit of interpreters you could test that way — indeed, there was a bug in it that McCarthy had overlooked. To test a more complicated interpreter, you'd have had to run it, and computers then weren't powerful enough.
+
+Now they are, though. Now you could continue using McCarthy's axiomatic approach till you'd defined a complete programming language. And as long as every change you made to McCarthy's Lisp was a discoveredness-preserving transformation, you could, in principle, end up with a complete language that had this quality. Harder to do than to talk about, of course, but if it was possible in principle, why not try? So I decided to take a shot at it. It took 4 years, from March 26, 2015 to October 12, 2019. It was fortunate that I had a precisely defined goal, or it would have been hard to keep at it for so long.
+
+I wrote this new Lisp, called Bel, in itself in Arc. That may sound like a contradiction, but it's an indication of the sort of trickery I had to engage in to make this work. By means of an egregious collection of hacks I managed to make something close enough to an interpreter written in itself that could actually run. Not fast, but fast enough to test.
+
+I had to ban myself from writing essays during most of this time, or I'd never have finished. In late 2015 I spent 3 months writing essays, and when I went back to working on Bel I could barely understand the code. Not so much because it was badly written as because the problem is so convoluted. When you're working on an interpreter written in itself, it's hard to keep track of what's happening at what level, and errors can be practically encrypted by the time you get them.
+
+So I said no more essays till Bel was done. But I told few people about Bel while I was working on it. So for years it must have seemed that I was doing nothing, when in fact I was working harder than I'd ever worked on anything. Occasionally after wrestling for hours with some gruesome bug I'd check Twitter or HN and see someone asking "Does Paul Graham still code?"
+
+Working on Bel was hard but satisfying. I worked on it so intensively that at any given time I had a decent chunk of the code in my head and could write more there. I remember taking the boys to the coast on a sunny day in 2015 and figuring out how to deal with some problem involving continuations while I watched them play in the tide pools. It felt like I was doing life right. I remember that because I was slightly dismayed at how novel it felt. The good news is that I had more moments like this over the next few years.
+
+In the summer of 2016 we moved to England. We wanted our kids to see what it was like living in another country, and since I was a British citizen by birth, that seemed the obvious choice. We only meant to stay for a year, but we liked it so much that we still live there. So most of Bel was written in England.
+
+In the fall of 2019, Bel was finally finished. Like McCarthy's original Lisp, it's a spec rather than an implementation, although like McCarthy's Lisp it's a spec expressed as code.
+
+Now that I could write essays again, I wrote a bunch about topics I'd had stacked up. I kept writing essays through 2020, but I also started to think about other things I could work on. How should I choose what to do? Well, how had I chosen what to work on in the past? I wrote an essay for myself to answer that question, and I was surprised how long and messy the answer turned out to be. If this surprised me, who'd lived it, then I thought perhaps it would be interesting to other people, and encouraging to those with similarly messy lives. So I wrote a more detailed version for others to read, and this is the last sentence of it.
+
+
+
+
+
+
+
+
+
+Notes
+
+[1] My experience skipped a step in the evolution of computers: time-sharing machines with interactive OSes. I went straight from batch processing to microcomputers, which made microcomputers seem all the more exciting.
+
+[2] Italian words for abstract concepts can nearly always be predicted from their English cognates (except for occasional traps like polluzione). It's the everyday words that differ. So if you string together a lot of abstract concepts with a few simple verbs, you can make a little Italian go a long way.
+
+[3] I lived at Piazza San Felice 4, so my walk to the Accademia went straight down the spine of old Florence: past the Pitti, across the bridge, past Orsanmichele, between the Duomo and the Baptistery, and then up Via Ricasoli to Piazza San Marco. I saw Florence at street level in every possible condition, from empty dark winter evenings to sweltering summer days when the streets were packed with tourists.
+
+[4] You can of course paint people like still lives if you want to, and they're willing. That sort of portrait is arguably the apex of still life painting, though the long sitting does tend to produce pained expressions in the sitters.
+
+[5] Interleaf was one of many companies that had smart people and built impressive technology, and yet got crushed by Moore's Law. In the 1990s the exponential growth in the power of commodity (i.e. Intel) processors rolled up high-end, special-purpose hardware and software companies like a bulldozer.
+
+[6] The signature style seekers at RISD weren't specifically mercenary. In the art world, money and coolness are tightly coupled. Anything expensive comes to be seen as cool, and anything seen as cool will soon become equally expensive.
+
+[7] Technically the apartment wasn't rent-controlled but rent-stabilized, but this is a refinement only New Yorkers would know or care about. The point is that it was really cheap, less than half market price.
+
+[8] Most software you can launch as soon as it's done. But when the software is an online store builder and you're hosting the stores, if you don't have any users yet, that fact will be painfully obvious. So before we could launch publicly we had to launch privately, in the sense of recruiting an initial set of users and making sure they had decent-looking stores.
+
+[9] We'd had a code editor in Viaweb for users to define their own page styles. They didn't know it, but they were editing Lisp expressions underneath. But this wasn't an app editor, because the code ran when the merchants' sites were generated, not when shoppers visited them.
+
+[10] This was the first instance of what is now a familiar experience, and so was what happened next, when I read the comments and found they were full of angry people. How could I claim that Lisp was better than other languages? Weren't they all Turing complete? People who see the responses to essays I write sometimes tell me how sorry they feel for me, but I'm not exaggerating when I reply that it has always been like this, since the very beginning. It comes with the territory. An essay must tell readers things they don't already know, and some people dislike being told such things.
+
+[11] People put plenty of stuff on the internet in the 90s of course, but putting something online is not the same as publishing it online. Publishing online means you treat the online version as the (or at least a) primary version.
+
+[12] There is a general lesson here that our experience with Y Combinator also teaches: Customs continue to constrain you long after the restrictions that caused them have disappeared. Customary VC practice had once, like the customs about publishing essays, been based on real constraints. Startups had once been much more expensive to start, and proportionally rare. Now they could be cheap and common, but the VCs' customs still reflected the old world, just as customs about writing essays still reflected the constraints of the print era.
+
+Which in turn implies that people who are independent-minded (i.e. less influenced by custom) will have an advantage in fields affected by rapid change (where customs are more likely to be obsolete).
+
+Here's an interesting point, though: you can't always predict which fields will be affected by rapid change. Obviously software and venture capital will be, but who would have predicted that essay writing would be?
+
+[13] Y Combinator was not the original name. At first we were called Cambridge Seed. But we didn't want a regional name, in case someone copied us in Silicon Valley, so we renamed ourselves after one of the coolest tricks in the lambda calculus, the Y combinator.
+
+I picked orange as our color partly because it's the warmest, and partly because no VC used it. In 2005 all the VCs used staid colors like maroon, navy blue, and forest green, because they were trying to appeal to LPs, not founders. The YC logo itself is an inside joke: the Viaweb logo had been a white V on a red circle, so I made the YC logo a white Y on an orange square.
+
+[14] YC did become a fund for a couple years starting in 2009, because it was getting so big I could no longer afford to fund it personally. But after Heroku got bought we had enough money to go back to being self-funded.
+
+[15] I've never liked the term "deal flow," because it implies that the number of new startups at any given time is fixed. This is not only false, but it's the purpose of YC to falsify it, by causing startups to be founded that would not otherwise have existed.
+
+[16] She reports that they were all different shapes and sizes, because there was a run on air conditioners and she had to get whatever she could, but that they were all heavier than she could carry now.
+
+[17] Another problem with HN was a bizarre edge case that occurs when you both write essays and run a forum. When you run a forum, you're assumed to see if not every conversation, at least every conversation involving you. And when you write essays, people post highly imaginative misinterpretations of them on forums. Individually these two phenomena are tedious but bearable, but the combination is disastrous. You actually have to respond to the misinterpretations, because the assumption that you're present in the conversation means that not responding to any sufficiently upvoted misinterpretation reads as a tacit admission that it's correct. But that in turn encourages more; anyone who wants to pick a fight with you senses that now is their chance.
+
+[18] The worst thing about leaving YC was not working with Jessica anymore. We'd been working on YC almost the whole time we'd known each other, and we'd neither tried nor wanted to separate it from our personal lives, so leaving was like pulling up a deeply rooted tree.
+
+[19] One way to get more precise about the concept of invented vs discovered is to talk about space aliens. Any sufficiently advanced alien civilization would certainly know about the Pythagorean theorem, for example. I believe, though with less certainty, that they would also know about the Lisp in McCarthy's 1960 paper.
+
+But if so there's no reason to suppose that this is the limit of the language that might be known to them. Presumably aliens need numbers and errors and I/O too. So it seems likely there exists at least one path out of McCarthy's Lisp along which discoveredness is preserved.
+
+
+
+Thanks to Trevor Blackwell, John Collison, Patrick Collison, Daniel Gackle, Ralph Hazell, Jessica Livingston, Robert Morris, and Harj Taggar for reading drafts of this.
--- a/docs/docs/how_to/prompts_composition.ipynb
+++ b/docs/docs/how_to/prompts_composition.ipynb
@@ -17,13 +17,14 @@
   "source": [
    "# How to compose prompts together\n",
    "\n",
-    "LangChain provides a user friendly interface for composing different parts of prompts together. You can do this with either string prompts or chat prompts. Constructing prompts this way allows for easy reuse of components.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
    "\n",
-    "<PrerequisiteLinks content={`- [Prompt templates](/docs/concepts/#prompt-templates)`} />\n",
-    "```"
+    ":::\n",
+    "\n",
+    "LangChain provides a user friendly interface for composing different parts of prompts together. You can do this with either string prompts or chat prompts. Constructing prompts this way allows for easy reuse of components."
   ]
  },
  {
@@ -306,7 +307,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/prompts_partial.ipynb
+++ b/docs/docs/how_to/prompts_partial.ipynb
@@ -17,6 +17,13 @@
   "source": [
    "# How to partially format prompt templates\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "Like partially binding arguments to a function, it can make sense to \"partial\" a prompt template - e.g. pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.\n",
    "\n",
    "LangChain supports this in two ways:\n",
@@ -26,14 +33,6 @@
    "\n",
    "In the examples below, we go over the motivations for both use cases as well as how to do it in LangChain.\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "`}/>\n",
-    "```\n",
-    "\n",
    "## Partial with strings\n",
    "\n",
    "One common use case for wanting to partial a prompt template is if you get access to some of the variables in a prompt before others. For example, suppose you have a prompt template that requires two variables, `foo` and `baz`. If you get the `foo` value early on in your chain, but the `baz` value later, it can be inconvenient to pass both variables all the way through the chain. Instead, you can partial the prompt template with the `foo` value, and then pass the partialed prompt template along and just use that. Below is an example of doing this:\n"
@@ -191,7 +190,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/routing.ipynb
+++ b/docs/docs/how_to/routing.ipynb
@@ -18,6 +18,17 @@
   "source": [
    "# How to route execution within a chain\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
+    "- [Chaining runnables](/docs/how_to/sequence/)\n",
+    "- [Configuring chain parameters at runtime](/docs/how_to/configure)\n",
+    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
+    "- [Chat Messages](/docs/concepts/#message-types)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "Routing allows you to create non-deterministic chains where the output of a previous step defines the next step. Routing can help provide structure and consistency around interactions with models by allowing you to define states and use information related to those states as context to model calls.\n",
    "\n",
    "There are two ways to perform routing:\n",
@@ -25,19 +36,7 @@
    "1. Conditionally return runnables from a [`RunnableLambda`](/docs/how_to/functions) (recommended)\n",
    "2. Using a `RunnableBranch` (legacy)\n",
    "\n",
-    "We'll illustrate both methods using a two step sequence where the first step classifies an input question as being about `LangChain`, `Anthropic`, or `Other`, then routes to a corresponding prompt chain.\n",
-    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
-    "- [Chaining runnables](/docs/how_to/sequence/)\n",
-    "- [Configuring chain parameters at runtime](/docs/how_to/configure)\n",
-    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
-    "- [Chat Messages](/docs/concepts/#message-types)\n",
-    "`} />\n",
-    "```"
+    "We'll illustrate both methods using a two step sequence where the first step classifies an input question as being about `LangChain`, `Anthropic`, or `Other`, then routes to a corresponding prompt chain."
   ]
  },
  {
@@ -474,7 +473,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/sequence.ipynb
+++ b/docs/docs/how_to/sequence.ipynb
@@ -16,20 +16,19 @@
   "source": [
    "# How to chain runnables\n",
    "\n",
-    "One point about [LangChain Expression Language](/docs/concepts/#langchain-expression-language) is that any two runnables can be \"chained\" together into sequences. The output of the previous runnable's `.invoke()` call is passed as input to the next runnable. This can be done using the pipe operator (`|`), or the more explicit `.pipe()` method, which does the same thing.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "The resulting [`RunnableSequence`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableSequence.html) is itself a runnable, which means it can be invoked, streamed, or further chained just like any other runnable. Advantages of chaining runnables in this way are efficient streaming (the sequence will stream output as soon as it is available), and debugging and tracing with tools like [LangSmith](/docs/how_to/debugging).\n",
-    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [LangChain Expression Language (LCEL)](/docs/concepts/#langchain-expression-language)\n",
    "- [Prompt templates](/docs/concepts/#prompt-templates)\n",
    "- [Chat models](/docs/concepts/#chat-models)\n",
    "- [Output parser](/docs/concepts/#output-parsers)\n",
-    "`}/>\n",
-    "```\n",
+    "\n",
+    ":::\n",
+    "\n",
+    "One point about [LangChain Expression Language](/docs/concepts/#langchain-expression-language) is that any two runnables can be \"chained\" together into sequences. The output of the previous runnable's `.invoke()` call is passed as input to the next runnable. This can be done using the pipe operator (`|`), or the more explicit `.pipe()` method, which does the same thing.\n",
+    "\n",
+    "The resulting [`RunnableSequence`](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableSequence.html) is itself a runnable, which means it can be invoked, streamed, or further chained just like any other runnable. Advantages of chaining runnables in this way are efficient streaming (the sequence will stream output as soon as it is available), and debugging and tracing with tools like [LangSmith](/docs/how_to/debugging).\n",
    "\n",
    "## The pipe operator\n",
    "\n",
@@ -255,9 +254,9 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/docs/docs/how_to/streaming.ipynb
+++ b/docs/docs/how_to/streaming.ipynb
@@ -15,7 +15,16 @@
   "id": "bb7d49db-04d3-4399-bfe1-09f82bbe6015",
   "metadata": {},
   "source": [
-    "# How to stream\n",
+    "# How to stream runnables\n",
+    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [LangChain Expression Language](/docs/concepts/#langchain-expression-language)\n",
+    "- [Output parsers](/docs/concepts/#output-parsers)\n",
+    "\n",
+    ":::\n",
    "\n",
    "Streaming is critical in making applications based on LLMs feel responsive to end-users.\n",
    "\n",
@@ -28,16 +37,6 @@
    "\n",
    "Let's take a look at both approaches, and try to understand how to use them.\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LangChain Expression Language](/docs/concepts/#langchain-expression-language)\n",
-    "- [Output parsers](/docs/concepts/#output-parsers)\n",
-    "`} />\n",
-    "```\n",
-    "\n",
    "## Using Stream\n",
    "\n",
    "All `Runnable` objects implement a sync method called `stream` and an async variant called `astream`. \n",
@@ -1464,7 +1463,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/structured_output.ipynb
+++ b/docs/docs/how_to/structured_output.ipynb
@@ -17,31 +17,34 @@
   "source": [
    "# How to return structured data from a model\n",
    "\n",
-    "It is often useful to have a model return output that matches some specific schema. One common use-case is extracting data from arbitrary text to insert into a traditional database or use with some other downstrem system. This guide will show you a few different strategies you can use to do this.\n",
+    ":::info Prerequisites\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
+    "This guide assumes familiarity with the following concepts:\n",
    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "`}/>\n",
-    "```\n",
+    "- [Function/tool calling](/docs/concepts/#functiontool-calling)\n",
+    ":::\n",
+    "\n",
+    "It is often useful to have a model return output that matches a specific schema. One common use-case is extracting data from text to insert into a database or use with some other downstream system. This guide covers a few strategies for getting structured outputs from a model.\n",
    "\n",
    "## The `.with_structured_output()` method\n",
    "\n",
-    "There are several strategies that models can use under the hood. For some of the most popular model providers, including [OpenAI](/docs/integrations/platforms/openai/), [Anthropic](/docs/integrations/platforms/anthropic/), and [Mistral](/docs/integrations/providers/mistralai/), LangChain implements a common interface that abstracts away these strategies called `.with_structured_output`.\n",
+    ":::info Supported models\n",
    "\n",
-    "By invoking this method (and passing in [JSON schema](https://json-schema.org/) or a [Pydantic](https://docs.pydantic.dev/latest/) model) the model will add whatever model parameters + output parsers are necessary to get back structured output matching the requested schema. If the model supports more than one way to do this (e.g., function calling vs JSON mode) - you can configure which method to use by passing into that method.\n",
+    "You can find a [list of models that support this method here](/docs/integrations/chat/).\n",
    "\n",
-    "You can find the [current list of models that support this method here](/docs/integrations/chat/).\n",
+    ":::\n",
    "\n",
-    "Let's look at some examples of this in action! We'll use Pydantic to create a simple response schema.\n",
+    "This is the easiest and most reliable way to get structured outputs. `with_structured_output()` is implemented for models that provide native APIs for structuring outputs, like tool/function calling or JSON mode, and makes use of these capabilities under the hood.\n",
+    "\n",
+    "This method takes a schema as input which specifies the names, types, and descriptions of the desired output attributes. The method returns a model-like Runnable, except that instead of outputting strings or Messages it outputs objects corresponding to the given schema. The schema can be specified as a [JSON Schema](https://json-schema.org/) or a Pydantic class. If JSON Schema is used then a dictionary will be returned by the Runnable, and if a Pydantic class is used then Pydantic objects will be returned.\n",
+    "\n",
+    "As an example, let's get a model to generate a joke and separate the setup from the punchline:\n",
    "\n",
    "```{=mdx}\n",
    "import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
    "\n",
    "<ChatModelTabs\n",
-    "  customVarName=\"model\"\n",
+    "  customVarName=\"llm\"\n",
    "/>\n",
    "```"
   ]
@@ -58,25 +61,30 @@
    "\n",
    "from langchain_openai import ChatOpenAI\n",
    "\n",
-    "model = ChatOpenAI(\n",
-    "    model=\"gpt-4-0125-preview\",\n",
-    "    temperature=0,\n",
-    ")"
+    "llm = ChatOpenAI(model=\"gpt-4-0125-preview\", temperature=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a808a401-be1f-49f9-ad13-58dd68f7db5f",
+   "metadata": {},
+   "source": [
+    "If we want the model to return a Pydantic object, we just need to pass in desired the Pydantic class:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 38,
   "id": "070bf702",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!', rating=None)"
+       "Joke(setup='Why was the cat sitting on the computer?', punchline='To keep an eye on the mouse!', rating=None)"
      ]
     },
-     "execution_count": 13,
+     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -88,35 +96,39 @@
    "\n",
    "\n",
    "class Joke(BaseModel):\n",
+    "    \"\"\"Joke to tell user.\"\"\"\n",
+    "\n",
    "    setup: str = Field(description=\"The setup of the joke\")\n",
    "    punchline: str = Field(description=\"The punchline to the joke\")\n",
    "    rating: Optional[int] = Field(description=\"How funny the joke is, from 1 to 10\")\n",
    "\n",
    "\n",
-    "structured_llm = model.with_structured_output(Joke)\n",
+    "structured_llm = llm.with_structured_output(Joke)\n",
    "\n",
    "structured_llm.invoke(\"Tell me a joke about cats\")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "00890a47-3cdf-4805-b8f1-6d110f0633d3",
+   "metadata": {},
+   "source": [
+    ":::tip\n",
+    "Beyond just the structure of the Pydantic class, the name of the Pydantic class, the docstring, and the names and provided descriptions of parameters are very important. Most of the time `with_structured_output` is using a model's function/tool calling API, and you can effectively think of all of this information as being added to the model prompt.\n",
+    ":::"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "deddb6d3",
   "metadata": {},
   "source": [
-    "The result is a Pydantic model. Note that name of the model and the names and provided descriptions of parameters are very important, as they help guide the model's output.\n",
-    "\n",
-    "We can also pass in an OpenAI-style JSON schema dict if you prefer not to use Pydantic. This dict should contain three properties:\n",
-    "\n",
-    "- `name`: The name of the schema to output.\n",
-    "- `description`: A high level description of the schema to output.\n",
-    "- `parameters`: The nested details of the schema you want to extract, formatted as a [JSON schema](https://json-schema.org/) dict.\n",
-    "\n",
-    "In this case, the response is also a dict:"
+    "We can also pass in a [JSON Schema](https://json-schema.org/) dict if you prefer not to use Pydantic. In this case, the response is also a dict:"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 8,
   "id": "6700994a",
   "metadata": {},
   "outputs": [
@@ -124,30 +136,37 @@
     "data": {
      "text/plain": [
       "{'setup': 'Why was the cat sitting on the computer?',\n",
-       " 'punchline': 'To keep an eye on the mouse!'}"
+       " 'punchline': 'Because it wanted to keep an eye on the mouse!',\n",
+       " 'rating': 8}"
      ]
     },
-     "execution_count": 3,
+     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "structured_llm = model.with_structured_output(\n",
-    "    {\n",
-    "        \"name\": \"joke\",\n",
-    "        \"description\": \"Joke to tell user.\",\n",
-    "        \"parameters\": {\n",
-    "            \"title\": \"Joke\",\n",
-    "            \"type\": \"object\",\n",
-    "            \"properties\": {\n",
-    "                \"setup\": {\"type\": \"string\", \"description\": \"The setup for the joke\"},\n",
-    "                \"punchline\": {\"type\": \"string\", \"description\": \"The joke's punchline\"},\n",
-    "            },\n",
-    "            \"required\": [\"setup\", \"punchline\"],\n",
+    "json_schema = {\n",
+    "    \"title\": \"joke\",\n",
+    "    \"description\": \"Joke to tell user.\",\n",
+    "    \"type\": \"object\",\n",
+    "    \"properties\": {\n",
+    "        \"setup\": {\n",
+    "            \"type\": \"string\",\n",
+    "            \"description\": \"The setup of the joke\",\n",
    "        },\n",
-    "    }\n",
-    ")\n",
+    "        \"punchline\": {\n",
+    "            \"type\": \"string\",\n",
+    "            \"description\": \"The punchline to the joke\",\n",
+    "        },\n",
+    "        \"rating\": {\n",
+    "            \"type\": \"integer\",\n",
+    "            \"description\": \"How funny the joke is, from 1 to 10\",\n",
+    "        },\n",
+    "    },\n",
+    "    \"required\": [\"setup\", \"punchline\"],\n",
+    "}\n",
+    "structured_llm = llm.with_structured_output(json_schema)\n",
    "\n",
    "structured_llm.invoke(\"Tell me a joke about cats\")"
   ]
@@ -159,7 +178,7 @@
   "source": [
    "### Choosing between multiple schemas\n",
    "\n",
-    "If you have multiple schemas that are valid outputs for the model, you can use Pydantic's `Union` type:"
+    "The simplest way to let the model choose from multiple schemas is to create a parent Pydantic class that has a Union-typed attribute:"
   ]
  },
  {
@@ -171,7 +190,7 @@
    {
     "data": {
      "text/plain": [
-       "Response(output=Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!'))"
+       "Response(output=Joke(setup='Why was the cat sitting on the computer?', punchline='To keep an eye on the mouse!', rating=8))"
      ]
     },
     "execution_count": 4,
@@ -182,15 +201,10 @@
   "source": [
    "from typing import Union\n",
    "\n",
-    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
-    "\n",
-    "\n",
-    "class Joke(BaseModel):\n",
-    "    setup: str = Field(description=\"The setup of the joke\")\n",
-    "    punchline: str = Field(description=\"The punchline to the joke\")\n",
-    "\n",
    "\n",
    "class ConversationalResponse(BaseModel):\n",
+    "    \"\"\"Respond in a conversational manner. Be kind and helpful.\"\"\"\n",
+    "\n",
    "    response: str = Field(description=\"A conversational response to the user's query\")\n",
    "\n",
    "\n",
@@ -198,7 +212,7 @@
    "    output: Union[Joke, ConversationalResponse]\n",
    "\n",
    "\n",
-    "structured_llm = model.with_structured_output(Response)\n",
+    "structured_llm = llm.with_structured_output(Response)\n",
    "\n",
    "structured_llm.invoke(\"Tell me a joke about cats\")"
   ]
@@ -212,7 +226,7 @@
    {
     "data": {
      "text/plain": [
-       "Response(output=ConversationalResponse(response=\"I'm just a collection of code, so I don't have feelings, but thanks for asking! How can I assist you today?\"))"
+       "Response(output=ConversationalResponse(response=\"I'm just a digital assistant, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?\"))"
      ]
     },
     "execution_count": 5,
@@ -229,9 +243,225 @@
   "id": "e28c14d3",
   "metadata": {},
   "source": [
-    "If you are using JSON Schema, you can take advantage of other more complex schema descriptions to create a similar effect.\n",
+    "Alternatively, you can use tool calling directly to allow the model to choose between options, if your [chosen model supports it](/docs/integrations/chat/). This involves a bit more parsing and setup but in some instances leads to better performance because you don't have to use nested schemas. See [this how-to guide](/docs/how_to/tool_calling/) for more details."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9a40f703-7fd2-4fe0-ab2a-fa2d711ba009",
+   "metadata": {},
+   "source": [
+    "### Streaming\n",
    "\n",
-    "You can also use tool calling directly to allow the model to choose between options, if your chosen model supports it. This involves a bit more parsing and setup. See [this how-to guide](/docs/how_to/tool_calling/) for more details."
+    "We can stream outputs from our structured model when the output type is a dict (i.e., when the schema is specified as a JSON Schema dict). \n",
+    "\n",
+    ":::info\n",
+    "\n",
+    "Note that what's yielded is already aggregated chunks, not deltas.\n",
+    "\n",
+    ":::"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 43,
+   "id": "aff89877-28a3-472f-a1aa-eff893fe7736",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{}\n",
+      "{'setup': ''}\n",
+      "{'setup': 'Why'}\n",
+      "{'setup': 'Why was'}\n",
+      "{'setup': 'Why was the'}\n",
+      "{'setup': 'Why was the cat'}\n",
+      "{'setup': 'Why was the cat sitting'}\n",
+      "{'setup': 'Why was the cat sitting on'}\n",
+      "{'setup': 'Why was the cat sitting on the'}\n",
+      "{'setup': 'Why was the cat sitting on the computer'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': ''}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an eye'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an eye on'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an eye on the'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an eye on the mouse'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an eye on the mouse!'}\n",
+      "{'setup': 'Why was the cat sitting on the computer?', 'punchline': 'Because it wanted to keep an eye on the mouse!', 'rating': 8}\n"
+     ]
+    }
+   ],
+   "source": [
+    "structured_llm = llm.with_structured_output(json_schema)\n",
+    "\n",
+    "for chunk in structured_llm.stream(\"Tell me a joke about cats\"):\n",
+    "    print(chunk)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0a526cdf-e736-451b-96be-22e8986d3863",
+   "metadata": {},
+   "source": [
+    "### Few-shot prompting\n",
+    "\n",
+    "For more complex schemas it's very useful to add few-shot examples to the prompt. This can be done in a few ways.\n",
+    "\n",
+    "The simplest and most universal way is to add examples to a system message in the prompt:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 47,
+   "id": "283ba784-2072-47ee-9b2c-1119e3c69e8e",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'setup': 'Woodpecker',\n",
+       " 'punchline': \"Woodpecker goes 'knock knock', but don't worry, they never expect you to answer the door!\",\n",
+       " 'rating': 8}"
+      ]
+     },
+     "execution_count": 47,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "\n",
+    "system = \"\"\"You are a hilarious comedian. Your specialty is knock-knock jokes. \\\n",
+    "Return a joke which has the setup (the response to \"Who's there?\") and the final punchline (the response to \"<setup> who?\").\n",
+    "\n",
+    "Here are some examples of jokes:\n",
+    "\n",
+    "example_user: Tell me a joke about planes\n",
+    "example_assistant: {{\"setup\": \"Why don't planes ever get tired?\", \"punchline\": \"Because they have rest wings!\", \"rating\": 2}}\n",
+    "\n",
+    "example_user: Tell me another joke about planes\n",
+    "example_assistant: {{\"setup\": \"Cargo\", \"punchline\": \"Cargo 'vroom vroom', but planes go 'zoom zoom'!\", \"rating\": 10}}\n",
+    "\n",
+    "example_user: Now about caterpillars\n",
+    "example_assistant: {{\"setup\": \"Caterpillar\", \"punchline\": \"Caterpillar really slow, but watch me turn into a butterfly and steal the show!\", \"rating\": 5}}\"\"\"\n",
+    "\n",
+    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", \"{input}\")])\n",
+    "\n",
+    "few_shot_structured_llm = prompt | structured_llm\n",
+    "few_shot_structured_llm.invoke(\"what's something funny about woodpeckers\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3c12b389-153d-44d1-af34-37e5b926d3db",
+   "metadata": {},
+   "source": [
+    "When the underlying method for structuring outputs is tool calling, we can pass in our examples as explicit tool calls. You can check if the model you're using makes use of tool calling in its API reference."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 46,
+   "id": "d7381cb0-b2c3-4302-a319-ed72d0b9e43f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'setup': 'Crocodile',\n",
+       " 'punchline': \"Crocodile 'see you later', but in a while, it becomes an alligator!\",\n",
+       " 'rating': 7}"
+      ]
+     },
+     "execution_count": 46,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.messages import AIMessage, HumanMessage, ToolMessage\n",
+    "\n",
+    "examples = [\n",
+    "    HumanMessage(\"Tell me a joke about planes\", name=\"example_user\"),\n",
+    "    AIMessage(\n",
+    "        \"\",\n",
+    "        name=\"example_assistant\",\n",
+    "        tool_calls=[\n",
+    "            {\n",
+    "                \"name\": \"joke\",\n",
+    "                \"args\": {\n",
+    "                    \"setup\": \"Why don't planes ever get tired?\",\n",
+    "                    \"punchline\": \"Because they have rest wings!\",\n",
+    "                    \"rating\": 2,\n",
+    "                },\n",
+    "                \"id\": \"1\",\n",
+    "            }\n",
+    "        ],\n",
+    "    ),\n",
+    "    # Most tool-calling models expect a ToolMessage(s) to follow an AIMessage with tool calls.\n",
+    "    ToolMessage(\"\", tool_call_id=\"1\"),\n",
+    "    # Some models also expect an AIMessage to follow any ToolMessages,\n",
+    "    # so you may need to add an AIMessage here.\n",
+    "    HumanMessage(\"Tell me another joke about planes\", name=\"example_user\"),\n",
+    "    AIMessage(\n",
+    "        \"\",\n",
+    "        name=\"example_assistant\",\n",
+    "        tool_calls=[\n",
+    "            {\n",
+    "                \"name\": \"joke\",\n",
+    "                \"args\": {\n",
+    "                    \"setup\": \"Cargo\",\n",
+    "                    \"punchline\": \"Cargo 'vroom vroom', but planes go 'zoom zoom'!\",\n",
+    "                    \"rating\": 10,\n",
+    "                },\n",
+    "                \"id\": \"2\",\n",
+    "            }\n",
+    "        ],\n",
+    "    ),\n",
+    "    ToolMessage(\"\", tool_call_id=\"2\"),\n",
+    "    HumanMessage(\"Now about caterpillars\", name=\"example_user\"),\n",
+    "    AIMessage(\n",
+    "        \"\",\n",
+    "        tool_calls=[\n",
+    "            {\n",
+    "                \"name\": \"joke\",\n",
+    "                \"args\": {\n",
+    "                    \"setup\": \"Caterpillar\",\n",
+    "                    \"punchline\": \"Caterpillar really slow, but watch me turn into a butterfly and steal the show!\",\n",
+    "                    \"rating\": 5,\n",
+    "                },\n",
+    "                \"id\": \"3\",\n",
+    "            }\n",
+    "        ],\n",
+    "    ),\n",
+    "    ToolMessage(\"\", tool_call_id=\"3\"),\n",
+    "]\n",
+    "system = \"\"\"You are a hilarious comedian. Your specialty is knock-knock jokes. \\\n",
+    "Return a joke which has the setup (the response to \"Who's there?\") \\\n",
+    "and the final punchline (the response to \"<setup> who?\").\"\"\"\n",
+    "\n",
+    "prompt = ChatPromptTemplate.from_messages(\n",
+    "    [(\"system\", system), (\"placeholder\", \"{examples}\"), (\"human\", \"{input}\")]\n",
+    ")\n",
+    "few_shot_structured_llm = prompt | structured_llm\n",
+    "few_shot_structured_llm.invoke({\"input\": \"crocodiles\", \"examples\": examples})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "498d893b-ceaa-47ff-a9d8-4faa60702715",
+   "metadata": {},
+   "source": [
+    "For more on few shot prompting when using tool calling, see [here](/docs/how_to/function_calling/#Few-shot-prompting)."
   ]
  },
  {
@@ -239,9 +469,17 @@
   "id": "39d7a555",
   "metadata": {},
   "source": [
-    "### Specifying the output method (Advanced)\n",
+    "### (Advanced) Specifying the method for structuring outputs\n",
    "\n",
-    "For models that support more than one means of outputting data, you can specify the preferred one like this:"
+    "For models that support more than one means of structuring outputs (i.e., they support both tool calling and JSON mode), you can specify which method to use with the `method=` argument.\n",
+    "\n",
+    ":::info JSON mode\n",
+    "\n",
+    "If using JSON mode you'll have to still specify the desired schema in the model prompt. The schema you pass to `with_structured_output` will only be used for parsing the model outputs, it will not be passed to the model the way it is with tool calling.\n",
+    "\n",
+    "To see if the model you're using supports JSON mode, check its entry in the [API reference](https://api.python.langchain.com/en/latest/langchain_api_reference.html).\n",
+    "\n",
+    ":::"
   ]
  },
  {
@@ -253,7 +491,7 @@
    {
     "data": {
      "text/plain": [
-       "Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!')"
+       "Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!', rating=None)"
      ]
     },
     "execution_count": 6,
@@ -262,7 +500,7 @@
    }
   ],
   "source": [
-    "structured_llm = model.with_structured_output(Joke, method=\"json_mode\")\n",
+    "structured_llm = llm.with_structured_output(Joke, method=\"json_mode\")\n",
    "\n",
    "structured_llm.invoke(\n",
    "    \"Tell me a joke about cats, respond in JSON with `setup` and `punchline` keys\"\n",
@@ -274,13 +512,9 @@
   "id": "5e92a98a",
   "metadata": {},
   "source": [
-    "In the above example, we use OpenAI's alternate JSON mode capability along with a more specific prompt.\n",
+    "## Prompting and parsing model directly\n",
    "\n",
-    "For specifics about the model you choose, peruse its entry in the [API reference pages](https://api.python.langchain.com/en/latest/langchain_api_reference.html).\n",
-    "\n",
-    "## Prompting techniques\n",
-    "\n",
-    "You can also prompt models to outputting information in a given format. This approach relies on designing good prompts and then parsing the output of the models. This is the only option for models that don't support `.with_structured_output()` or other built-in approaches.\n",
+    "Not all models support `.with_structured_output()`, since not all models have tool calling or JSON mode support. For such models you'll need to directly prompt the model to use a specific format, and use an output parser to extract the structured response from the raw model output.\n",
    "\n",
    "### Using `PydanticOutputParser`\n",
    "\n",
@@ -289,14 +523,14 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 31,
   "id": "6e514455",
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import List\n",
    "\n",
-    "from langchain.output_parsers import PydanticOutputParser\n",
+    "from langchain_core.output_parsers import PydanticOutputParser\n",
    "from langchain_core.prompts import ChatPromptTemplate\n",
    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
    "\n",
@@ -341,7 +575,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 37,
   "id": "3d73d33d",
   "metadata": {},
   "outputs": [
@@ -366,7 +600,7 @@
   "source": [
    "query = \"Anna is 23 years old and she is 6 feet tall\"\n",
    "\n",
-    "print(prompt.format_prompt(query=query).to_string())"
+    "print(prompt.invoke(query).to_string())"
   ]
  },
  {
@@ -395,7 +629,7 @@
    }
   ],
   "source": [
-    "chain = prompt | model | parser\n",
+    "chain = prompt | llm | parser\n",
    "\n",
    "chain.invoke({\"query\": query})"
   ]
@@ -538,35 +772,17 @@
    }
   ],
   "source": [
-    "chain = prompt | model | extract_json\n",
+    "chain = prompt | llm | extract_json\n",
    "\n",
    "chain.invoke({\"query\": query})"
   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7a39221a",
-   "metadata": {},
-   "source": [
-    "## Next steps\n",
-    "\n",
-    "Now you've learned a few methods to make a model output structured data.\n",
-    "\n",
-    "To learn more, check out the other how-to guides in this section, or the conceptual guide on tool calling."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6e3759e2",
-   "metadata": {},
-   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "poetry-venv-2",
   "language": "python",
-   "name": "python3"
+   "name": "poetry-venv-2"
  },
  "language_info": {
   "codemirror_mode": {
@@ -578,7 +794,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.4"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/how_to/tool_calling.ipynb
+++ b/docs/docs/how_to/tool_calling.ipynb
@@ -6,6 +6,14 @@
   "source": [
    "# How to use a chat model to call tools\n",
    "\n",
+    ":::info Prerequisites\n",
+    "\n",
+    "This guide assumes familiarity with the following concepts:\n",
+    "- [Chat models](/docs/concepts/#chat-models)\n",
+    "- [LangChain Tools](/docs/concepts/#tools)\n",
+    "\n",
+    ":::\n",
+    "\n",
    "```{=mdx}\n",
    ":::info\n",
    "We use the term tool calling interchangeably with function calling. Although\n",
@@ -40,15 +48,6 @@
    "LangChain implements standard interfaces for defining tools, passing them to LLMs, \n",
    "and representing tool calls. This guide will show you how to use them.\n",
    "\n",
-    "```{=mdx}\n",
-    "import PrerequisiteLinks from \"@theme/PrerequisiteLinks\";\n",
-    "\n",
-    "<PrerequisiteLinks content={`\n",
-    "- [Chat models](/docs/concepts/#chat-models)\n",
-    "- [LangChain Tools](/docs/concepts/#tools)\n",
-    "`} />\n",
-    "```\n",
-    "\n",
    "## Passing tools to chat models\n",
    "\n",
    "Chat models that support tool calling features implement a `.bind_tools` method, which \n",
@@ -706,9 +705,9 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 4
 }
--- a/docs/docs/how_to/tool_calls_multi_modal.ipynb
+++ b/docs/docs/how_to/tool_calls_multi_modal.ipynb
@@ -0,0 +1,160 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "4facdf7f-680e-4d28-908b-2b8408e2a741",
+   "metadata": {},
+   "source": [
+    "# How to call tools with multi-modal data\n",
+    "\n",
+    "Here we demonstrate how to call tools with multi-modal data, such as images.\n",
+    "\n",
+    "Some multi-modal models, such as those that can reason over images or audio, support [tool calling](/docs/concepts/#functiontool-calling) features as well.\n",
+    "\n",
+    "To call tools using such models, simply bind tools to them in the [usual way](/docs/how_to/tool_calling), and invoke the model using content blocks of the desired type (e.g., containing image data).\n",
+    "\n",
+    "Below, we demonstrate examples using [OpenAI](/docs/integrations/platforms/openai) and [Anthropic](/docs/integrations/platforms/anthropic). We will use the same image and tool in all cases. Let's first select an image, and build a placeholder tool that expects as input the string \"sunny\", \"cloudy\", or \"rainy\". We will ask the models to describe the weather in the image."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "0d9fd81a-b7f0-445a-8e3d-cfc2d31fdd59",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Literal\n",
+    "\n",
+    "from langchain_core.tools import tool\n",
+    "\n",
+    "image_url = \"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg\"\n",
+    "\n",
+    "\n",
+    "@tool\n",
+    "def weather_tool(weather: Literal[\"sunny\", \"cloudy\", \"rainy\"]) -> None:\n",
+    "    \"\"\"Describe the weather\"\"\"\n",
+    "    pass"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8656018e-c56d-47d2-b2be-71e87827f90a",
+   "metadata": {},
+   "source": [
+    "## OpenAI\n",
+    "\n",
+    "For OpenAI, we can feed the image URL directly in a content block of type \"image_url\":"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "a8819cf3-5ddc-44f0-889a-19ca7b7fe77e",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[{'name': 'weather_tool', 'args': {'weather': 'sunny'}, 'id': 'call_mRYL50MtHdeNuNIjSCm5UPmB'}]\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_core.messages import HumanMessage\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "model = ChatOpenAI(model=\"gpt-4o\").bind_tools([weather_tool])\n",
+    "\n",
+    "message = HumanMessage(\n",
+    "    content=[\n",
+    "        {\"type\": \"text\", \"text\": \"describe the weather in this image\"},\n",
+    "        {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n",
+    "    ],\n",
+    ")\n",
+    "response = model.invoke([message])\n",
+    "print(response.tool_calls)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e5738224-1109-4bf8-8976-ff1570dd1d46",
+   "metadata": {},
+   "source": [
+    "Note that we recover tool calls with parsed arguments in LangChain's [standard format](/docs/how_to/tool_calling) in the model response."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0cee63ff-e09f-4dd8-8323-912edbde94f6",
+   "metadata": {},
+   "source": [
+    "## Anthropic\n",
+    "\n",
+    "For Anthropic, we can format a base64-encoded image into a content block of type \"image\", as below:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "d90c4590-71c8-42b1-99ff-03a9eca8082e",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[{'name': 'weather_tool', 'args': {'weather': 'sunny'}, 'id': 'toolu_016m9KfknJqx5fVRYk4tkF6s'}]\n"
+     ]
+    }
+   ],
+   "source": [
+    "import base64\n",
+    "\n",
+    "import httpx\n",
+    "from langchain_anthropic import ChatAnthropic\n",
+    "\n",
+    "image_data = base64.b64encode(httpx.get(image_url).content).decode(\"utf-8\")\n",
+    "\n",
+    "model = ChatAnthropic(model=\"claude-3-sonnet-20240229\").bind_tools([weather_tool])\n",
+    "\n",
+    "message = HumanMessage(\n",
+    "    content=[\n",
+    "        {\"type\": \"text\", \"text\": \"describe the weather in this image\"},\n",
+    "        {\n",
+    "            \"type\": \"image\",\n",
+    "            \"source\": {\n",
+    "                \"type\": \"base64\",\n",
+    "                \"media_type\": \"image/jpeg\",\n",
+    "                \"data\": image_data,\n",
+    "            },\n",
+    "        },\n",
+    "    ],\n",
+    ")\n",
+    "response = model.invoke([message])\n",
+    "print(response.tool_calls)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/integrations/chat/huggingface.ipynb
+++ b/docs/docs/integrations/chat/huggingface.ipynb
@@ -9,9 +9,10 @@
    "This notebook shows how to get started using `Hugging Face` LLM's as chat models.\n",
    "\n",
    "In particular, we will:\n",
-    "1. Utilize the [HuggingFaceTextGenInference](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_text_gen_inference.py), [HuggingFaceEndpoint](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_endpoint.py), or [HuggingFaceHub](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_hub.py) integrations to instantiate an `LLM`.\n",
-    "2. Utilize the `ChatHuggingFace` class to enable any of these LLMs to interface with LangChain's [Chat Messages](/docs/concepts#chat-models) abstraction.\n",
-    "3. Demonstrate how to use an open-source LLM to power an `ChatAgent` pipeline\n",
+    "1. Utilize the [HuggingFaceEndpoint](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/llms/huggingface_endpoint.py) integrations to instantiate an `LLM`.\n",
+    "2. Utilize the `ChatHuggingFace` class to enable any of these LLMs to interface with LangChain's [Chat Messages](/docs/concepts/#message-types) abstraction.\n",
+    "3. Explore tool calling with the `ChatHuggingFace`.\n",
+    "4. Demonstrate how to use an open-source LLM to power an `ChatAgent` pipeline\n",
    "\n",
    "\n",
    "> Note: To get started, you'll need to have a [Hugging Face Access Token](https://huggingface.co/docs/hub/security-tokens) saved as an environment variable: `HUGGINGFACEHUB_API_TOKEN`."
@@ -21,61 +22,16 @@
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\u001b[0mNote: you may need to restart the kernel to use updated packages.\n"
-     ]
-    }
-   ],
-   "source": [
-    "%pip install --upgrade --quiet  text-generation transformers google-search-results numexpr langchainhub sentencepiece jinja2"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## 1. Instantiate an LLM\n",
-    "\n",
-    "There are three LLM options to choose from."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### `HuggingFaceTextGenInference`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
   "outputs": [],
   "source": [
-    "import os\n",
-    "\n",
-    "from langchain_community.llms import HuggingFaceTextGenInference\n",
-    "\n",
-    "ENDPOINT_URL = \"<YOUR_ENDPOINT_URL_HERE>\"\n",
-    "HF_TOKEN = os.getenv(\"HUGGINGFACEHUB_API_TOKEN\")\n",
-    "\n",
-    "llm = HuggingFaceTextGenInference(\n",
-    "    inference_server_url=ENDPOINT_URL,\n",
-    "    max_new_tokens=512,\n",
-    "    top_k=50,\n",
-    "    temperature=0.1,\n",
-    "    repetition_penalty=1.03,\n",
-    "    server_kwargs={\n",
-    "        \"headers\": {\n",
-    "            \"Authorization\": f\"Bearer {HF_TOKEN}\",\n",
-    "            \"Content-Type\": \"application/json\",\n",
-    "        }\n",
-    "    },\n",
-    ")"
+    "%pip install --upgrade --quiet  langchain-huggingface text-generation transformers google-search-results numexpr langchainhub sentencepiece jinja2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1. Instantiate an LLM"
   ]
  },
  {
@@ -87,58 +43,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.llms import HuggingFaceEndpoint\n",
+    "from langchain_huggingface import HuggingFaceEndpoint\n",
    "\n",
-    "ENDPOINT_URL = \"<YOUR_ENDPOINT_URL_HERE>\"\n",
    "llm = HuggingFaceEndpoint(\n",
-    "    endpoint_url=ENDPOINT_URL,\n",
+    "    repo_id=\"meta-llama/Meta-Llama-3-70B-Instruct\",\n",
    "    task=\"text-generation\",\n",
-    "    model_kwargs={\n",
-    "        \"max_new_tokens\": 512,\n",
-    "        \"top_k\": 50,\n",
-    "        \"temperature\": 0.1,\n",
-    "        \"repetition_penalty\": 1.03,\n",
-    "    },\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### `HuggingFaceHub`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "/Users/jacoblee/langchain/langchain/libs/langchain/.venv/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py:127: FutureWarning: '__init__' (from 'huggingface_hub.inference_api') is deprecated and will be removed from version '1.0'. `InferenceApi` client is deprecated in favor of the more feature-complete `InferenceClient`. Check out this guide to learn how to convert your script to use it: https://huggingface.co/docs/huggingface_hub/guides/inference#legacy-inferenceapi-client.\n",
-      "  warnings.warn(warning_message, FutureWarning)\n"
-     ]
-    }
-   ],
-   "source": [
-    "from langchain_community.llms import HuggingFaceHub\n",
-    "\n",
-    "llm = HuggingFaceHub(\n",
-    "    repo_id=\"HuggingFaceH4/zephyr-7b-beta\",\n",
-    "    task=\"text-generation\",\n",
-    "    model_kwargs={\n",
-    "        \"max_new_tokens\": 512,\n",
-    "        \"top_k\": 30,\n",
-    "        \"temperature\": 0.1,\n",
-    "        \"repetition_penalty\": 1.03,\n",
-    "    },\n",
+    "    max_new_tokens=512,\n",
+    "    do_sample=False,\n",
+    "    repetition_penalty=1.03,\n",
    ")"
   ]
  },
@@ -153,37 +69,30 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Instantiate the chat model and some messages to pass."
+    "Instantiate the chat model and some messages to pass. \n",
+    "\n",
+    "**Note**: you need to pass the `model_id` explicitly if you are using self-hosted `text-generation-inference`"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
-      "WARNING! repo_id is not default parameter.\n",
-      "                    repo_id was transferred to model_kwargs.\n",
-      "                    Please confirm that repo_id is what you intended.\n",
-      "WARNING! task is not default parameter.\n",
-      "                    task was transferred to model_kwargs.\n",
-      "                    Please confirm that task is what you intended.\n",
-      "WARNING! huggingfacehub_api_token is not default parameter.\n",
-      "                    huggingfacehub_api_token was transferred to model_kwargs.\n",
-      "                    Please confirm that huggingfacehub_api_token is what you intended.\n",
-      "None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.\n"
+      "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\n"
     ]
    }
   ],
   "source": [
-    "from langchain.schema import (\n",
+    "from langchain_core.messages import (\n",
    "    HumanMessage,\n",
    "    SystemMessage,\n",
    ")\n",
-    "from langchain_community.chat_models.huggingface import ChatHuggingFace\n",
+    "from langchain_huggingface import ChatHuggingFace\n",
    "\n",
    "messages = [\n",
    "    SystemMessage(content=\"You're a helpful assistant\"),\n",
@@ -199,21 +108,21 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "Inspect which model and corresponding chat template is being used."
+    "Check the `model_id`"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "'HuggingFaceH4/zephyr-7b-beta'"
+       "'meta-llama/Meta-Llama-3-70B-Instruct'"
      ]
     },
-     "execution_count": 6,
+     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -231,16 +140,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "\"<|system|>\\nYou're a helpful assistant</s>\\n<|user|>\\nWhat happens when an unstoppable force meets an immovable object?</s>\\n<|assistant|>\\n\""
+       "\"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\\n\\nYou're a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>\\n\\nWhat happens when an unstoppable force meets an immovable object?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\\n\\n\""
      ]
     },
-     "execution_count": 7,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -258,14 +167,20 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "According to a popular philosophical paradox, when an unstoppable force meets an immovable object, it is impossible to determine which one will prevail because both are defined as being completely unyielding and unmovable. The paradox suggests that the very concepts of \"unstoppable force\" and \"immovable object\" are inherently contradictory, and therefore, it is illogical to imagine a scenario where they would meet and interact. However, in practical terms, it is highly unlikely for such a scenario to occur in the real world, as the concepts of \"unstoppable force\" and \"immovable object\" are often used metaphorically to describe hypothetical situations or abstract concepts, rather than physical objects or forces.\n"
+      "One of the classic thought experiments in physics!\n",
+      "\n",
+      "The concept of an unstoppable force meeting an immovable object is a paradox that has puzzled philosophers and physicists for centuries. It's a mind-bending scenario that challenges our understanding of the fundamental laws of physics.\n",
+      "\n",
+      "In essence, an unstoppable force is something that cannot be halted or slowed down, while an immovable object is something that cannot be moved or displaced. If we assume that both entities exist in the same universe, we run into a logical contradiction.\n",
+      "\n",
+      "Here\n"
     ]
    }
   ],
@@ -278,7 +193,71 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## 3. Take it for a spin as an agent!\n",
+    "## 3. Explore the tool calling with `ChatHuggingFace`\n",
+    "\n",
+    "`text-generation-inference` supports tool with open source LLMs starting from v2.0.1"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Create a basic tool (`Calculator`):"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.pydantic_v1 import BaseModel, Field\n",
+    "\n",
+    "\n",
+    "class Calculator(BaseModel):\n",
+    "    \"\"\"Multiply two integers together.\"\"\"\n",
+    "\n",
+    "    a: int = Field(..., description=\"First integer\")\n",
+    "    b: int = Field(..., description=\"Second integer\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Bind the tool to the `chat_model` and give it a try:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Calculator(a=3, b=12)]"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain_core.output_parsers.openai_tools import PydanticToolsParser\n",
+    "\n",
+    "llm_with_multiply = chat_model.bind_tools([Calculator], tool_choice=\"auto\")\n",
+    "parser = PydanticToolsParser(tools=[Calculator])\n",
+    "tool_chain = llm_with_multiply | parser\n",
+    "tool_chain.invoke(\"How much is 3 multiplied by 12?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 4. Take it for a spin as an agent!\n",
    "\n",
    "Here we'll test out `Zephyr-7B-beta` as a zero-shot `ReAct` Agent. The example below is taken from [here](https://python.langchain.com/v0.1/docs/modules/agents/agent_types/react/#using-chat-models).\n",
    "\n",
@@ -287,7 +266,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -310,7 +289,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -342,7 +321,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": null,
   "metadata": {},
   "outputs": [
    {
--- a/docs/docs/integrations/chat/openai.ipynb
+++ b/docs/docs/integrations/chat/openai.ipynb
@@ -147,7 +147,7 @@
    "\n",
    "### ChatOpenAI.bind_tools()\n",
    "\n",
-    "With `ChatAnthropic.bind_tools`, we can easily pass in Pydantic classes, dict schemas, LangChain tools, or even functions as tools to the model. Under the hood these are converted to an Anthropic tool schemas, which looks like:\n",
+    "With `ChatOpenAI.bind_tools`, we can easily pass in Pydantic classes, dict schemas, LangChain tools, or even functions as tools to the model. Under the hood these are converted to an OpenAI tool schemas, which looks like:\n",
    "```\n",
    "{\n",
    "    \"name\": \"...\",\n",
--- a/docs/docs/integrations/document_transformers/cross_encoder_reranker.ipynb
+++ b/docs/docs/integrations/document_transformers/cross_encoder_reranker.ipynb
@@ -67,8 +67,8 @@
   "outputs": [],
   "source": [
    "from langchain.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import FAISS\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
    "\n",
    "documents = TextLoader(\"../../how_to/state_of_the_union.txt\").load()\n",
--- a/docs/docs/integrations/graphs/kuzu_db.ipynb
+++ b/docs/docs/integrations/graphs/kuzu_db.ipynb
@@ -7,11 +7,12 @@
   "source": [
    "# Kuzu\n",
    "\n",
-    ">[Kùzu](https://kuzudb.com) is an in-process property graph database management system. \n",
-    ">\n",
-    ">This notebook shows how to use LLMs to provide a natural language interface to [Kùzu](https://kuzudb.com) database with `Cypher` graph query language.\n",
-    ">\n",
-    ">[Cypher](https://en.wikipedia.org/wiki/Cypher_(query_language)) is a declarative graph query language that allows for expressive and efficient data querying in a property graph."
+    ">[Kùzu](https://kuzudb.com) is an embeddable property graph database management system built for query speed and scalability.\n",
+    "> \n",
+    "> Kùzu has a permissive (MIT) open source license and implements [Cypher](https://en.wikipedia.org/wiki/Cypher_(query_language)), a declarative graph query language that allows for expressive and efficient data querying in a property graph.\n",
+    "> It uses columnar storage and its query processor contains novel join algorithms that allow it to scale to very large graphs without sacrificing query performance.\n",
+    "> \n",
+    "> This notebook shows how to use LLMs to provide a natural language interface to [Kùzu](https://kuzudb.com) database with Cypher."
   ]
  },
  {
@@ -21,7 +22,8 @@
   "source": [
    "## Setting up\n",
    "\n",
-    "Install the python package:\n",
+    "Kùzu is an embedded database (it runs in-process), so there are no servers to manage.\n",
+    "Simply install it via its Python package:\n",
    "\n",
    "```bash\n",
    "pip install kuzu\n",
@@ -32,7 +34,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -52,16 +54,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "<kuzu.query_result.QueryResult at 0x1066ff410>"
+       "<kuzu.query_result.QueryResult at 0x103a72290>"
      ]
     },
-     "execution_count": 2,
+     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -84,16 +86,16 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "<kuzu.query_result.QueryResult at 0x107016210>"
+       "<kuzu.query_result.QueryResult at 0x103a9e750>"
      ]
     },
-     "execution_count": 3,
+     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -132,7 +134,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -143,7 +145,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -152,11 +154,15 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
-    "chain = KuzuQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph, verbose=True)"
+    "chain = KuzuQAChain.from_llm(\n",
+    "    llm=ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-16k\"),\n",
+    "    graph=graph,\n",
+    "    verbose=True,\n",
+    ")"
   ]
  },
  {
@@ -166,12 +172,13 @@
   "source": [
    "## Refresh graph schema information\n",
    "\n",
-    "If the schema of database changes, you can refresh the schema information needed to generate Cypher statements."
+    "If the schema of database changes, you can refresh the schema information needed to generate Cypher statements.\n",
+    "You can also display the schema of the Kùzu graph as demonstrated below."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -180,7 +187,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
@@ -205,78 +212,7 @@
   "source": [
    "## Querying the graph\n",
    "\n",
-    "We can now use the `KuzuQAChain` to ask question of the graph"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\n",
-      "\n",
-      "\u001b[1m> Entering new  chain...\u001b[0m\n",
-      "Generated Cypher:\n",
-      "\u001b[32;1m\u001b[1;3mMATCH (p:Person)-[:ActedIn]->(m:Movie {name: 'The Godfather: Part II'}) RETURN p.name\u001b[0m\n",
-      "Full Context:\n",
-      "\u001b[32;1m\u001b[1;3m[{'p.name': 'Al Pacino'}, {'p.name': 'Robert De Niro'}]\u001b[0m\n",
-      "\n",
-      "\u001b[1m> Finished chain.\u001b[0m\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "'Al Pacino and Robert De Niro both played in The Godfather: Part II.'"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain.run(\"Who played in The Godfather: Part II?\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "\n",
-      "\n",
-      "\u001b[1m> Entering new  chain...\u001b[0m\n",
-      "Generated Cypher:\n",
-      "\u001b[32;1m\u001b[1;3mMATCH (p:Person {name: 'Robert De Niro'})-[:ActedIn]->(m:Movie)\n",
-      "RETURN m.name\u001b[0m\n",
-      "Full Context:\n",
-      "\u001b[32;1m\u001b[1;3m[{'m.name': 'The Godfather: Part II'}]\u001b[0m\n",
-      "\n",
-      "\u001b[1m> Finished chain.\u001b[0m\n"
-     ]
-    },
-    {
-     "data": {
-      "text/plain": [
-       "'Robert De Niro played in The Godfather: Part II.'"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain.run(\"Robert De Niro played in which movies?\")"
+    "We can now use the `KuzuQAChain` to ask questions of the graph."
   ]
  },
  {
@@ -290,12 +226,13 @@
     "text": [
      "\n",
      "\n",
-      "\u001b[1m> Entering new  chain...\u001b[0m\n",
+      "\u001b[1m> Entering new KuzuQAChain chain...\u001b[0m\n",
      "Generated Cypher:\n",
-      "\u001b[32;1m\u001b[1;3mMATCH (p:Person {name: 'Robert De Niro'})-[:ActedIn]->(m:Movie)\n",
-      "RETURN p.birthDate\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3mMATCH (p:Person)-[:ActedIn]->(m:Movie)\n",
+      "WHERE m.name = 'The Godfather: Part II'\n",
+      "RETURN p.name\u001b[0m\n",
      "Full Context:\n",
-      "\u001b[32;1m\u001b[1;3m[{'p.birthDate': '1943-08-17'}]\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3m[{'p.name': 'Al Pacino'}, {'p.name': 'Robert De Niro'}]\u001b[0m\n",
      "\n",
      "\u001b[1m> Finished chain.\u001b[0m\n"
     ]
@@ -303,7 +240,8 @@
    {
     "data": {
      "text/plain": [
-       "'Robert De Niro was born on August 17, 1943.'"
+       "{'query': 'Who acted in The Godfather: Part II?',\n",
+       " 'result': 'Al Pacino, Robert De Niro acted in The Godfather: Part II.'}"
      ]
     },
     "execution_count": 11,
@@ -312,7 +250,7 @@
    }
   ],
   "source": [
-    "chain.run(\"Robert De Niro is born in which year?\")"
+    "chain.invoke(\"Who acted in The Godfather: Part II?\")"
   ]
  },
  {
@@ -326,13 +264,87 @@
     "text": [
      "\n",
      "\n",
-      "\u001b[1m> Entering new  chain...\u001b[0m\n",
+      "\u001b[1m> Entering new KuzuQAChain chain...\u001b[0m\n",
      "Generated Cypher:\n",
-      "\u001b[32;1m\u001b[1;3mMATCH (p:Person)-[:ActedIn]->(m:Movie{name:'The Godfather: Part II'})\n",
-      "WITH p, m, p.birthDate AS birthDate\n",
-      "ORDER BY birthDate ASC\n",
-      "LIMIT 1\n",
-      "RETURN p.name\u001b[0m\n",
+      "\u001b[32;1m\u001b[1;3mMATCH (p:Person)-[:ActedIn]->(m:Movie)\n",
+      "WHERE p.name = 'Robert De Niro'\n",
+      "RETURN m.name\u001b[0m\n",
+      "Full Context:\n",
+      "\u001b[32;1m\u001b[1;3m[{'m.name': 'The Godfather: Part II'}]\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "{'query': 'Robert De Niro played in which movies?',\n",
+       " 'result': 'Robert De Niro played in The Godfather: Part II.'}"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.invoke(\"Robert De Niro played in which movies?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new KuzuQAChain chain...\u001b[0m\n",
+      "Generated Cypher:\n",
+      "\u001b[32;1m\u001b[1;3mMATCH (:Person)-[:ActedIn]->(:Movie {name: 'Godfather: Part II'})\n",
+      "RETURN count(*)\u001b[0m\n",
+      "Full Context:\n",
+      "\u001b[32;1m\u001b[1;3m[{'COUNT_STAR()': 0}]\u001b[0m\n",
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "{'query': 'How many actors played in the Godfather: Part II?',\n",
+       " 'result': \"I don't know the answer.\"}"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.invoke(\"How many actors played in the Godfather: Part II?\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new KuzuQAChain chain...\u001b[0m\n",
+      "Generated Cypher:\n",
+      "\u001b[32;1m\u001b[1;3mMATCH (p:Person)-[:ActedIn]->(m:Movie {name: 'The Godfather: Part II'})\n",
+      "RETURN p.name\n",
+      "ORDER BY p.birthDate ASC\n",
+      "LIMIT 1\u001b[0m\n",
      "Full Context:\n",
      "\u001b[32;1m\u001b[1;3m[{'p.name': 'Al Pacino'}]\u001b[0m\n",
      "\n",
@@ -342,16 +354,114 @@
    {
     "data": {
      "text/plain": [
-       "'The oldest actor who played in The Godfather: Part II is Al Pacino.'"
+       "{'query': 'Who is the oldest actor who played in The Godfather: Part II?',\n",
+       " 'result': 'Al Pacino is the oldest actor who played in The Godfather: Part II.'}"
      ]
     },
-     "execution_count": 12,
+     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "chain.run(\"Who is the oldest actor who played in The Godfather: Part II?\")"
+    "chain.invoke(\"Who is the oldest actor who played in The Godfather: Part II?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Use separate LLMs for Cypher and answer generation\n",
+    "\n",
+    "You can specify `cypher_llm` and `qa_llm` separately to use different LLMs for Cypher generation and answer generation."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/prrao/code/langchain/.venv/lib/python3.11/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The class `LLMChain` was deprecated in LangChain 0.1.17 and will be removed in 0.3.0. Use RunnableSequence, e.g., `prompt | llm` instead.\n",
+      "  warn_deprecated(\n"
+     ]
+    }
+   ],
+   "source": [
+    "chain = KuzuQAChain.from_llm(\n",
+    "    cypher_llm=ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo-16k\"),\n",
+    "    qa_llm=ChatOpenAI(temperature=0, model=\"gpt-4\"),\n",
+    "    graph=graph,\n",
+    "    verbose=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\u001b[1m> Entering new KuzuQAChain chain...\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/prrao/code/langchain/.venv/lib/python3.11/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The method `Chain.run` was deprecated in langchain 0.1.0 and will be removed in 0.2.0. Use invoke instead.\n",
+      "  warn_deprecated(\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Generated Cypher:\n",
+      "\u001b[32;1m\u001b[1;3mMATCH (:Person)-[:ActedIn]->(:Movie {name: 'The Godfather: Part II'})\n",
+      "RETURN count(*)\u001b[0m\n",
+      "Full Context:\n",
+      "\u001b[32;1m\u001b[1;3m[{'COUNT_STAR()': 2}]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/prrao/code/langchain/.venv/lib/python3.11/site-packages/langchain_core/_api/deprecation.py:119: LangChainDeprecationWarning: The method `Chain.__call__` was deprecated in langchain 0.1.0 and will be removed in 0.2.0. Use invoke instead.\n",
+      "  warn_deprecated(\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\u001b[1m> Finished chain.\u001b[0m\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "{'query': 'How many actors played in The Godfather: Part II?',\n",
+       " 'result': 'Two actors played in The Godfather: Part II.'}"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.invoke(\"How many actors played in The Godfather: Part II?\")"
   ]
  }
 ],
@@ -371,7 +481,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.7"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/llms/huggingface_endpoint.ipynb
+++ b/docs/docs/integrations/llms/huggingface_endpoint.ipynb
@@ -20,7 +20,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.llms import HuggingFaceEndpoint"
+    "from langchain_huggingface import HuggingFaceEndpoint"
   ]
  },
  {
@@ -83,7 +83,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.llms import HuggingFaceEndpoint"
+    "from langchain_huggingface import HuggingFaceEndpoint"
   ]
  },
  {
@@ -193,7 +193,7 @@
   "outputs": [],
   "source": [
    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
-    "from langchain_community.llms import HuggingFaceEndpoint\n",
+    "from langchain_huggingface import HuggingFaceEndpoint\n",
    "\n",
    "llm = HuggingFaceEndpoint(\n",
    "    endpoint_url=f\"{your_endpoint_url}\",\n",
--- a/docs/docs/integrations/llms/huggingface_pipelines.ipynb
+++ b/docs/docs/integrations/llms/huggingface_pipelines.ipynb
@@ -55,7 +55,7 @@
   },
   "outputs": [],
   "source": [
-    "from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline\n",
+    "from langchain_huggingface.llms import HuggingFacePipeline\n",
    "\n",
    "hf = HuggingFacePipeline.from_model_id(\n",
    "    model_id=\"gpt2\",\n",
@@ -79,7 +79,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline\n",
+    "from langchain_huggingface.llms import HuggingFacePipeline\n",
    "from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline\n",
    "\n",
    "model_id = \"gpt2\"\n",
--- a/docs/docs/integrations/llms/jsonformer_experimental.ipynb
+++ b/docs/docs/integrations/llms/jsonformer_experimental.ipynb
@@ -152,7 +152,7 @@
    }
   ],
   "source": [
-    "from langchain_community.llms import HuggingFacePipeline\n",
+    "from langchain_huggingface import HuggingFacePipeline\n",
    "from transformers import pipeline\n",
    "\n",
    "hf_model = pipeline(\n",
--- a/docs/docs/integrations/llms/lmformatenforcer_experimental.ipynb
+++ b/docs/docs/integrations/llms/lmformatenforcer_experimental.ipynb
@@ -25,7 +25,7 @@
   },
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  lm-format-enforcer > /dev/null"
+    "%pip install --upgrade --quiet  lm-format-enforcer langchain-huggingface > /dev/null"
   ]
  },
  {
@@ -193,7 +193,7 @@
    }
   ],
   "source": [
-    "from langchain_community.llms import HuggingFacePipeline\n",
+    "from langchain_huggingface import HuggingFacePipeline\n",
    "from transformers import pipeline\n",
    "\n",
    "hf_model = pipeline(\n",
--- a/docs/docs/integrations/llms/mlx_pipelines.ipynb
+++ b/docs/docs/integrations/llms/mlx_pipelines.ipynb
@@ -78,7 +78,6 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline\n",
    "from mlx_lm import load\n",
    "\n",
    "model, tokenizer = load(\"mlx-community/quantized-gemma-2b-it\")\n",
--- a/docs/docs/integrations/llms/openvino.ipynb
+++ b/docs/docs/integrations/llms/openvino.ipynb
@@ -55,7 +55,7 @@
   },
   "outputs": [],
   "source": [
-    "from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline\n",
+    "from langchain_huggingface import HuggingFacePipeline\n",
    "\n",
    "ov_config = {\"PERFORMANCE_HINT\": \"LATENCY\", \"NUM_STREAMS\": \"1\", \"CACHE_DIR\": \"\"}\n",
    "\n",
--- a/docs/docs/integrations/llms/rellm_experimental.ipynb
+++ b/docs/docs/integrations/llms/rellm_experimental.ipynb
@@ -24,7 +24,7 @@
   },
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  rellm > /dev/null"
+    "%pip install --upgrade --quiet  rellm langchain-huggingface > /dev/null"
   ]
  },
  {
@@ -92,7 +92,7 @@
    }
   ],
   "source": [
-    "from langchain_community.llms import HuggingFacePipeline\n",
+    "from langchain_huggingface import HuggingFacePipeline\n",
    "from transformers import pipeline\n",
    "\n",
    "hf_model = pipeline(\n",
--- a/docs/docs/integrations/llms/weight_only_quantization.ipynb
+++ b/docs/docs/integrations/llms/weight_only_quantization.ipynb
@@ -85,7 +85,6 @@
   "outputs": [],
   "source": [
    "from intel_extension_for_transformers.transformers import AutoModelForSeq2SeqLM\n",
-    "from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline\n",
    "from transformers import AutoTokenizer, pipeline\n",
    "\n",
    "model_id = \"google/flan-t5-large\"\n",
--- a/docs/docs/integrations/platforms/huggingface.mdx
+++ b/docs/docs/integrations/platforms/huggingface.mdx
@@ -2,22 +2,24 @@

 All functionality related to the [Hugging Face Platform](https://huggingface.co/).

+## Installation
+
+Most of the Hugging Face integrations are available in the `langchain-huggingface` package.
+
+```bash
+pip install langchain-huggingface
+```
+
 ## Chat models

 ### Models from Hugging Face

 We can use the `Hugging Face` LLM classes or directly use the `ChatHuggingFace` class.

-We need to install several python packages.
-
-```bash
-pip install huggingface_hub
-pip install transformers
-```
 See a [usage example](/docs/integrations/chat/huggingface).

 ```python
-from langchain_community.chat_models.huggingface import ChatHuggingFace
+from langchain_huggingface import ChatHuggingFace
 ```

 ## LLMs
@@ -26,60 +28,23 @@ from langchain_community.chat_models.huggingface import ChatHuggingFace

 Hugging Face models can be run locally through the `HuggingFacePipeline` class.

-We need to install `transformers` python package.
-
-```bash
-pip install transformers
-```
-
 See a [usage example](/docs/integrations/llms/huggingface_pipelines).

 ```python
-from langchain_community.llms.huggingface_pipeline import HuggingFacePipeline
+from langchain_huggingface import HuggingFacePipeline
 ```

-To use the OpenVINO backend in local pipeline wrapper, please install the optimum library and set HuggingFacePipeline's backend as `openvino`:
-
-```bash
-pip install --upgrade-strategy eager "optimum[openvino,nncf]"
-```
-
-See a [usage example](/docs/integrations/llms/huggingface_pipelines)
-
-To export your model to the OpenVINO IR format with the CLI:
-
-```bash
-optimum-cli export openvino --model gpt2 ov_model
-```
-
-To apply [weight-only quantization](https://github.com/huggingface/optimum-intel?tab=readme-ov-file#export) when exporting your model.
-
-
 ## Embedding Models

-### Hugging Face Hub
-
->The [Hugging Face Hub](https://huggingface.co/docs/hub/index) is a platform 
-> with over 350k models, 75k datasets, and 150k demo apps (Spaces), all open source 
-> and publicly available, in an online platform where people can easily 
-> collaborate and build ML together. The Hub works as a central place where anyone 
-> can explore, experiment, collaborate, and build technology with Machine Learning.
-
-We need to install the `sentence_transformers` python package.
-
-```bash
-pip install sentence_transformers
-```
-
-
-#### HuggingFaceEmbeddings
+### HuggingFaceEmbeddings

 See a [usage example](/docs/integrations/text_embedding/huggingfacehub).

 ```python
-from langchain_community.embeddings import HuggingFaceEmbeddings
+from langchain_huggingface import HuggingFaceEmbeddings
 ```
-#### HuggingFaceInstructEmbeddings
+
+### HuggingFaceInstructEmbeddings

 See a [usage example](/docs/integrations/text_embedding/instruct_embeddings).

@@ -87,7 +52,7 @@ See a [usage example](/docs/integrations/text_embedding/instruct_embeddings).
 from langchain_community.embeddings import HuggingFaceInstructEmbeddings
 ```

-#### HuggingFaceBgeEmbeddings
+### HuggingFaceBgeEmbeddings

 >[BGE models on the HuggingFace](https://huggingface.co/BAAI/bge-large-en) are [the best open-source embedding models](https://huggingface.co/spaces/mteb/leaderboard).
 >BGE model is created by the [Beijing Academy of Artificial Intelligence (BAAI)](https://en.wikipedia.org/wiki/Beijing_Academy_of_Artificial_Intelligence). `BAAI` is a private non-profit organization engaged in AI research and development.
--- a/docs/docs/integrations/platforms/index.mdx
+++ b/docs/docs/integrations/platforms/index.mdx
@@ -36,6 +36,7 @@ These providers have standalone `langchain-{provider}` packages for improved ver
 - [Nvidia](/docs/integrations/providers/nvidia)
 - [OpenAI](/docs/integrations/platforms/openai)
 - [Pinecone](/docs/integrations/providers/pinecone)
+- [Qdrant](/docs/integrations/providers/qdrant)
 - [Robocorp](/docs/integrations/providers/robocorp)
 - [Together AI](/docs/integrations/providers/together)
 - [Upstage](/docs/integrations/providers/upstage)
--- a/docs/docs/integrations/providers/qdrant.mdx
+++ b/docs/docs/integrations/providers/qdrant.mdx
@@ -7,10 +7,10 @@

 ## Installation and Setup

-Install the Python SDK:
+Install the Python partner package:

 ```bash
-pip install qdrant-client
+pip install langchain-qdrant
 ```


@@ -21,7 +21,7 @@ whether for semantic search or example selection.

 To import this vectorstore:
 ```python
-from langchain_community.vectorstores import Qdrant
+from langchain_qdrant import Qdrant
 ```

 For a more detailed walkthrough of the Qdrant wrapper, see [this notebook](/docs/integrations/vectorstores/qdrant)
--- a/docs/docs/integrations/providers/snowflake.mdx
+++ b/docs/docs/integrations/providers/snowflake.mdx
@@ -17,7 +17,7 @@ pip install langchain-community sentence-transformers
 ```

 ```python
-from langchain_community.text_embeddings import HuggingFaceEmbeddings
+from langchain_huggingface import HuggingFaceEmbeddings

 model = HuggingFaceEmbeddings(model_name="snowflake/arctic-embed-l")
 ```
--- a/docs/docs/integrations/providers/vdms.mdx
+++ b/docs/docs/integrations/providers/vdms.mdx
@@ -41,7 +41,7 @@ docs = text_splitter.split_documents(documents)

 from langchain_community.vectorstores import VDMS
 from langchain_community.vectorstores.vdms import VDMS_Client
-from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings
+from langchain_huggingface import HuggingFaceEmbeddings

 client = VDMS_Client("localhost", 55555)
 vectorstore = VDMS.from_documents(
--- a/docs/docs/integrations/retrievers/merger_retriever.ipynb
+++ b/docs/docs/integrations/retrievers/merger_retriever.ipynb
@@ -33,7 +33,7 @@
    "    EmbeddingsClusteringFilter,\n",
    "    EmbeddingsRedundantFilter,\n",
    ")\n",
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_openai import OpenAIEmbeddings\n",
    "\n",
    "# Get 3 diff embeddings.\n",
--- a/docs/docs/integrations/text_embedding/huggingfacehub.ipynb
+++ b/docs/docs/integrations/text_embedding/huggingfacehub.ipynb
@@ -26,7 +26,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings import HuggingFaceEmbeddings"
+    "from langchain_huggingface.embeddings import HuggingFaceEmbeddings"
   ]
  },
  {
@@ -175,7 +175,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings import HuggingFaceHubEmbeddings"
+    "from langchain_huggingface.embeddings import HuggingFaceEndpointEmbeddings"
   ]
  },
  {
@@ -185,7 +185,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "embeddings = HuggingFaceHubEmbeddings()"
+    "embeddings = HuggingFaceEndpointEmbeddings()"
   ]
  },
  {
--- a/docs/docs/integrations/text_embedding/sentence_transformers.ipynb
+++ b/docs/docs/integrations/text_embedding/sentence_transformers.ipynb
@@ -41,7 +41,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings import HuggingFaceEmbeddings"
+    "from langchain_huggingface import HuggingFaceEmbeddings"
   ]
  },
  {
--- a/docs/docs/integrations/text_embedding/text_embeddings_inference.ipynb
+++ b/docs/docs/integrations/text_embedding/text_embeddings_inference.ipynb
@@ -59,7 +59,7 @@
   },
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings import HuggingFaceHubEmbeddings"
+    "from langchain_huggingface.embeddings import HuggingFaceEndpointEmbeddings"
   ]
  },
  {
@@ -71,7 +71,7 @@
   },
   "outputs": [],
   "source": [
-    "embeddings = HuggingFaceHubEmbeddings(model=\"http://localhost:8080\")"
+    "embeddings = HuggingFaceEndpointEmbeddings(model=\"http://localhost:8080\")"
   ]
  },
  {
--- a/docs/docs/integrations/vectorstores/annoy.ipynb
+++ b/docs/docs/integrations/vectorstores/annoy.ipynb
@@ -52,8 +52,8 @@
   },
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import Annoy\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "\n",
    "embeddings_func = HuggingFaceEmbeddings()"
   ]
--- a/docs/docs/integrations/vectorstores/faiss.ipynb
+++ b/docs/docs/integrations/vectorstores/faiss.ipynb
@@ -328,7 +328,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "\n",
    "pkl = db.serialize_to_bytes()  # serializes the faiss\n",
    "embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
--- a/docs/docs/integrations/vectorstores/faiss_async.ipynb
+++ b/docs/docs/integrations/vectorstores/faiss_async.ipynb
@@ -158,7 +158,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "\n",
    "pkl = db.serialize_to_bytes()  # serializes the faiss index\n",
    "embeddings = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
--- a/docs/docs/integrations/vectorstores/oracle.ipynb
+++ b/docs/docs/integrations/vectorstores/oracle.ipynb
@@ -91,11 +91,11 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import oraclevs\n",
    "from langchain_community.vectorstores.oraclevs import OracleVS\n",
    "from langchain_community.vectorstores.utils import DistanceStrategy\n",
-    "from langchain_core.documents import Document"
+    "from langchain_core.documents import Document\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings"
   ]
  },
  {
--- a/docs/docs/integrations/vectorstores/pinecone.ipynb
+++ b/docs/docs/integrations/vectorstores/pinecone.ipynb
@@ -12,16 +12,7 @@
    "\n",
    "This notebook shows how to use functionality related to the `Pinecone` vector database.\n",
    "\n",
-    "To use Pinecone, you must have an API key. \n",
-    "Here are the [installation instructions](https://docs.pinecone.io/docs/quickstart).\n",
-    "\n",
-    "Set the following environment variables to make using the `Pinecone` integration easier:\n",
-    "\n",
-    "- `PINECONE_API_KEY`: Your Pinecone API key.\n",
-    "- `PINECONE_INDEX_NAME`: The name of the index you want to use.\n",
-    "\n",
-    "And to follow along in this doc, you should also set\n",
-    "\n",
+    "Set the following environment variables to follow along in this doc:\n",
    "- `OPENAI_API_KEY`: Your OpenAI API key, for using `OpenAIEmbeddings`"
   ]
  },
@@ -34,7 +25,11 @@
   },
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  langchain-pinecone langchain-openai langchain"
+    "%pip install --upgrade --quiet  \\\n",
+    "    langchain-pinecone \\\n",
+    "    langchain-openai \\\n",
+    "    langchain \\\n",
+    "    pinecone-notebooks"
   ]
  },
  {
@@ -72,14 +67,93 @@
    "embeddings = OpenAIEmbeddings()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "ef6dc4de",
+   "metadata": {},
+   "source": [
+    "Now let's create a new Pinecone account, or sign into your existing one, and create an API key to use in this notebook."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1fdc3c36",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from pinecone_notebooks.colab import Authenticate\n",
+    "\n",
+    "Authenticate()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "54da1a39",
+   "metadata": {},
+   "source": [
+    "The newly created API key has been stored in the `PINECONE_API_KEY` environment variable. We will use it to setup the Pinecone client."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "eb554814",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "pinecone_api_key = os.environ.get(\"PINECONE_API_KEY\")\n",
+    "pinecone_api_key\n",
+    "\n",
+    "import time\n",
+    "\n",
+    "from pinecone import Pinecone, ServerlessSpec\n",
+    "\n",
+    "pc = Pinecone(api_key=pinecone_api_key)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "658706a3",
+   "metadata": {},
+   "source": [
+    "Next, let's connect to your Pinecone index. If one named `index_name` doesn't exist, it will be created."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "276a06dd",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import time\n",
+    "\n",
+    "index_name = \"langchain-index\"  # change if desired\n",
+    "\n",
+    "existing_indexes = [index_info[\"name\"] for index_info in pc.list_indexes()]\n",
+    "\n",
+    "if index_name not in existing_indexes:\n",
+    "    pc.create_index(\n",
+    "        name=index_name,\n",
+    "        dimension=1536,\n",
+    "        metric=\"cosine\",\n",
+    "        spec=ServerlessSpec(cloud=\"aws\", region=\"us-east-1\"),\n",
+    "    )\n",
+    "    while not pc.describe_index(index_name).status[\"ready\"]:\n",
+    "        time.sleep(1)\n",
+    "\n",
+    "index = pc.Index(index_name)"
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "3a4d377f",
   "metadata": {},
   "source": [
-    "Now let's assume you have your Pinecone index set up with `dimension=1536`.\n",
-    "\n",
-    "We can connect to our Pinecone index and insert those chunked docs as contents with `PineconeVectorStore.from_documents`."
+    "Now that our Pinecone index is setup, we can upsert those chunked docs as contents with `PineconeVectorStore.from_documents`."
   ]
  },
  {
@@ -91,8 +165,6 @@
   "source": [
    "from langchain_pinecone import PineconeVectorStore\n",
    "\n",
-    "index_name = \"langchain-test-index\"\n",
-    "\n",
    "docsearch = PineconeVectorStore.from_documents(docs, embeddings, index_name=index_name)"
   ]
  },
@@ -315,14 +387,6 @@
    "for i, doc in enumerate(found_docs):\n",
    "    print(f\"{i + 1}.\", doc.page_content, \"\\n\")"
   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b0fd750b",
-   "metadata": {},
-   "outputs": [],
-   "source": []
  }
 ],
 "metadata": {
@@ -341,7 +405,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.4"
+   "version": "3.11.6"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/vectorstores/qdrant.ipynb
+++ b/docs/docs/integrations/vectorstores/qdrant.ipynb
@@ -30,7 +30,7 @@
   },
   "outputs": [],
   "source": [
-    "%pip install --upgrade --quiet  qdrant-client"
+    "%pip install --upgrade --quiet  langchain-qdrant langchain-openai langchain"
   ]
  },
  {
@@ -79,8 +79,8 @@
   "outputs": [],
   "source": [
    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.vectorstores import Qdrant\n",
    "from langchain_openai import OpenAIEmbeddings\n",
+    "from langchain_qdrant import Qdrant\n",
    "from langchain_text_splitters import CharacterTextSplitter"
   ]
  },
@@ -216,7 +216,7 @@
   "source": [
    "### Qdrant Cloud\n",
    "\n",
-    "If you prefer not to keep yourself busy with managing the infrastructure, you can choose to set up a fully-managed Qdrant cluster on [Qdrant Cloud](https://cloud.qdrant.io/). There is a free forever 1GB cluster included for trying out. The main difference with using a managed version of Qdrant is that you'll need to provide an API key to secure your deployment from being accessed publicly."
+    "If you prefer not to keep yourself busy with managing the infrastructure, you can choose to set up a fully-managed Qdrant cluster on [Qdrant Cloud](https://cloud.qdrant.io/). There is a free forever 1GB cluster included for trying out. The main difference with using a managed version of Qdrant is that you'll need to provide an API key to secure your deployment from being accessed publicly. The value can also be set in a `QDRANT_API_KEY` environment variable."
   ]
  },
  {
@@ -243,6 +243,36 @@
    ")"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "825c7903",
+   "metadata": {},
+   "source": [
+    "## Using an existing collection"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3f772575",
+   "metadata": {},
+   "source": [
+    "To get an instance of `langchain_qdrant.Qdrant` without loading any new documents or texts, you can use the `Qdrant.from_existing_collection()` method."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "daf7a6e5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "qdrant = Qdrant.from_existing_collection(\n",
+    "    embeddings=embeddings,\n",
+    "    collection_name=\"my_documents\",\n",
+    "    url=\"http://localhost:6333\",\n",
+    ")"
+   ]
+  },
  {
   "attachments": {},
   "cell_type": "markdown",
@@ -251,7 +281,7 @@
   "source": [
    "## Recreating the collection\n",
    "\n",
-    "Both `Qdrant.from_texts` and `Qdrant.from_documents` methods are great to start using Qdrant with Langchain. In the previous versions the collection was recreated every time you called any of them. That behaviour has changed. Currently, the collection is going to be reused if it already exists. Setting `force_recreate` to `True` allows to remove the old collection and start from scratch."
+    "The collection is reused if it already exists. Setting `force_recreate` to `True` allows to remove the old collection and start from scratch."
   ]
  },
  {
@@ -520,7 +550,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": null,
   "id": "9427195f",
   "metadata": {
    "ExecuteTime": {
@@ -528,21 +558,9 @@
     "start_time": "2023-04-04T10:51:26.018763Z"
    }
   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "VectorStoreRetriever(vectorstore=<langchain_community.vectorstores.qdrant.Qdrant object at 0x7fc4e5720a00>, search_type='similarity', search_kwargs={})"
-      ]
-     },
-     "execution_count": 15,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
-    "retriever = qdrant.as_retriever()\n",
-    "retriever"
+    "retriever = qdrant.as_retriever()"
   ]
  },
  {
@@ -556,7 +574,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": null,
   "id": "64348f1b",
   "metadata": {
    "ExecuteTime": {
@@ -564,21 +582,9 @@
     "start_time": "2023-04-04T10:51:26.034284Z"
    }
   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "VectorStoreRetriever(vectorstore=<langchain_community.vectorstores.qdrant.Qdrant object at 0x7fc4e5720a00>, search_type='mmr', search_kwargs={})"
-      ]
-     },
-     "execution_count": 16,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
-    "retriever = qdrant.as_retriever(search_type=\"mmr\")\n",
-    "retriever"
+    "retriever = qdrant.as_retriever(search_type=\"mmr\")"
   ]
  },
  {
@@ -678,7 +684,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": null,
   "id": "e4d6baf9",
   "metadata": {
    "ExecuteTime": {
@@ -686,18 +692,7 @@
     "start_time": "2023-04-04T11:08:30.229748Z"
    }
   },
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "<langchain_community.vectorstores.qdrant.Qdrant at 0x7fc4e2baa230>"
-      ]
-     },
-     "execution_count": 19,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
    "Qdrant.from_documents(\n",
    "    docs,\n",
--- a/docs/docs/integrations/vectorstores/scann.ipynb
+++ b/docs/docs/integrations/vectorstores/scann.ipynb
@@ -60,8 +60,8 @@
   ],
   "source": [
    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import ScaNN\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_text_splitters import CharacterTextSplitter\n",
    "\n",
    "loader = TextLoader(\"state_of_the_union.txt\")\n",
--- a/docs/docs/integrations/vectorstores/semadb.ipynb
+++ b/docs/docs/integrations/vectorstores/semadb.ipynb
@@ -41,7 +41,7 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "\n",
    "embeddings = HuggingFaceEmbeddings()"
   ]
--- a/docs/docs/integrations/vectorstores/surrealdb.ipynb
+++ b/docs/docs/integrations/vectorstores/surrealdb.ipynb
@@ -74,8 +74,8 @@
   "outputs": [],
   "source": [
    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import SurrealDBStore\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_text_splitters import CharacterTextSplitter"
   ]
  },
--- a/docs/docs/integrations/vectorstores/tiledb.ipynb
+++ b/docs/docs/integrations/vectorstores/tiledb.ipynb
@@ -44,8 +44,8 @@
   "outputs": [],
   "source": [
    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import TileDB\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_text_splitters import CharacterTextSplitter\n",
    "\n",
    "raw_documents = TextLoader(\"../../how_to/state_of_the_union.txt\").load()\n",
--- a/docs/docs/integrations/vectorstores/vald.ipynb
+++ b/docs/docs/integrations/vectorstores/vald.ipynb
@@ -43,8 +43,8 @@
   "outputs": [],
   "source": [
    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import Vald\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_text_splitters import CharacterTextSplitter\n",
    "\n",
    "raw_documents = TextLoader(\"state_of_the_union.txt\").load()\n",
@@ -190,8 +190,8 @@
   "outputs": [],
   "source": [
    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import Vald\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_text_splitters import CharacterTextSplitter\n",
    "\n",
    "raw_documents = TextLoader(\"state_of_the_union.txt\").load()\n",
--- a/docs/docs/integrations/vectorstores/vdms.ipynb
+++ b/docs/docs/integrations/vectorstores/vdms.ipynb
@@ -92,9 +92,9 @@
    "import time\n",
    "\n",
    "from langchain_community.document_loaders.text import TextLoader\n",
-    "from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores import VDMS\n",
    "from langchain_community.vectorstores.vdms import VDMS_Client\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_text_splitters.character import CharacterTextSplitter\n",
    "\n",
    "time.sleep(2)\n",
--- a/docs/docs/integrations/vectorstores/vearch.ipynb
+++ b/docs/docs/integrations/vectorstores/vearch.ipynb
@@ -53,8 +53,8 @@
   ],
   "source": [
    "from langchain_community.document_loaders import TextLoader\n",
-    "from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings\n",
    "from langchain_community.vectorstores.vearch import Vearch\n",
+    "from langchain_huggingface import HuggingFaceEmbeddings\n",
    "from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
    "from transformers import AutoModel, AutoTokenizer\n",
    "\n",
--- a/docs/docs/introduction.mdx
+++ b/docs/docs/introduction.mdx
@@ -54,13 +54,13 @@ These are the best ones to get started with:
 Explore the full list of tutorials [here](/docs/tutorials).


-## [How-To Guides](/docs/how_to)
+## [How-to guides](/docs/how_to)

 [Here](/docs/how_to) you’ll find short answers to “How do I….?” types of questions.
 These how-to guides don’t cover topics in depth – you’ll find that material in the [Tutorials](/docs/tutorials) and the [API Reference](https://api.python.langchain.com/en/latest/).
 However, these guides will help you quickly accomplish common tasks.

-## [Conceptual Guide](/docs/concepts)
+## [Conceptual guide](/docs/concepts)

 Introductions to all the key parts of LangChain you’ll need to know! [Here](/docs/concepts) you'll find high level explanations of all LangChain concepts.

--- a/docs/docs/security.md
+++ b/docs/docs/security.md
@@ -2,7 +2,7 @@

 LangChain has a large ecosystem of integrations with various external resources like local and remote file systems, APIs and databases. These integrations allow developers to create versatile applications that combine the power of LLMs with the ability to access, interact with and manipulate external resources.

-## Best Practices
+## Best practices

 When building such applications developers should remember to follow good security practices:

@@ -25,6 +25,6 @@ If you're building applications that access external resources like file systems
 or databases, consider speaking with your company's security team to determine how to best
 design and secure your applications.

-## Reporting a Vulnerability
+## Reporting a vulnerability

 Please report security vulnerabilities by email to security@langchain.dev. This will ensure the issue is promptly triaged and acted upon as needed.
--- a/docs/docs/tutorials/agents.ipynb
+++ b/docs/docs/tutorials/agents.ipynb
@@ -63,7 +63,7 @@
    "```\n",
    "\n",
    "\n",
-    "For more details, see our [Installation guide](/docs/installation).\n",
+    "For more details, see our [Installation guide](/docs/how_to/installation).\n",
    "\n",
    "### LangSmith\n",
    "\n",
--- a/docs/docs/tutorials/chatbot.ipynb
+++ b/docs/docs/tutorials/chatbot.ipynb
@@ -75,7 +75,7 @@
    "```\n",
    "\n",
    "\n",
-    "For more details, see our [Installation guide](/docs/installation).\n",
+    "For more details, see our [Installation guide](/docs/how_to/installation).\n",
    "\n",
    "### LangSmith\n",
    "\n",
--- a/docs/docs/tutorials/extraction.ipynb
+++ b/docs/docs/tutorials/extraction.ipynb
@@ -65,7 +65,7 @@
    "```\n",
    "\n",
    "\n",
-    "For more details, see our [Installation guide](/docs/installation).\n",
+    "For more details, see our [Installation guide](/docs/how_to/installation).\n",
    "\n",
    "### LangSmith\n",
    "\n",
--- a/docs/docs/tutorials/index.mdx
+++ b/docs/docs/tutorials/index.mdx
@@ -9,6 +9,7 @@ New to LangChain or to LLM app development in general? Read this material to qui
 ### Basics
 - [Build a Simple LLM Application](/docs/tutorials/llm_chain)
 - [Build a Chatbot](/docs/tutorials/chatbot)
+- [Build vector stores and retrievers](/docs/tutorials/retrievers)
 - [Build an Agent](/docs/tutorials/agents)

 ### Working with external knowledge
--- a/docs/docs/tutorials/llm_chain.ipynb
+++ b/docs/docs/tutorials/llm_chain.ipynb
@@ -64,7 +64,7 @@
    "```\n",
    "\n",
    "\n",
-    "For more details, see our [Installation guide](/docs/installation).\n",
+    "For more details, see our [Installation guide](/docs/how_to/installation).\n",
    "\n",
    "### LangSmith\n",
    "\n",
--- a/docs/docs/tutorials/rag.ipynb
+++ b/docs/docs/tutorials/rag.ipynb
@@ -78,7 +78,7 @@
    "```\n",
    "\n",
    "\n",
-    "For more details, see our [Installation guide](/docs/installation).\n",
+    "For more details, see our [Installation guide](/docs/how_to/installation).\n",
    "\n",
    "### LangSmith\n",
    "\n",
--- a/docs/docs/tutorials/retrievers.ipynb
+++ b/docs/docs/tutorials/retrievers.ipynb
@@ -0,0 +1,502 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "bf37a837-7a6a-447b-8779-38f26c585887",
+   "metadata": {},
+   "source": [
+    "# Vector stores and retrievers\n",
+    "\n",
+    "This tutorial will familiarize you with LangChain's vector store and retriever abstractions. These abstractions are designed to support retrieval of data--  from (vector) databases and other sources--  for integration with LLM workflows. They are important for applications that fetch data to be reasoned over as part of model inference, as in the case of retrieval-augmented generation, or RAG (see our RAG tutorial [here](/docs/tutorials/rag)).\n",
+    "\n",
+    "## Concepts\n",
+    "\n",
+    "This guide focuses on retrieval of text data. We will cover the following concepts:\n",
+    "\n",
+    "- Documents;\n",
+    "- Vector stores;\n",
+    "- Retrievers.\n",
+    "\n",
+    "## Setup\n",
+    "\n",
+    "### Jupyter Notebook\n",
+    "\n",
+    "This and other tutorials are perhaps most conveniently run in a Jupyter notebook. See [here](https://jupyter.org/install) for instructions on how to install.\n",
+    "\n",
+    "### Installation\n",
+    "\n",
+    "This tutorial requires the `langchain`, `langchain-chroma`, and `langchain-openai` packages:\n",
+    "\n",
+    "```{=mdx}\n",
+    "import Tabs from '@theme/Tabs';\n",
+    "import TabItem from '@theme/TabItem';\n",
+    "import CodeBlock from \"@theme/CodeBlock\";\n",
+    "\n",
+    "<Tabs>\n",
+    "  <TabItem value=\"pip\" label=\"Pip\" default>\n",
+    "    <CodeBlock language=\"bash\">pip install langchain langchain-chroma langchain-openai</CodeBlock>\n",
+    "  </TabItem>\n",
+    "  <TabItem value=\"conda\" label=\"Conda\">\n",
+    "    <CodeBlock language=\"bash\">conda install langchain langchain-chroma langchain-openai -c conda-forge</CodeBlock>\n",
+    "  </TabItem>\n",
+    "</Tabs>\n",
+    "\n",
+    "```\n",
+    "\n",
+    "For more details, see our [Installation guide](/docs/how_to/installation).\n",
+    "\n",
+    "### LangSmith\n",
+    "\n",
+    "Many of the applications you build with LangChain will contain multiple steps with multiple invocations of LLM calls.\n",
+    "As these applications get more and more complex, it becomes crucial to be able to inspect what exactly is going on inside your chain or agent.\n",
+    "The best way to do this is with [LangSmith](https://smith.langchain.com).\n",
+    "\n",
+    "After you sign up at the link above, make sure to set your environment variables to start logging traces:\n",
+    "\n",
+    "```shell\n",
+    "export LANGCHAIN_TRACING_V2=\"true\"\n",
+    "export LANGCHAIN_API_KEY=\"...\"\n",
+    "```\n",
+    "\n",
+    "Or, if in a notebook, you can set them with:\n",
+    "\n",
+    "```python\n",
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
+    "os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()\n",
+    "```\n",
+    "\n",
+    "\n",
+    "## Documents\n",
+    "\n",
+    "LangChain implements a [Document](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html) abstraction, which is intended to represent a unit of text and associated metadata. It has two attributes:\n",
+    "\n",
+    "- `page_content`: a string representing the content;\n",
+    "- `metadata`: a dict containing arbitrary metadata.\n",
+    "\n",
+    "The `metadata` attribute can capture information about the source of the document, its relationship to other documents, and other information. Note that an individual `Document` object often represents a chunk of a larger document.\n",
+    "\n",
+    "Let's generate some sample documents:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "9f3dc151-7b2f-4d94-9558-7a84f7eab100",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.documents import Document\n",
+    "\n",
+    "documents = [\n",
+    "    Document(\n",
+    "        page_content=\"Dogs are great companions, known for their loyalty and friendliness.\",\n",
+    "        metadata={\"source\": \"mammal-pets-doc\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Cats are independent pets that often enjoy their own space.\",\n",
+    "        metadata={\"source\": \"mammal-pets-doc\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Goldfish are popular pets for beginners, requiring relatively simple care.\",\n",
+    "        metadata={\"source\": \"fish-pets-doc\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Parrots are intelligent birds capable of mimicking human speech.\",\n",
+    "        metadata={\"source\": \"bird-pets-doc\"},\n",
+    "    ),\n",
+    "    Document(\n",
+    "        page_content=\"Rabbits are social animals that need plenty of space to hop around.\",\n",
+    "        metadata={\"source\": \"mammal-pets-doc\"},\n",
+    "    ),\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1cac19bd-27d1-40f1-9c27-7a586b685b4e",
+   "metadata": {},
+   "source": [
+    "Here we've generated five documents, containing metadata indicating three distinct \"sources\".\n",
+    "\n",
+    "## Vector stores\n",
+    "\n",
+    "Vector search is a common way to store and search over unstructured data (such as unstructured text). The idea is to store numeric vectors that are associated with the text. Given a query, we can [embed](/docs/concepts#embedding-models) it as a vector of the same dimension and use vector similarity metrics to identify related data in the store.\n",
+    "\n",
+    "LangChain [VectorStore](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStore.html) objects contain methods for adding text and `Document` objects to the store, and querying them using various similarity metrics. They are often initialized with [embedding](/docs/how_to/embed_text) models, which determine how text data is translated to numeric vectors.\n",
+    "\n",
+    "LangChain includes a suite of [integrations](/docs/integrations/vectorstores) with different vector store technologies. Some vector stores are hosted by a provider (e.g., various cloud providers) and require specific credentials to use; some (such as [Postgres](/docs/integrations/vectorstores/pgvector)) run in separate infrastructure that can be run locally or via a third-party; others can run in-memory for lightweight workloads. Here we will demonstrate usage of LangChain VectorStores using [Chroma](/docs/integrations/vectorstores/chroma), which includes an in-memory implementation.\n",
+    "\n",
+    "To instantiate a vector store, we often need to provide an [embedding](/docs/how_to/embed_text) model to specify how text should be converted into a numeric vector. Here we will use [OpenAI embeddings](/docs/integrations/text_embedding/openai/)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "d48acc28-1a34-414b-8e08-fbdef3a2a60b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_chroma import Chroma\n",
+    "from langchain_openai import OpenAIEmbeddings\n",
+    "\n",
+    "vectorstore = Chroma.from_documents(\n",
+    "    documents,\n",
+    "    embedding=OpenAIEmbeddings(),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ff0f0b43-e5b8-4c79-b782-a02f17345487",
+   "metadata": {},
+   "source": [
+    "Calling `.from_documents` here will add the documents to the vector store. [VectorStore](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStore.html) implements methods for adding documents that can also be called after the object is instantiated. Most implementations will allow you to connect to an existing vector store--  e.g., by providing a client, index name, or other information. See the documentation for a specific [integration](/docs/integrations/vectorstores) for more detail.\n",
+    "\n",
+    "Once we've instantiated a `VectorStore` that contains documents, we can query it. [VectorStore](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStore.html) includes methods for querying:\n",
+    "- Synchronously and asynchronously;\n",
+    "- By string query and by vector;\n",
+    "- With and without returning similarity scores;\n",
+    "- By similarity and [maximum marginal relevance](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStore.html#langchain_core.vectorstores.VectorStore.max_marginal_relevance_search) (to balance similarity with query to diversity in retrieved results).\n",
+    "\n",
+    "The methods will generally include a list of [Document](https://api.python.langchain.com/en/latest/documents/langchain_core.documents.base.Document.html#langchain_core.documents.base.Document) objects in their outputs.\n",
+    "\n",
+    "### Examples\n",
+    "\n",
+    "Return documents based on similarity to a string query:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "7e01ed91-1a98-4221-960a-bd7a2541a548",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Dogs are great companions, known for their loyalty and friendliness.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Rabbits are social animals that need plenty of space to hop around.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Parrots are intelligent birds capable of mimicking human speech.', metadata={'source': 'bird-pets-doc'})]"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "vectorstore.similarity_search(\"cat\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4d4f9857-5a7d-4b5f-82b8-ff76539143c2",
+   "metadata": {},
+   "source": [
+    "Async query:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "618af196-6182-4a7d-8b09-07493fcdc868",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Dogs are great companions, known for their loyalty and friendliness.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Rabbits are social animals that need plenty of space to hop around.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Parrots are intelligent birds capable of mimicking human speech.', metadata={'source': 'bird-pets-doc'})]"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "await vectorstore.asimilarity_search(\"cat\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "d4172698-9ad7-4422-99b2-bdc268e99c75",
+   "metadata": {},
+   "source": [
+    "Return scores:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "4ed24af2-0d82-478c-949b-b389348d4e9f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[(Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'}),\n",
+       "  0.3751849830150604),\n",
+       " (Document(page_content='Dogs are great companions, known for their loyalty and friendliness.', metadata={'source': 'mammal-pets-doc'}),\n",
+       "  0.48316916823387146),\n",
+       " (Document(page_content='Rabbits are social animals that need plenty of space to hop around.', metadata={'source': 'mammal-pets-doc'}),\n",
+       "  0.49601367115974426),\n",
+       " (Document(page_content='Parrots are intelligent birds capable of mimicking human speech.', metadata={'source': 'bird-pets-doc'}),\n",
+       "  0.4972994923591614)]"
+      ]
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Note that providers implement different scores; Chroma here\n",
+    "# returns a distance metric that should vary inversely with\n",
+    "# similarity.\n",
+    "\n",
+    "vectorstore.similarity_search_with_score(\"cat\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b4991642-7275-40a9-b11a-e3beccbf2614",
+   "metadata": {},
+   "source": [
+    "Return documents based on similarity to a embedded query:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "b1a5eabb-a821-48cc-917e-cc27f03e4bcc",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Dogs are great companions, known for their loyalty and friendliness.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Rabbits are social animals that need plenty of space to hop around.', metadata={'source': 'mammal-pets-doc'}),\n",
+       " Document(page_content='Parrots are intelligent birds capable of mimicking human speech.', metadata={'source': 'bird-pets-doc'})]"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "embedding = OpenAIEmbeddings().embed_query(\"cat\")\n",
+    "\n",
+    "vectorstore.similarity_search_by_vector(embedding)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "168dbbec-ea97-4cc9-bb1a-75519c2d08af",
+   "metadata": {},
+   "source": [
+    "Learn more:\n",
+    "\n",
+    "- [API reference](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStore.html)\n",
+    "- [How-to guide](/docs/how_to/vectorstores)\n",
+    "- [Integration-specific docs](/docs/integrations/vectorstores)\n",
+    "\n",
+    "## Retrievers\n",
+    "\n",
+    "LangChain `VectorStore` objects do not subclass [Runnable](https://api.python.langchain.com/en/latest/core_api_reference.html#module-langchain_core.runnables), and so cannot immediately be integrated into LangChain Expression Language [chains](/docs/concepts/#langchain-expression-language-lcel).\n",
+    "\n",
+    "LangChain [Retrievers](https://api.python.langchain.com/en/latest/core_api_reference.html#module-langchain_core.retrievers) are Runnables, so they implement a standard set of methods (e.g., synchronous and asynchronous `invoke` and `batch` operations) and are designed to be incorporated in LCEL chains.\n",
+    "\n",
+    "We can create a simple version of this ourselves, without subclassing `Retriever`. If we choose what method we wish to use to retrieve documents, we can create a runnable easily. Below we will build one around the `similarity_search` method:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "f1461582-e569-4326-bd95-510f72edf019",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[[Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'})],\n",
+       " [Document(page_content='Goldfish are popular pets for beginners, requiring relatively simple care.', metadata={'source': 'fish-pets-doc'})]]"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from typing import List\n",
+    "\n",
+    "from langchain_core.documents import Document\n",
+    "from langchain_core.runnables import RunnableLambda\n",
+    "\n",
+    "retriever = RunnableLambda(vectorstore.similarity_search).bind(k=1)  # select top result\n",
+    "\n",
+    "retriever.batch([\"cat\", \"shark\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a36d3f64-a8bc-4baa-b2ea-07e324a0143e",
+   "metadata": {},
+   "source": [
+    "Vectorstores implement an `as_retriever` method that will generate a Retriever, specifically a [VectorStoreRetriever](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStoreRetriever.html). These retrievers include specific `search_type` and `search_kwargs` attributes that identify what methods of the underlying vector store to call, and how to parameterize them. For instance, we can replicate the above with the following:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "4989fe5e-ac58-4751-bc35-f53ff885860c",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[[Document(page_content='Cats are independent pets that often enjoy their own space.', metadata={'source': 'mammal-pets-doc'})],\n",
+       " [Document(page_content='Goldfish are popular pets for beginners, requiring relatively simple care.', metadata={'source': 'fish-pets-doc'})]]"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "retriever = vectorstore.as_retriever(\n",
+    "    search_type=\"similarity\",\n",
+    "    search_kwargs={\"k\": 1},\n",
+    ")\n",
+    "\n",
+    "retriever.batch([\"cat\", \"shark\"])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6b79ded3-39ed-4aeb-8b70-cd36795ae239",
+   "metadata": {},
+   "source": [
+    "`VectorStoreRetriever` supports search types of `\"similarity\"` (default), `\"mmr\"` (maximum marginal relevance, described above), and `\"similarity_score_threshold\"`. We can use the latter to threshold documents output by the retriever by similarity score.\n",
+    "\n",
+    "Retrievers can easily be incorporated into more complex applications, such as retrieval-augmented generation (RAG) applications that combine a given question with retrieved context into a prompt for a LLM. Below we show a minimal example.\n",
+    "\n",
+    "```{=mdx}\n",
+    "import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
+    "\n",
+    "<ChatModelTabs customVarName=\"llm\" />\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "c77b68bf-59f3-4416-9877-960f934c374d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# | output: false\n",
+    "# | echo: false\n",
+    "\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "6f1ae0d0-0b4b-4da0-80ce-f82913052a83",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import RunnablePassthrough\n",
+    "\n",
+    "message = \"\"\"\n",
+    "Answer this question using the provided context only.\n",
+    "\n",
+    "{question}\n",
+    "\n",
+    "Context:\n",
+    "{context}\n",
+    "\"\"\"\n",
+    "\n",
+    "prompt = ChatPromptTemplate.from_messages([(\"human\", message)])\n",
+    "\n",
+    "rag_chain = {\"context\": retriever, \"question\": RunnablePassthrough()} | prompt | llm"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "b3c0d625-61e0-492e-b3a6-c40d383fca03",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Cats are independent pets that often enjoy their own space.\n"
+     ]
+    }
+   ],
+   "source": [
+    "response = rag_chain.invoke(\"tell me about cats\")\n",
+    "\n",
+    "print(response.content)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3d9be7cb-2081-48a4-b6e4-d5e2d562ffd4",
+   "metadata": {},
+   "source": [
+    "## Learn more:\n",
+    "\n",
+    "Retrieval strategies can be rich and complex. For example:\n",
+    "\n",
+    "- We can [infer hard rules and filters](/docs/how_to/self_query/) from a query (e.g., \"using documents published after 2020\");\n",
+    "- We can [return documents that are linked](/docs/how_to/parent_document_retriever/) to the retrieved context in some way (e.g., via some document taxonomy);\n",
+    "- We can generate [multiple embeddings](/docs/how_to/multi_vector) for each unit of context;\n",
+    "- We can [ensemble results](/docs/how_to/ensemble_retriever) from multiple retrievers;\n",
+    "- We can assign weights to documents, e.g., to weigh [recent documents](/docs/how_to/time_weighted_vectorstore/) higher.\n",
+    "\n",
+    "The [retrievers](/docs/how_to#retrievers) section of the how-to guides covers these and other built-in retrieval strategies.\n",
+    "\n",
+    "It is also straightforward to extend the [BaseRetriever](https://api.python.langchain.com/en/latest/retrievers/langchain_core.retrievers.BaseRetriever.html) class in order to implement custom retrievers. See our how-to guide [here](/docs/how_to/custom_retriever)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/docs/tutorials/summarization.ipynb
+++ b/docs/docs/tutorials/summarization.ipynb
@@ -84,7 +84,7 @@
    "```\n",
    "\n",
    "\n",
-    "For more details, see our [Installation guide](/docs/installation).\n",
+    "For more details, see our [Installation guide](/docs/how_to/installation).\n",
    "\n",
    "### LangSmith\n",
    "\n",
--- a/docs/docs/versions/overview.mdx
+++ b/docs/docs/versions/overview.mdx
@@ -3,7 +3,7 @@ sidebar_position: 0
 sidebar_label: Overview
 ---

-# LangChain Over Time
+# LangChain over time

 ## What’s new in LangChain?

@@ -45,7 +45,7 @@ This document serves to outline at a high level what has changed and why.

 - `langchain` was split into the following component packages: `langchain-core`, `langchain`, `langchain-community`, `langchain-[partner]` to improve the usability of langchain code in production settings. You can read more about it on our [blog](https://blog.langchain.dev/langchain-v0-1-0/).

-### Ecosystem Organization
+### Ecosystem organization

 By the release of 0.1.0, LangChain had grown to a large ecosystem with many integrations and a large community.

--- a/docs/docs/versions/packages.mdx
+++ b/docs/docs/versions/packages.mdx
@@ -3,7 +3,7 @@ sidebar_position: 3
 sidebar_label: Packages
 ---

-# 📕 Package Versioning
+# 📕 Package versioning

 As of now, LangChain has an ad hoc release process: releases are cut with high frequency by
 a maintainer and published to [PyPI](https://pypi.org/).
--- a/docs/docs/versions/release_policy.mdx
+++ b/docs/docs/versions/release_policy.mdx
@@ -3,7 +3,7 @@ sidebar_position: 2
 sidebar_label: Release Policy
 ---

-# LangChain Releases
+# LangChain releases

 The LangChain ecosystem is composed of different component packages (e.g., `langchain-core`, `langchain`, `langchain-community`, `langgraph`, `langserve`, partner packages etc.)

@@ -32,13 +32,13 @@ From time to time, we will version packages as **release candidates**. These are

 Other packages in the ecosystem (including user packages) can follow a different versioning scheme, but are generally expected to pin to specific minor versions of `langchain` and `langchain-core`.

-## Release Cadence
+## Release cadence

 We expect to space out **minor** releases (e.g., from 0.2.0 to 0.3.0) of `langchain` and `langchain-core` by at least 2-3 months, as such releases may contain breaking changes.

 Patch versions are released frequently as they contain bug fixes and new features.

-## API Stability
+## API stability

 The development of LLM applications is a rapidly evolving field, and we are constantly learning from our users and the community. As such, we expect that the APIs in `langchain` and `langchain-core` will continue to evolve to better serve the needs of our users.

@@ -49,14 +49,14 @@ Even though both `langchain` and `langchain-core` are currently in a pre-1.0 sta

 We will generally try to avoid making unnecessary changes, and will provide a deprecation policy for features that are being removed.

-### Stability of Other Packages
+### Stability of other packages

 The stability of other packages in the LangChain ecosystem may vary:

 - `langchain-community` is a community maintained package that contains 3rd party integrations. While we do our best to review and test changes in `langchain-community`, `langchain-community` is expected to experience more breaking changes than `langchain` and `langchain-core` as it contains many community contributions.
 - Partner packages may follow different stability and versioning policies, and users should refer to the documentation of those packages for more information; however, in general these packages are expected to be stable.

-### What is a "API Stability"?
+### What is a "API stability"?

 API stability means:

@@ -72,7 +72,7 @@ Certain APIs are explicitly marked as “internal” in a couple of ways:
 - Functions, methods, and other objects prefixed by a leading underscore (**`_`**). This is the standard Python convention of indicating that something is private; if any method starts with a single **`_`**, it’s an internal API.
    - **Exception:** Certain methods are prefixed with `_` , but do not contain an implementation. These methods are *meant* to be overridden by sub-classes that provide the implementation. Such methods are generally part of the **Public API** of LangChain.

-## Deprecation Policy
+## Deprecation policy

 We will generally avoid deprecating features until a better alternative is available.

--- a/docs/docs/versions/v0_2.mdx
+++ b/docs/docs/versions/v0_2.mdx
@@ -41,7 +41,7 @@ Here is an example of the import changes that the migration script can help appl
 | langchain           | langchain-text-splitters | from langchain.text_splitter import RecursiveCharacterTextSplitter | from langchain_text_splitters import RecursiveCharacterTextSplitter |


-#### Deprecation Timeline
+#### Deprecation timeline

 We have two main types of deprecations:

@@ -102,7 +102,7 @@ langchain-cli migrate [path to code] --diff # Preview
 langchain-cli migrate [path to code] # Apply
 ```

-#### Other Options
+#### Other options

 ```bash
 # See help menu
@@ -114,11 +114,11 @@ langchain-cli migrate --diff [path to code]
 langchain-cli migrate --disable langchain_to_core --include-ipynb [path to code]
 ```

-## Deprecations and Breaking Changes
+## Deprecations and breaking changes

 This code contains a list of deprecations and removals in the `langchain` and `langchain-core` packages.

-### Breaking Changes in 0.2.0
+### Breaking changes in 0.2.0

 As of release 0.2.0, `langchain` is required to be integration-agnostic. This means that code in `langchain`  should not by default instantiate any specific chat models, llms, embedding models, vectorstores etc; instead, the user will be required to specify those explicitly.

--- a/docs/docusaurus.config.js
+++ b/docs/docusaurus.config.js
@@ -124,7 +124,7 @@ const config = {
    /** @type {import('@docusaurus/preset-classic').ThemeConfig} */
    ({
      announcementBar: {
-        content: 'You are viewing the <strong>preview</strong> LangChain v0.2 docs. View the <a href="/v0.1/docs/get_started/introduction/">stable 0.1 docs here</a>.',
+        content: 'You are viewing the <strong>preview</strong> v0.2 docs. View the <strong>stable</strong> v0.1 docs <a href="/v0.1/docs/get_started/introduction/">here</a>. Leave feedback on the v0.2 docs <a href="https://github.com/langchain-ai/langchain/discussions/21716">here</a>.',
        isCloseable: true,
      },
      docs: {
@@ -310,9 +310,9 @@ const config = {
        // this is linked to erick@langchain.dev currently
        apiKey: "6c01842d6a88772ed2236b9c85806441",

-        indexName: "python-langchain",
+        indexName: "python-langchain-0.2",

-        contextualSearch: true,
+        contextualSearch: false,
      },
    }),

--- a/docs/scripts/notebook_convert.py
+++ b/docs/scripts/notebook_convert.py
@@ -7,7 +7,7 @@ from typing import Iterable, Tuple

 import nbformat
 from nbconvert.exporters import MarkdownExporter
-from nbconvert.preprocessors import Preprocessor, RegexRemovePreprocessor
+from nbconvert.preprocessors import Preprocessor


 class EscapePreprocessor(Preprocessor):
@@ -79,11 +79,26 @@ class ExtractAttachmentsPreprocessor(Preprocessor):
        return cell, resources


+class CustomRegexRemovePreprocessor(Preprocessor):
+    def check_conditions(self, cell):
+        pattern = re.compile(r"(?s)(?:\s*\Z)|(?:.*#\s*\|\s*output:\s*false.*)")
+        rtn = not pattern.match(cell.source)
+        if not rtn:
+            return False
+        else:
+            return True
+
+    def preprocess(self, nb, resources):
+        nb.cells = [cell for cell in nb.cells if self.check_conditions(cell)]
+
+        return nb, resources
+
+
 exporter = MarkdownExporter(
    preprocessors=[
        EscapePreprocessor,
        ExtractAttachmentsPreprocessor,
-        RegexRemovePreprocessor(patterns=[r"^\s*$"]),
+        CustomRegexRemovePreprocessor,
    ],
    template_name="mdoutput",
    extra_template_basedirs=["./scripts/notebook_convert_templates"],
--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -21,12 +21,9 @@
 module.exports = {
  docs: [
    {
-      type: "category",
-      label: "Introduction",
-      collapsed: false,
-      collapsible: false,
-      link: { type: "doc", id: "introduction" },
-      items: ["installation"],
+          type: "doc",
+          label: "Introduction",
+          id: "introduction",
    },
    {
      type: "category",
@@ -42,7 +39,7 @@ module.exports = {
    {
      type: "category",
      link: {type: 'doc', id: 'how_to/index'},
-      label: "How-To Guides",
+      label: "How-to guides",
      collapsible: false,
      items: [{
        type: 'autogenerated',
--- a/docs/src/theme/Feedback.js
+++ b/docs/src/theme/Feedback.js
@@ -112,6 +112,7 @@ export default function Feedback() {
  const { setCookie, checkCookie } = useCookie();
  const [feedbackSent, setFeedbackSent] = useState(false);
  const { siteConfig } = useDocusaurusContext();
+  const [pathname, setPathname] = useState("");

  /** @param {"good" | "bad"} feedback */
  const handleFeedback = async (feedback) => {
@@ -167,6 +168,7 @@ export default function Feedback() {
      // (cookies exp in 24hrs)
      const cookieName = `${FEEDBACK_COOKIE_PREFIX}_${window.location.pathname}`;
      setFeedbackSent(checkCookie(cookieName));
+      setPathname(window.location.pathname);
    }
  }, []);

@@ -192,6 +194,10 @@ export default function Feedback() {
    onMouseUp: (e) => (e.currentTarget.style.backgroundColor = "#f0f0f0"),
  };

+  const newGithubIssueURL = pathname
+    ? `https://github.com/langchain-ai/langchain/issues/new?assignees=&labels=03+-+Documentation&projects=&template=documentation.yml&title=DOC%3A+%3CIssue+related+to+${pathname}%3E`
+    : "https://github.com/langchain-ai/langchain/issues/new?assignees=&labels=03+-+Documentation&projects=&template=documentation.yml&title=DOC%3A+%3CPlease+write+a+comprehensive+title+after+the+%27DOC%3A+%27+prefix%3E";
+
  return (
    <div style={{ display: "flex", flexDirection: "column" }}>
      <hr />
@@ -199,7 +205,7 @@ export default function Feedback() {
        <h4>Thanks for your feedback!</h4>
      ) : (
        <>
-          <h4>Help us out by providing feedback on this documentation page:</h4>
+          <h4>Was this page helpful?</h4>
          <div style={{ display: "flex", gap: "5px" }}>
            <div
              {...defaultFields}
@@ -240,6 +246,14 @@ export default function Feedback() {
          </div>
        </>
      )}
+      <br />
+      <h4>
+        You can leave detailed feedback{" "}
+        <a target="_blank" href={newGithubIssueURL}>
+          on GitHub
+        </a>
+        .
+      </h4>
    </div>
  );
 }
--- a/docs/src/theme/PrerequisiteLinks.js
+++ b/docs/src/theme/PrerequisiteLinks.js
@@ -1,19 +0,0 @@
-import React from "react";
-import { marked } from "marked";
-import DOMPurify from "isomorphic-dompurify";
-import Admonition from '@theme/Admonition';
-
-export default function PrerequisiteLinks({ content }) {
-  return (
-    <Admonition type="info" title="Prerequisites">
-      <div style={{ marginTop: "8px" }}>
-        This guide will assume familiarity with the following concepts:
-      </div>
-      <div style={{ marginTop: "16px" }}
-        dangerouslySetInnerHTML={{
-          __html: DOMPurify.sanitize(marked.parse(content))
-        }} 
-      />
-    </Admonition>
-  );
-}
--- a/docs/vercel_requirements.txt
+++ b/docs/vercel_requirements.txt
@@ -1,6 +1,5 @@
 -e ../libs/langchain
 -e ../libs/community
-e ../libs/core
 -e ../libs/experimental
 -e ../libs/text-splitters
 langchain-cohere
@@ -10,3 +9,4 @@ langchain-elasticsearch
 langchain-postgres
 urllib3==1.26.18
 nbconvert==7.16.4
+langchain-core==0.1.52
--- a/libs/cli/langchain_cli/integration_template/pyproject.toml
+++ b/libs/cli/langchain_cli/integration_template/pyproject.toml
@@ -12,7 +12,7 @@ license = "MIT"

 [tool.poetry.dependencies]
 python = ">=3.8.1,<4.0"
-langchain-core = "^0.1"
+langchain-core = ">=0.1,<0.3"

 [tool.poetry.group.test]
 optional = true
--- a/libs/cli/langchain_cli/package_template/pyproject.toml
+++ b/libs/cli/langchain_cli/package_template/pyproject.toml
@@ -7,7 +7,7 @@ readme = "README.md"

 [tool.poetry.dependencies]
 python = ">=3.8.1,<4.0"
-langchain-core = ">=0.1.5"
+langchain-core = ">=0.1.5,<0.3"
 langchain-openai = ">=0.0.1"


--- a/libs/community/langchain_community/adapters/openai.py
+++ b/libs/community/langchain_community/adapters/openai.py
@@ -95,18 +95,18 @@ def convert_dict_to_message(_dict: Mapping[str, Any]) -> BaseMessage:
    elif role == "system":
        return SystemMessage(content=_dict.get("content", ""))
    elif role == "function":
-        return FunctionMessage(content=_dict.get("content", ""), name=_dict.get("name"))
+        return FunctionMessage(content=_dict.get("content", ""), name=_dict.get("name"))  # type: ignore[arg-type]
    elif role == "tool":
        additional_kwargs = {}
        if "name" in _dict:
            additional_kwargs["name"] = _dict["name"]
        return ToolMessage(
            content=_dict.get("content", ""),
-            tool_call_id=_dict.get("tool_call_id"),
+            tool_call_id=_dict.get("tool_call_id"),  # type: ignore[arg-type]
            additional_kwargs=additional_kwargs,
        )
    else:
-        return ChatMessage(content=_dict.get("content", ""), role=role)
+        return ChatMessage(content=_dict.get("content", ""), role=role)  # type: ignore[arg-type]


 def convert_message_to_dict(message: BaseMessage) -> dict:
--- a/libs/community/langchain_community/agent_toolkits/azure_ai_services.py
+++ b/libs/community/langchain_community/agent_toolkits/azure_ai_services.py
@@ -21,11 +21,11 @@ class AzureAiServicesToolkit(BaseToolkit):
        """Get the tools in the toolkit."""

        tools: List[BaseTool] = [
-            AzureAiServicesDocumentIntelligenceTool(),
-            AzureAiServicesImageAnalysisTool(),
-            AzureAiServicesSpeechToTextTool(),
-            AzureAiServicesTextToSpeechTool(),
-            AzureAiServicesTextAnalyticsForHealthTool(),
+            AzureAiServicesDocumentIntelligenceTool(),  # type: ignore[call-arg]
+            AzureAiServicesImageAnalysisTool(),  # type: ignore[call-arg]
+            AzureAiServicesSpeechToTextTool(),  # type: ignore[call-arg]
+            AzureAiServicesTextToSpeechTool(),  # type: ignore[call-arg]
+            AzureAiServicesTextAnalyticsForHealthTool(),  # type: ignore[call-arg]
        ]

        return tools
--- a/libs/community/langchain_community/agent_toolkits/azure_cognitive_services.py
+++ b/libs/community/langchain_community/agent_toolkits/azure_cognitive_services.py
@@ -21,13 +21,13 @@ class AzureCognitiveServicesToolkit(BaseToolkit):
        """Get the tools in the toolkit."""

        tools: List[BaseTool] = [
-            AzureCogsFormRecognizerTool(),
-            AzureCogsSpeech2TextTool(),
-            AzureCogsText2SpeechTool(),
-            AzureCogsTextAnalyticsHealthTool(),
+            AzureCogsFormRecognizerTool(),  # type: ignore[call-arg]
+            AzureCogsSpeech2TextTool(),  # type: ignore[call-arg]
+            AzureCogsText2SpeechTool(),  # type: ignore[call-arg]
+            AzureCogsTextAnalyticsHealthTool(),  # type: ignore[call-arg]
        ]

        # TODO: Remove check once azure-ai-vision supports MacOS.
        if sys.platform.startswith("linux") or sys.platform.startswith("win"):
-            tools.append(AzureCogsImageAnalysisTool())
+            tools.append(AzureCogsImageAnalysisTool())  # type: ignore[call-arg]
        return tools
--- a/libs/community/langchain_community/agent_toolkits/clickup/toolkit.py
+++ b/libs/community/langchain_community/agent_toolkits/clickup/toolkit.py
@@ -102,7 +102,7 @@ class ClickupToolkit(BaseToolkit):
            )
            for action in operations
        ]
-        return cls(tools=tools)
+        return cls(tools=tools)  # type: ignore[arg-type]

    def get_tools(self) -> List[BaseTool]:
        """Get the tools in the toolkit."""
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Erick Friis	be15740084	fireworks: add secret (#21744 )	2024-05-15 19:48:51 -07:00
Erick Friis	06110e20b9	pinecone: bump min core version (#21742 )	2024-05-15 19:31:43 -07:00
Erick Friis	bd3e7d50f3	fireworks: bump min core version (#21741 )	2024-05-15 19:29:13 -07:00
Erick Friis	1647b28a87	infra: release min version dont clobber current lib (#21740 )	2024-05-15 19:27:39 -07:00
Erick Friis	f5c31078d7	airbyte[patch]: airbyte-cdk compatible pydantic versions (#21738 )	2024-05-15 19:13:25 -07:00
Erick Friis	3d33b89fa4	ibm[patch]: release 0.1.7 (#21737 )	2024-05-15 19:10:15 -07:00
Erick Friis	e41d801369	openai[patch]: fix embedding float precision issue (#21736 ) also clean up + comment some of the embedding batching code	2024-05-16 02:06:51 +00:00
JuHyung Son	38c297a025	upstage: Support batch input in embedding request. (#21730 ) Description: upstage embedding now supports batch input.	2024-05-15 18:13:44 -07:00
junefish	c5a981e3b4	docs: Update Pinecone example notebook with embedded widget (#21719 ) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-05-15 21:20:46 +00:00
Erick Friis	0aea7f4b1d	docs: fix installation link (#21728 )	2024-05-15 21:10:12 +00:00
Harrison Chase	15be439719	Harrison/move flashrank rerank (#21448 ) third party integration, should be in community	2024-05-15 13:08:52 -07:00
Harrison Chase	c6c2649a5a	move installation (#21711 )	2024-05-15 12:59:45 -07:00
Erick Friis	aca98fd150	multiple: releases with relaxed core dep (#21724 )	2024-05-15 19:29:35 +00:00
Bagatur	af284518bc	openai[patch]: Release 0.1.7, bump tiktoken 0.7.0 (#21723 )	2024-05-15 12:19:29 -07:00
Bagatur	0405933914	docs: add feedback link to 0.2 banner (#21600 )	2024-05-15 10:53:48 -07:00
William FH	ca768c8353	[Core] Check is async callable (#21714 ) To permit proper coercion of objects like the following: ```python class MyAsyncCallable: async def __call__(self, foo): return await ... class MyAsyncGenerator: async def __call__(self, foo): await ... yield ```	2024-05-15 10:49:49 -07:00
ccurme	7128c2d8ad	docs: add tutorial for vector stores and retrievers (#21683 ) also update how-to guide for parent document retriever	2024-05-15 11:50:24 -04:00
Eugene Yurtsev	5c2cfabec6	core[minor]: Add v2 implementation of astream events (#21638 ) This PR introduces a v2 implementation of astream events that removes intermediate abstractions and fixes some issues with v1 implementation. The v2 implementation significantly reduces relevant code that's associated with the astream events implementation together with overhead. After this PR, the astream events implementation: - Uses an async callback handler - No longer relies on BaseTracer - No longer relies on json patch As a result of this re-write, a number of issues were discovered with the existing implementation. ## Changes in V2 vs. V1 ### on_chat_model_end `output` The outputs associated with `on_chat_model_end` changed depending on whether it was within a chain or not. As a root level runnable the output was: ```python "data": {"output": AIMessageChunk(content="hello world!", id='some id')} ``` As part of a chain the output was: ``` "data": { "output": { "generations": [ [ { "generation_info": None, "message": AIMessageChunk( content="hello world!", id=AnyStr() ), "text": "hello world!", "type": "ChatGenerationChunk", } ] ], "llm_output": None, } }, ``` After this PR, we will always use the simpler representation: ```python "data": {"output": AIMessageChunk(content="hello world!", id='some id')} ``` NOTE Non chat models (i.e., regular LLMs) are still associated with the more verbose format. ### Remove some `_stream` events `on_retriever_stream` and `on_tool_stream` events were removed -- these were not real events, but created as an artifact of implementing on top of astream_log. The same information is already available in the `x_on_end` events. ### Propagating Names Names of runnables have been updated to be more consistent ```python model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields( messages=ConfigurableField( id="messages", name="Messages", description="Messages return by the LLM", ) ) ``` Before: ```python "name": "RunnableConfigurableFields", ``` After: ```python "name": "GenericFakeChatModel", ``` ### on_retriever_end on_retriever_end will always return `output` which is a list of documents (rather than a dict containing a key called "documents") ### Retry events Removed the `on_retry` callback handler. It was incorrectly showing that the failed function being retried has invoked `on_chain_end` https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394	2024-05-15 11:48:47 -04:00
Rajendra Kadam	54e003268e	langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641 ) - Description: PebbloRetrievalQA chain introduces identity enforcement using vector-db metadata filtering - Dependencies: None - Issue: None - Documentation: Adding documentation for PebbloRetrievalQA chain in a separate PR(https://github.com/langchain-ai/langchain/pull/20746) - Unit tests: New unit-tests added --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-05-15 13:14:52 +00:00
Bagatur	f2f970f93d	docs: openai bind tools nit (#21692 )	2024-05-15 01:20:53 +00:00
Erick Friis	5fa5a73dc0	docs: disable contextual search (#21691 )	2024-05-14 16:59:11 -07:00
Erick Friis	3ee0747382	infra: remove prints from notebook build (#21688 )	2024-05-14 16:27:56 -07:00
Erick Friis	024c11ff9c	docs: v0.2 search index (#21619 )	2024-05-14 15:37:42 -07:00
Bagatur	241a6e43a5	docs: update structured how to (#21679 )	2024-05-14 22:19:51 +00:00
Jib	f369495fa0	mongodb: [performance] Increase DEFAULT_INSERT_BATCH_SIZE to 100,000 and introduce sizing constraints (#19608 )	2024-05-14 22:11:26 +00:00
Eugene Yurtsev	e69a9bedf8	core[patch]: Update mypy config (#21684 ) Update mypy config to ignore checking deps from numpy and pytest (which are optional in langsmith sdk)	2024-05-14 17:29:07 -04:00
Erick Friis	9973547aef	mongodb: release 0.1.4 (#21678 )	2024-05-14 11:54:23 -07:00
Jib	a97473c846	mongodb[patch]: Make ObjectId JSON-serializable on generation (#21394 )	2024-05-14 11:52:29 -07:00
ccurme	12b599c47f	docs: add how-to on multi-modal tool calling (#21667 ) Can move this to a dedicated multi-modal section if desired.	2024-05-14 12:26:25 -04:00
Eugene Yurtsev	5c64c004cc	core[patch]: Add unit tests with some streaming scenarios (#21668 ) Add unit tests that show differences between sync / async versions when streaming. The inner on_chain_chunk event is missing if mixing sync and async functionality. Likely due to missing tap_output_iter implementation on the sync variant of `_transform_stream_with_config`	2024-05-14 15:30:57 +00:00
Eugene Yurtsev	2ac4d2960c	core[patch]: Add unit test to catch ordering (#21669 ) Add unit test to catch ordering issues	2024-05-14 15:25:33 +00:00
ccurme	3390dc2266	docs: style nits (#21666 )	2024-05-14 10:18:13 -04:00
ccurme	2463c8060c	docs: how-to on adding scores to retriever results (#21626 )	2024-05-14 09:41:36 -04:00
Zhao Blake	972d2071c6	core[patch]: Fix typo in VectorStoreExampleSelector doc-string (#21574 )	2024-05-14 13:31:37 +00:00
William FH	714cba96a8	[docs] Update langgraph migration guide (#21644 ) - add links to references where appropriate - use the create_react_agent - Fix the timeout recommendation	2024-05-14 06:13:17 +00:00
Erick Friis	5144c94603	docs: add 0.2 search notice (#21653 )	2024-05-14 04:00:18 +00:00
Erick Friis	2a984e8e3f	docs: huggingface package (#21645 )	2024-05-14 03:17:40 +00:00
Anush	cd1879f5e7	docs: Qdrant partner package reference (#21649 ) ## Description: As the title goes.	2024-05-13 19:51:57 -07:00
Erick Friis	c77d2f2b06	multiple: core 0.2 nonbreaking dep, check_diff community->langchain dep (#21646 ) 0.2 is not a breaking release for core (but it is for langchain and community) To keep the core+langchain+community packages in sync at 0.2, we will relax deps throughout the ecosystem to tolerate `langchain-core` 0.2	2024-05-13 19:50:36 -07:00
Anush	edd68e4ad4	qdrant: init package (#21146 ) ## Description This PR introduces the new `langchain-qdrant` partner package, intending to deprecate the community package. ## Changes - Moved the Qdrant vector store implementation `/libs/partners/qdrant` with integration tests. - The conditional imports of the client library are now regular with minor implementation improvements. - Added a deprecation warning to `langchain_community.vectorstores.qdrant.Qdrant`. - Replaced references/imports from `langchain_community` with either `langchain_core` or by moving the definitions to the `langchain_qdrant` package itself. - Updated the Qdrant vector store documentation to reflect the changes. ## Testing - `QDRANT_URL` and [`QDRANT_API_KEY`](`583e36bf6b`) env values need to be set to [run integration tests](`d608c93d1f`) in the [cloud](https://cloud.qdrant.tech). - If a Qdrant instance is running at `http://localhost:6333`, the integration tests will use it too. - By default, tests use an [`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode) instance(Not comprehensive). --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-05-13 18:20:03 -07:00
Erick Friis	fe8c9d621a	docs: ignore nb echo:false blocks (#21624 ) not working currently	2024-05-13 17:18:26 -07:00
Prashanth Rao	63c3a0e56c	[community][graph]: Update KuzuQAChain and docs (#21218 ) This PR makes some small updates for `KuzuQAChain` for graph QA. - Updated Cypher generation prompt (we now support `WHERE EXISTS`) and generalize it more - Support different LLMs for Cypher generation and QA - Update docs and examples	2024-05-13 17:17:14 -07:00
Bagatur	752b1e85f8	docs: gh feedback link (#21606 ) Co-authored-by: bracesproul <braceasproul@gmail.com>	2024-05-14 00:11:37 +00:00
Bagatur	506df439eb	docs: how to index nits (#21623 )	2024-05-13 23:52:50 +00:00
Bagatur	b514a479c0	docs: standardize capitalization (#21641 )	2024-05-13 16:25:51 -07:00
Bagatur	89aae3e043	docs: add Techniques to Concepts (#21636 ) - Adds Techniques section - Moves function calling, retrieval types to Techniques - Removes Installation section (not conceptual) - Reorders a few things (chat models before llms, package descriptions before diagram) - Add text splitter types to Techniques	2024-05-13 16:06:16 -07:00
Tomaz Bratanic	89ff6a3d3b	Add sentiment and confidence levels to diffbotgraphtransformer (#21590 ) Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-05-13 23:00:52 +00:00
Bagatur	526ba235f3	docs: fix prereq links (#21630 )	2024-05-13 15:40:53 -07:00
Erick Friis	0541e06e21	infra: 0.2 docs 404 page (#21634 )	2024-05-13 22:11:28 +00:00
Erick Friis	e861b5bcb7	infra: fix api ref link generation (#21631 )	2024-05-13 14:52:26 -07:00
Erick Friis	9b51ca08bc	huggingface: fix community dep checking (#21628 )	2024-05-13 21:52:18 +00:00
Erick Friis	91a2ea5cd6	chroma, mongodb: fix docstrings (#21629 )	2024-05-13 21:27:43 +00:00
Jofthomas	afd85b60fc	huggingface: init package (#21097 ) First Pr for the langchain_huggingface partner Package - Moved some of the hugging face related class from `community` to the new `partner package` Still needed : - Documentation - Tests - Support for the new apply_chat_template in `ChatHuggingFace` - Confirm choice of class to support for embeddings witht he sentence-transformer team. cc : @efriis --------- Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-05-13 20:53:15 +00:00
Tomaz Bratanic	9fce03e7db	community[patch]: Fix neo4j enhanced schema (#21582 )	2024-05-13 15:26:06 -04:00
Christophe Bornet	66a4da8ad0	community[patch]: Improve Cassandra VectorStore docsctrings (#21620 )	2024-05-13 15:24:26 -04:00
adreo00	40aff1eacc	core[major]: AsyncCallbackManagerForToolRun no longer casts return object to string (#20374 ) - Description: Stops `AsyncCallbackManagerForToolRun` from converting the output to str - Issue: #20372 - Dependencies: None	2024-05-13 15:09:12 -04:00
Eugene Yurtsev	25fbe356b4	community[patch]: upgrade to recent version of mypy (#21616 ) This PR upgrades community to a recent version of mypy. It inserts type: ignore on all existing failures.	2024-05-13 14:55:07 -04:00
Eugene Yurtsev	b923951062	langchain[patch]: CI add lint rule for community imports (#21618 ) Add a rule to check for imports from community in global scope	2024-05-13 14:51:25 -04:00
Jorge Piedrahita Ortiz	4378fbbef0	community[patch]: Fix typos in Sambanova integration doc-strings (#21617 ) - Description: Sambanova integration docstrings updated, bad formated --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-05-13 18:35:16 +00:00
Erick Friis	0f5bf94f9f	infra: remove ai21 docs scan features (#21614 ) ai21 depends on ai21-tokenizer which depends on too restrictive/old version of `tokenizers`	2024-05-13 18:05:53 +00:00
ccurme	fe08421207	docs: add hybrid retrieval how-to guide (#21613 ) Updating v0.2 docs with https://github.com/langchain-ai/langchain/pull/21245	2024-05-13 14:03:55 -04:00