bump 260 (#9002 )

Fixed wrong paper reference (#8970 )
The ReAct reference references to MRKL paper. Corrected so that it points to the actual ReAct paper #8964.
2026-02-04 00:00:34 +00:00 · 2023-08-09 13:40:49 -07:00 · 2023-08-09 16:17:46 -04:00 · 2023-08-09 21:17:04 +01:00 · 2023-08-09 16:13:06 -04:00 · 2023-08-09 12:33:00 -07:00
748 changed files with 38668 additions and 11161 deletions
--- a/.devcontainer/README.md
+++ b/.devcontainer/README.md
@@ -15,7 +15,11 @@ You may use the button above, or follow these steps to open this repo in a Codes
 For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).
  
 ## VS Code Dev Containers
-[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)
+[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
+
+Note: If you click this link you will open the main repo and not your local cloned repo, you can use this link and replace with your username and cloned repo name: 
+https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/<yourusername>/<yourclonedreponame>
+

 If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.

@@ -25,7 +29,7 @@ You can also follow these steps to open this repo in a container using the VS Co

 2. Open a locally cloned copy of the code:

-   - Clone this repository to your local filesystem.
+   - Fork and Clone this repository to your local filesystem.
   - Press <kbd>F1</kbd> and select the **Dev Containers: Open Folder in Container...** command.
   - Select the cloned copy of this folder, wait for the container to start, and try things out!

--- a/.github/workflows/_release.yml
+++ b/.github/workflows/_release.yml
@@ -37,6 +37,7 @@ jobs:
          echo version=$(poetry version --short) >> $GITHUB_OUTPUT
      - name: Create Release
        uses: ncipollo/release-action@v1
+        if: ${{ inputs.working-directory == 'libs/langchain' }}
        with:
          artifacts: "dist/*"
          token: ${{ secrets.GITHUB_TOKEN }}
--- a/.github/workflows/langchain_experimental_ci.yml
+++ b/.github/workflows/langchain_experimental_ci.yml
@@ -1,5 +1,5 @@
 ---
-name: libs/langchain-experimental CI
+name: libs/experimental CI

 on:
  push:
--- a/.github/workflows/langchain_experimental_release.yml
+++ b/.github/workflows/langchain_experimental_release.yml
@@ -1,5 +1,5 @@
 ---
-name: libs/langchain-experimental Release
+name: libs/experimental Release

 on:
  pull_request:
--- a/.github/workflows/scheduled_test.yml
+++ b/.github/workflows/scheduled_test.yml
@@ -0,0 +1,38 @@
+name: Scheduled tests
+
+on:
+  schedule:
+    - cron:  '0 13 * * *'
+
+env:
+  POETRY_VERSION: "1.4.2"
+
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    environment: Scheduled testing
+    strategy:
+      matrix:
+        python-version:
+          - "3.8"
+          - "3.9"
+          - "3.10"
+          - "3.11"
+    name: Python ${{ matrix.python-version }}
+    steps:
+      - uses: actions/checkout@v3
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: "./.github/actions/poetry_setup"
+        with:
+          python-version: ${{ matrix.python-version }}
+          poetry-version: "1.4.2"
+          install-command: |
+            echo "Running scheduled tests, installing dependencies with poetry..."
+            poetry install -E scheduled_testing
+      - name: Run tests
+        env:
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+        run: |
+          make scheduled_tests
+        shell: bash
+    secrets: inherit
--- a/.gitignore
+++ b/.gitignore
@@ -162,6 +162,7 @@ docs/.docusaurus/
 docs/.cache-loader/
 docs/_dist
 docs/api_reference/api_reference.rst
+docs/api_reference/experimental_api_reference.rst
 docs/api_reference/_build
 docs/api_reference/*/
 !docs/api_reference/_static/
--- a/MIGRATE.md
+++ b/MIGRATE.md
@@ -43,6 +43,10 @@ Now:

 `from langchain_experimental.sql import SQLDatabaseChain`

+Alternatively, if you are just interested in using the query generation part of the SQL chain, you can check out [`create_sql_query_chain`](https://github.com/langchain-ai/langchain/blob/master/docs/extras/use_cases/tabular/sql_query.ipynb)
+
+`from langchain.chains import create_sql_query_chain`
+
 ## `load_prompt` for Python files

 Note: this only applies if you want to load Python files as prompts.
--- a/7
+++ b/7
@@ -43,7 +43,12 @@ spell_fix:

 help:
 	@echo '----'
-	@echo 'coverage                     - run unit tests and generate coverage report'
+	@echo 'clean                        - run docs_clean and api_docs_clean'
 	@echo 'docs_build                   - build the documentation'
 	@echo 'docs_clean                   - clean the documentation build artifacts'
 	@echo 'docs_linkcheck               - run linkchecker on the documentation'
+	@echo 'api_docs_build               - build the API Reference documentation'
+	@echo 'api_docs_clean               - clean the API Reference documentation build artifacts'
+	@echo 'api_docs_linkcheck           - run linkchecker on the API Reference documentation'
+	@echo 'spell_check               	- run codespell on the project'
+	@echo 'spell_fix               		- run codespell on the project and fix the errors'
--- a/README.md
+++ b/README.md
@@ -12,14 +12,14 @@
 [![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)
 [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)
 [![GitHub star chart](https://img.shields.io/github/stars/hwchase17/langchain?style=social)](https://star-history.com/#hwchase17/langchain)
-[![Dependency Status](https://img.shields.io/librariesio/github/hwchase17/langchain)](https://libraries.io/github/hwchase17/langchain)
+[![Dependency Status](https://img.shields.io/librariesio/github/langchain-ai/langchain)](https://libraries.io/github/langchain-ai/langchain)
 [![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues)


 Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).

-**Production Support:** As you move your LangChains into production, we'd love to offer more comprehensive support.
-Please fill out [this form](https://6w1pwbss0py.typeform.com/to/rrbrdTH2) and we'll set up a dedicated support Slack channel.
+**Production Support:** As you move your LangChains into production, we'd love to offer more hands-on support.
+Fill out [this form](https://airtable.com/appwQzlErAS2qiP0L/shrGtGaVBVAz7NcV2) to share more about what you're building, and our team will get in touch.

 ## 🚨Breaking Changes for select chains (SQLDatabase) on 7/28

--- a/docs/.local_build.sh
+++ b/docs/.local_build.sh
@@ -13,5 +13,6 @@ cp -r {docs_skeleton,snippets} _dist
 cp -r extras/* _dist/docs_skeleton/docs
 cd _dist/docs_skeleton
 poetry run nbdoc_build
+poetry run python generate_api_reference_links.py
 yarn install
 yarn start
--- a/docs/api_reference/conf.py
+++ b/docs/api_reference/conf.py
@@ -23,6 +23,7 @@ from sphinx.util.docutils import SphinxDirective
 _DIR = Path(__file__).parent.absolute()
 sys.path.insert(0, os.path.abspath("."))
 sys.path.insert(0, os.path.abspath("../../libs/langchain"))
+sys.path.insert(0, os.path.abspath("../../libs/experimental"))

 with (_DIR.parents[1] / "libs" / "langchain" / "pyproject.toml").open("r") as f:
    data = toml.load(f)
@@ -99,6 +100,9 @@ extensions = [
 ]
 source_suffix = [".rst"]

+# some autodoc pydantic options are repeated in the actual template.
+# potentially user error, but there may be bugs in the sphinx extension
+# with options not being passed through correctly (from either the location in the code)
 autodoc_pydantic_model_show_json = False
 autodoc_pydantic_field_list_validators = False
 autodoc_pydantic_config_members = False
@@ -111,13 +115,6 @@ autodoc_member_order = "groupwise"
 autoclass_content = "both"
 autodoc_typehints_format = "short"

-autodoc_default_options = {
-    "members": True,
-    "show-inheritance": True,
-    "inherited-members": "BaseModel",
-    "undoc-members": True,
-    "special-members": "__call__",
-}
 # autodoc_typehints = "description"
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ["templates"]
--- a/docs/api_reference/create_api_rst.py
+++ b/docs/api_reference/create_api_rst.py
@@ -1,76 +1,253 @@
-"""Script for auto-generating api_reference.rst"""
-import glob
-import re
+"""Script for auto-generating api_reference.rst."""
+import importlib
+import inspect
+import typing
 from pathlib import Path
+from typing import TypedDict, Sequence, List, Dict, Literal, Union
+from enum import Enum
+
+from pydantic import BaseModel

 ROOT_DIR = Path(__file__).parents[2].absolute()
+HERE = Path(__file__).parent
+
 PKG_DIR = ROOT_DIR / "libs" / "langchain" / "langchain"
-WRITE_FILE = Path(__file__).parent / "api_reference.rst"
+EXP_DIR = ROOT_DIR / "libs" / "experimental" / "langchain_experimental"
+WRITE_FILE = HERE / "api_reference.rst"
+EXP_WRITE_FILE = HERE / "experimental_api_reference.rst"


-def load_members() -> dict:
-    members: dict = {}
-    for py in glob.glob(str(PKG_DIR) + "/**/*.py", recursive=True):
-        module = py[len(str(PKG_DIR)) + 1 :].replace(".py", "").replace("/", ".")
-        top_level = module.split(".")[0]
-        if top_level not in members:
-            members[top_level] = {"classes": [], "functions": []}
-        with open(py, "r") as f:
-            for line in f.readlines():
-                cls = re.findall(r"^class ([^_].*)\(", line)
-                members[top_level]["classes"].extend([module + "." + c for c in cls])
-                func = re.findall(r"^def ([^_].*)\(", line)
-                afunc = re.findall(r"^async def ([^_].*)\(", line)
-                func_strings = [module + "." + f for f in func + afunc]
-                members[top_level]["functions"].extend(func_strings)
-    return members
+ClassKind = Literal["TypedDict", "Regular", "Pydantic", "enum"]


-def construct_doc(members: dict) -> str:
-    full_doc = """\
-.. _api_reference:
+class ClassInfo(TypedDict):
+    """Information about a class."""

-=============
-API Reference
-=============
+    name: str
+    """The name of the class."""
+    qualified_name: str
+    """The fully qualified name of the class."""
+    kind: ClassKind
+    """The kind of the class."""
+    is_public: bool
+    """Whether the class is public or not."""
+
+
+class FunctionInfo(TypedDict):
+    """Information about a function."""
+
+    name: str
+    """The name of the function."""
+    qualified_name: str
+    """The fully qualified name of the function."""
+    is_public: bool
+    """Whether the function is public or not."""
+
+
+class ModuleMembers(TypedDict):
+    """A dictionary of module members."""
+
+    classes_: Sequence[ClassInfo]
+    functions: Sequence[FunctionInfo]
+
+
+def _load_module_members(module_path: str, namespace: str) -> ModuleMembers:
+    """Load all members of a module.
+
+    Args:
+        module_path: Path to the module.
+        namespace: the namespace of the module.
+
+    Returns:
+        list: A list of loaded module objects.
+    """
+    classes_: List[ClassInfo] = []
+    functions: List[FunctionInfo] = []
+    module = importlib.import_module(module_path)
+    for name, type_ in inspect.getmembers(module):
+        if not hasattr(type_, "__module__"):
+            continue
+        if type_.__module__ != module_path:
+            continue
+
+        if inspect.isclass(type_):
+            if type(type_) == typing._TypedDictMeta:  # type: ignore
+                kind: ClassKind = "TypedDict"
+            elif issubclass(type_, Enum):
+                kind = "enum"
+            elif issubclass(type_, BaseModel):
+                kind = "Pydantic"
+            else:
+                kind = "Regular"
+
+            classes_.append(
+                ClassInfo(
+                    name=name,
+                    qualified_name=f"{namespace}.{name}",
+                    kind=kind,
+                    is_public=not name.startswith("_"),
+                )
+            )
+        elif inspect.isfunction(type_):
+            functions.append(
+                FunctionInfo(
+                    name=name,
+                    qualified_name=f"{namespace}.{name}",
+                    is_public=not name.startswith("_"),
+                )
+            )
+        else:
+            continue
+
+    return ModuleMembers(
+        classes_=classes_,
+        functions=functions,
+    )
+
+
+def _merge_module_members(
+    module_members: Sequence[ModuleMembers],
+) -> ModuleMembers:
+    """Merge module members."""
+    classes_: List[ClassInfo] = []
+    functions: List[FunctionInfo] = []
+    for module in module_members:
+        classes_.extend(module["classes_"])
+        functions.extend(module["functions"])
+
+    return ModuleMembers(
+        classes_=classes_,
+        functions=functions,
+    )
+
+
+def _load_package_modules(
+    package_directory: Union[str, Path]
+) -> Dict[str, ModuleMembers]:
+    """Recursively load modules of a package based on the file system.
+
+    Traversal based on the file system makes it easy to determine which
+    of the modules/packages are part of the package vs. 3rd party or built-in.
+
+    Parameters:
+        package_directory: Path to the package directory.
+
+    Returns:
+        list: A list of loaded module objects.
+    """
+    package_path = (
+        Path(package_directory)
+        if isinstance(package_directory, str)
+        else package_directory
+    )
+    modules_by_namespace = {}
+
+    package_name = package_path.name
+
+    for file_path in package_path.rglob("*.py"):
+        if not file_path.name.startswith("__"):
+            relative_module_name = file_path.relative_to(package_path)
+            # Get the full namespace of the module
+            namespace = str(relative_module_name).replace(".py", "").replace("/", ".")
+            # Keep only the top level namespace
+            top_namespace = namespace.split(".")[0]
+
+            try:
+                module_members = _load_module_members(
+                    f"{package_name}.{namespace}", namespace
+                )
+                # Merge module members if the namespace already exists
+                if top_namespace in modules_by_namespace:
+                    existing_module_members = modules_by_namespace[top_namespace]
+                    _module_members = _merge_module_members(
+                        [existing_module_members, module_members]
+                    )
+                else:
+                    _module_members = module_members
+
+                modules_by_namespace[top_namespace] = _module_members
+
+            except ImportError as e:
+                print(f"Error: Unable to import module '{namespace}' with error: {e}")
+
+    return modules_by_namespace
+
+
+def _construct_doc(pkg: str, members_by_namespace: Dict[str, ModuleMembers]) -> str:
+    """Construct the contents of the reference.rst file for the given package.
+
+    Args:
+        pkg: The package name
+        members_by_namespace: The members of the package, dict organized by top level
+                              module contains a list of classes and functions
+                              inside of the top level namespace.
+
+    Returns:
+        The contents of the reference.rst file.
+    """
+    full_doc = f"""\
+=======================
+``{pkg}`` API Reference
+=======================

 """
-    for module, _members in sorted(members.items(), key=lambda kv: kv[0]):
-        classes = _members["classes"]
+    namespaces = sorted(members_by_namespace)
+
+    for module in namespaces:
+        _members = members_by_namespace[module]
+        classes = _members["classes_"]
        functions = _members["functions"]
        if not (classes or functions):
            continue
-        section = f":mod:`langchain.{module}`"
+        section = f":mod:`{pkg}.{module}`"
+        underline = "=" * (len(section) + 1)
        full_doc += f"""\
 {section}
-{'=' * (len(section) + 1)}
+{underline}

-.. automodule:: langchain.{module}
+.. automodule:: {pkg}.{module}
    :no-members:
    :no-inherited-members:

 """

        if classes:
-            cstring = "\n    ".join(sorted(classes))
            full_doc += f"""\
 Classes
 --------------
-.. currentmodule:: langchain
+.. currentmodule:: {pkg}

 .. autosummary::
    :toctree: {module}
-    :template: class.rst
-
-    {cstring}
-
 """
+
+            for class_ in classes:
+                if not class_['is_public']:
+                    continue
+                    
+                if class_["kind"] == "TypedDict":
+                    template = "typeddict.rst"
+                elif class_["kind"] == "enum":
+                    template = "enum.rst"
+                elif class_["kind"] == "Pydantic":
+                    template = "pydantic.rst"
+                else:
+                    template = "class.rst"
+
+                full_doc += f"""\
+    :template: {template}
+    
+    {class_["qualified_name"]}
+    
+"""
+
        if functions:
-            fstring = "\n    ".join(sorted(functions))
+            _functions = [f["qualified_name"] for f in functions if f["is_public"]]
+            fstring = "\n    ".join(sorted(_functions))
            full_doc += f"""\
 Functions
 --------------
-.. currentmodule:: langchain
+.. currentmodule:: {pkg}

 .. autosummary::
    :toctree: {module}
@@ -83,10 +260,17 @@ Functions


 def main() -> None:
-    members = load_members()
-    full_doc = construct_doc(members)
+    """Generate the reference.rst file for each package."""
+    lc_members = _load_package_modules(PKG_DIR)
+    lc_doc = ".. _api_reference:\n\n" + _construct_doc("langchain", lc_members)
    with open(WRITE_FILE, "w") as f:
-        f.write(full_doc)
+        f.write(lc_doc)
+    exp_members = _load_package_modules(EXP_DIR)
+    exp_doc = ".. _experimental_api_reference:\n\n" + _construct_doc(
+        "langchain_experimental", exp_members
+    )
+    with open(EXP_WRITE_FILE, "w") as f:
+        f.write(exp_doc)


 if __name__ == "__main__":
--- a/docs/api_reference/guide_imports.json
+++ b/docs/api_reference/guide_imports.json
--- a/docs/api_reference/requirements.txt
+++ b/docs/api_reference/requirements.txt
@@ -1,4 +1,5 @@
 -e libs/langchain
+-e libs/experimental
 autodoc_pydantic==1.8.0
 myst_parser
 nbsphinx==0.8.9
@@ -10,4 +11,4 @@ sphinx-panels
 toml
 myst_nb
 sphinx_copybutton
-pydata-sphinx-theme==0.13.1
+pydata-sphinx-theme==0.13.1
--- a/docs/api_reference/templates/class.rst
+++ b/docs/api_reference/templates/class.rst
@@ -5,17 +5,6 @@

 .. autoclass:: {{ objname }}

-   {% block methods %}
-   {% if methods %}
-   .. rubric:: {{ _('Methods') }}
-
-   .. autosummary::
-   {% for item in methods %}
-      ~{{ name }}.{{ item }}
-   {%- endfor %}
-   {% endif %}
-   {% endblock %}
-
   {% block attributes %}
   {% if attributes %}
   .. rubric:: {{ _('Attributes') }}
@@ -27,4 +16,21 @@
   {% endif %}
   {% endblock %}

+   {% block methods %}
+   {% if methods %}
+   .. rubric:: {{ _('Methods') }}
+
+   .. autosummary::
+   {% for item in methods %}
+      ~{{ name }}.{{ item }}
+   {%- endfor %}
+
+   {% for item in methods %}
+   .. automethod:: {{ name }}.{{ item }}
+   {%- endfor %}
+
+   {% endif %}
+   {% endblock %}
+
+
 .. example_links:: {{ objname }}
--- a/docs/api_reference/templates/enum.rst
+++ b/docs/api_reference/templates/enum.rst
@@ -0,0 +1,14 @@
+:mod:`{{module}}`.{{objname}}
+{{ underline }}==============
+
+.. currentmodule:: {{ module }}
+
+.. autoclass:: {{ objname }}
+
+    {% block attributes %}
+    {% for item in attributes %}
+    .. autoattribute:: {{ item }}
+    {% endfor %}
+    {% endblock %}
+
+.. example_links:: {{ objname }}
--- a/docs/api_reference/templates/pydantic.rst
+++ b/docs/api_reference/templates/pydantic.rst
@@ -0,0 +1,22 @@
+:mod:`{{module}}`.{{objname}}
+{{ underline }}==============
+
+.. currentmodule:: {{ module }}
+
+.. autopydantic_model:: {{ objname }}
+    :model-show-json: False
+    :model-show-config-summary: False
+    :model-show-validator-members: False
+    :model-show-field-summary: False
+    :field-signature-prefix: param
+    :members:
+    :undoc-members:
+    :inherited-members:
+    :member-order: groupwise
+    :show-inheritance: True
+    :special-members: __call__
+
+    {% block attributes %}
+    {% endblock %}
+
+.. example_links:: {{ objname }}
--- a/docs/api_reference/templates/typeddict.rst
+++ b/docs/api_reference/templates/typeddict.rst
@@ -0,0 +1,14 @@
+:mod:`{{module}}`.{{objname}}
+{{ underline }}==============
+
+.. currentmodule:: {{ module }}
+
+.. autoclass:: {{ objname }}
+
+    {% block attributes %}
+   {% for item in attributes %}
+  .. autoattribute:: {{ item }}
+   {% endfor %}
+   {% endblock %}
+
+.. example_links:: {{ objname }}
--- a/docs/api_reference/themes/scikit-learn-modern/layout.html
+++ b/docs/api_reference/themes/scikit-learn-modern/layout.html
@@ -19,7 +19,7 @@
  {% block htmltitle %}
  <title>{{ title|striptags|e }}{{ titlesuffix }}</title>
  {% endblock %}
-  <link rel="canonical" href="http://scikit-learn.org/stable/{{pagename}}.html" />
+  <link rel="canonical" href="https://api.python.langchain.com/en/latest/{{pagename}}.html" />

  {% if favicon_url %}
  <link rel="shortcut icon" href="{{ favicon_url|e }}"/>
--- a/docs/api_reference/themes/scikit-learn-modern/nav.html
+++ b/docs/api_reference/themes/scikit-learn-modern/nav.html
@@ -6,17 +6,6 @@
  {%- set top_container_cls = "sk-landing-container" %}
 {%- endif %}

-{% if theme_link_to_live_contributing_page|tobool %}
-{# Link to development page for live builds #}
-  {%- set development_link = "https://scikit-learn.org/dev/developers/index.html" %}
-{# Open on a new development page in new window/tab for live builds #}
-  {%- set development_attrs = 'target="_blank" rel="noopener noreferrer"' %}
-{%- else %}
-  {%- set development_link = pathto('developers/index') %}
-  {%- set development_attrs = '' %}
-{%- endif %}
-
-
 <nav id="navbar" class="{{ nav_bar_class }} navbar navbar-expand-md navbar-light bg-light py-0">
  <div class="container-fluid {{ top_container_cls }} px-0">
    {%- if logo_url %}
@@ -45,6 +34,9 @@
        <li class="nav-item">
          <a class="sk-nav-link nav-link" href="{{ pathto('api_reference') }}">API</a>
        </li>
+        <li class="nav-item">
+          <a class="sk-nav-link nav-link" href="{{ pathto('experimental_api_reference') }}">Experimental</a>
+        </li>
        <li class="nav-item">
          <a class="sk-nav-link nav-link" target="_blank" rel="noopener noreferrer" href="https://python.langchain.com/">Python Docs</a>
        </li>
--- a/docs/docs_skeleton/docs/guides/expression_language/index.mdx
+++ b/docs/docs_skeleton/docs/guides/expression_language/index.mdx
@@ -0,0 +1,9 @@
+# LangChain Expression Language
+
+import DocCardList from "@theme/DocCardList";
+
+LangChain Expression Language is a declarative way to easily compose chains together.
+Any chain constructed this way will automatically have full sync, async, and streaming support.
+See guides below for how to interact with chains constructed this way as well as cookbook examples.
+
+<DocCardList />
--- a/docs/docs_skeleton/docs/modules/chains/additional/constitutional_chain.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/additional/constitutional_chain.mdx
--- a/docs/docs_skeleton/docs/guides/safety/index.mdx
+++ b/docs/docs_skeleton/docs/guides/safety/index.mdx
@@ -0,0 +1,6 @@
+# Preventing harmful outputs
+
+One of the key concerns with using LLMs is that they may generate harmful or unethical text. This is an area of active research in the field. Here we present some built-in chains inspired by this research, which are intended to make the outputs of LLMs safer.
+
+- [Moderation chain](/docs/use_cases/safety/moderation): Explicitly check if any output text is harmful and flag it.
+- [Constitutional chain](/docs/use_cases/safety/constitutional_chain): Prompt the model with a set of principles which should guide it's behavior.
--- a/docs/docs_skeleton/docs/modules/chains/additional/moderation.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/additional/moderation.mdx
--- a/docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx
+++ b/docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx
@@ -12,7 +12,7 @@ Here are the agents available in LangChain.

 ### [Zero-shot ReAct](/docs/modules/agents/agent_types/react.html)

-This agent uses the [ReAct](https://arxiv.org/pdf/2205.00445.pdf) framework to determine which tool to use
+This agent uses the [ReAct](https://arxiv.org/pdf/2210.03629) framework to determine which tool to use
 based solely on the tool's description. Any number of tools can be provided.
 This agent requires that a description is provided for each tool.

@@ -28,7 +28,7 @@ navigating around a browser.
 ### [OpenAI Functions](/docs/modules/agents/agent_types/openai_functions_agent.html)

 Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been explicitly fine-tuned to detect when a
-function should to be called and respond with the inputs that should be passed to the function.
+function should be called and respond with the inputs that should be passed to the function.
 The OpenAI Functions Agent is designed to work with these models.

 ### [Conversational](/docs/modules/agents/agent_types/chat_conversation_agent.html)
--- a/docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx
+++ b/docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx
@@ -1,6 +1,6 @@
 # OpenAI functions

-Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been fine-tuned to detect when a function should to be called and respond with the inputs that should be passed to the function.
+Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been fine-tuned to detect when a function should be called and respond with the inputs that should be passed to the function.
 In an API call, you can describe functions and have the model intelligently choose to output a JSON object containing arguments to call those functions.
 The goal of the OpenAI Function APIs is to more reliably return valid and useful function calls than a generic text completion or chat API.

--- a/docs/docs_skeleton/docs/modules/chains/additional/index.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/additional/index.mdx
@@ -1,8 +0,0 @@
---
-sidebar_position: 4
---
-# Additional
-
-import DocCardList from "@theme/DocCardList";
-
-<DocCardList />
--- a/docs/docs_skeleton/docs/modules/chains/additional/multi_prompt_router.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/additional/multi_prompt_router.mdx
@@ -1,7 +0,0 @@
-# Dynamically selecting from multiple prompts
-
-This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects the prompt to use for a given input. Specifically we show how to use the `MultiPromptChain` to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt.
-
-import Example from "@snippets/modules/chains/additional/multi_prompt_router.mdx"
-
-<Example/>
--- a/docs/docs_skeleton/docs/modules/chains/foundational/sequential_chains.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/foundational/sequential_chains.mdx
@@ -1,6 +1,6 @@
 # Sequential

-<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! Instead, edit the notebook w/the location & name as this file. -->
+

 The next step after calling a language model is make a series of calls to a language model. This is particularly useful when you want to take the output from one call and use it as the input to another.

--- a/docs/docs_skeleton/docs/modules/chains/popular/index.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/popular/index.mdx
@@ -1,8 +0,0 @@
---
-sidebar_position: 3
---
-# Popular
-
-import DocCardList from "@theme/DocCardList";
-
-<DocCardList />
--- a/docs/docs_skeleton/docs/modules/chains/popular/summarize.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/popular/summarize.mdx
@@ -1,8 +0,0 @@
-# Summarization
-
-A summarization chain can be used to summarize multiple documents. One way is to input multiple smaller documents, after they have been divided into chunks, and operate over them with a MapReduceDocumentsChain. You can also choose instead for the chain that does summarization to be a StuffDocumentsChain, or a RefineDocumentsChain.
-
-import Example from "@snippets/modules/chains/popular/summarize.mdx"
-
-<Example/>
-
--- a/docs/docs_skeleton/docs/modules/data_connection/retrievers/integrations/_category_.yml
+++ b/docs/docs_skeleton/docs/modules/data_connection/retrievers/integrations/_category_.yml
@@ -1 +0,0 @@
-label: 'Integrations'
--- a/docs/docs_skeleton/docs/modules/index.mdx
+++ b/docs/docs_skeleton/docs/modules/index.mdx
@@ -18,5 +18,3 @@ Let chains choose which tools to use given high-level directives
 Persist application state between runs of a chain
 #### [Callbacks](/docs/modules/callbacks/)
 Log and stream intermediate steps of any chain
-#### [Evaluation](/docs/modules/evaluation/)
-Evaluate the performance of a chain.
--- a/docs/docs_skeleton/docs/modules/memory/chat_messages/index.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/chat_messages/index.mdx
@@ -0,0 +1,17 @@
+---
+sidebar_position: 1
+---
+# Chat Messages
+
+:::info
+Head to [Integrations](/docs/integrations/memory/) for documentation on built-in memory integrations with 3rd-party databases and tools.
+:::
+
+One of the core utility classes underpinning most (if not all) memory modules is the `ChatMessageHistory` class.
+This is a super lightweight wrapper which exposes convenience methods for saving Human messages, AI messages, and then fetching them all.
+
+You may want to use this class directly if you are managing memory outside of a chain.
+
+import GetStarted from "@snippets/modules/memory/chat_messages/get_started.mdx"
+
+<GetStarted/>
--- a/docs/docs_skeleton/docs/modules/memory/index.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/index.mdx
@@ -1,34 +1,62 @@
 ---
 sidebar_position: 3
 ---
-
 # Memory

-🚧 _Docs under construction_ 🚧
+Most LLM applications have a conversational interface. An essential component of a conversation is being able to refer to information introduced earlier in the conversation.
+At bare minimum, a conversational system should be able to access some window of past messages directly.
+A more complex system will need to have a world model that it is constantly updating, which allows it to do things like maintain information about entities and their relationships.

-:::info
-Head to [Integrations](/docs/integrations/memory/) for documentation on built-in memory integrations with 3rd-party tools.
-:::
+We call this ability to store information about past interactions "memory".
+LangChain provides a lot of utilities for adding memory to a system.
+These utilities can be used by themselves or incorporated seamlessly into a chain.

-By default, Chains and Agents are stateless,
-meaning that they treat each incoming query independently (like the underlying LLMs and chat models themselves).
-In some applications, like chatbots, it is essential
-to remember previous interactions, both in the short and long-term.
-The **Memory** class does exactly that.
+A memory system needs to support two basic actions: reading and writing.
+Recall that every chain defines some core execution logic that expects certain inputs.
+Some of these inputs come directly from the user, but some of these inputs can come from memory.
+A chain will interact with its memory system twice in a given run.
+1. AFTER receiving the initial user inputs but BEFORE executing the core logic, a chain will READ from its memory system and augment the user inputs.
+2. AFTER executing the core logic but BEFORE returning the answer, a chain will WRITE the inputs and outputs of the current run to memory, so that they can be referred to in future runs.

-LangChain provides memory components in two forms.
-First, LangChain provides helper utilities for managing and manipulating previous chat messages.
-These are designed to be modular and useful regardless of how they are used.
-Secondly, LangChain provides easy ways to incorporate these utilities into chains.
+![memory-diagram](/img/memory_diagram.png)
+
+
+## Building memory into a system
+The two core design decisions in any memory system are:
+- How state is stored
+- How state is queried
+
+### Storing: List of chat messages
+Underlying any memory is a history of all chat interactions.
+Even if these are not all used directly, they need to be stored in some form.
+One of the key parts of the LangChain memory module is a series of integrations for storing these chat messages,
+from in-memory lists to persistent databases.
+
+- [Chat message storage](/docs/modules/memory/chat_messages/): How to work with Chat Messages, and the various integrations offered
+
+### Querying: Data structures and algorithms on top of chat messages
+Keeping a list of chat messages is fairly straight-forward.
+What is less straight-forward are the data structures and algorithms built on top of chat messages that serve a view of those messages that is most useful.
+
+A very simply memory system might just return the most recent messages each run. A slightly more complex memory system might return a succinct summary of the past K messages.
+An even more sophisticated system might extract entities from stored messages and only return information about entities referenced in the current run.
+
+Each application can have different requirements for how memory is queried. The memory module should make it easy to both get started with simple memory systems and write your own custom systems if needed.
+
+- [Memory types](/docs/modules/memory/types/): The various data structures and algorithms that make up the memory types LangChain supports

 ## Get started

-Memory involves keeping a concept of state around throughout a user's interactions with an language model. A user's interactions with a language model are captured in the concept of ChatMessages, so this boils down to ingesting, capturing, transforming and extracting knowledge from a sequence of chat messages. There are many different ways to do this, each of which exists as its own memory type.
-
-In general, for each type of memory there are two ways to understanding using memory. These are the standalone functions which extract information from a sequence of messages, and then there is the way you can use this type of memory in a chain.
-
-Memory can return multiple pieces of information (for example, the most recent N messages and a summary of all previous messages). The returned information can either be a string or a list of messages.
+Let's take a look at what Memory actually looks like in LangChain.
+Here we'll cover the basics of interacting with an arbitrary memory class.

 import GetStarted from "@snippets/modules/memory/get_started.mdx"

 <GetStarted/>
+
+## Next steps
+
+And that's it for getting started!
+Please see the other sections for walkthroughs of more advanced topics,
+like custom memory, multiple memories, and more.
+
--- a/docs/docs_skeleton/docs/modules/memory/types/buffer.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/buffer.mdx
@@ -4,6 +4,6 @@ This notebook shows how to use `ConversationBufferMemory`. This memory allows fo

 We can first extract it as a string.

-import Example from "@snippets/modules/memory/how_to/buffer.mdx"
+import Example from "@snippets/modules/memory/types/buffer.mdx"

 <Example/>
--- a/docs/docs_skeleton/docs/modules/memory/types/buffer_window.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/buffer_window.mdx
@@ -4,6 +4,6 @@

 Let's first explore the basic functionality of this type of memory.

-import Example from "@snippets/modules/memory/how_to/buffer_window.mdx"
+import Example from "@snippets/modules/memory/types/buffer_window.mdx"

 <Example/>
--- a/docs/docs_skeleton/docs/modules/memory/types/entity_summary_memory.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/entity_summary_memory.mdx
@@ -4,6 +4,6 @@ Entity Memory remembers given facts about specific entities in a conversation. I

 Let's first walk through using this functionality.

-import Example from "@snippets/modules/memory/how_to/entity_summary_memory.mdx"
+import Example from "@snippets/modules/memory/types/entity_summary_memory.mdx"

 <Example/>
--- a/docs/docs_skeleton/docs/modules/memory/types/index.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/index.mdx
@@ -0,0 +1,8 @@
+---
+sidebar_position: 2
+---
+# Memory Types
+
+There are many different types of memory.
+Each have their own parameters, their own return types, and are useful in different scenarios.
+Please see their individual page for more detail on each one.
--- a/docs/docs_skeleton/docs/modules/memory/types/summary.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/summary.mdx
@@ -4,6 +4,6 @@ Conversation summary memory summarizes the conversation as it happens and stores

 Let's first explore the basic functionality of this type of memory.

-import Example from "@snippets/modules/memory/how_to/summary.mdx"
+import Example from "@snippets/modules/memory/types/summary.mdx"

 <Example/>
--- a/docs/docs_skeleton/docs/modules/memory/types/vectorstore_retriever_memory.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/vectorstore_retriever_memory.mdx
@@ -6,6 +6,6 @@ This differs from most of the other Memory classes in that it doesn't explicitly

 In this case, the "docs" are previous conversation snippets. This can be useful to refer to relevant pieces of information that the AI was told earlier in the conversation.

-import Example from "@snippets/modules/memory/how_to/vectorstore_retriever_memory.mdx"
+import Example from "@snippets/modules/memory/types/vectorstore_retriever_memory.mdx"

 <Example/>
--- a/docs/docs_skeleton/docs/modules/model_io/prompts/index.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/prompts/index.mdx
@@ -3,10 +3,12 @@ sidebar_position: 0
 ---
 # Prompts

-The new way of programming models is through prompts.
-A **prompt** refers to the input to the model.
-This input is often constructed from multiple components.
-LangChain provides several classes and functions to make constructing and working with prompts easy.
+A prompt for a language model is a set of instructions or input provided by a user to
+guide the model's response, helping it understand the context and generate relevant
+and coherent language-based output, such as answering questions, completing sentences,
+or engaging in a conversation.

- [Prompt templates](/docs/modules/model_io/prompts/prompt_templates/): Parametrize model inputs
+LangChain provides several classes and functions to help construct and work with prompts.
+
+- [Prompt templates](/docs/modules/model_io/prompts/prompt_templates/): Parametrized model inputs
 - [Example selectors](/docs/modules/model_io/prompts/example_selectors/): Dynamically select examples to include in prompts
--- a/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/index.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/index.mdx
@@ -4,18 +4,15 @@ sidebar_position: 0

 # Prompt templates

-Language models take text as input - that text is commonly referred to as a prompt.
-Typically this is not simply a hardcoded string but rather a combination of a template, some examples, and user input.
-LangChain provides several classes and functions to make constructing and working with prompts easy.
+Prompt templates are pre-defined recipes for generating prompts for language models.

-## What is a prompt template?
+A template may include instructions, few shot examples, and specific context and
+questions appropriate for a given task.

-A prompt template refers to a reproducible way to generate a prompt. It contains a text string ("the template"), that can take in a set of parameters from the end user and generates a prompt.
+LangChain provides tooling to create and work with prompt templates.

-A prompt template can contain:
- instructions to the language model,
- a set of few shot examples to help the language model generate a better response,
- a question to the language model.
+LangChain strives to create model agnostic templates to make it easy to reuse
+existing templates across different language models.

 import GetStarted from "@snippets/modules/model_io/prompts/prompt_templates/get_started.mdx"

--- a/docs/docs_skeleton/docs/modules/chains/popular/api.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/popular/api.mdx
--- a/docs/docs_skeleton/docs/use_cases/question_answering/how_to/_category_.yml
+++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/_category_.yml
@@ -0,0 +1 @@
+label: 'How to'
--- a/docs/docs_skeleton/docs/use_cases/question_answering/how_to/analyze_document.mdx
+++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/analyze_document.mdx
--- a/docs/docs_skeleton/docs/use_cases/question_answering/how_to/chat_vector_db.mdx
+++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/chat_vector_db.mdx
@@ -2,7 +2,7 @@
 sidebar_position: 2
 ---

-# Conversational Retrieval QA
+# Store and reference chat history
 The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component.

 It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response.
--- a/docs/docs_skeleton/docs/use_cases/question_answering/how_to/multi_retrieval_qa_router.mdx
+++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/multi_retrieval_qa_router.mdx
@@ -1,4 +1,4 @@
-# Dynamically selecting from multiple retrievers
+# Dynamically select from multiple retrievers

 This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects which Retrieval system to use. Specifically we show how to use the `MultiRetrievalQAChain` to create a question-answering chain that selects the retrieval QA chain which is most relevant for a given question, and then answers the question using it.

--- a/docs/docs_skeleton/docs/use_cases/question_answering/how_to/question_answering.mdx
+++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/question_answering.mdx
@@ -1,4 +1,4 @@
-# Document QA
+# QA over in-memory documents

 Here we walk through how to use LangChain for question answering over a list of documents. Under the hood we'll be using our [Document chains](/docs/modules/chains/document/).

--- a/docs/docs_skeleton/docs/use_cases/question_answering/how_to/vector_db_qa.mdx
+++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/vector_db_qa.mdx
@@ -1,7 +1,7 @@
 ---
 sidebar_position: 1
 ---
-# Retrieval QA
+# QA using a Retriever

 This example showcases question answering over an index.

--- a/docs/docs_skeleton/docusaurus.config.js
+++ b/docs/docs_skeleton/docusaurus.config.js
@@ -128,6 +128,10 @@ const config = {
          hideable: true,
        },
      },
+      colorMode: {
+        disableSwitch: false,
+        respectPrefersColorScheme: true,
+      },
      prism: {
        theme: {
          ...baseLightCodeBlockTheme,
--- a/docs/docs_skeleton/generate_api_reference_links.py
+++ b/docs/docs_skeleton/generate_api_reference_links.py
@@ -5,6 +5,7 @@ import logging
 import os
 import re
 from pathlib import Path
+import argparse

 logging.basicConfig(level=logging.INFO)
 logger = logging.getLogger(__name__)
@@ -14,7 +15,12 @@ _BASE_URL = "https://api.python.langchain.com/en/latest/"
 # Regular expression to match Python code blocks
 code_block_re = re.compile(r"^(```python\n)(.*?)(```\n)", re.DOTALL | re.MULTILINE)
 # Regular expression to match langchain import lines
-_IMPORT_RE = re.compile(r"(from\s+(langchain\.\w+(\.\w+)*?)\s+import\s+)(\w+)")
+_IMPORT_RE = re.compile(
+    r"from\s+(langchain\.\w+(\.\w+)*?)\s+import\s+"
+    r"((?:\w+(?:,\s*)?)*"  # Match zero or more words separated by a comma+optional ws
+    r"(?:\s*\(.*?\))?)",  # Match optional parentheses block
+    re.DOTALL,  # Match newlines as well
+)

 _CURRENT_PATH = Path(__file__).parent.absolute()
 # Directory where generated markdown files are stored
@@ -24,6 +30,10 @@ _JSON_PATH = _CURRENT_PATH.parent / "api_reference" / "guide_imports.json"

 def find_files(path):
    """Find all MDX files in the given path"""
+    # Check if is file first
+    if os.path.isfile(path):
+        yield path
+        return
    for root, _, files in os.walk(path):
        for file in files:
            if file.endswith(".mdx") or file.endswith(".md"):
@@ -37,20 +47,33 @@ def get_full_module_name(module_path, class_name):
    return inspect.getmodule(class_).__name__


+def get_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument(
+        "--docs_dir",
+        type=str,
+        default=_DOCS_DIR,
+        help="Directory where generated markdown files are stored",
+    )
+    return parser.parse_args()
+
+
 def main():
    """Main function"""
+    args = get_args()
    global_imports = {}

-    for file in find_files(_DOCS_DIR):
+    for file in find_files(args.docs_dir):
        print(f"Adding links for imports in {file}")
-
-        # replace_imports now returns the import information rather than writing it to a file
        file_imports = replace_imports(file)

        if file_imports:
            # Use relative file path as key
-            relative_path = os.path.relpath(file, _DOCS_DIR)
-            doc_url = f"https://python.langchain.com/docs/{relative_path.replace('.mdx', '').replace('.md', '')}"
+            relative_path = (
+                os.path.relpath(file, _DOCS_DIR).replace(".mdx", "").replace(".md", "")
+            )
+
+            doc_url = f"https://python.langchain.com/docs/{relative_path}"
            for import_info in file_imports:
                doc_title = import_info["title"]
                class_name = import_info["imported"]
@@ -59,6 +82,7 @@ def main():
                global_imports[class_name][doc_title] = doc_url

    # Write the global imports information to a JSON file
+    _JSON_PATH.parent.mkdir(parents=True, exist_ok=True)
    with _JSON_PATH.open("w") as f:
        json.dump(global_imports, f)

@@ -76,7 +100,8 @@ def _get_doc_title(data: str, file_name: str) -> str:


 def replace_imports(file):
-    """Replace imports in each Python code block with links to their documentation and append the import info in a comment"""
+    """Replace imports in each Python code block with links to their
+    documentation and append the import info in a comment"""
    all_imports = []
    with open(file, "r") as f:
        data = f.read()
@@ -96,37 +121,45 @@ def replace_imports(file):
        # Process imports in the code block
        imports = []
        for import_match in _IMPORT_RE.finditer(code):
-            class_name = import_match.group(4)
-            try:
-                module_path = get_full_module_name(import_match.group(2), class_name)
-            except AttributeError as e:
-                logger.warning(f"Could not find module for {class_name}, {e}")
-                continue
-            except ImportError as e:
-                # Some CentOS OpenSSL issues can cause this to fail
-                logger.warning(f"Failed to load for class {class_name}, {e}")
-                continue
+            module = import_match.group(1)
+            imports_str = (
+                import_match.group(3).replace("(\n", "").replace("\n)", "")
+            )  # Handle newlines within parentheses
+            # remove any newline and spaces, then split by comma
+            imported_classes = [
+                imp.strip()
+                for imp in re.split(r",\s*", imports_str.replace("\n", ""))
+                if imp.strip()
+            ]
+            for class_name in imported_classes:
+                try:
+                    module_path = get_full_module_name(module, class_name)
+                except AttributeError as e:
+                    logger.warning(f"Could not find module for {class_name}, {e}")
+                    continue
+                except ImportError as e:
+                    logger.warning(f"Failed to load for class {class_name}, {e}")
+                    continue

-            url = (
-                _BASE_URL
-                + "/"
-                + module_path.split(".")[1]
-                + "/"
-                + module_path
-                + "."
-                + class_name
-                + ".html"
-            )
+                url = (
+                    _BASE_URL
+                    + module_path.split(".")[1]
+                    + "/"
+                    + module_path
+                    + "."
+                    + class_name
+                    + ".html"
+                )

-            # Add the import information to our list
-            imports.append(
-                {
-                    "imported": class_name,
-                    "source": import_match.group(2),
-                    "docs": url,
-                    "title": _DOC_TITLE,
-                }
-            )
+                # Add the import information to our list
+                imports.append(
+                    {
+                        "imported": class_name,
+                        "source": module,
+                        "docs": url,
+                        "title": _DOC_TITLE,
+                    }
+                )

        if imports:
            all_imports.extend(imports)
--- a/docs/docs_skeleton/package-lock.json
+++ b/docs/docs_skeleton/package-lock.json
@@ -12,7 +12,7 @@
        "@docusaurus/preset-classic": "2.4.0",
        "@docusaurus/remark-plugin-npm2yarn": "^2.4.0",
        "@mdx-js/react": "^1.6.22",
-        "@mendable/search": "^0.0.125",
+        "@mendable/search": "^0.0.137",
        "clsx": "^1.2.1",
        "json-loader": "^0.5.7",
        "process": "^0.11.10",
@@ -3212,10 +3212,11 @@
      }
    },
    "node_modules/@mendable/search": {
-      "version": "0.0.125",
-      "resolved": "https://registry.npmjs.org/@mendable/search/-/search-0.0.125.tgz",
-      "integrity": "sha512-Mb1J3zDhOyBZV9cXqJocSOBNYGpe8+LQDqd9n9laPWxosSJcSTUewqtlIbMerrYsScBsxskoSiWgRsc7xF5z0Q==",
+      "version": "0.0.137",
+      "resolved": "https://registry.npmjs.org/@mendable/search/-/search-0.0.137.tgz",
+      "integrity": "sha512-2J2fd5eqToK+mLzrSDA6NAr4F1kfql7QRiHpD7AUJJX0nqpvInhr/mMJKBCUSCv2z76UKCmF5wLuPSw+C90Qdg==",
      "dependencies": {
+        "html-react-parser": "^4.2.0",
        "posthog-js": "^1.45.1"
      },
      "peerDependencies": {
@@ -8332,6 +8333,33 @@
        "safe-buffer": "~5.1.0"
      }
    },
+    "node_modules/html-dom-parser": {
+      "version": "4.0.0",
+      "resolved": "https://registry.npmjs.org/html-dom-parser/-/html-dom-parser-4.0.0.tgz",
+      "integrity": "sha512-TUa3wIwi80f5NF8CVWzkopBVqVAtlawUzJoLwVLHns0XSJGynss4jiY0mTWpiDOsuyw+afP+ujjMgRh9CoZcXw==",
+      "dependencies": {
+        "domhandler": "5.0.3",
+        "htmlparser2": "9.0.0"
+      }
+    },
+    "node_modules/html-dom-parser/node_modules/htmlparser2": {
+      "version": "9.0.0",
+      "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-9.0.0.tgz",
+      "integrity": "sha512-uxbSI98wmFT/G4P2zXx4OVx04qWUmyFPrD2/CNepa2Zo3GPNaCaaxElDgwUrwYWkK1nr9fft0Ya8dws8coDLLQ==",
+      "funding": [
+        "https://github.com/fb55/htmlparser2?sponsor=1",
+        {
+          "type": "github",
+          "url": "https://github.com/sponsors/fb55"
+        }
+      ],
+      "dependencies": {
+        "domelementtype": "^2.3.0",
+        "domhandler": "^5.0.3",
+        "domutils": "^3.1.0",
+        "entities": "^4.5.0"
+      }
+    },
    "node_modules/html-entities": {
      "version": "2.4.0",
      "resolved": "https://registry.npmjs.org/html-entities/-/html-entities-2.4.0.tgz",
@@ -8375,6 +8403,20 @@
        "node": ">= 12"
      }
    },
+    "node_modules/html-react-parser": {
+      "version": "4.2.0",
+      "resolved": "https://registry.npmjs.org/html-react-parser/-/html-react-parser-4.2.0.tgz",
+      "integrity": "sha512-gzU55AS+FI6qD7XaKe5BLuLFM2Xw0/LodfMWZlxV9uOHe7LCD5Lukx/EgYuBI3c0kLu0XlgFXnSzO0qUUn3Vrg==",
+      "dependencies": {
+        "domhandler": "5.0.3",
+        "html-dom-parser": "4.0.0",
+        "react-property": "2.0.0",
+        "style-to-js": "1.1.3"
+      },
+      "peerDependencies": {
+        "react": "0.14 || 15 || 16 || 17 || 18"
+      }
+    },
    "node_modules/html-tags": {
      "version": "3.3.1",
      "resolved": "https://registry.npmjs.org/html-tags/-/html-tags-3.3.1.tgz",
@@ -11762,6 +11804,11 @@
        "webpack": ">=4.41.1 || 5.x"
      }
    },
+    "node_modules/react-property": {
+      "version": "2.0.0",
+      "resolved": "https://registry.npmjs.org/react-property/-/react-property-2.0.0.tgz",
+      "integrity": "sha512-kzmNjIgU32mO4mmH5+iUyrqlpFQhF8K2k7eZ4fdLSOPFrD1XgEuSBv9LDEgxRXTMBqMd8ppT0x6TIzqE5pdGdw=="
+    },
    "node_modules/react-router": {
      "version": "5.3.4",
      "resolved": "https://registry.npmjs.org/react-router/-/react-router-5.3.4.tgz",
@@ -13127,6 +13174,22 @@
        "url": "https://github.com/sponsors/sindresorhus"
      }
    },
+    "node_modules/style-to-js": {
+      "version": "1.1.3",
+      "resolved": "https://registry.npmjs.org/style-to-js/-/style-to-js-1.1.3.tgz",
+      "integrity": "sha512-zKI5gN/zb7LS/Vm0eUwjmjrXWw8IMtyA8aPBJZdYiQTXj4+wQ3IucOLIOnF7zCHxvW8UhIGh/uZh/t9zEHXNTQ==",
+      "dependencies": {
+        "style-to-object": "0.4.1"
+      }
+    },
+    "node_modules/style-to-js/node_modules/style-to-object": {
+      "version": "0.4.1",
+      "resolved": "https://registry.npmjs.org/style-to-object/-/style-to-object-0.4.1.tgz",
+      "integrity": "sha512-HFpbb5gr2ypci7Qw+IOhnP2zOU7e77b+rzM+wTzXzfi1PrtBCX0E7Pk4wL4iTLnhzZ+JgEGAhX81ebTg/aYjQw==",
+      "dependencies": {
+        "inline-style-parser": "0.1.1"
+      }
+    },
    "node_modules/style-to-object": {
      "version": "0.3.0",
      "resolved": "https://registry.npmjs.org/style-to-object/-/style-to-object-0.3.0.tgz",
--- a/docs/docs_skeleton/package.json
+++ b/docs/docs_skeleton/package.json
@@ -23,7 +23,7 @@
    "@docusaurus/preset-classic": "2.4.0",
    "@docusaurus/remark-plugin-npm2yarn": "^2.4.0",
    "@mdx-js/react": "^1.6.22",
-    "@mendable/search": "^0.0.125",
+    "@mendable/search": "^0.0.137",
    "clsx": "^1.2.1",
    "json-loader": "^0.5.7",
    "process": "^0.11.10",
--- a/docs/docs_skeleton/static/img/SQLDatabaseToolkit.png
+++ b/docs/docs_skeleton/static/img/SQLDatabaseToolkit.png
--- a/docs/docs_skeleton/static/img/chat_use_case.png
+++ b/docs/docs_skeleton/static/img/chat_use_case.png
--- a/docs/docs_skeleton/static/img/chat_use_case_2.png
+++ b/docs/docs_skeleton/static/img/chat_use_case_2.png
--- a/docs/docs_skeleton/static/img/create_sql_query_chain.png
+++ b/docs/docs_skeleton/static/img/create_sql_query_chain.png
--- a/docs/docs_skeleton/static/img/extraction.png
+++ b/docs/docs_skeleton/static/img/extraction.png
--- a/docs/docs_skeleton/static/img/extraction_trace_function.png
+++ b/docs/docs_skeleton/static/img/extraction_trace_function.png
--- a/docs/docs_skeleton/static/img/extraction_trace_function_2.png
+++ b/docs/docs_skeleton/static/img/extraction_trace_function_2.png
--- a/docs/docs_skeleton/static/img/extraction_trace_joke.png
+++ b/docs/docs_skeleton/static/img/extraction_trace_joke.png
--- a/docs/docs_skeleton/static/img/memory_diagram.png
+++ b/docs/docs_skeleton/static/img/memory_diagram.png
--- a/docs/docs_skeleton/static/img/sql_usecase.png
+++ b/docs/docs_skeleton/static/img/sql_usecase.png
--- a/docs/docs_skeleton/static/img/sqldbchain_trace.png
+++ b/docs/docs_skeleton/static/img/sqldbchain_trace.png
--- a/docs/docs_skeleton/static/img/summarization_use_case_1.png
+++ b/docs/docs_skeleton/static/img/summarization_use_case_1.png
--- a/docs/docs_skeleton/static/img/summarization_use_case_2.png
+++ b/docs/docs_skeleton/static/img/summarization_use_case_2.png
--- a/docs/docs_skeleton/static/img/summarization_use_case_3.png
+++ b/docs/docs_skeleton/static/img/summarization_use_case_3.png
--- a/docs/docs_skeleton/vercel.json
+++ b/docs/docs_skeleton/vercel.json
@@ -556,6 +556,14 @@
      "source": "/docs/integrations/llamacpp",
      "destination": "/docs/integrations/providers/llamacpp"
    },
+    {
+      "source": "/en/latest/integrations/log10.html",
+      "destination": "/docs/integrations/providers/log10"
+    },
+    {
+      "source": "/docs/integrations/log10",
+      "destination": "/docs/integrations/providers/log10"
+    },
    {
      "source": "/en/latest/integrations/mediawikidump.html",
      "destination": "/docs/integrations/providers/mediawikidump"
@@ -1610,59 +1618,59 @@
    },
    {
      "source": "/en/latest/modules/chains/examples/flare.html",
-      "destination": "/docs/modules/chains/additional/flare"
+      "destination": "/docs/use_cases/question_answering/how_to/flare"
    },
    {
      "source": "/en/latest/modules/chains/examples/graph_cypher_qa.html",
-      "destination": "/docs/modules/chains/additional/graph_cypher_qa"
+      "destination": "/docs/use_cases/graph/graph_cypher_qa"
    },
    {
      "source": "/en/latest/modules/chains/examples/graph_nebula_qa.html",
-      "destination": "/docs/modules/chains/additional/graph_nebula_qa"
+      "destination": "/docs/use_cases/graph/graph_nebula_qa"
    },
    {
      "source": "/en/latest/modules/chains/index_examples/graph_qa.html",
-      "destination": "/docs/modules/chains/additional/graph_qa"
+      "destination": "/docs/use_cases/graph/graph_qa"
    },
    {
      "source": "/en/latest/modules/chains/index_examples/hyde.html",
-      "destination": "/docs/modules/chains/additional/hyde"
+      "destination": "/docs/use_cases/question_answering/how_to/hyde"
    },
    {
      "source": "/en/latest/modules/chains/examples/llm_bash.html",
-      "destination": "/docs/modules/chains/additional/llm_bash"
+      "destination": "/docs/use_cases/code_writing/llm_bash"
    },
    {
      "source": "/en/latest/modules/chains/examples/llm_checker.html",
-      "destination": "/docs/modules/chains/additional/llm_checker"
+      "destination": "/docs/use_cases/self_check/llm_checker"
    },
    {
      "source": "/en/latest/modules/chains/examples/llm_math.html",
-      "destination": "/docs/modules/chains/additional/llm_math"
+      "destination": "/docs/use_cases/code_writing/llm_math"
    },
    {
      "source": "/en/latest/modules/chains/examples/llm_requests.html",
-      "destination": "/docs/modules/chains/additional/llm_requests"
+      "destination": "/docs/use_cases/apis/llm_requests"
    },
    {
      "source": "/en/latest/modules/chains/examples/llm_summarization_checker.html",
-      "destination": "/docs/modules/chains/additional/llm_summarization_checker"
+      "destination": "/docs/use_cases/self_check/llm_summarization_checker"
    },
    {
      "source": "/en/latest/modules/chains/examples/openapi.html",
-      "destination": "/docs/modules/chains/additional/openapi"
+      "destination": "/docs/use_cases/apis/openapi"
    },
    {
      "source": "/en/latest/modules/chains/examples/pal.html",
-      "destination": "/docs/modules/chains/additional/pal"
+      "destination": "/docs/use_cases/code_writing/pal"
    },
    {
      "source": "/en/latest/modules/chains/examples/tagging.html",
-      "destination": "/docs/modules/chains/additional/tagging"
+      "destination": "/docs/use_cases/tagging"
    },
    {
      "source": "/en/latest/modules/chains/index_examples/vector_db_text_generation.html",
-      "destination": "/docs/modules/chains/additional/vector_db_text_generation"
+      "destination": "/docs/use_cases/question_answering/how_to/vector_db_text_generation"
    },
    {
      "source": "/en/latest/modules/chains/generic/router.html",
@@ -3448,6 +3456,10 @@
      "source": "/docs/modules/model_io/models/llms/integrations/writer",
      "destination": "/docs/integrations/llms/writer"
    },
+    {
+      "source": "/en/latest/modules/prompts.html",
+      "destination": "/docs/modules/model_io/prompts"
+    },
    {
      "source": "/en/latest/modules/prompts/output_parsers.html",
      "destination": "/docs/modules/model_io/output_parsers/"
@@ -3472,6 +3484,10 @@
      "source": "/en/latest/modules/prompts/output_parsers/examples/retry.html",
      "destination": "/docs/modules/model_io/output_parsers/retry"
    },
+    {
+      "source": "/en/latest/modules/prompts/example_selectors.html",
+      "destination": "/docs/modules/model_io/prompts/example_selectors"
+    },
    {
      "source": "/en/latest/modules/prompts/example_selectors/examples/custom_example_selector.html",
      "destination": "/docs/modules/model_io/prompts/example_selectors/custom_example_selector"
@@ -3484,6 +3500,10 @@
      "source": "/en/latest/modules/prompts/example_selectors/examples/ngram_overlap.html",
      "destination": "/docs/modules/model_io/prompts/example_selectors/ngram_overlap"
    },
+    {
+      "source": "/en/latest/modules/prompts/prompt_templates.html",
+      "destination": "/docs/modules/model_io/prompts/prompt_templates"
+    },
    {
      "source": "/en/latest/modules/prompts/prompt_templates/examples/connecting_to_a_feature_store.html",
      "destination": "/docs/modules/model_io/prompts/prompt_templates/connecting_to_a_feature_store"
@@ -3736,6 +3756,10 @@
      "source": "/docs/modules/evaluation/:path*(/?)",
      "destination": "/docs/guides/evaluation/:path*"
    },
+    {
+      "source": "/en/latest/modules/indexes.html",
+      "destination": "/docs/modules/data_connection"
+    },
    {
      "source": "/en/latest/modules/indexes/:path*",
      "destination": "/docs/modules/data_connection/:path*"
@@ -3771,6 +3795,174 @@
    {
      "source": "/en/latest/:path*",
      "destination": "/docs/:path*"
+    },
+    {
+      "source": "/docs/modules/chains/additional/constitutional_chain",
+      "destination": "/docs/guides/safety/constitutional_chain"
+    },
+    {
+      "source": "/docs/modules/chains/additional/moderation",
+      "destination": "/docs/guides/safety/moderation"
+    },
+    {
+      "source": "/docs/modules/chains/popular/api",
+      "destination": "/docs/use_cases/apis/api"
+    },
+    {
+      "source": "/docs/modules/chains/additional/analyze_document",
+      "destination": "/docs/use_cases/question_answering/how_to/analyze_document"
+    },
+    {
+      "source": "/docs/modules/chains/popular/chat_vector_db",
+      "destination": "/docs/use_cases/question_answering/how_to/chat_vector_db"
+    },
+    {
+      "source": "/docs/modules/chains/additional/multi_retrieval_qa_router",
+      "destination": "/docs/use_cases/question_answering/how_to/multi_retrieval_qa_router"
+    },
+    {
+      "source": "/docs/modules/chains/additional/question_answering",
+      "destination": "/docs/use_cases/question_answering/how_to/question_answering"
+    },
+    {
+      "source": "/docs/modules/chains/popular/vector_db_qa",
+      "destination": "/docs/use_cases/question_answering/how_to/vector_db_qa"
+    },
+    {
+      "source": "/docs/modules/chains/popular/summarize",
+      "destination": "/docs/use_cases/summarization/summarize"
+    },
+    {
+      "source": "/docs/modules/chains/popular/sqlite",
+      "destination": "/docs/use_cases/tabular/sqlite"
+    },
+    {
+      "source": "/docs/modules/chains/popular/openai_functions",
+      "destination": "/docs/modules/chains/how_to/openai_functions"
+    },
+    {
+      "source": "/docs/modules/chains/additional/llm_requests",
+      "destination": "/docs/use_cases/apis/llm_requests"
+    },
+    {
+      "source": "/docs/modules/chains/additional/openai_openapi",
+      "destination": "/docs/use_cases/apis/openai_openapi"
+    },
+    {
+      "source": "/docs/modules/chains/additional/openapi",
+      "destination": "/docs/use_cases/apis/openapi"
+    },
+    {
+      "source": "/docs/modules/chains/additional/openapi_openai",
+      "destination": "/docs/use_cases/apis/openapi_openai"
+    },
+    {
+      "source": "/docs/modules/chains/additional/cpal",
+      "destination": "/docs/use_cases/code_writing/cpal"
+    },
+    {
+      "source": "/docs/modules/chains/additional/llm_bash",
+      "destination": "/docs/use_cases/code_writing/llm_bash"
+    },
+    {
+      "source": "/docs/modules/chains/additional/llm_math",
+      "destination": "/docs/use_cases/code_writing/llm_math"
+    },
+    {
+      "source": "/docs/modules/chains/additional/llm_symbolic_math",
+      "destination": "/docs/use_cases/code_writing/llm_symbolic_math"
+    },
+    {
+      "source": "/docs/modules/chains/additional/pal",
+      "destination": "/docs/use_cases/code_writing/pal"
+    },
+    {
+      "source": "/docs/modules/chains/additional/graph_arangodb_qa",
+      "destination": "/docs/use_cases/graph/graph_arangodb_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/graph_cypher_qa",
+      "destination": "/docs/use_cases/graph/graph_cypher_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/graph_hugegraph_qa",
+      "destination": "/docs/use_cases/graph/graph_hugegraph_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/graph_kuzu_qa",
+      "destination": "/docs/use_cases/graph/graph_kuzu_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/graph_nebula_qa",
+      "destination": "/docs/use_cases/graph/graph_nebula_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/graph_qa",
+      "destination": "/docs/use_cases/graph/graph_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/graph_sparql_qa",
+      "destination": "/docs/use_cases/graph/graph_sparql_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/neptune_cypher_qa",
+      "destination": "/docs/use_cases/graph/neptune_cypher_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/tot",
+      "destination": "/docs/use_cases/graph/tot"
+    },
+    {
+      "source": "/docs/use_cases/question_answering//document-context-aware-QA",
+      "destination": "/docs/use_cases/question_answering/how_to/document-context-aware-QA"
+    },
+    {
+      "source": "/docs/modules/chains/additional/flare",
+      "destination": "/docs/use_cases/question_answering/how_to/flare"
+    },
+    {
+      "source": "/docs/modules/chains/additional/hyde",
+      "destination": "/docs/use_cases/question_answering/how_to/hyde"
+    },
+    {
+      "source": "/docs/use_cases/question_answering//local_retrieval_qa",
+      "destination": "/docs/use_cases/question_answering/how_to/local_retrieval_qa"
+    },
+    {
+      "source": "/docs/modules/chains/additional/qa_citations",
+      "destination": "/docs/use_cases/question_answering/how_to/qa_citations"
+    },
+    {
+      "source": "/docs/modules/chains/additional/vector_db_text_generation",
+      "destination": "/docs/use_cases/question_answering/how_to/vector_db_text_generation"
+    },
+    {
+      "source": "/docs/modules/chains/additional/openai_functions_retrieval_qa",
+      "destination": "/docs/use_cases/question_answering/integrations/openai_functions_retrieval_qa"
+    },
+    {
+      "source": "/docs/use_cases/question_answering//semantic-search-over-chat",
+      "destination": "/docs/use_cases/question_answering/integrations/semantic-search-over-chat"
+    },
+    {
+      "source": "/docs/modules/chains/additional/llm_checker",
+      "destination": "/docs/use_cases/self_check/llm_checker"
+    },
+    {
+      "source": "/docs/modules/chains/additional/llm_summarization_checker",
+      "destination": "/docs/use_cases/self_check/llm_summarization_checker"
+    },
+    {
+      "source": "/docs/modules/chains/additional/elasticsearch_database",
+      "destination": "/docs/use_cases/tabular/elasticsearch_database"
+    },
+    {
+      "source": "/docs/modules/chains/additional/tagging",
+      "destination": "/docs/use_cases/tagging"
+    },
+    {
+      "source": "docs/integrations/providers/agent_with_wandb_tracing",
+      "destination": "docs/integrations/providers/wandb_tracing"
    }
  ]
 }
--- a/docs/docs_skeleton/vercel_build.sh
+++ b/docs/docs_skeleton/vercel_build.sh
@@ -1,26 +1,47 @@
 #!/bin/bash

+version_compare() {
+    local v1=(${1//./ })
+    local v2=(${2//./ })
+    for i in {0..2}; do
+        if (( ${v1[i]} < ${v2[i]} )); then
+            return 1
+        fi
+    done
+    return 0
+}
+
+openssl_version=$(openssl version | awk '{print $2}')
+required_openssl_version="1.1.1"
+
+python_version=$(python3 --version 2>&1 | awk '{print $2}')
+required_python_version="3.10"
+
+echo "OpenSSL Version"
+echo $openssl_version
+echo "Python Version"
+echo $python_version
+# If openssl version is less than 1.1.1 AND python version is less than 3.10
+if ! version_compare $openssl_version $required_openssl_version && ! version_compare $python_version $required_python_version; then
 ### See: https://github.com/urllib3/urllib3/issues/2168
 # Requests lib breaks for old SSL versions,
 # which are defaults on Amazon Linux 2 (which Vercel uses for builds)
-yum -y update
-yum remove openssl-devel -y
-yum install gcc bzip2-devel libffi-devel zlib-devel wget tar -y
-yum install openssl11 -y
-yum install openssl11-devel -y
-# Install python 3.11 to connect with openSSL 1.1.1
-wget https://www.python.org/ftp/python/3.11.4/Python-3.11.4.tgz 
-tar xzf Python-3.11.4.tgz 
-cd Python-3.11.4 
-./configure 
-make altinstall
-# Check python version
-echo "Python Version"
-python3.11 --version
-cd ..
-###
+    yum -y update
+    yum remove openssl-devel -y
+    yum install gcc bzip2-devel libffi-devel zlib-devel wget tar -y
+    yum install openssl11 -y
+    yum install openssl11-devel -y
+
+    wget https://www.python.org/ftp/python/3.11.4/Python-3.11.4.tgz
+    tar xzf Python-3.11.4.tgz
+    cd Python-3.11.4
+    ./configure
+    make altinstall
+    echo "Python Version"
+    python3.11 --version
+    cd ..
+fi

-# Install nbdev and generate docs
 cd ..
 python3.11 -m venv .venv
 source .venv/bin/activate
--- a/docs/extras/additional_resources/tutorials.mdx
+++ b/docs/extras/additional_resources/tutorials.mdx
@@ -1,5 +1,6 @@
 # Tutorials

+Below are links to video tutorials and courses on LangChain. For written guides on common use cases for LangChain, check out the [use cases guides](/docs/use_cases).

 ⛓ icon marks a new addition [last update 2023-07-05]

--- a/docs/extras/guides/debugging.md
+++ b/docs/extras/guides/debugging.md
@@ -4,7 +4,7 @@ If you're building with LLMs, at some point something will break, and you'll nee

 Here's a few different tools and functionalities to aid in debugging.

-<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! Instead, edit the notebook w/the location & name as this file. -->
+

 ## Tracing

--- a/docs/extras/guides/expression_language/cookbook.ipynb
+++ b/docs/extras/guides/expression_language/cookbook.ipynb
--- a/docs/extras/guides/expression_language/interface.ipynb
+++ b/docs/extras/guides/expression_language/interface.ipynb
@@ -0,0 +1,282 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "9a9acd2e",
+   "metadata": {},
+   "source": [
+    "# Interface\n",
+    "\n",
+    "In an effort to make it as easy as possible to create custom chains, we've implemented a [\"Runnable\"](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.Runnable.html#langchain.schema.runnable.Runnable) protocol that most components implement. This is a standard interface with a few different methods, which makes it easy to define custom chains as well as making it possible to invoke them in a standard way. The standard interface exposed includes:\n",
+    "\n",
+    "- `stream`: stream back chunks of the response\n",
+    "- `invoke`: call the chain on an input\n",
+    "- `batch`: call the chain on a list of inputs\n",
+    "\n",
+    "These also have corresponding async methods:\n",
+    "\n",
+    "- `astream`: stream back chunks of the response async\n",
+    "- `ainvoke`: call the chain on an input async\n",
+    "- `abatch`: call the chain on a list of inputs async\n",
+    "\n",
+    "The type of the input varies by component:\n",
+    "\n",
+    "| Component | Input Type |\n",
+    "| --- | --- |\n",
+    "|Prompt|Dictionary|\n",
+    "|Retriever|Single string|\n",
+    "|Model| Single string, list of chat messages or a PromptValue|\n",
+    "\n",
+    "The output type also varies by component:\n",
+    "\n",
+    "| Component | Output Type |\n",
+    "| --- | --- |\n",
+    "| LLM | String |\n",
+    "| ChatModel | ChatMessage |\n",
+    "| Prompt | PromptValue |\n",
+    "| Retriever | List of documents |\n",
+    "\n",
+    "Let's take a look at these methods! To do so, we'll create a super simple PromptTemplate + ChatModel chain."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
+   "id": "466b65b3",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.prompts import ChatPromptTemplate\n",
+    "from langchain.chat_models import ChatOpenAI"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "3c634ef0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model = ChatOpenAI()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "d1850a1f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "prompt = ChatPromptTemplate.from_template(\"tell me a joke about {topic}\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "56d0669f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chain = prompt | model"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "daf2b2b2",
+   "metadata": {},
+   "source": [
+    "## Stream"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "bea9639d",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Sure, here's a bear-themed joke for you:\n",
+      "\n",
+      "Why don't bears wear shoes?\n",
+      "\n",
+      "Because they have bear feet!"
+     ]
+    }
+   ],
+   "source": [
+    "for s in chain.stream({\"topic\": \"bears\"}):\n",
+    "    print(s.content, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cbf1c782",
+   "metadata": {},
+   "source": [
+    "## Invoke"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "470e483f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=\"Why don't bears wear shoes?\\n\\nBecause they already have bear feet!\", additional_kwargs={}, example=False)"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.invoke({\"topic\": \"bears\"})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "88f0c279",
+   "metadata": {},
+   "source": [
+    "## Batch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 19,
+   "id": "9685de67",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[AIMessage(content=\"Why don't bears ever wear shoes?\\n\\nBecause they have bear feet!\", additional_kwargs={}, example=False),\n",
+       " AIMessage(content=\"Why don't cats play poker in the wild?\\n\\nToo many cheetahs!\", additional_kwargs={}, example=False)]"
+      ]
+     },
+     "execution_count": 19,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.batch([{\"topic\": \"bears\"}, {\"topic\": \"cats\"}])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b960cbfe",
+   "metadata": {},
+   "source": [
+    "## Async Stream"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "id": "ea35eee4",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Why don't bears wear shoes?\n",
+      "\n",
+      "Because they have bear feet!"
+     ]
+    }
+   ],
+   "source": [
+    "async for s in chain.astream({\"topic\": \"bears\"}):\n",
+    "    print(s.content, end=\"\", flush=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "04cb3324",
+   "metadata": {},
+   "source": [
+    "## Async Invoke"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "ef8c9b20",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=\"Sure, here you go:\\n\\nWhy don't bears wear shoes?\\n\\nBecause they have bear feet!\", additional_kwargs={}, example=False)"
+      ]
+     },
+     "execution_count": 16,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "await chain.ainvoke({\"topic\": \"bears\"})"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3da288d5",
+   "metadata": {},
+   "source": [
+    "## Async Batch"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "eba2a103",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[AIMessage(content=\"Why don't bears wear shoes?\\n\\nBecause they have bear feet!\", additional_kwargs={}, example=False)]"
+      ]
+     },
+     "execution_count": 18,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "await chain.abatch([{\"topic\": \"bears\"}])"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/guides/model_laboratory.ipynb
+++ b/docs/extras/guides/model_laboratory.ipynb
@@ -5,7 +5,7 @@
   "id": "920a3c1a",
   "metadata": {},
   "source": [
-    "# Model Comparison\n",
+    "# Model comparison\n",
    "\n",
    "Constructing your language model application will likely involved choosing between many different options of prompts, models, and even chains to use. When doing so, you will want to compare these different options on different inputs in an easy, flexible, and intuitive way. \n",
    "\n",
@@ -254,7 +254,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.9"
+   "version": "3.11.3"
  }
 },
 "nbformat": 4,
--- a/docs/extras/integrations/callbacks/argilla.ipynb
+++ b/docs/extras/integrations/callbacks/argilla.ipynb
@@ -14,7 +14,7 @@
    "> using both human and machine feedback. We provide support for each step in the MLOps cycle, \n",
    "> from data labeling to model monitoring.\n",
    "\n",
-    "<a target=\"_blank\" href=\"https://colab.research.google.com/github/hwchase17/langchain/blob/master/docs/modules/callbacks/integrations/argilla.html\">\n",
+    "<a target=\"_blank\" href=\"https://colab.research.google.com/github/hwchase17/langchain/blob/master/docs/integrations/callbacks/argilla.html\">\n",
    "  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
    "</a>"
   ]
--- a/docs/extras/integrations/callbacks/streamlit.md
+++ b/docs/extras/integrations/callbacks/streamlit.md
@@ -71,3 +71,6 @@ or any other local ENV management tool.

 Currently `StreamlitCallbackHandler` is geared towards use with a LangChain Agent Executor. Support for additional agent types,
 use directly with Chains, etc will be added in the future.
+
+You may also be interested in using
+[StreamlitChatMessageHistory](/docs/integrations/memory/streamlit_chat_message_history) for LangChain.
--- a/docs/extras/integrations/chat/anthropic_functions.ipynb
+++ b/docs/extras/integrations/chat/anthropic_functions.ipynb
@@ -0,0 +1,287 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "5125a1e3",
+   "metadata": {},
+   "source": [
+    "# Anthropic Functions\n",
+    "\n",
+    "This notebook shows how to use an experimental wrapper around Anthropic that gives it the same API as OpenAI Functions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "378be79b",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/Users/harrisonchase/.pyenv/versions/3.9.1/envs/langchain/lib/python3.9/site-packages/deeplake/util/check_latest_version.py:32: UserWarning: A newer version of deeplake (3.6.14) is available. It's recommended that you update to the latest version using `pip install -U deeplake`.\n",
+      "  warnings.warn(\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain_experimental.llms.anthropic_functions import AnthropicFunctions"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "65499965",
+   "metadata": {},
+   "source": [
+    "## Initialize Model\n",
+    "\n",
+    "You can initialize this wrapper the same way you'd initialize ChatAnthropic"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "e1d535f6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model = AnthropicFunctions(model='claude-2')"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fcc9eaf4",
+   "metadata": {},
+   "source": [
+    "## Passing in functions\n",
+    "\n",
+    "You can now pass in functions in a similar way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "0779c320",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "functions=[\n",
+    "    {\n",
+    "      \"name\": \"get_current_weather\",\n",
+    "      \"description\": \"Get the current weather in a given location\",\n",
+    "      \"parameters\": {\n",
+    "        \"type\": \"object\",\n",
+    "        \"properties\": {\n",
+    "          \"location\": {\n",
+    "            \"type\": \"string\",\n",
+    "            \"description\": \"The city and state, e.g. San Francisco, CA\"\n",
+    "          },\n",
+    "          \"unit\": {\n",
+    "            \"type\": \"string\",\n",
+    "            \"enum\": [\"celsius\", \"fahrenheit\"]\n",
+    "          }\n",
+    "        },\n",
+    "        \"required\": [\"location\"]\n",
+    "      }\n",
+    "    }\n",
+    "  ]"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "ad75a933",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.schema import HumanMessage"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "fc703085",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "response = model.predict_messages(\n",
+    "    [HumanMessage(content=\"whats the weater in boston?\")], \n",
+    "    functions=functions\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "04d7936a",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=' ', additional_kwargs={'function_call': {'name': 'get_current_weather', 'arguments': '{\"location\": \"Boston, MA\", \"unit\": \"fahrenheit\"}'}}, example=False)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "response"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0072fdba",
+   "metadata": {},
+   "source": [
+    "## Using for extraction\n",
+    "\n",
+    "You can now use this for extraction."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "7af5c567",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chains import create_extraction_chain\n",
+    "schema = {\n",
+    "    \"properties\": {\n",
+    "        \"name\": {\"type\": \"string\"},\n",
+    "        \"height\": {\"type\": \"integer\"},\n",
+    "        \"hair_color\": {\"type\": \"string\"},\n",
+    "    },\n",
+    "    \"required\": [\"name\", \"height\"],\n",
+    "}\n",
+    "inp = \"\"\"\n",
+    "Alex is 5 feet tall. Claudia is 1 feet taller Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\n",
+    "        \"\"\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "bd01082a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chain = create_extraction_chain(schema, model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "id": "b5a23e9f",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[{'name': 'Alex', 'height': '5', 'hair_color': 'blonde'},\n",
+       " {'name': 'Claudia', 'height': '6', 'hair_color': 'brunette'}]"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.run(inp)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "90ec959e",
+   "metadata": {},
+   "source": [
+    "## Using for tagging\n",
+    "\n",
+    "You can now use this for tagging"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "id": "03c1eb0d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chains import create_tagging_chain"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "581c0ece",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "schema = {\n",
+    "    \"properties\": {\n",
+    "        \"sentiment\": {\"type\": \"string\"},\n",
+    "        \"aggressiveness\": {\"type\": \"integer\"},\n",
+    "        \"language\": {\"type\": \"string\"},\n",
+    "    }\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "d9a8570e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chain = create_tagging_chain(schema, model)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "cf37d679",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{'sentiment': 'positive', 'aggressiveness': '0', 'language': 'english'}"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "chain.run(\"this is really cool\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/chat/anyscale.ipynb
+++ b/docs/extras/integrations/chat/anyscale.ipynb
@@ -0,0 +1,225 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "id": "642fd21c-600a-47a1-be96-6e1438b421a9",
+   "metadata": {},
+   "source": [
+    "# Anyscale\n",
+    "\n",
+    "This notebook demonstrates the use of `langchain.chat_models.ChatAnyscale` for [Anyscale Endpoints](https://endpoints.anyscale.com/).\n",
+    "\n",
+    "* Set `ANYSCALE_API_KEY` environment variable\n",
+    "* or use the `anyscale_api_key` keyword argument"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "outputs": [],
+   "source": [
+    "# !pip install openai"
+   ],
+   "metadata": {
+    "collapsed": false
+   },
+   "id": "d00d850917865298"
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "72340871-ae2f-415f-b399-0777d32dc379",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdin",
+     "output_type": "stream",
+     "text": [
+      " ········\n"
+     ]
+    }
+   ],
+   "source": [
+    "import os\n",
+    "from getpass import getpass\n",
+    "\n",
+    "os.environ[\"ANYSCALE_API_KEY\"] = getpass()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5d7fc704-3ea0-4c35-96e7-89fcae6c73fa",
+   "metadata": {},
+   "source": [
+    "# Let's try out each model offered on Anyscale Endpoints"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "0dc9428d-4217-47d2-97de-f784b1764186",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "dict_keys(['meta-llama/Llama-2-70b-chat-hf', 'meta-llama/Llama-2-7b-chat-hf', 'meta-llama/Llama-2-13b-chat-hf'])\n"
+     ]
+    }
+   ],
+   "source": [
+    "from langchain.chat_models import ChatAnyscale\n",
+    "\n",
+    "chats = {\n",
+    "    model: ChatAnyscale(model_name=model, temperature=1.0)\n",
+    "    for model in ChatAnyscale.get_available_models()\n",
+    "}\n",
+    "\n",
+    "print(chats.keys())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7c4f124a-eaf7-4d78-a2c0-b0aa23fb25c4",
+   "metadata": {},
+   "source": [
+    "# We can use async methods and other stuff supported by ChatOpenAI\n",
+    "\n",
+    "This way, the three requests will only take as long as the longest individual request."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "1f94f5d2-569e-4a2c-965e-de53c2845fbb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import asyncio\n",
+    "\n",
+    "from langchain.schema import SystemMessage, HumanMessage\n",
+    "\n",
+    "messages = [\n",
+    "    SystemMessage(\n",
+    "        content=\"You are a helpful AI that shares everything you know.\"\n",
+    "    ),\n",
+    "    HumanMessage(\n",
+    "        content=\"Tell me technical facts about yourself. Are you a transformer model? How many billions of parameters do you have?\"\n",
+    "    ),\n",
+    "]\n",
+    "\n",
+    "async def get_msgs():\n",
+    "    tasks = [\n",
+    "        chat.apredict_messages(messages)\n",
+    "        for chat in chats.values()\n",
+    "    ]\n",
+    "    responses = await asyncio.gather(*tasks)\n",
+    "    return dict(zip(chats.keys(), responses))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "b2ced871-869a-4ca6-a2ec-6bfececdf7da",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import nest_asyncio\n",
+    "\n",
+    "nest_asyncio.apply()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "bc605fa5-9501-470d-a6c9-cd868d2145ef",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\tmeta-llama/Llama-2-70b-chat-hf\n",
+      "\n",
+      "Greetings! I'm just an AI, I don't have a personal identity like humans do, but I'm here to help you with any questions you have.\n",
+      "\n",
+      "I'm a large language model, which means I'm trained on a large corpus of text data to generate language outputs that are coherent and natural-sounding. My architecture is based on a transformer model, which is a type of neural network that's particularly well-suited for natural language processing tasks.\n",
+      "\n",
+      "As for my parameters, I have a few billion parameters, but I don't have access to the exact number as it's not relevant to my functioning. My training data includes a vast amount of text from various sources, including books, articles, and websites, which I use to learn patterns and relationships in language.\n",
+      "\n",
+      "I'm designed to be a helpful tool for a variety of tasks, such as answering questions, providing information, and generating text. I'm constantly learning and improving my abilities through machine learning algorithms and feedback from users like you.\n",
+      "\n",
+      "I hope this helps! Is there anything else you'd like to know about me or my capabilities?\n",
+      "\n",
+      "---\n",
+      "\n",
+      "\tmeta-llama/Llama-2-7b-chat-hf\n",
+      "\n",
+      "Ah, a fellow tech enthusiast! *adjusts glasses* I'm glad to share some technical details about myself. 🤓\n",
+      "Indeed, I'm a transformer model, specifically a BERT-like language model trained on a large corpus of text data. My architecture is based on the transformer framework, which is a type of neural network designed for natural language processing tasks. 🏠\n",
+      "As for the number of parameters, I have approximately 340 million. *winks* That's a pretty hefty number, if I do say so myself! These parameters allow me to learn and represent complex patterns in language, such as syntax, semantics, and more. 🤔\n",
+      "But don't ask me to do math in my head – I'm a language model, not a calculating machine! 😅 My strengths lie in understanding and generating human-like text, so feel free to chat with me anytime you'd like. 💬\n",
+      "Now, do you have any more technical questions for me? Or would you like to engage in a nice chat? 😊\n",
+      "\n",
+      "---\n",
+      "\n",
+      "\tmeta-llama/Llama-2-13b-chat-hf\n",
+      "\n",
+      "Hello! As a friendly and helpful AI, I'd be happy to share some technical facts about myself.\n",
+      "\n",
+      "I am a transformer-based language model, specifically a variant of the BERT (Bidirectional Encoder Representations from Transformers) architecture. BERT was developed by Google in 2018 and has since become one of the most popular and widely-used AI language models.\n",
+      "\n",
+      "Here are some technical details about my capabilities:\n",
+      "\n",
+      "1. Parameters: I have approximately 340 million parameters, which are the numbers that I use to learn and represent language. This is a relatively large number of parameters compared to some other languages models, but it allows me to learn and understand complex language patterns and relationships.\n",
+      "2. Training: I was trained on a large corpus of text data, including books, articles, and other sources of written content. This training allows me to learn about the structure and conventions of language, as well as the relationships between words and phrases.\n",
+      "3. Architectures: My architecture is based on the transformer model, which is a type of neural network that is particularly well-suited for natural language processing tasks. The transformer model uses self-attention mechanisms to allow the model to \"attend\" to different parts of the input text, allowing it to capture long-range dependencies and contextual relationships.\n",
+      "4. Precision: I am capable of generating text with high precision and accuracy, meaning that I can produce text that is close to human-level quality in terms of grammar, syntax, and coherence.\n",
+      "5. Generative capabilities: In addition to being able to generate text based on prompts and questions, I am also capable of generating text based on a given topic or theme. This allows me to create longer, more coherent pieces of text that are organized around a specific idea or concept.\n",
+      "\n",
+      "Overall, I am a powerful and versatile language model that is capable of a wide range of natural language processing tasks. I am constantly learning and improving, and I am here to help answer any questions you may have!\n",
+      "\n",
+      "---\n",
+      "\n",
+      "CPU times: user 371 ms, sys: 15.5 ms, total: 387 ms\n",
+      "Wall time: 12 s\n"
+     ]
+    }
+   ],
+   "source": [
+    "%%time\n",
+    "\n",
+    "response_dict = asyncio.run(get_msgs())\n",
+    "\n",
+    "for model_name, response in response_dict.items():\n",
+    "    print(f'\\t{model_name}')\n",
+    "    print()\n",
+    "    print(response.content)\n",
+    "    print('\\n---\\n')"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.4"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/chat/azure_chat_openai.ipynb
+++ b/docs/extras/integrations/chat/azure_chat_openai.ipynb
@@ -74,6 +74,124 @@
   "metadata": {},
   "outputs": [],
   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f27fa24d",
+   "metadata": {},
+   "source": [
+    "## Model Version\n",
+    "Azure OpenAI responses contain `model` property, which is name of the model used to generate the response. However unlike native OpenAI responses, it does not contain the version of the model, which is set on the deplyoment in Azure. This makes it tricky to know which version of the model was used to generate the response, which as result can lead to e.g. wrong total cost calculation with `OpenAICallbackHandler`.\n",
+    "\n",
+    "To solve this problem, you can pass `model_version` parameter to `AzureChatOpenAI` class, which will be added to the model name in the llm output. This way you can easily distinguish between different versions of the model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0531798a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.callbacks import get_openai_callback"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "3fd97dfc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "BASE_URL = \"https://{endpoint}.openai.azure.com\"\n",
+    "API_KEY = \"...\"\n",
+    "DEPLOYMENT_NAME = \"gpt-35-turbo\" # in Azure, this deployment has version 0613 - input and output tokens are counted separately"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "aceddb72",
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Total Cost (USD): $0.000054\n"
+     ]
+    }
+   ],
+   "source": [
+    "model = AzureChatOpenAI(\n",
+    "    openai_api_base=BASE_URL,\n",
+    "    openai_api_version=\"2023-05-15\",\n",
+    "    deployment_name=DEPLOYMENT_NAME,\n",
+    "    openai_api_key=API_KEY,\n",
+    "    openai_api_type=\"azure\",\n",
+    ")\n",
+    "with get_openai_callback() as cb:\n",
+    "    model(\n",
+    "        [\n",
+    "            HumanMessage(\n",
+    "                content=\"Translate this sentence from English to French. I love programming.\"\n",
+    "            )\n",
+    "        ]\n",
+    "    )\n",
+    "    print(f\"Total Cost (USD): ${format(cb.total_cost, '.6f')}\") # without specifying the model version, flat-rate 0.002 USD per 1k input and output tokens is used\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2e61eefd",
+   "metadata": {},
+   "source": [
+    "We can provide the model version to `AzureChatOpenAI` constructor. It will get appended to the model name returned by Azure OpenAI and cost will be counted correctly."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "id": "8d5e54e9",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Total Cost (USD): $0.000044\n"
+     ]
+    }
+   ],
+   "source": [
+    "model0613 = AzureChatOpenAI(\n",
+    "    openai_api_base=BASE_URL,\n",
+    "    openai_api_version=\"2023-05-15\",\n",
+    "    deployment_name=DEPLOYMENT_NAME,\n",
+    "    openai_api_key=API_KEY,\n",
+    "    openai_api_type=\"azure\",\n",
+    "    model_version=\"0613\"\n",
+    ")\n",
+    "with get_openai_callback() as cb:\n",
+    "    model0613(\n",
+    "        [\n",
+    "            HumanMessage(\n",
+    "                content=\"Translate this sentence from English to French. I love programming.\"\n",
+    "            )\n",
+    "        ]\n",
+    "    )\n",
+    "    print(f\"Total Cost (USD): ${format(cb.total_cost, '.6f')}\")\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "99682534",
+   "metadata": {},
+   "outputs": [],
+   "source": []
  }
 ],
 "metadata": {
@@ -92,7 +210,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.8.10"
  }
 },
 "nbformat": 4,
--- a/docs/extras/integrations/chat/azureml_chat_endpoint.ipynb
+++ b/docs/extras/integrations/chat/azureml_chat_endpoint.ipynb
@@ -0,0 +1,95 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# AzureML Chat Online Endpoint\n",
+    "\n",
+    "[AzureML](https://azure.microsoft.com/en-us/products/machine-learning/) is a platform used to build, train, and deploy machine learning models. Users can explore the types of models to deploy in the Model Catalog, which provides Azure Foundation Models and OpenAI Models. Azure Foundation Models include various open-source models and popular Hugging Face models. Users can also import models of their liking into AzureML.\n",
+    "\n",
+    "This notebook goes over how to use a chat model hosted on an `AzureML online endpoint`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.chat_models.azureml_endpoint import AzureMLChatOnlineEndpoint"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Set up\n",
+    "\n",
+    "To use the wrapper, you must [deploy a model on AzureML](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-use-foundation-models?view=azureml-api-2#deploying-foundation-models-to-endpoints-for-inferencing) and obtain the following parameters:\n",
+    "\n",
+    "* `endpoint_api_key`: The API key provided by the endpoint\n",
+    "* `endpoint_url`: The REST endpoint url provided by the endpoint"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Content Formatter\n",
+    "\n",
+    "The `content_formatter` parameter is a handler class for transforming the request and response of an AzureML endpoint to match with required schema. Since there are a wide range of models in the model catalog, each of which may process data differently from one another, a `ContentFormatterBase` class is provided to allow users to transform data to their liking. The following content formatters are provided:\n",
+    "\n",
+    "* `LLamaContentFormatter`: Formats request and response data for LLaMa2-chat"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content='  The Collatz Conjecture is one of the most famous unsolved problems in mathematics, and it has been the subject of much study and research for many years. While it is impossible to predict with certainty whether the conjecture will ever be solved, there are several reasons why it is considered a challenging and important problem:\\n\\n1. Simple yet elusive: The Collatz Conjecture is a deceptively simple statement that has proven to be extraordinarily difficult to prove or disprove. Despite its simplicity, the conjecture has eluded some of the brightest minds in mathematics, and it remains one of the most famous open problems in the field.\\n2. Wide-ranging implications: The Collatz Conjecture has far-reaching implications for many areas of mathematics, including number theory, algebra, and analysis. A solution to the conjecture could have significant impacts on these fields and potentially lead to new insights and discoveries.\\n3. Computational evidence: While the conjecture remains unproven, extensive computational evidence supports its validity. In fact, no counterexample to the conjecture has been found for any starting value up to 2^64 (a number', additional_kwargs={}, example=False)"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain.chat_models.azureml_endpoint import LlamaContentFormatter\n",
+    "from langchain.schema import HumanMessage\n",
+    "\n",
+    "chat = AzureMLChatOnlineEndpoint(content_formatter=LlamaContentFormatter())\n",
+    "response = chat(messages=[\n",
+    "    HumanMessage(content=\"Will the Collatz conjecture ever be solved?\")\n",
+    "])\n",
+    "response"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.11"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/docs/extras/integrations/chat/google_vertex_ai_palm.ipynb
+++ b/docs/extras/integrations/chat/google_vertex_ai_palm.ipynb
@@ -1,6 +1,7 @@
 {
 "cells": [
  {
+   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@@ -8,11 +9,7 @@
    "\n",
    "Note: This is seperate from the Google PaLM integration. Google has chosen to offer an enterprise version of PaLM through GCP, and this supports the models made available through there. \n",
    "\n",
-    "PaLM API on Vertex AI is a Preview offering, subject to the Pre-GA Offerings Terms of the [GCP Service Specific Terms](https://cloud.google.com/terms/service-terms). \n",
-    "\n",
-    "Pre-GA products and features may have limited support, and changes to pre-GA products and features may not be compatible with other pre-GA versions. For more information, see the [launch stage descriptions](https://cloud.google.com/products#product-launch-stages). Further, by using PaLM API on Vertex AI, you agree to the Generative AI Preview [terms and conditions](https://cloud.google.com/trustedtester/aitos) (Preview Terms).\n",
-    "\n",
-    "For PaLM API on Vertex AI, you can process personal data as outlined in the Cloud Data Processing Addendum, subject to applicable restrictions and obligations in the Agreement (as defined in the Preview Terms).\n",
+    "By default, Google Cloud [does not use](https://cloud.google.com/vertex-ai/docs/generative-ai/data-governance#foundation_model_development) Customer Data to train its foundation models as part of Google Cloud`s AI/ML Privacy Commitment. More details about how Google processes data can also be found in [Google's Customer Data Processing Addendum (CDPA)](https://cloud.google.com/terms/data-processing-addendum).\n",
    "\n",
    "To use Vertex AI PaLM you must have the `google-cloud-aiplatform` Python package installed and either:\n",
    "- Have credentials configured for your environment (gcloud, workload identity, etc...)\n",
@@ -90,6 +87,7 @@
   ]
  },
  {
+   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
@@ -142,6 +140,7 @@
   ]
  },
  {
+   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "execution": {
--- a/docs/extras/integrations/document_loaders/airbyte_cdk.ipynb
+++ b/docs/extras/integrations/document_loaders/airbyte_cdk.ipynb
@@ -0,0 +1,226 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1f3a5ebf",
+   "metadata": {},
+   "source": [
+    "# Airbyte CDK"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
+   "metadata": {},
+   "source": [
+    ">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
+    "\n",
+    "A lot of source connectors are implemented using the [Airbyte CDK](https://docs.airbyte.com/connector-development/cdk-python/). This loader allows to run any of these connectors and return the data as documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b06fbde",
+   "metadata": {},
+   "source": [
+    "## Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3e9dc79",
+   "metadata": {},
+   "source": [
+    "First, you need to install the `airbyte-cdk` python package."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d35e4e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install airbyte-cdk"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "085aa658",
+   "metadata": {},
+   "source": [
+    "Then, either install an existing connector from the [Airbyte Github repository](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors) or create your own connector using the [Airbyte CDK](https://docs.airbyte.io/connector-development/connector-development).\n",
+    "\n",
+    "For example, to install the Github connector, run"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f6d04ef4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install \"source_github@git+https://github.com/airbytehq/airbyte.git@master#subdirectory=airbyte-integrations/connectors/source-github\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "36069b74",
+   "metadata": {},
+   "source": [
+    "Some sources are also published as regular packages on PyPI"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae855210",
+   "metadata": {},
+   "source": [
+    "## Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02208f52",
+   "metadata": {},
+   "source": [
+    "Now you can create an `AirbyteCDKLoader` based on the imported source. It takes a `config` object that's passed to the connector. You also have to pick the stream you want to retrieve records from by name (`stream_name`). Check the connectors documentation page and spec definition for more information on the config object and available streams. For the Github connectors these are:\n",
+    "* [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-github/source_github/spec.json](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-github/source_github/spec.json).\n",
+    "* [https://docs.airbyte.com/integrations/sources/github/](https://docs.airbyte.com/integrations/sources/github/)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "89a99e58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "from langchain.document_loaders.airbyte import AirbyteCDKLoader\n",
+    "from source_github.source import SourceGithub # plug in your own source here\n",
+    "\n",
+    "config = {\n",
+    "    # your github configuration\n",
+    "    \"credentials\": {\n",
+    "        \"api_url\": \"api.github.com\",\n",
+    "        \"personal_access_token\": \"<token>\"\n",
+    "    },\n",
+    "    \"repository\": \"<repo>\",\n",
+    "    \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\"\n",
+    "}\n",
+    "\n",
+    "issues_loader = AirbyteCDKLoader(source_class=SourceGithub, config=config, stream_name=\"issues\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cea23fc",
+   "metadata": {},
+   "source": [
+    "Now you can load documents the usual way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dae75cdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = issues_loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a93dc2a",
+   "metadata": {},
+   "source": [
+    "As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1782db09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs_iterator = issues_loader.lazy_load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a124086",
+   "metadata": {},
+   "source": [
+    "Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To create documents in a different, pass in a record_handler function when creating the loader:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5671395d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document\n",
+    "\n",
+    "def handle_record(record, id):\n",
+    "    return Document(page_content=record.data[\"title\"] + \"\\n\" + (record.data[\"body\"] or \"\"), metadata=record.data)\n",
+    "\n",
+    "issues_loader = AirbyteCDKLoader(source_class=SourceGithub, config=config, stream_name=\"issues\", record_handler=handle_record)\n",
+    "\n",
+    "docs = issues_loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223eb8bc",
+   "metadata": {},
+   "source": [
+    "## Incremental loads\n",
+    "\n",
+    "Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
+    "\n",
+    "To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7061e735",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "last_state = issues_loader.last_state # store safely\n",
+    "\n",
+    "incremental_issue_loader = AirbyteCDKLoader(source_class=SourceGithub, config=config, stream_name=\"issues\", state=last_state)\n",
+    "\n",
+    "new_docs = incremental_issue_loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/airbyte_gong.ipynb
+++ b/docs/extras/integrations/document_loaders/airbyte_gong.ipynb
@@ -0,0 +1,206 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1f3a5ebf",
+   "metadata": {},
+   "source": [
+    "# Airbyte Gong"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
+   "metadata": {},
+   "source": [
+    ">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
+    "\n",
+    "This loader exposes the Gong connector as a document loader, allowing you to load various Gong objects as documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6847a40c",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b06fbde",
+   "metadata": {},
+   "source": [
+    "## Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3e9dc79",
+   "metadata": {},
+   "source": [
+    "First, you need to install the `airbyte-source-gong` python package."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d35e4e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install airbyte-source-gong"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae855210",
+   "metadata": {},
+   "source": [
+    "## Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02208f52",
+   "metadata": {},
+   "source": [
+    "Check out the [Airbyte documentation page](https://docs.airbyte.com/integrations/sources/gong/) for details about how to configure the reader.\n",
+    "The JSON schema the config object should adhere to can be found on Github: [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-gong/source_gong/spec.yaml](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-gong/source_gong/spec.yaml).\n",
+    "\n",
+    "The general shape looks like this:\n",
+    "```python\n",
+    "{\n",
+    "  \"access_key\": \"<access key name>\",\n",
+    "  \"access_key_secret\": \"<access key secret>\",\n",
+    "  \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\",\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "By default all fields are stored as metadata in the documents and the text is set to an empty string. Construct the text of the document by transforming the documents returned by the reader."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "89a99e58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "from langchain.document_loaders.airbyte import AirbyteGongLoader\n",
+    "\n",
+    "config = {\n",
+    "    # your gong configuration\n",
+    "}\n",
+    "\n",
+    "loader = AirbyteGongLoader(config=config, stream_name=\"calls\") # check the documentation linked above for a list of all streams"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cea23fc",
+   "metadata": {},
+   "source": [
+    "Now you can load documents the usual way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dae75cdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a93dc2a",
+   "metadata": {},
+   "source": [
+    "As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1782db09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs_iterator = loader.lazy_load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a124086",
+   "metadata": {},
+   "source": [
+    "Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To process documents, create a class inheriting from the base loader and implement the `_handle_records` method yourself:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5671395d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document\n",
+    "\n",
+    "def handle_record(record, id):\n",
+    "    return Document(page_content=record.data[\"title\"], metadata=record.data)\n",
+    "\n",
+    "loader = AirbyteGongLoader(config=config, record_handler=handle_record, stream_name=\"calls\")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223eb8bc",
+   "metadata": {},
+   "source": [
+    "## Incremental loads\n",
+    "\n",
+    "Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
+    "\n",
+    "To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7061e735",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "last_state = loader.last_state # store safely\n",
+    "\n",
+    "incremental_loader = AirbyteGongLoader(config=config, stream_name=\"calls\", state=last_state)\n",
+    "\n",
+    "new_docs = incremental_loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/airbyte_hubspot.ipynb
+++ b/docs/extras/integrations/document_loaders/airbyte_hubspot.ipynb
@@ -0,0 +1,208 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1f3a5ebf",
+   "metadata": {},
+   "source": [
+    "# Airbyte Hubspot"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
+   "metadata": {},
+   "source": [
+    ">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
+    "\n",
+    "This loader exposes the Hubspot connector as a document loader, allowing you to load various Hubspot objects as documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6847a40c",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b06fbde",
+   "metadata": {},
+   "source": [
+    "## Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3e9dc79",
+   "metadata": {},
+   "source": [
+    "First, you need to install the `airbyte-source-hubspot` python package."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d35e4e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install airbyte-source-hubspot"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae855210",
+   "metadata": {},
+   "source": [
+    "## Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02208f52",
+   "metadata": {},
+   "source": [
+    "Check out the [Airbyte documentation page](https://docs.airbyte.com/integrations/sources/hubspot/) for details about how to configure the reader.\n",
+    "The JSON schema the config object should adhere to can be found on Github: [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-hubspot/source_hubspot/spec.yaml](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-hubspot/source_hubspot/spec.yaml).\n",
+    "\n",
+    "The general shape looks like this:\n",
+    "```python\n",
+    "{\n",
+    "  \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\",\n",
+    "  \"credentials\": {\n",
+    "    \"credentials_title\": \"Private App Credentials\",\n",
+    "    \"access_token\": \"<access token of your private app>\"\n",
+    "  }\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "By default all fields are stored as metadata in the documents and the text is set to an empty string. Construct the text of the document by transforming the documents returned by the reader."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "89a99e58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "from langchain.document_loaders.airbyte import AirbyteHubspotLoader\n",
+    "\n",
+    "config = {\n",
+    "    # your hubspot configuration\n",
+    "}\n",
+    "\n",
+    "loader = AirbyteHubspotLoader(config=config, stream_name=\"products\") # check the documentation linked above for a list of all streams"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cea23fc",
+   "metadata": {},
+   "source": [
+    "Now you can load documents the usual way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dae75cdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a93dc2a",
+   "metadata": {},
+   "source": [
+    "As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1782db09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs_iterator = loader.lazy_load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a124086",
+   "metadata": {},
+   "source": [
+    "Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To process documents, create a class inheriting from the base loader and implement the `_handle_records` method yourself:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5671395d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document\n",
+    "\n",
+    "def handle_record(record, id):\n",
+    "    return Document(page_content=record.data[\"title\"], metadata=record.data)\n",
+    "\n",
+    "loader = AirbyteHubspotLoader(config=config, record_handler=handle_record, stream_name=\"products\")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223eb8bc",
+   "metadata": {},
+   "source": [
+    "## Incremental loads\n",
+    "\n",
+    "Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
+    "\n",
+    "To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7061e735",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "last_state = loader.last_state # store safely\n",
+    "\n",
+    "incremental_loader = AirbyteHubspotLoader(config=config, stream_name=\"products\", state=last_state)\n",
+    "\n",
+    "new_docs = incremental_loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/airbyte_salesforce.ipynb
+++ b/docs/extras/integrations/document_loaders/airbyte_salesforce.ipynb
@@ -0,0 +1,213 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1f3a5ebf",
+   "metadata": {},
+   "source": [
+    "# Airbyte Salesforce"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
+   "metadata": {},
+   "source": [
+    ">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
+    "\n",
+    "This loader exposes the Salesforce connector as a document loader, allowing you to load various Salesforce objects as documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6847a40c",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b06fbde",
+   "metadata": {},
+   "source": [
+    "## Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3e9dc79",
+   "metadata": {},
+   "source": [
+    "First, you need to install the `airbyte-source-salesforce` python package."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d35e4e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install airbyte-source-salesforce"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae855210",
+   "metadata": {},
+   "source": [
+    "## Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02208f52",
+   "metadata": {},
+   "source": [
+    "Check out the [Airbyte documentation page](https://docs.airbyte.com/integrations/sources/salesforce/) for details about how to configure the reader.\n",
+    "The JSON schema the config object should adhere to can be found on Github: [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/source_salesforce/spec.yaml](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-salesforce/source_salesforce/spec.yaml).\n",
+    "\n",
+    "The general shape looks like this:\n",
+    "```python\n",
+    "{\n",
+    "  \"client_id\": \"<oauth client id>\",\n",
+    "  \"client_secret\": \"<oauth client secret>\",\n",
+    "  \"refresh_token\": \"<oauth refresh token>\",\n",
+    "  \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\",\n",
+    "  \"is_sandbox\": False, # set to True if you're using a sandbox environment\n",
+    "  \"streams_criteria\": [ # Array of filters for salesforce objects that should be loadable\n",
+    "    {\"criteria\": \"exacts\", \"value\": \"Account\"}, # Exact name of salesforce object\n",
+    "    {\"criteria\": \"starts with\", \"value\": \"Asset\"}, # Prefix of the name\n",
+    "    # Other allowed criteria: ends with, contains, starts not with, ends not with, not contains, not exacts\n",
+    "  ],\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "By default all fields are stored as metadata in the documents and the text is set to an empty string. Construct the text of the document by transforming the documents returned by the reader."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "89a99e58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "from langchain.document_loaders.airbyte import AirbyteSalesforceLoader\n",
+    "\n",
+    "config = {\n",
+    "    # your salesforce configuration\n",
+    "}\n",
+    "\n",
+    "loader = AirbyteSalesforceLoader(config=config, stream_name=\"asset\") # check the documentation linked above for a list of all streams"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cea23fc",
+   "metadata": {},
+   "source": [
+    "Now you can load documents the usual way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dae75cdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a93dc2a",
+   "metadata": {},
+   "source": [
+    "As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1782db09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs_iterator = loader.lazy_load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a124086",
+   "metadata": {},
+   "source": [
+    "Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To create documents in a different, pass in a record_handler function when creating the loader:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5671395d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document\n",
+    "\n",
+    "def handle_record(record, id):\n",
+    "    return Document(page_content=record.data[\"title\"], metadata=record.data)\n",
+    "\n",
+    "loader = AirbyteSalesforceLoader(config=config, record_handler=handle_record, stream_name=\"asset\")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223eb8bc",
+   "metadata": {},
+   "source": [
+    "## Incremental loads\n",
+    "\n",
+    "Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
+    "\n",
+    "To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7061e735",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "last_state = loader.last_state # store safely\n",
+    "\n",
+    "incremental_loader = AirbyteSalesforceLoader(config=config, stream_name=\"asset\", state=last_state)\n",
+    "\n",
+    "new_docs = incremental_loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/airbyte_shopify.ipynb
+++ b/docs/extras/integrations/document_loaders/airbyte_shopify.ipynb
@@ -0,0 +1,209 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1f3a5ebf",
+   "metadata": {},
+   "source": [
+    "# Airbyte Shopify"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
+   "metadata": {},
+   "source": [
+    ">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
+    "\n",
+    "This loader exposes the Shopify connector as a document loader, allowing you to load various Shopify objects as documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6847a40c",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b06fbde",
+   "metadata": {},
+   "source": [
+    "## Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3e9dc79",
+   "metadata": {},
+   "source": [
+    "First, you need to install the `airbyte-source-shopify` python package."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d35e4e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install airbyte-source-shopify"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae855210",
+   "metadata": {},
+   "source": [
+    "## Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02208f52",
+   "metadata": {},
+   "source": [
+    "Check out the [Airbyte documentation page](https://docs.airbyte.com/integrations/sources/shopify/) for details about how to configure the reader.\n",
+    "The JSON schema the config object should adhere to can be found on Github: [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-shopify/source_shopify/spec.json](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-shopify/source_shopify/spec.json).\n",
+    "\n",
+    "The general shape looks like this:\n",
+    "```python\n",
+    "{\n",
+    "    \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\",\n",
+    "    \"shop\": \"<name of the shop you want to retrieve documents from>\",\n",
+    "    \"credentials\": {\n",
+    "        \"auth_method\": \"api_password\",\n",
+    "        \"api_password\": \"<your api password>\"\n",
+    "    }\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "By default all fields are stored as metadata in the documents and the text is set to an empty string. Construct the text of the document by transforming the documents returned by the reader."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "89a99e58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "from langchain.document_loaders.airbyte import AirbyteShopifyLoader\n",
+    "\n",
+    "config = {\n",
+    "    # your shopify configuration\n",
+    "}\n",
+    "\n",
+    "loader = AirbyteShopifyLoader(config=config, stream_name=\"orders\") # check the documentation linked above for a list of all streams"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cea23fc",
+   "metadata": {},
+   "source": [
+    "Now you can load documents the usual way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dae75cdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a93dc2a",
+   "metadata": {},
+   "source": [
+    "As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1782db09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs_iterator = loader.lazy_load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a124086",
+   "metadata": {},
+   "source": [
+    "Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To create documents in a different, pass in a record_handler function when creating the loader:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5671395d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document\n",
+    "\n",
+    "def handle_record(record, id):\n",
+    "    return Document(page_content=record.data[\"title\"], metadata=record.data)\n",
+    "\n",
+    "loader = AirbyteShopifyLoader(config=config, record_handler=handle_record, stream_name=\"orders\")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223eb8bc",
+   "metadata": {},
+   "source": [
+    "## Incremental loads\n",
+    "\n",
+    "Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
+    "\n",
+    "To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7061e735",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "last_state = loader.last_state # store safely\n",
+    "\n",
+    "incremental_loader = AirbyteShopifyLoader(config=config, stream_name=\"orders\", state=last_state)\n",
+    "\n",
+    "new_docs = incremental_loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/airbyte_stripe.ipynb
+++ b/docs/extras/integrations/document_loaders/airbyte_stripe.ipynb
@@ -0,0 +1,206 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1f3a5ebf",
+   "metadata": {},
+   "source": [
+    "# Airbyte Stripe"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
+   "metadata": {},
+   "source": [
+    ">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
+    "\n",
+    "This loader exposes the Stripe connector as a document loader, allowing you to load various Stripe objects as documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6847a40c",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b06fbde",
+   "metadata": {},
+   "source": [
+    "## Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3e9dc79",
+   "metadata": {},
+   "source": [
+    "First, you need to install the `airbyte-source-stripe` python package."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d35e4e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install airbyte-source-stripe"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae855210",
+   "metadata": {},
+   "source": [
+    "## Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02208f52",
+   "metadata": {},
+   "source": [
+    "Check out the [Airbyte documentation page](https://docs.airbyte.com/integrations/sources/stripe/) for details about how to configure the reader.\n",
+    "The JSON schema the config object should adhere to can be found on Github: [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-stripe/source_stripe/spec.yaml](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-stripe/source_stripe/spec.yaml).\n",
+    "\n",
+    "The general shape looks like this:\n",
+    "```python\n",
+    "{\n",
+    "  \"client_secret\": \"<secret key>\",\n",
+    "  \"account_id\": \"<account id>\",\n",
+    "  \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\",\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "By default all fields are stored as metadata in the documents and the text is set to an empty string. Construct the text of the document by transforming the documents returned by the reader."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "89a99e58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "from langchain.document_loaders.airbyte import AirbyteStripeLoader\n",
+    "\n",
+    "config = {\n",
+    "    # your stripe configuration\n",
+    "}\n",
+    "\n",
+    "loader = AirbyteStripeLoader(config=config, stream_name=\"invoices\") # check the documentation linked above for a list of all streams"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cea23fc",
+   "metadata": {},
+   "source": [
+    "Now you can load documents the usual way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dae75cdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a93dc2a",
+   "metadata": {},
+   "source": [
+    "As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1782db09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs_iterator = loader.lazy_load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a124086",
+   "metadata": {},
+   "source": [
+    "Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To create documents in a different, pass in a record_handler function when creating the loader:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5671395d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document\n",
+    "\n",
+    "def handle_record(record, id):\n",
+    "    return Document(page_content=record.data[\"title\"], metadata=record.data)\n",
+    "\n",
+    "loader = AirbyteStripeLoader(config=config, record_handler=handle_record, stream_name=\"invoices\")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223eb8bc",
+   "metadata": {},
+   "source": [
+    "## Incremental loads\n",
+    "\n",
+    "Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
+    "\n",
+    "To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7061e735",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "last_state = loader.last_state # store safely\n",
+    "\n",
+    "incremental_loader = AirbyteStripeLoader(config=config, record_handler=handle_record, stream_name=\"invoices\", state=last_state)\n",
+    "\n",
+    "new_docs = incremental_loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/airbyte_typeform.ipynb
+++ b/docs/extras/integrations/document_loaders/airbyte_typeform.ipynb
@@ -0,0 +1,209 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1f3a5ebf",
+   "metadata": {},
+   "source": [
+    "# Airbyte Typeform"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
+   "metadata": {},
+   "source": [
+    ">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
+    "\n",
+    "This loader exposes the Typeform connector as a document loader, allowing you to load various Typeform objects as documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6847a40c",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b06fbde",
+   "metadata": {},
+   "source": [
+    "## Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3e9dc79",
+   "metadata": {},
+   "source": [
+    "First, you need to install the `airbyte-source-typeform` python package."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d35e4e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install airbyte-source-typeform"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae855210",
+   "metadata": {},
+   "source": [
+    "## Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02208f52",
+   "metadata": {},
+   "source": [
+    "Check out the [Airbyte documentation page](https://docs.airbyte.com/integrations/sources/typeform/) for details about how to configure the reader.\n",
+    "The JSON schema the config object should adhere to can be found on Github: [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-typeform/source_typeform/spec.json](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-typeform/source_typeform/spec.json).\n",
+    "\n",
+    "The general shape looks like this:\n",
+    "```python\n",
+    "{\n",
+    "  \"credentials\": {\n",
+    "    \"auth_type\": \"Private Token\",\n",
+    "    \"access_token\": \"<your auth token>\"\n",
+    "  },\n",
+    "  \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\",\n",
+    "  \"form_ids\": [\"<id of form to load records for>\"] # if omitted, records from all forms will be loaded\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "By default all fields are stored as metadata in the documents and the text is set to an empty string. Construct the text of the document by transforming the documents returned by the reader."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "89a99e58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "from langchain.document_loaders.airbyte import AirbyteTypeformLoader\n",
+    "\n",
+    "config = {\n",
+    "    # your typeform configuration\n",
+    "}\n",
+    "\n",
+    "loader = AirbyteTypeformLoader(config=config, stream_name=\"forms\") # check the documentation linked above for a list of all streams"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cea23fc",
+   "metadata": {},
+   "source": [
+    "Now you can load documents the usual way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dae75cdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a93dc2a",
+   "metadata": {},
+   "source": [
+    "As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1782db09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs_iterator = loader.lazy_load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a124086",
+   "metadata": {},
+   "source": [
+    "Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To create documents in a different, pass in a record_handler function when creating the loader:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5671395d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document\n",
+    "\n",
+    "def handle_record(record, id):\n",
+    "    return Document(page_content=record.data[\"title\"], metadata=record.data)\n",
+    "\n",
+    "loader = AirbyteTypeformLoader(config=config, record_handler=handle_record, stream_name=\"forms\")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223eb8bc",
+   "metadata": {},
+   "source": [
+    "## Incremental loads\n",
+    "\n",
+    "Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
+    "\n",
+    "To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7061e735",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "last_state = loader.last_state # store safely\n",
+    "\n",
+    "incremental_loader = AirbyteTypeformLoader(config=config, record_handler=handle_record, stream_name=\"forms\", state=last_state)\n",
+    "\n",
+    "new_docs = incremental_loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/airbyte_zendesk_support.ipynb
+++ b/docs/extras/integrations/document_loaders/airbyte_zendesk_support.ipynb
@@ -0,0 +1,210 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1f3a5ebf",
+   "metadata": {},
+   "source": [
+    "# Airbyte Zendesk Support"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "35ac77b1-449b-44f7-b8f3-3494d55c286e",
+   "metadata": {},
+   "source": [
+    ">[Airbyte](https://github.com/airbytehq/airbyte) is a data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes. It has the largest catalog of ELT connectors to data warehouses and databases.\n",
+    "\n",
+    "This loader exposes the Zendesk Support connector as a document loader, allowing you to load various objects as documents."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6847a40c",
+   "metadata": {},
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b06fbde",
+   "metadata": {},
+   "source": [
+    "## Installation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e3e9dc79",
+   "metadata": {},
+   "source": [
+    "First, you need to install the `airbyte-source-zendesk-support` python package."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4d35e4e0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install airbyte-source-zendesk-support"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ae855210",
+   "metadata": {},
+   "source": [
+    "## Example"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "02208f52",
+   "metadata": {},
+   "source": [
+    "Check out the [Airbyte documentation page](https://docs.airbyte.com/integrations/sources/zendesk-support/) for details about how to configure the reader.\n",
+    "The JSON schema the config object should adhere to can be found on Github: [https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-zendesk-support/source_zendesk_support/spec.json](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-zendesk-support/source_zendesk_support/spec.json).\n",
+    "\n",
+    "The general shape looks like this:\n",
+    "```python\n",
+    "{\n",
+    "  \"subdomain\": \"<your zendesk subdomain>\",\n",
+    "  \"start_date\": \"<date from which to start retrieving records from in ISO format, e.g. 2020-10-20T00:00:00Z>\",\n",
+    "  \"credentials\": {\n",
+    "    \"credentials\": \"api_token\",\n",
+    "    \"email\": \"<your email>\",\n",
+    "    \"api_token\": \"<your api token>\"\n",
+    "  }\n",
+    "}\n",
+    "```\n",
+    "\n",
+    "By default all fields are stored as metadata in the documents and the text is set to an empty string. Construct the text of the document by transforming the documents returned by the reader."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "89a99e58",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "\n",
+    "from langchain.document_loaders.airbyte import AirbyteZendeskSupportLoader\n",
+    "\n",
+    "config = {\n",
+    "    # your zendesk-support configuration\n",
+    "}\n",
+    "\n",
+    "loader = AirbyteZendeskSupportLoader(config=config, stream_name=\"tickets\") # check the documentation linked above for a list of all streams"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2cea23fc",
+   "metadata": {},
+   "source": [
+    "Now you can load documents the usual way"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dae75cdb",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4a93dc2a",
+   "metadata": {},
+   "source": [
+    "As `load` returns a list, it will block until all documents are loaded. To have better control over this process, you can also you the `lazy_load` method which returns an iterator instead:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1782db09",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "docs_iterator = loader.lazy_load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3a124086",
+   "metadata": {},
+   "source": [
+    "Keep in mind that by default the page content is empty and the metadata object contains all the information from the record. To create documents in a different, pass in a record_handler function when creating the loader:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5671395d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.docstore.document import Document\n",
+    "\n",
+    "def handle_record(record, id):\n",
+    "    return Document(page_content=record.data[\"title\"], metadata=record.data)\n",
+    "\n",
+    "loader = AirbyteZendeskSupportLoader(config=config, record_handler=handle_record, stream_name=\"tickets\")\n",
+    "docs = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "223eb8bc",
+   "metadata": {},
+   "source": [
+    "## Incremental loads\n",
+    "\n",
+    "Some streams allow incremental loading, this means the source keeps track of synced records and won't load them again. This is useful for sources that have a high volume of data and are updated frequently.\n",
+    "\n",
+    "To take advantage of this, store the `last_state` property of the loader and pass it in when creating the loader again. This will ensure that only new records are loaded."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7061e735",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "last_state = loader.last_state # store safely\n",
+    "\n",
+    "incremental_loader = AirbyteZendeskSupportLoader(config=config, stream_name=\"tickets\", state=last_state)\n",
+    "\n",
+    "new_docs = incremental_loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/concurrent.ipynb
+++ b/docs/extras/integrations/document_loaders/concurrent.ipynb
@@ -0,0 +1,94 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "23c6e167",
+   "metadata": {},
+   "source": [
+    "# Concurrent Loader\n",
+    "\n",
+    "Works just like the GenericLoader but concurrently for those who choose to optimize their workflow.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "6ff3fb1f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import ConcurrentLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "ce96fa20",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = ConcurrentLoader.from_filesystem('example_data/', glob=\"**/*.txt\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "id": "06a6cf5d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "files = loader.load()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "id": "b87d3e58",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "2"
+      ]
+     },
+     "execution_count": 12,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "len(files)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "668f1ee5",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.1"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/modules/data_connection/document_loaders/integrations/dropbox.ipynb
+++ b/docs/extras/modules/data_connection/document_loaders/integrations/dropbox.ipynb
--- a/docs/extras/integrations/document_loaders/example_data/notebook.md
+++ b/docs/extras/integrations/document_loaders/example_data/notebook.md
@@ -2,7 +2,7 @@

 This notebook covers how to load data from an .ipynb notebook into a format suitable by LangChain.

-<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! Instead, edit the notebook w/the location & name as this file. -->
+


 ```python
--- a/docs/extras/integrations/document_loaders/example_data/sample_rss_feeds.opml
+++ b/docs/extras/integrations/document_loaders/example_data/sample_rss_feeds.opml
@@ -0,0 +1,13 @@
+<?xml version="1.0" encoding="UTF-8"?>
+
+<opml version="1.0">
+    <head>
+        <title>Sample RSS feed subscriptions</title>
+    </head>
+    <body>
+        <outline text="Tech" title="Tech">
+            <outline type="rss" text="Engadget" title="Engadget" xmlUrl="http://www.engadget.com/rss-full.xml" htmlUrl="http://www.engadget.com"/>
+            <outline type="rss" text="Ars Technica - All content" title="Ars Technica - All content" xmlUrl="http://feeds.arstechnica.com/arstechnica/index/" htmlUrl="https://arstechnica.com"/>
+        </outline>
+    </body>
+</opml>
--- a/docs/extras/integrations/document_loaders/google_cloud_storage_file.ipynb
+++ b/docs/extras/integrations/document_loaders/google_cloud_storage_file.ipynb
@@ -73,13 +73,27 @@
    "loader.load()"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "41c8a46f",
+   "metadata": {},
+   "source": [
+    "If you want to use an alternative loader, you can provide a custom function, for example:"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eba3002d",
   "metadata": {},
   "outputs": [],
-   "source": []
+   "source": [
+    "from langchain.document_loaders import PyPDFLoader\n",
+    "def load_pdf(file_path):\n",
+    "    return PyPDFLoader(file_path)\n",
+    "\n",
+    "loader = GCSFileLoader(project_name=\"aist\", bucket=\"testing-hwc\", blob=\"fake.pdf\", loader_func=load_pdf)"
+   ]
  }
 ],
 "metadata": {
--- a/docs/extras/integrations/document_loaders/huawei_obs_directory.ipynb
+++ b/docs/extras/integrations/document_loaders/huawei_obs_directory.ipynb
@@ -0,0 +1,178 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "c83b6a4c",
+   "metadata": {},
+   "source": [
+    "# Huawei OBS Directory\n",
+    "The following code demonstrates how to load objects from the Huawei OBS (Object Storage Service) as documents."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c2191935",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Install the required package\n",
+    "# pip install esdk-obs-python"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "55fca3b4",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import OBSDirectoryLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "c3ed419f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "endpoint = \"your-endpoint\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "3428fd4e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Configure your access credentials\\n\n",
+    "config = {\n",
+    "    \"ak\": \"your-access-key\",\n",
+    "    \"sk\": \"your-secret-key\"\n",
+    "}\n",
+    "loader = OBSDirectoryLoader(\"your-bucket-name\", endpoint=endpoint, config=config)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9beede9f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1e20a839",
+   "metadata": {},
+   "source": [
+    "## Specify a Prefix for Loading\n",
+    "If you want to load objects with a specific prefix from the bucket, you can use the following code:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "125f311d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = OBSDirectoryLoader(\"your-bucket-name\", endpoint=endpoint, config=config, prefix=\"test_prefix\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b3488037",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "84c82c0a",
+   "metadata": {},
+   "source": [
+    "## Get Authentication Information from ECS\n",
+    "If your langchain is deployed on Huawei Cloud ECS and [Agency is set up](https://support.huaweicloud.com/intl/en-us/usermanual-ecs/ecs_03_0166.html#section7), the loader can directly get the security token from ECS without needing access key and secret key. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "1db99969",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "config = {\"get_token_from_ecs\": True}\n",
+    "loader = OBSDirectoryLoader(\"your-bucket-name\", endpoint=endpoint, config=config)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "57dd9f35",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "30205d25",
+   "metadata": {},
+   "source": [
+    "## Use a Public Bucket\n",
+    "If your bucket's bucket policy allows anonymous access (anonymous users have `listBucket` and `GetObject` permissions), you can directly load the objects without configuring the `config` parameter."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "4dfa2ef0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = OBSDirectoryLoader(\"your-bucket-name\", endpoint=endpoint)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "67d4c1d0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/huawei_obs_file.ipynb
+++ b/docs/extras/integrations/document_loaders/huawei_obs_file.ipynb
@@ -0,0 +1,180 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "4394a872",
+   "metadata": {},
+   "source": [
+    "# Huawei OBS File\n",
+    "The following code demonstrates how to load an object from the Huawei OBS (Object Storage Service) as document."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c43d811b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Install the required package\n",
+    "# pip install esdk-obs-python"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "5e16bae6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders.obs_file import OBSFileLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "75cc7e7c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "endpoint = \"your-endpoint\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "f9816984",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from obs import ObsClient\n",
+    "obs_client = ObsClient(access_key_id=\"your-access-key\", secret_access_key=\"your-secret-key\", server=endpoint)\n",
+    "loader = OBSFileLoader(\"your-bucket-name\", \"your-object-key\", client=obs_client)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6143b39b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "633e05ca",
+   "metadata": {},
+   "source": [
+    "## Each Loader with Separate Authentication Information\n",
+    "If you don't need to reuse OBS connections between different loaders, you can directly configure the `config`. The loader will use the config information to initialize its own OBS client."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "a5dd6a5d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Configure your access credentials\\n\n",
+    "config = {\n",
+    "    \"ak\": \"your-access-key\",\n",
+    "    \"sk\": \"your-secret-key\"\n",
+    "}\n",
+    "loader = OBSFileLoader(\"your-bucket-name\", \"your-object-key\",endpoint=endpoint, config=config)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9a741f1c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "1e2e611c",
+   "metadata": {},
+   "source": [
+    "## Get Authentication Information from ECS\n",
+    "If your langchain is deployed on Huawei Cloud ECS and [Agency is set up](https://support.huaweicloud.com/intl/en-us/usermanual-ecs/ecs_03_0166.html#section7), the loader can directly get the security token from ECS without needing access key and secret key. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "id": "338fafef",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "config = {\"get_token_from_ecs\": True}\n",
+    "loader = OBSFileLoader(\"your-bucket-name\", \"your-object-key\", endpoint=endpoint, config=config)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "73976c55",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b77aa18c",
+   "metadata": {},
+   "source": [
+    "## Access a Publicly Accessible Object\n",
+    "If the object you want to access allows anonymous user access (anonymous users have `GetObject` permission), you can directly load the object without configuring the `config` parameter."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "df83d121",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader = OBSFileLoader(\"your-bucket-name\", \"your-object-key\", endpoint=endpoint)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "82a844ba",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "loader.load()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/mediawikidump.ipynb
+++ b/docs/extras/integrations/document_loaders/mediawikidump.ipynb
@@ -57,7 +57,13 @@
    }
   ],
   "source": [
-    "loader = MWDumpLoader(\"example_data/testmw_pages_current.xml\", encoding=\"utf8\")\n",
+    "loader = MWDumpLoader(\n",
+    "    file_path = \"example_data/testmw_pages_current.xml\", \n",
+    "    encoding=\"utf8\",\n",
+    "    #namespaces = [0,2,3] Optional list to load only specific namespaces. Loads all namespaces by default.\n",
+    "    skip_redirects = True, #will skip over pages that just redirect to other pages (or not if False)\n",
+    "    stop_on_error = False #will skip over pages that cause parsing errors (or not if False)\n",
+    "     )\n",
    "documents = loader.load()\n",
    "print(f\"You have {len(documents)} document(s) in your data \")"
   ]
--- a/docs/extras/integrations/document_loaders/news.ipynb
+++ b/docs/extras/integrations/document_loaders/news.ipynb
@@ -0,0 +1,192 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "2dfc4698",
+   "metadata": {},
+   "source": [
+    "# News URL\n",
+    "\n",
+    "This covers how to load HTML news articles from a list of URLs into a document format that we can use downstream."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "16c3699e",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-08-02T21:18:18.886031400Z",
+     "start_time": "2023-08-02T21:18:17.682345Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders import NewsURLLoader"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "836fbac1",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-08-02T21:18:18.895539800Z",
+     "start_time": "2023-08-02T21:18:18.895539800Z"
+    }
+   },
+   "outputs": [],
+   "source": [
+    "urls = [\n",
+    "    \"https://www.bbc.com/news/world-us-canada-66388172\",\n",
+    "    \"https://www.bbc.com/news/entertainment-arts-66384971\",\n",
+    "]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "33089aba-ff74-4d00-8f40-9449c29587cc",
+   "metadata": {},
+   "source": [
+    "Pass in urls to load them into Documents"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "00f46fda",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-08-02T21:18:19.227074500Z",
+     "start_time": "2023-08-02T21:18:18.895539800Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "First article:  page_content='In testimony to the congressional committee examining the 6 January riot, Mrs Powell said she did not review all of the many claims of election fraud she made, telling them that \"no reasonable person\" would view her claims as fact. Neither she nor her representatives have commented.' metadata={'title': 'Donald Trump indictment: What do we know about the six co-conspirators?', 'link': 'https://www.bbc.com/news/world-us-canada-66388172', 'authors': [], 'language': 'en', 'description': 'Six people accused of helping Mr Trump undermine the election have been described by prosecutors.', 'publish_date': None}\n",
+      "\n",
+      "Second article:  page_content='Ms Williams added: \"If there\\'s anything that I can do in my power to ensure that dancers or singers or whoever decides to work with her don\\'t have to go through that same experience, I\\'m going to do that.\"' metadata={'title': \"Lizzo dancers Arianna Davis and Crystal Williams: 'No one speaks out, they are scared'\", 'link': 'https://www.bbc.com/news/entertainment-arts-66384971', 'authors': [], 'language': 'en', 'description': 'The US pop star is being sued for sexual harassment and fat-shaming but has yet to comment.', 'publish_date': None}\n"
+     ]
+    }
+   ],
+   "source": [
+    "loader = NewsURLLoader(urls=urls)\n",
+    "data = loader.load()\n",
+    "print(\"First article: \", data[0])\n",
+    "print(\"\\nSecond article: \", data[1])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "source": [
+    "Use nlp=True to run nlp analysis and generate keywords + summary"
+   ],
+   "metadata": {
+    "collapsed": false
+   },
+   "id": "98ac26c488315bff"
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "b68a26b3",
+   "metadata": {
+    "ExecuteTime": {
+     "end_time": "2023-08-02T21:18:19.585758200Z",
+     "start_time": "2023-08-02T21:18:19.227074500Z"
+    }
+   },
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "First article:  page_content='In testimony to the congressional committee examining the 6 January riot, Mrs Powell said she did not review all of the many claims of election fraud she made, telling them that \"no reasonable person\" would view her claims as fact. Neither she nor her representatives have commented.' metadata={'title': 'Donald Trump indictment: What do we know about the six co-conspirators?', 'link': 'https://www.bbc.com/news/world-us-canada-66388172', 'authors': [], 'language': 'en', 'description': 'Six people accused of helping Mr Trump undermine the election have been described by prosecutors.', 'publish_date': None, 'keywords': ['powell', 'know', 'donald', 'trump', 'review', 'indictment', 'telling', 'view', 'reasonable', 'person', 'testimony', 'coconspirators', 'riot', 'representatives', 'claims'], 'summary': 'In testimony to the congressional committee examining the 6 January riot, Mrs Powell said she did not review all of the many claims of election fraud she made, telling them that \"no reasonable person\" would view her claims as fact.\\nNeither she nor her representatives have commented.'}\n",
+      "\n",
+      "Second article:  page_content='Ms Williams added: \"If there\\'s anything that I can do in my power to ensure that dancers or singers or whoever decides to work with her don\\'t have to go through that same experience, I\\'m going to do that.\"' metadata={'title': \"Lizzo dancers Arianna Davis and Crystal Williams: 'No one speaks out, they are scared'\", 'link': 'https://www.bbc.com/news/entertainment-arts-66384971', 'authors': [], 'language': 'en', 'description': 'The US pop star is being sued for sexual harassment and fat-shaming but has yet to comment.', 'publish_date': None, 'keywords': ['davis', 'lizzo', 'singers', 'experience', 'crystal', 'ensure', 'arianna', 'theres', 'williams', 'power', 'going', 'dancers', 'im', 'speaks', 'work', 'ms', 'scared'], 'summary': 'Ms Williams added: \"If there\\'s anything that I can do in my power to ensure that dancers or singers or whoever decides to work with her don\\'t have to go through that same experience, I\\'m going to do that.\"'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "loader = NewsURLLoader(urls=urls, nlp=True)\n",
+    "data = loader.load()\n",
+    "print(\"First article: \", data[0])\n",
+    "print(\"\\nSecond article: \", data[1])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "['powell',\n 'know',\n 'donald',\n 'trump',\n 'review',\n 'indictment',\n 'telling',\n 'view',\n 'reasonable',\n 'person',\n 'testimony',\n 'coconspirators',\n 'riot',\n 'representatives',\n 'claims']"
+     },
+     "execution_count": 5,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data[0].metadata['keywords']"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-08-02T21:18:19.585758200Z",
+     "start_time": "2023-08-02T21:18:19.585758200Z"
+    }
+   },
+   "id": "ae37e004e0284b1d"
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "outputs": [
+    {
+     "data": {
+      "text/plain": "'In testimony to the congressional committee examining the 6 January riot, Mrs Powell said she did not review all of the many claims of election fraud she made, telling them that \"no reasonable person\" would view her claims as fact.\\nNeither she nor her representatives have commented.'"
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "data[0].metadata['summary']"
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-08-02T21:18:19.598966800Z",
+     "start_time": "2023-08-02T21:18:19.594950200Z"
+    }
+   },
+   "id": "7676155fb175e53e"
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
--- a/docs/extras/integrations/document_loaders/nuclia.ipynb
+++ b/docs/extras/integrations/document_loaders/nuclia.ipynb
@@ -0,0 +1,144 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Nuclia Understanding API document loader\n",
+    "\n",
+    "[Nuclia](https://nuclia.com) automatically indexes your unstructured data from any internal and external source, providing optimized search results and generative answers. It can handle video and audio transcription, image content extraction, and document parsing.\n",
+    "\n",
+    "The Nuclia Understanding API supports the processing of unstructured data, including text, web pages, documents, and audio/video contents. It extracts all texts wherever they are (using speech-to-text or OCR when needed), it also extracts metadata, embedded files (like images in a PDF), and web links. If machine learning is enabled, it identifies entities, provides a summary of the content and generates embeddings for all the sentences.\n",
+    "\n",
+    "To use the Nuclia Understanding API, you need to have a Nuclia account. You can create one for free at [https://nuclia.cloud](https://nuclia.cloud), and then [create a NUA key](https://docs.nuclia.dev/docs/docs/using/understanding/intro)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "#!pip install --upgrade protobuf\n",
+    "#!pip install nucliadb-protos"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "os.environ[\"NUCLIA_ZONE\"] = \"<YOUR_ZONE>\"  # e.g. europe-1\n",
+    "os.environ[\"NUCLIA_NUA_KEY\"] = \"<YOUR_API_KEY>\""
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To use the Nuclia document loader, you need to instantiate a `NucliaUnderstandingAPI` tool:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.tools.nuclia import NucliaUnderstandingAPI\n",
+    "\n",
+    "nua = NucliaUnderstandingAPI(enable_ml=False)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.document_loaders.nuclia import NucliaLoader\n",
+    "\n",
+    "loader = NucliaLoader(\"./interview.mp4\", nua)"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can now call the `load` the document in a loop until you get the document."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import time\n",
+    "\n",
+    "pending = True\n",
+    "while pending:\n",
+    "    time.sleep(15)\n",
+    "    docs = loader.load()\n",
+    "    if len(docs) > 0:\n",
+    "        print(docs[0].page_content)\n",
+    "        print(docs[0].metadata)\n",
+    "        pending = False\n",
+    "    else:\n",
+    "        print(\"waiting...\")"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Retrieved information\n",
+    "\n",
+    "Nuclia returns the following information:\n",
+    "\n",
+    "- file metadata\n",
+    "- extracted text\n",
+    "- nested text (like text in an embedded image)\n",
+    "- paragraphs and sentences splitting (defined by the position of their first and last characters, plus start time and end time for a video or audio file)\n",
+    "- links\n",
+    "- a thumbnail\n",
+    "- embedded files\n",
+    "\n",
+    "Note:\n",
+    "\n",
+    "  Generated files (thumbnail, extracted embedded files, etc.) are provided as a token. You can download them with the [`/processing/download` endpoint](https://docs.nuclia.dev/docs/api#operation/Download_binary_file_processing_download_get).\n",
+    "\n",
+    "  Also at any level, if an attribute exceeds a certain size, it will be put in a downloadable file and will be replaced in the document by a file pointer. This will consist of `{\"file\": {\"uri\": \"JWT_TOKEN\"}}`. The rule is that if the size of the message is greater than 1000000 characters, the biggest parts will be moved to downloadable files. First, the compression process will target vectors. If that is not enough, it will target large field metadata, and finally it will target extracted text.\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "langchain",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.5"
+  },
+  "orig_nbformat": 4
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
--- a/Show More
+++ b/Show More