remove from init

cr
2026-02-04 08:10:25 +00:00 · 2023-09-16 17:17:05 -07:00 · 2023-09-16 17:10:33 -07:00 · 2023-09-16 17:07:48 -07:00 · 2023-09-16 17:07:05 -07:00 · 2023-09-16 17:02:54 -07:00
1829 changed files with 129069 additions and 237189 deletions
--- a/.github/CONTRIBUTING.md
+++ b/.github/CONTRIBUTING.md
@@ -33,7 +33,7 @@ best way to get our attention.
 ### 🚩GitHub Issues

 Our [issues](https://github.com/hwchase17/langchain/issues) page is kept up to date
-with bugs, improvements, and feature requests. 
+with bugs, improvements, and feature requests.

 There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help
 organize issues.
@@ -44,7 +44,7 @@ If you are adding an issue, please try to keep it focused on a single, modular b
 If two issues are related, or blocking, please link them rather than combining them.

 We will try to keep these issues as up to date as possible, though
-with the rapid rate of develop in this field some may get out of date.
+with the rapid rate of development in this field some may get out of date.
 If you notice this happening, please let us know.

 ### 🙋Getting Help
@@ -61,11 +61,11 @@ we do not want these to get in the way of getting good code into the codebase.

 > **Note:** You can run this repository locally (which is described below) or in a [development container](https://containers.dev/) (which is described in the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer)).

-This project uses [Poetry](https://python-poetry.org/) as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.
+This project uses [Poetry](https://python-poetry.org/) v1.5.1 as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.

 ❗Note: If you use `Conda` or `Pyenv` as your environment / package manager, avoid dependency conflicts by doing the following first:
 1. *Before installing Poetry*, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
-2. Install Poetry (see above)
+2. Install Poetry v1.5.1 (see above)
 3. Tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
 4. Continue with the following steps.

@@ -73,21 +73,21 @@ There are two separate projects in this repository:
 - `langchain`: core langchain code, abstractions, and use cases
 - `langchain.experimental`: more experimental code

-Each of these has their OWN development environment. 
+Each of these has their OWN development environment.
 In order to run any of the commands below, please move into their respective directories.
 For example, to contribute to `langchain` run `cd libs/langchain` before getting started with the below.

 To install requirements:

 ```bash
-poetry install -E all
+poetry install --with test
 ```

-This will install all requirements for running the package, examples, linting, formatting, tests, and coverage. Note the `-E all` flag will install all optional dependencies necessary for integration testing.
+This will install all requirements for running the package, examples, linting, formatting, tests, and coverage.

-❗Note: If you're running Poetry 1.4.1 and receive a `WheelFileValidationError` for `debugpy` during installation, you can try either downgrading to Poetry 1.4.0 or disabling "modern installation" (`poetry config installer.modern-installation false`) and re-install requirements. See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
+❗Note: If during installation you receive a `WheelFileValidationError` for `debugpy`, please make sure you are running Poetry v1.5.1. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases. If you are still seeing this bug on v1.5.1, you may also try disabling "modern installation" (`poetry config installer.modern-installation false`) and re-installing requirements. See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.

-Now, you should be able to run the common tasks in the following section. To double check, run `make test`, all tests should pass. If they don't you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
+Now assuming `make` and `pytest` are installed, you should be able to run the common tasks in the following section. To double check, run `make test` under `libs/langchain`, all tests should pass. If they don't, you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.

 ## ✅ Common Tasks

@@ -134,7 +134,7 @@ We recognize linting can be annoying - if you do not want to do it, please conta
 ### Spellcheck

 Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).
-Note that `codespell` finds common typos, so could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
+Note that `codespell` finds common typos, so it could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.

 To check spelling for this project:

@@ -175,9 +175,9 @@ If you're adding a new dependency to Langchain, assume that it will be an option
 that most users won't have it installed.

 Users that do not have the dependency installed should be able to **import** your code without
-any side effects (no warnings, no errors, no exceptions). 
+any side effects (no warnings, no errors, no exceptions).

-To introduce the dependency to the pyproject.toml file correctly, please do the following: 
+To introduce the dependency to the pyproject.toml file correctly, please do the following:

 1. Add the dependency to the main group as an optional dependency
  ```bash
@@ -220,7 +220,7 @@ If you add new logic, please add a unit test.

 Integration tests cover logic that requires making calls to outside APIs (often integration with other services).

-**warning** Almost no tests should be integration tests. 
+**warning** Almost no tests should be integration tests.

  Tests that require making network connections make it difficult for other
  developers to test the code.
@@ -307,4 +307,3 @@ even patch releases may contain [non-backwards-compatible changes](https://semve

 If your contribution has made its way into a release, we will want to give you credit on Twitter (only if you want though)!
 If you have a Twitter account you would like us to mention, please let us know in the PR or in another manner.
-
--- a/.github/ISSUE_TEMPLATE/bug-report.yml
+++ b/.github/ISSUE_TEMPLATE/bug-report.yml
@@ -1,5 +1,5 @@
 name: "\U0001F41B Bug Report"
-description: Submit a bug report to help us improve LangChain
+description: Submit a bug report to help us improve LangChain. To report a security issue, please instead use the security option below.
 labels: ["02 Bug Report"]
 body:
  - type: markdown
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -1,28 +1,20 @@
 <!-- Thank you for contributing to LangChain!

-Replace this comment with:
-  - Description: a description of the change, 
-  - Issue: the issue # it fixes (if applicable),
-  - Dependencies: any dependencies required for this change,
-  - Tag maintainer: for a quicker response, tag the relevant maintainer (see below),
-  - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out!
+Replace this entire comment with:
+  - **Description:** a description of the change, 
+  - **Issue:** the issue # it fixes (if applicable),
+  - **Dependencies:** any dependencies required for this change,
+  - **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below),
+  - **Twitter handle:** we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out!

-Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally.
+Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally.
+
+See contribution guidelines for more information on how to write/run tests, lint, etc: 
+https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

 If you're adding a new integration, please include:
  1. a test for the integration, preferably unit tests that do not rely on network access,
-  2. an example notebook showing its use.
+  2. an example notebook showing its use. It lives in `docs/extras` directory.

-Maintainer responsibilities:
-  - General / Misc / if you don't know who to tag: @baskaryan
-  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
-  - Models / Prompts: @hwchase17, @baskaryan
-  - Memory: @hwchase17
-  - Agents / Tools / Toolkits: @hinthornw
-  - Tracing / Callbacks: @agola11
-  - Async: @agola11
-
-If no one reviews your PR within a few days, feel free to @-mention the same people again.
-
-See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
+If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17.
 -->
--- a/.github/actions/poetry_setup/action.yml
+++ b/.github/actions/poetry_setup/action.yml
@@ -15,64 +15,77 @@ inputs:
    description: Poetry version
    required: true

-  install-command:
-    description: Command run for installing dependencies
-    required: false
-    default: poetry install
-
  cache-key:
    description: Cache key to use for manual handling of caching
    required: true

  working-directory:
-    description: Directory to run install-command in
-    required: false
-    default: ""
+    description: Directory whose poetry.lock file should be cached
+    required: true

 runs:
  using: composite
  steps:
    - uses: actions/setup-python@v4
-      name: Setup python $${ inputs.python-version }}
+      name: Setup python ${{ inputs.python-version }}
      with:
        python-version: ${{ inputs.python-version }}

    - uses: actions/cache@v3
-      id: cache-pip
-      name: Cache Pip ${{ inputs.python-version }}
+      id: cache-bin-poetry
+      name: Cache Poetry binary - Python ${{ inputs.python-version }}
      env:
-        SEGMENT_DOWNLOAD_TIMEOUT_MIN: "15"
+        SEGMENT_DOWNLOAD_TIMEOUT_MIN: "1"
+      with:
+        path: |
+          /opt/pipx/venvs/poetry
+        # This step caches the poetry installation, so make sure it's keyed on the poetry version as well.
+        key: bin-poetry-${{ runner.os }}-${{ runner.arch }}-py-${{ inputs.python-version }}-${{ inputs.poetry-version }}
+
+    - name: Refresh shell hashtable and fixup softlinks
+      if: steps.cache-bin-poetry.outputs.cache-hit == 'true'
+      shell: bash
+      env:
+        POETRY_VERSION: ${{ inputs.poetry-version }}
+        PYTHON_VERSION: ${{ inputs.python-version }}
+      run: |
+        set -eux
+
+        # Refresh the shell hashtable, to ensure correct `which` output.
+        hash -r
+
+        # `actions/cache@v3` doesn't always seem able to correctly unpack softlinks.
+        # Delete and recreate the softlinks pipx expects to have.
+        rm /opt/pipx/venvs/poetry/bin/python
+        cd /opt/pipx/venvs/poetry/bin
+        ln -s "$(which "python$PYTHON_VERSION")" python
+        chmod +x python
+        cd /opt/pipx_bin/
+        ln -s /opt/pipx/venvs/poetry/bin/poetry poetry
+        chmod +x poetry
+
+        # Ensure everything got set up correctly.
+        /opt/pipx/venvs/poetry/bin/python --version
+        /opt/pipx_bin/poetry --version
+
+    - name: Install poetry
+      if: steps.cache-bin-poetry.outputs.cache-hit != 'true'
+      shell: bash
+      env:
+        POETRY_VERSION: ${{ inputs.poetry-version }}
+        PYTHON_VERSION: ${{ inputs.python-version }}
+      run: pipx install "poetry==$POETRY_VERSION" --python "python$PYTHON_VERSION" --verbose
+
+    - name: Restore pip and poetry cached dependencies
+      uses: actions/cache@v3
+      env:
+        SEGMENT_DOWNLOAD_TIMEOUT_MIN: "4"
+        WORKDIR: ${{ inputs.working-directory == '' && '.' || inputs.working-directory }}
      with:
        path: |
          ~/.cache/pip
-        key: pip-${{ runner.os }}-${{ runner.arch }}-py-${{ inputs.python-version }}
-
-    - run: pipx install poetry==${{ inputs.poetry-version }} --python python${{ inputs.python-version }}
-      shell: bash
-
-    - name: Check Poetry File
-      shell: bash
-      working-directory: ${{ inputs.working-directory }}
-      run: |
-        poetry check
-
-    - name: Check lock file
-      shell: bash
-      working-directory: ${{ inputs.working-directory }}
-      run: |
-        poetry lock --check
-
-    - uses: actions/cache@v3
-      id: cache-poetry
-      env:
-        SEGMENT_DOWNLOAD_TIMEOUT_MIN: "15"
-      with:
-        path: |
          ~/.cache/pypoetry/virtualenvs
          ~/.cache/pypoetry/cache
          ~/.cache/pypoetry/artifacts
-        key: poetry-${{ runner.os }}-${{ runner.arch }}-py-${{ inputs.python-version }}-poetry-${{ inputs.poetry-version }}-${{ inputs.cache-key }}-${{ hashFiles('poetry.lock') }}
-
-    - run: ${{ inputs.install-command }}
-      working-directory: ${{ inputs.working-directory }}
-      shell: bash
+          ${{ env.WORKDIR }}/.venv
+        key: py-deps-${{ runner.os }}-${{ runner.arch }}-py-${{ inputs.python-version }}-poetry-${{ inputs.poetry-version }}-${{ inputs.cache-key }}-${{ hashFiles(format('{0}/**/poetry.lock', env.WORKDIR)) }}
--- a/.github/tools/git-restore-mtime
+++ b/.github/tools/git-restore-mtime
@@ -0,0 +1,606 @@
+#!/usr/bin/env python3
+#
+# git-restore-mtime - Change mtime of files based on commit date of last change
+#
+#    Copyright (C) 2012 Rodrigo Silva (MestreLion) <linux@rodrigosilva.com>
+#
+#    This program is free software: you can redistribute it and/or modify
+#    it under the terms of the GNU General Public License as published by
+#    the Free Software Foundation, either version 3 of the License, or
+#    (at your option) any later version.
+#
+#    This program is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#    GNU General Public License for more details.
+#
+#    You should have received a copy of the GNU General Public License
+#    along with this program. See <http://www.gnu.org/licenses/gpl.html>
+#
+# Source: https://github.com/MestreLion/git-tools
+# Version: July 13, 2023 (commit hash 5f832e72453e035fccae9d63a5056918d64476a2)
+"""
+Change the modification time (mtime) of files in work tree, based on the
+date of the most recent commit that modified the file, including renames.
+
+Ignores untracked files and uncommitted deletions, additions and renames, and
+by default modifications too.
+---
+Useful prior to generating release tarballs, so each file is archived with a
+date that is similar to the date when the file was actually last modified,
+assuming the actual modification date and its commit date are close.
+"""
+
+# TODO:
+# - Add -z on git whatchanged/ls-files, so we don't deal with filename decoding
+# - When Python is bumped to 3.7, use text instead of universal_newlines on subprocess
+# - Update "Statistics for some large projects" with modern hardware and repositories.
+# - Create a README.md for git-restore-mtime alone. It deserves extensive documentation
+#   - Move Statistics there
+# - See git-extras as a good example on project structure and documentation
+
+# FIXME:
+# - When current dir is outside the worktree, e.g. using --work-tree, `git ls-files`
+#   assume any relative pathspecs are to worktree root, not the current dir. As such,
+#   relative pathspecs may not work.
+# - Renames are tricky:
+#   - R100 should not change mtime, but original name is not on filelist. Should
+#     track renames until a valid (A, M) mtime found and then set on current name.
+#   - Should set mtime for both current and original directories.
+#   - Check mode changes with unchanged blobs?
+# - Check file (A, D) for the directory mtime is not sufficient:
+#   - Renames also change dir mtime, unless rename was on a parent dir
+#   - If most recent change of all files in a dir was a Modification (M),
+#     dir might not be touched at all.
+#   - Dirs containing only subdirectories but no direct files will also
+#     not be touched. They're files' [grand]parent dir, but never their dirname().
+#   - Some solutions:
+#     - After files done, perform some dir processing for missing dirs, finding latest
+#       file (A, D, R)
+#     - Simple approach: dir mtime is the most recent child (dir or file) mtime
+#     - Use a virtual concept of "created at most at" to fill missing info, bubble up
+#       to parents and grandparents
+#   - When handling [grand]parent dirs, stay inside <pathspec>
+# - Better handling of merge commits. `-m` is plain *wrong*. `-c/--cc` is perfect, but
+#   painfully slow. First pass without merge commits is not accurate. Maybe add a new
+#   `--accurate` mode for `--cc`?
+
+if __name__ != "__main__":
+    raise ImportError("{} should not be used as a module.".format(__name__))
+
+import argparse
+import datetime
+import logging
+import os.path
+import shlex
+import signal
+import subprocess
+import sys
+import time
+
+__version__ = "2022.12+dev"
+
+# Update symlinks only if the platform supports not following them
+UPDATE_SYMLINKS = bool(os.utime in getattr(os, 'supports_follow_symlinks', []))
+
+# Call os.path.normpath() only if not in a POSIX platform (Windows)
+NORMALIZE_PATHS = (os.path.sep != '/')
+
+# How many files to process in each batch when re-trying merge commits
+STEPMISSING = 100
+
+# (Extra) keywords for the os.utime() call performed by touch()
+UTIME_KWS = {} if not UPDATE_SYMLINKS else {'follow_symlinks': False}
+
+
+# Command-line interface ######################################################
+
+def parse_args():
+    parser = argparse.ArgumentParser(
+        description=__doc__.split('\n---')[0])
+
+    group = parser.add_mutually_exclusive_group()
+    group.add_argument('--quiet', '-q', dest='loglevel',
+        action="store_const", const=logging.WARNING, default=logging.INFO,
+        help="Suppress informative messages and summary statistics.")
+    group.add_argument('--verbose', '-v', action="count", help="""
+        Print additional information for each processed file.
+        Specify twice to further increase verbosity.
+        """)
+
+    parser.add_argument('--cwd', '-C', metavar="DIRECTORY", help="""
+        Run as if %(prog)s was started in directory %(metavar)s.
+        This affects how --work-tree, --git-dir and PATHSPEC arguments are handled.
+        See 'man 1 git' or 'git --help' for more information.
+        """)
+
+    parser.add_argument('--git-dir', dest='gitdir', metavar="GITDIR", help="""
+        Path to the git repository, by default auto-discovered by searching
+        the current directory and its parents for a .git/ subdirectory.
+        """)
+
+    parser.add_argument('--work-tree', dest='workdir', metavar="WORKTREE", help="""
+        Path to the work tree root, by default the parent of GITDIR if it's
+        automatically discovered, or the current directory if GITDIR is set.
+        """)
+
+    parser.add_argument('--force', '-f', default=False, action="store_true", help="""
+        Force updating files with uncommitted modifications.
+        Untracked files and uncommitted deletions, renames and additions are
+        always ignored.
+        """)
+
+    parser.add_argument('--merge', '-m', default=False, action="store_true", help="""
+        Include merge commits.
+        Leads to more recent times and more files per commit, thus with the same
+        time, which may or may not be what you want.
+        Including merge commits may lead to fewer commits being evaluated as files
+        are found sooner, which can improve performance, sometimes substantially.
+        But as merge commits are usually huge, processing them may also take longer.
+        By default, merge commits are only used for files missing from regular commits.
+        """)
+
+    parser.add_argument('--first-parent', default=False, action="store_true", help="""
+        Consider only the first parent, the "main branch", when evaluating merge commits.
+        Only effective when merge commits are processed, either when --merge is
+        used or when finding missing files after the first regular log search.
+        See --skip-missing.
+        """)
+
+    parser.add_argument('--skip-missing', '-s', dest="missing", default=True,
+        action="store_false", help="""
+        Do not try to find missing files.
+        If merge commits were not evaluated with --merge and some files were
+        not found in regular commits, by default %(prog)s searches for these
+        files again in the merge commits.
+        This option disables this retry, so files found only in merge commits
+        will not have their timestamp updated.
+        """)
+
+    parser.add_argument('--no-directories', '-D', dest='dirs', default=True,
+        action="store_false", help="""
+        Do not update directory timestamps.
+        By default, use the time of its most recently created, renamed or deleted file.
+        Note that just modifying a file will NOT update its directory time.
+        """)
+
+    parser.add_argument('--test', '-t', default=False, action="store_true",
+        help="Test run: do not actually update any file timestamp.")
+
+    parser.add_argument('--commit-time', '-c', dest='commit_time', default=False,
+        action='store_true', help="Use commit time instead of author time.")
+
+    parser.add_argument('--oldest-time', '-o', dest='reverse_order', default=False,
+        action='store_true', help="""
+        Update times based on the oldest, instead of the most recent commit of a file.
+        This reverses the order in which the git log is processed to emulate a
+        file "creation" date. Note this will be inaccurate for files deleted and
+        re-created at later dates.
+        """)
+
+    parser.add_argument('--skip-older-than', metavar='SECONDS', type=int, help="""
+        Ignore files that are currently older than %(metavar)s.
+        Useful in workflows that assume such files already have a correct timestamp,
+        as it may improve performance by processing fewer files.
+        """)
+
+    parser.add_argument('--skip-older-than-commit', '-N', default=False,
+        action='store_true', help="""
+        Ignore files older than the timestamp it would be updated to.
+        Such files may be considered "original", likely in the author's repository.
+        """)
+
+    parser.add_argument('--unique-times', default=False, action="store_true", help="""
+        Set the microseconds to a unique value per commit.
+        Allows telling apart changes that would otherwise have identical timestamps,
+        as git's time accuracy is in seconds.
+        """)
+
+    parser.add_argument('pathspec', nargs='*', metavar='PATHSPEC', help="""
+        Only modify paths matching %(metavar)s, relative to current directory.
+        By default, update all but untracked files and submodules.
+        """)
+
+    parser.add_argument('--version', '-V', action='version',
+        version='%(prog)s version {version}'.format(version=get_version()))
+
+    args_ = parser.parse_args()
+    if args_.verbose:
+        args_.loglevel = max(logging.TRACE, logging.DEBUG // args_.verbose)
+    args_.debug = args_.loglevel <= logging.DEBUG
+    return args_
+
+
+def get_version(version=__version__):
+    if not version.endswith('+dev'):
+        return version
+    try:
+        cwd = os.path.dirname(os.path.realpath(__file__))
+        return Git(cwd=cwd, errors=False).describe().lstrip('v')
+    except Git.Error:
+        return '-'.join((version, "unknown"))
+
+
+# Helper functions ############################################################
+
+def setup_logging():
+    """Add TRACE logging level and corresponding method, return the root logger"""
+    logging.TRACE = TRACE = logging.DEBUG // 2
+    logging.Logger.trace = lambda _, m, *a, **k: _.log(TRACE, m, *a, **k)
+    return logging.getLogger()
+
+
+def normalize(path):
+    r"""Normalize paths from git, handling non-ASCII characters.
+
+    Git stores paths as UTF-8 normalization form C.
+    If path contains non-ASCII or non-printable characters, git outputs the UTF-8
+    in octal-escaped notation, escaping double-quotes and backslashes, and then
+    double-quoting the whole path.
+    https://git-scm.com/docs/git-config#Documentation/git-config.txt-corequotePath
+
+    This function reverts this encoding, so:
+    normalize(r'"Back\\slash_double\"quote_a\303\247a\303\255"') =>
+        r'Back\slash_double"quote_açaí')
+
+    Paths with invalid UTF-8 encoding, such as single 0x80-0xFF bytes (e.g, from
+    Latin1/Windows-1251 encoding) are decoded using surrogate escape, the same
+    method used by Python for filesystem paths. So 0xE6 ("æ" in Latin1, r'\\346'
+    from Git) is decoded as "\udce6". See https://peps.python.org/pep-0383/ and
+    https://vstinner.github.io/painful-history-python-filesystem-encoding.html
+
+    Also see notes on `windows/non-ascii-paths.txt` about path encodings on
+    non-UTF-8 platforms and filesystems.
+    """
+    if path and path[0] == '"':
+        # Python 2: path = path[1:-1].decode("string-escape")
+        # Python 3: https://stackoverflow.com/a/46650050/624066
+        path = (path[1:-1]                 # Remove enclosing double quotes
+                .encode('latin1')          # Convert to bytes, required by 'unicode-escape'
+                .decode('unicode-escape')  # Perform the actual octal-escaping decode
+                .encode('latin1')          # 1:1 mapping to bytes, UTF-8 encoded
+                .decode('utf8', 'surrogateescape'))  # Decode from UTF-8
+    if NORMALIZE_PATHS:
+        # Make sure the slash matches the OS; for Windows we need a backslash
+        path = os.path.normpath(path)
+    return path
+
+
+def dummy(*_args, **_kwargs):
+    """No-op function used in dry-run tests"""
+
+
+def touch(path, mtime):
+    """The actual mtime update"""
+    os.utime(path, (mtime, mtime), **UTIME_KWS)
+
+
+def touch_ns(path, mtime_ns):
+    """The actual mtime update, using nanoseconds for unique timestamps"""
+    os.utime(path, None, ns=(mtime_ns, mtime_ns), **UTIME_KWS)
+
+
+def isodate(secs: int):
+    # time.localtime() accepts floats, but discards fractional part
+    return time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(secs))
+
+
+def isodate_ns(ns: int):
+    # for integers fromtimestamp() is equivalent and ~16% slower than isodate()
+    return datetime.datetime.fromtimestamp(ns / 1000000000).isoformat(sep=' ')
+
+
+def get_mtime_ns(secs: int, idx: int):
+    # Time resolution for filesystems and functions:
+    # ext-4 and other POSIX filesystems: 1 nanosecond
+    # NTFS (Windows default): 100 nanoseconds
+    # datetime.datetime() (due to 64-bit float epoch): 1 microsecond
+    us = idx % 1000000  # 10**6
+    return 1000 * (1000000 * secs + us)
+
+
+def get_mtime_path(path):
+    return os.path.getmtime(path)
+
+
+# Git class and parse_log(), the heart of the script ##########################
+
+class Git:
+    def __init__(self, workdir=None, gitdir=None, cwd=None, errors=True):
+        self.gitcmd = ['git']
+        self.errors = errors
+        self._proc = None
+        if workdir: self.gitcmd.extend(('--work-tree', workdir))
+        if gitdir:  self.gitcmd.extend(('--git-dir',   gitdir))
+        if cwd:     self.gitcmd.extend(('-C',          cwd))
+        self.workdir, self.gitdir = self._get_repo_dirs()
+
+    def ls_files(self, paths: list = None):
+        return (normalize(_) for _ in self._run('ls-files --full-name', paths))
+
+    def ls_dirty(self, force=False):
+        return (normalize(_[3:].split(' -> ', 1)[-1])
+                for _ in self._run('status --porcelain')
+                if _[:2] != '??' and (not force or (_[0] in ('R', 'A')
+                                                    or _[1] == 'D')))
+
+    def log(self, merge=False, first_parent=False, commit_time=False,
+            reverse_order=False, paths: list = None):
+        cmd = 'whatchanged --pretty={}'.format('%ct' if commit_time else '%at')
+        if merge:         cmd += ' -m'
+        if first_parent:  cmd += ' --first-parent'
+        if reverse_order: cmd += ' --reverse'
+        return self._run(cmd, paths)
+
+    def describe(self):
+        return self._run('describe --tags', check=True)[0]
+
+    def terminate(self):
+        if self._proc is None:
+            return
+        try:
+            self._proc.terminate()
+        except OSError:
+            # Avoid errors on OpenBSD
+            pass
+
+    def _get_repo_dirs(self):
+        return (os.path.normpath(_) for _ in
+            self._run('rev-parse --show-toplevel --absolute-git-dir', check=True))
+
+    def _run(self, cmdstr: str, paths: list = None, output=True, check=False):
+        cmdlist = self.gitcmd + shlex.split(cmdstr)
+        if paths:
+            cmdlist.append('--')
+            cmdlist.extend(paths)
+        popen_args = dict(universal_newlines=True, encoding='utf8')
+        if not self.errors:
+            popen_args['stderr'] = subprocess.DEVNULL
+        log.trace("Executing: %s", ' '.join(cmdlist))
+        if not output:
+            return subprocess.call(cmdlist, **popen_args)
+        if check:
+            try:
+                stdout: str = subprocess.check_output(cmdlist, **popen_args)
+                return stdout.splitlines()
+            except subprocess.CalledProcessError as e:
+                raise self.Error(e.returncode, e.cmd, e.output, e.stderr)
+        self._proc = subprocess.Popen(cmdlist, stdout=subprocess.PIPE, **popen_args)
+        return (_.rstrip() for _ in self._proc.stdout)
+
+    def __del__(self):
+        self.terminate()
+
+    class Error(subprocess.CalledProcessError):
+        """Error from git executable"""
+
+
+def parse_log(filelist, dirlist, stats, git, merge=False, filterlist=None):
+    mtime = 0
+    datestr = isodate(0)
+    for line in git.log(
+            merge,
+            args.first_parent,
+            args.commit_time,
+            args.reverse_order,
+            filterlist
+    ):
+        stats['loglines'] += 1
+
+        # Blank line between Date and list of files
+        if not line:
+            continue
+
+        # Date line
+        if line[0] != ':':  # Faster than `not line.startswith(':')`
+            stats['commits'] += 1
+            mtime = int(line)
+            if args.unique_times:
+                mtime = get_mtime_ns(mtime, stats['commits'])
+            if args.debug:
+                datestr = isodate(mtime)
+            continue
+
+        # File line: three tokens if it describes a renaming, otherwise two
+        tokens = line.split('\t')
+
+        # Possible statuses:
+        # M: Modified (content changed)
+        # A: Added (created)
+        # D: Deleted
+        # T: Type changed: to/from regular file, symlinks, submodules
+        # R099: Renamed (moved), with % of unchanged content. 100 = pure rename
+        # Not possible in log: C=Copied, U=Unmerged, X=Unknown, B=pairing Broken
+        status = tokens[0].split(' ')[-1]
+        file = tokens[-1]
+
+        # Handles non-ASCII chars and OS path separator
+        file = normalize(file)
+
+        def do_file():
+            if args.skip_older_than_commit and get_mtime_path(file) <= mtime:
+                stats['skip'] += 1
+                return
+            if args.debug:
+                log.debug("%d\t%d\t%d\t%s\t%s",
+                          stats['loglines'], stats['commits'], stats['files'],
+                          datestr, file)
+            try:
+                touch(os.path.join(git.workdir, file), mtime)
+                stats['touches'] += 1
+            except Exception as e:
+                log.error("ERROR: %s: %s", e, file)
+                stats['errors'] += 1
+
+        def do_dir():
+            if args.debug:
+                log.debug("%d\t%d\t-\t%s\t%s",
+                          stats['loglines'], stats['commits'],
+                          datestr, "{}/".format(dirname or '.'))
+            try:
+                touch(os.path.join(git.workdir, dirname), mtime)
+                stats['dirtouches'] += 1
+            except Exception as e:
+                log.error("ERROR: %s: %s", e, dirname)
+                stats['direrrors'] += 1
+
+        if file in filelist:
+            stats['files'] -= 1
+            filelist.remove(file)
+            do_file()
+
+        if args.dirs and status in ('A', 'D'):
+            dirname = os.path.dirname(file)
+            if dirname in dirlist:
+                dirlist.remove(dirname)
+                do_dir()
+
+        # All files done?
+        if not stats['files']:
+            git.terminate()
+            return
+
+
+# Main Logic ##################################################################
+
+def main():
+    start = time.time()  # yes, Wall time. CPU time is not realistic for users.
+    stats = {_: 0 for _ in ('loglines', 'commits', 'touches', 'skip', 'errors',
+                            'dirtouches', 'direrrors')}
+
+    logging.basicConfig(level=args.loglevel, format='%(message)s')
+    log.trace("Arguments: %s", args)
+
+    # First things first: Where and Who are we?
+    if args.cwd:
+        log.debug("Changing directory: %s", args.cwd)
+        try:
+            os.chdir(args.cwd)
+        except OSError as e:
+            log.critical(e)
+            return e.errno
+    # Using both os.chdir() and `git -C` is redundant, but might prevent side effects
+    # `git -C` alone could be enough if we make sure that:
+    # - all paths, including args.pathspec, are processed by git: ls-files, rev-parse
+    # - touch() / os.utime() path argument is always prepended with git.workdir
+    try:
+        git = Git(workdir=args.workdir, gitdir=args.gitdir, cwd=args.cwd)
+    except Git.Error as e:
+        # Not in a git repository, and git already informed user on stderr. So we just...
+        return e.returncode
+
+    # Get the files managed by git and build file list to be processed
+    if UPDATE_SYMLINKS and not args.skip_older_than:
+        filelist = set(git.ls_files(args.pathspec))
+    else:
+        filelist = set()
+        for path in git.ls_files(args.pathspec):
+            fullpath = os.path.join(git.workdir, path)
+
+            # Symlink (to file, to dir or broken - git handles the same way)
+            if not UPDATE_SYMLINKS and os.path.islink(fullpath):
+                log.warning("WARNING: Skipping symlink, no OS support for updates: %s",
+                            path)
+                continue
+
+            # skip files which are older than given threshold
+            if (args.skip_older_than
+                    and start - get_mtime_path(fullpath) > args.skip_older_than):
+                continue
+
+            # Always add files relative to worktree root
+            filelist.add(path)
+
+    # If --force, silently ignore uncommitted deletions (not in the filesystem)
+    # and renames / additions (will not be found in log anyway)
+    if args.force:
+        filelist -= set(git.ls_dirty(force=True))
+    # Otherwise, ignore any dirty files
+    else:
+        dirty = set(git.ls_dirty())
+        if dirty:
+            log.warning("WARNING: Modified files in the working directory were ignored."
+                "\nTo include such files, commit your changes or use --force.")
+            filelist -= dirty
+
+    # Build dir list to be processed
+    dirlist = set(os.path.dirname(_) for _ in filelist) if args.dirs else set()
+
+    stats['totalfiles'] = stats['files'] = len(filelist)
+    log.info("{0:,} files to be processed in work dir".format(stats['totalfiles']))
+
+    if not filelist:
+        # Nothing to do. Exit silently and without errors, just like git does
+        return
+
+    # Process the log until all files are 'touched'
+    log.debug("Line #\tLog #\tF.Left\tModification Time\tFile Name")
+    parse_log(filelist, dirlist, stats, git, args.merge, args.pathspec)
+
+    # Missing files
+    if filelist:
+        # Try to find them in merge logs, if not done already
+        # (usually HUGE, thus MUCH slower!)
+        if args.missing and not args.merge:
+            filterlist = list(filelist)
+            missing = len(filterlist)
+            log.info("{0:,} files not found in log, trying merge commits".format(missing))
+            for i in range(0, missing, STEPMISSING):
+                parse_log(filelist, dirlist, stats, git,
+                          merge=True, filterlist=filterlist[i:i + STEPMISSING])
+
+        # Still missing some?
+        for file in filelist:
+            log.warning("WARNING: not found in the log: %s", file)
+
+    # Final statistics
+    # Suggestion: use git-log --before=mtime to brag about skipped log entries
+    def log_info(msg, *a, width=13):
+        ifmt = '{:%d,}'    % (width,)  # not using 'n' for consistency with ffmt
+        ffmt = '{:%d,.2f}' % (width,)
+        # %-formatting lacks a thousand separator, must pre-render with .format()
+        log.info(msg.replace('%d', ifmt).replace('%f', ffmt).format(*a))
+
+    log_info(
+        "Statistics:\n"
+        "%f seconds\n"
+        "%d log lines processed\n"
+        "%d commits evaluated",
+        time.time() - start, stats['loglines'], stats['commits'])
+
+    if args.dirs:
+        if stats['direrrors']: log_info("%d directory update errors", stats['direrrors'])
+        log_info("%d directories updated", stats['dirtouches'])
+
+    if stats['touches'] != stats['totalfiles']:
+                        log_info("%d files",              stats['totalfiles'])
+    if stats['skip']:   log_info("%d files skipped",      stats['skip'])
+    if stats['files']:  log_info("%d files missing",      stats['files'])
+    if stats['errors']: log_info("%d file update errors", stats['errors'])
+
+    log_info("%d files updated", stats['touches'])
+
+    if args.test:
+        log.info("TEST RUN - No files modified!")
+
+
+# Keep only essential, global assignments here. Any other logic must be in main()
+log = setup_logging()
+args = parse_args()
+
+# Set the actual touch() and other functions based on command-line arguments
+if args.unique_times:
+    touch = touch_ns
+    isodate = isodate_ns
+
+# Make sure this is always set last to ensure --test behaves as intended
+if args.test:
+    touch = dummy
+
+# UI done, it's showtime!
+try:
+    sys.exit(main())
+except KeyboardInterrupt:
+    log.info("\nAborting")
+    signal.signal(signal.SIGINT, signal.SIG_DFL)
+    os.kill(os.getpid(), signal.SIGINT)
--- a/.github/workflows/_lint.yml
+++ b/.github/workflows/_lint.yml
@@ -9,38 +9,142 @@ on:
        description: "From which folder this pipeline executes"

 env:
-  POETRY_VERSION: "1.4.2"
+  POETRY_VERSION: "1.5.1"
+  WORKDIR: ${{ inputs.working-directory == '' && '.' || inputs.working-directory }}

 jobs:
  build:
-    defaults:
-      run:
-        working-directory: ${{ inputs.working-directory }}
    runs-on: ubuntu-latest
+    env:
+      # This number is set "by eye": we want it to be big enough
+      # so that it's bigger than the number of commits in any reasonable PR,
+      # and also as small as possible since increasing the number makes
+      # the initial `git fetch` slower.
+      FETCH_DEPTH: 50
    strategy:
      matrix:
+        # Only lint on the min and max supported Python versions.
+        # It's extremely unlikely that there's a lint issue on any version in between
+        # that doesn't show up on the min or max versions.
+        #
+        # GitHub rate-limits how many jobs can be running at any one time.
+        # Starting new jobs is also relatively slow,
+        # so linting on fewer versions makes CI faster.
        python-version:
          - "3.8"
-          - "3.9"
-          - "3.10"
          - "3.11"
    steps:
      - uses: actions/checkout@v3
-      - name: Install poetry
+        with:
+          # Fetch the last FETCH_DEPTH commits, so the mtime-changing script
+          # can accurately set the mtimes of files modified in the last FETCH_DEPTH commits.
+          fetch-depth: ${{ env.FETCH_DEPTH }}
+      - name: Restore workdir file mtimes to last-edited commit date
+        id: restore-mtimes
+        # This is needed to make black caching work.
+        # Black's cache uses file (mtime, size) to check whether a lookup is a cache hit.
+        # Without this command, files in the repo would have the current time as the modified time,
+        # since the previous action step just created them.
+        # This command resets the mtime to the last time the files were modified in git instead,
+        # which is a high-quality and stable representation of the last modification date.
        run: |
-          pipx install poetry==$POETRY_VERSION
-      - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v4
+          # Important considerations:
+          # - These commands run at base of the repo, since we never `cd` to the `WORKDIR`.
+          # - We only want to alter mtimes for Python files, since that's all black checks.
+          # - We don't need to alter mtimes for directories, since black doesn't look at those.
+          # - We also only alter mtimes inside the `WORKDIR` since that's all we'll lint.
+          # - This should run before `poetry install`, because poetry's venv also contains
+          #   Python files, and we don't want to alter their mtimes since they aren't linted.
+
+          # Ensure we fail on non-zero exits and on undefined variables.
+          # Also print executed commands, for easier debugging.
+          set -eux
+
+          # Restore the mtimes of Python files in the workdir based on git history.
+          .github/tools/git-restore-mtime --no-directories "$WORKDIR/**/*.py"
+
+          # Since CI only does a partial fetch (to `FETCH_DEPTH`) for efficiency,
+          # the local git repo doesn't have full history. There are probably files
+          # that were last modified in a commit *older than* the oldest fetched commit.
+          # After `git-restore-mtime`, such files have a mtime set to the oldest fetched commit.
+          #
+          # As new commits get added, that timestamp will keep moving forward.
+          # If left unchanged, this will make `black` think that the files were edited
+          # more recently than its cache suggests. Instead, we can set their mtime
+          # to a fixed date in the far past that won't change and won't cause cache misses in black.
+          #
+          # For all workdir Python files modified in or before the oldest few fetched commits,
+          # make their mtime be 2000-01-01 00:00:00.
+          OLDEST_COMMIT="$(git log --reverse '--pretty=format:%H' | head -1)"
+          OLDEST_COMMIT_TIME="$(git show -s '--format=%ai' "$OLDEST_COMMIT")"
+          find "$WORKDIR" -name '*.py' -type f -not -newermt "$OLDEST_COMMIT_TIME" -exec touch -c -m -t '200001010000' '{}' '+'
+
+          echo "oldest-commit=$OLDEST_COMMIT" >> "$GITHUB_OUTPUT"
+
+      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
+        uses: "./.github/actions/poetry_setup"
        with:
          python-version: ${{ matrix.python-version }}
-          cache: poetry
-      - name: Install dependencies
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: ${{ inputs.working-directory }}
+          cache-key: lint-with-extras
+
+      - name: Check Poetry File
+        shell: bash
+        working-directory: ${{ inputs.working-directory }}
        run: |
-          poetry install
+          poetry check
+
+      - name: Check lock file
+        shell: bash
+        working-directory: ${{ inputs.working-directory }}
+        run: |
+          poetry lock --check
+
+      - name: Install dependencies
+        # Also installs dev/lint/test/typing dependencies, to ensure we have
+        # type hints for as many of our libraries as possible.
+        # This helps catch errors that require dependencies to be spotted, for example:
+        # https://github.com/langchain-ai/langchain/pull/10249/files#diff-935185cd488d015f026dcd9e19616ff62863e8cde8c0bee70318d3ccbca98341
+        #
+        # If you change this configuration, make sure to change the `cache-key`
+        # in the `poetry_setup` action above to stop using the old cache.
+        # It doesn't matter how you change it, any change will cause a cache-bust.
+        working-directory: ${{ inputs.working-directory }}
+        run: |
+          poetry install --with dev,lint,test,typing
+
      - name: Install langchain editable
-        if: ${{ inputs.working-directory != 'langchain' }}
+        working-directory: ${{ inputs.working-directory }}
+        if: ${{ inputs.working-directory != 'libs/langchain' }}
        run: |
          pip install -e ../langchain
+
+      - name: Restore black cache
+        uses: actions/cache@v3
+        env:
+          CACHE_BASE: black-${{ runner.os }}-${{ runner.arch }}-py${{ matrix.python-version }}-${{ inputs.working-directory }}-${{ hashFiles(format('{0}/poetry.lock', env.WORKDIR)) }}
+          SEGMENT_DOWNLOAD_TIMEOUT_MIN: "1"
+        with:
+          path: |
+            ${{ env.WORKDIR }}/.black_cache
+          key: ${{ env.CACHE_BASE }}-${{ steps.restore-mtimes.outputs.oldest-commit }}
+          restore-keys:
+            # If we can't find an exact match for our cache key, accept any with this prefix.
+            ${{ env.CACHE_BASE }}-
+
+      - name: Get .mypy_cache to speed up mypy
+        uses: actions/cache@v3
+        env:
+          SEGMENT_DOWNLOAD_TIMEOUT_MIN: "2"
+        with:
+          path: |
+            ${{ env.WORKDIR }}/.mypy_cache
+          key: mypy-${{ runner.os }}-${{ runner.arch }}-py${{ matrix.python-version }}-${{ inputs.working-directory }}-${{ hashFiles(format('{0}/poetry.lock', env.WORKDIR)) }}
+
      - name: Analysing the code with our lint
+        working-directory: ${{ inputs.working-directory }}
+        env:
+          BLACK_CACHE_DIR: .black_cache
        run: |
          make lint
--- a/.github/workflows/_pydantic_compatibility.yml
+++ b/.github/workflows/_pydantic_compatibility.yml
@@ -0,0 +1,93 @@
+name: pydantic v1/v2 compatibility
+
+on:
+  workflow_call:
+    inputs:
+      working-directory:
+        required: true
+        type: string
+        description: "From which folder this pipeline executes"
+
+env:
+  POETRY_VERSION: "1.5.1"
+
+jobs:
+  build:
+    defaults:
+      run:
+        working-directory: ${{ inputs.working-directory }}
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version:
+          - "3.8"
+          - "3.9"
+          - "3.10"
+          - "3.11"
+    name: Pydantic v1/v2 compatibility - Python ${{ matrix.python-version }}
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
+        uses: "./.github/actions/poetry_setup"
+        with:
+          python-version: ${{ matrix.python-version }}
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: ${{ inputs.working-directory }}
+          cache-key: pydantic-cross-compat
+
+      - name: Install dependencies
+        shell: bash
+        run: poetry install
+
+      - name: Install the opposite major version of pydantic
+        # If normal tests use pydantic v1, here we'll use v2, and vice versa.
+        shell: bash
+        run: |
+          # Determine the major part of pydantic version
+          REGULAR_VERSION=$(poetry run python -c "import pydantic; print(pydantic.__version__)" | cut -d. -f1)
+
+          if [[ "$REGULAR_VERSION" == "1" ]]; then
+            PYDANTIC_DEP=">=2.1,<3"
+            TEST_WITH_VERSION="2"
+          elif [[ "$REGULAR_VERSION" == "2" ]]; then
+            PYDANTIC_DEP="<2"
+            TEST_WITH_VERSION="1"
+          else
+            echo "Unexpected pydantic major version '$REGULAR_VERSION', cannot determine which version to use for cross-compatibility test."
+            exit 1
+          fi
+
+          # Install via `pip` instead of `poetry add` to avoid changing lockfile,
+          # which would prevent caching from working: the cache would get saved
+          # to a different key than where it gets loaded from.
+          poetry run pip install "pydantic${PYDANTIC_DEP}"
+
+          # Ensure that the correct pydantic is installed now.
+          echo "Checking pydantic version... Expecting ${TEST_WITH_VERSION}"
+
+          # Determine the major part of pydantic version
+          CURRENT_VERSION=$(poetry run python -c "import pydantic; print(pydantic.__version__)" | cut -d. -f1)
+
+          # Check that the major part of pydantic version is as expected, if not
+          # raise an error
+          if [[ "$CURRENT_VERSION" != "$TEST_WITH_VERSION" ]]; then
+            echo "Error: expected pydantic version ${CURRENT_VERSION} to have been installed, but found: ${TEST_WITH_VERSION}"
+            exit 1
+          fi
+          echo "Found pydantic version ${CURRENT_VERSION}, as expected"
+      - name: Run pydantic compatibility tests
+        shell: bash
+        run: make test
+
+      - name: Ensure the tests did not create any additional files
+        shell: bash
+        run: |
+          set -eu
+
+          STATUS="$(git status)"
+          echo "$STATUS"
+
+          # grep will exit non-zero if the target message isn't found,
+          # and `set -e` above will cause the step to fail.
+          echo "$STATUS" | grep 'nothing to commit, working tree clean'
--- a/.github/workflows/_release.yml
+++ b/.github/workflows/_release.yml
@@ -9,26 +9,37 @@ on:
        description: "From which folder this pipeline executes"

 env:
-  POETRY_VERSION: "1.4.2"
+  POETRY_VERSION: "1.5.1"

 jobs:
  if_release:
-    if: |
-        ${{ github.event.pull_request.merged == true }}
-        && ${{ contains(github.event.pull_request.labels.*.name, 'release') }}
+    # Disallow publishing from branches that aren't `master`.
+    if: github.ref == 'refs/heads/master'
    runs-on: ubuntu-latest
+    permissions:
+      # This permission is used for trusted publishing:
+      # https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
+      #
+      # Trusted publishing has to also be configured on PyPI for each package:
+      # https://docs.pypi.org/trusted-publishers/adding-a-publisher/
+      id-token: write
+
+      # This permission is needed by `ncipollo/release-action` to create the GitHub release.
+      contents: write
    defaults:
      run:
        working-directory: ${{ inputs.working-directory }}
    steps:
      - uses: actions/checkout@v3
-      - name: Install poetry
-        run: pipx install poetry==$POETRY_VERSION
-      - name: Set up Python 3.10
-        uses: actions/setup-python@v4
+
+      - name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
+        uses: "./.github/actions/poetry_setup"
        with:
          python-version: "3.10"
-          cache: "poetry"
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: ${{ inputs.working-directory }}
+          cache-key: release
+
      - name: Build project for distribution
        run: poetry build
      - name: Check Version
@@ -45,8 +56,9 @@ jobs:
          generateReleaseNotes: true
          tag: v${{ steps.check-version.outputs.version }}
          commit: master
-      - name: Publish to PyPI
-        env:
-          POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_API_TOKEN }}
-        run: |
-          poetry publish
+      - name: Publish package distributions to PyPI
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          packages-dir: ${{ inputs.working-directory }}/dist/
+          verbose: true
+          print-hash: true
--- a/.github/workflows/_test.yml
+++ b/.github/workflows/_test.yml
@@ -7,13 +7,9 @@ on:
        required: true
        type: string
        description: "From which folder this pipeline executes"
-      test_type:
-        type: string
-        description: "Test types to run"
-        default: '["core", "extended"]'

 env:
-  POETRY_VERSION: "1.4.2"
+  POETRY_VERSION: "1.5.1"

 jobs:
  build:
@@ -28,34 +24,34 @@ jobs:
          - "3.9"
          - "3.10"
          - "3.11"
-        test_type: ${{ fromJSON(inputs.test_type) }}
-    name: Python ${{ matrix.python-version }} ${{ matrix.test_type }}
+    name: Python ${{ matrix.python-version }}
    steps:
      - uses: actions/checkout@v3
-      - name: Set up Python ${{ matrix.python-version }}
+
+      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
        uses: "./.github/actions/poetry_setup"
        with:
          python-version: ${{ matrix.python-version }}
+          poetry-version: ${{ env.POETRY_VERSION }}
          working-directory: ${{ inputs.working-directory }}
-          poetry-version: "1.4.2"
-          cache-key: ${{ matrix.test_type }}
-          install-command: |
-              if [ "${{ matrix.test_type }}" == "core" ]; then
-                echo "Running core tests, installing dependencies with poetry..."
-                poetry install
-              else
-                echo "Running extended tests, installing dependencies with poetry..."
-                poetry install -E extended_testing
-              fi
-      - name: Install langchain editable
-        if: ${{ inputs.working-directory != 'langchain' }}
-        run: |
-          pip install -e ../langchain
-      - name: Run ${{matrix.test_type}} tests
-        run: |
-          if [ "${{ matrix.test_type }}" == "core" ]; then
-            make test
-          else
-            make extended_tests
-          fi
+          cache-key: core
+
+      - name: Install dependencies
        shell: bash
+        run: poetry install
+
+      - name: Run core tests
+        shell: bash
+        run: make test
+
+      - name: Ensure the tests did not create any additional files
+        shell: bash
+        run: |
+          set -eu
+
+          STATUS="$(git status)"
+          echo "$STATUS"
+
+          # grep will exit non-zero if the target message isn't found,
+          # and `set -e` above will cause the step to fail.
+          echo "$STATUS" | grep 'nothing to commit, working tree clean'
--- a/.github/workflows/imports.yml
+++ b/.github/workflows/imports.yml
@@ -0,0 +1,23 @@
+---
+name: Imports
+
+on:
+  push:
+    branches: [master]
+  pull_request:
+    branches: [master]
+
+jobs:
+  check:
+    runs-on: ubuntu-latest
+
+    steps:
+    - name: Checkout repository
+      uses: actions/checkout@v2
+
+    - name: Run import check
+      run: |
+        # We should not encourage imports directly from main init file
+        # Expect for __version__ and hub
+        # And of course expect for this file
+        git grep 'from langchain import' | grep -vE 'from langchain import (__version__|hub)' | grep -v '.github/workflows/check-imports.yml' && exit 1 || exit 0
--- a/.github/workflows/langchain_ci.yml
+++ b/.github/workflows/langchain_ci.yml
@@ -6,12 +6,29 @@ on:
    branches: [ master ]
  pull_request:
    paths:
+      - '.github/actions/poetry_setup/action.yml'
+      - '.github/tools/**'
      - '.github/workflows/_lint.yml'
      - '.github/workflows/_test.yml'
+      - '.github/workflows/_pydantic_compatibility.yml'
      - '.github/workflows/langchain_ci.yml'
      - 'libs/langchain/**'
  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

+# If another push to the same PR or branch happens while this workflow is still running,
+# cancel the earlier run in favor of the next run.
+#
+# There's no point in testing an outdated version of the code. GitHub only allows
+# a limited number of job runners to be active at the same time, so it's better to cancel
+# pointless jobs early so that more useful jobs can run sooner.
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
+env:
+  POETRY_VERSION: "1.5.1"
+  WORKDIR: "libs/langchain"
+
 jobs:
  lint:
    uses:
@@ -19,9 +36,62 @@ jobs:
    with:
      working-directory: libs/langchain
    secrets: inherit
+
  test:
    uses:
      ./.github/workflows/_test.yml
    with:
      working-directory: libs/langchain
-    secrets: inherit
+    secrets: inherit
+
+  pydantic-compatibility:
+    uses:
+      ./.github/workflows/_pydantic_compatibility.yml
+    with:
+      working-directory: libs/langchain
+    secrets: inherit
+
+  extended-tests:
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: ${{ env.WORKDIR }}
+    strategy:
+      matrix:
+        python-version:
+          - "3.8"
+          - "3.9"
+          - "3.10"
+          - "3.11"
+    name: Python ${{ matrix.python-version }} extended tests
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
+        uses: "./.github/actions/poetry_setup"
+        with:
+          python-version: ${{ matrix.python-version }}
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: libs/langchain
+          cache-key: extended
+
+      - name: Install dependencies
+        shell: bash
+        run: |
+          echo "Running extended tests, installing dependencies with poetry..."
+          poetry install -E extended_testing
+
+      - name: Run extended tests
+        run: make extended_tests
+
+      - name: Ensure the tests did not create any additional files
+        shell: bash
+        run: |
+          set -eu
+
+          STATUS="$(git status)"
+          echo "$STATUS"
+
+          # grep will exit non-zero if the target message isn't found,
+          # and `set -e` above will cause the step to fail.
+          echo "$STATUS" | grep 'nothing to commit, working tree clean'
--- a/.github/workflows/langchain_experimental_ci.yml
+++ b/.github/workflows/langchain_experimental_ci.yml
@@ -1,11 +1,13 @@
 ---
-name: libs/langchain-experimental CI
+name: libs/experimental CI

 on:
  push:
    branches: [ master ]
  pull_request:
    paths:
+      - '.github/actions/poetry_setup/action.yml'
+      - '.github/tools/**'
      - '.github/workflows/_lint.yml'
      - '.github/workflows/_test.yml'
      - '.github/workflows/langchain_experimental_ci.yml'
@@ -13,6 +15,20 @@ on:
      - 'libs/experimental/**'
  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

+# If another push to the same PR or branch happens while this workflow is still running,
+# cancel the earlier run in favor of the next run.
+#
+# There's no point in testing an outdated version of the code. GitHub only allows
+# a limited number of job runners to be active at the same time, so it's better to cancel
+# pointless jobs early so that more useful jobs can run sooner.
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
+env:
+  POETRY_VERSION: "1.5.1"
+  WORKDIR: "libs/experimental"
+
 jobs:
  lint:
    uses:
@@ -20,10 +36,94 @@ jobs:
    with:
      working-directory: libs/experimental
    secrets: inherit
+
  test:
    uses:
      ./.github/workflows/_test.yml
    with:
      working-directory: libs/experimental
-      test_type: '["core"]'
-    secrets: inherit
+    secrets: inherit
+
+  # It's possible that langchain-experimental works fine with the latest *published* langchain,
+  # but is broken with the langchain on `master`.
+  #
+  # We want to catch situations like that *before* releasing a new langchain, hence this test.
+  test-with-latest-langchain:
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: ${{ env.WORKDIR }}
+    strategy:
+      matrix:
+        python-version:
+          - "3.8"
+          - "3.9"
+          - "3.10"
+          - "3.11"
+    name: test with unpublished langchain - Python ${{ matrix.python-version }}
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
+        uses: "./.github/actions/poetry_setup"
+        with:
+          python-version: ${{ matrix.python-version }}
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: ${{ env.WORKDIR }}
+          cache-key: unpublished-langchain
+
+      - name: Install dependencies
+        shell: bash
+        run: |
+          echo "Running tests with unpublished langchain, installing dependencies with poetry..."
+          poetry install
+
+          echo "Editably installing langchain outside of poetry, to avoid messing up lockfile..."
+          poetry run pip install -e ../langchain
+
+      - name: Run tests
+        run: make test
+  extended-tests:
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: ${{ env.WORKDIR }}
+    strategy:
+      matrix:
+        python-version:
+          - "3.8"
+          - "3.9"
+          - "3.10"
+          - "3.11"
+    name: Python ${{ matrix.python-version }} extended tests
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
+        uses: "./.github/actions/poetry_setup"
+        with:
+          python-version: ${{ matrix.python-version }}
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: libs/experimental
+          cache-key: extended
+
+      - name: Install dependencies
+        shell: bash
+        run: |
+          echo "Running extended tests, installing dependencies with poetry..."
+          poetry install -E extended_testing
+
+      - name: Run extended tests
+        run: make extended_tests
+
+      - name: Ensure the tests did not create any additional files
+        shell: bash
+        run: |
+          set -eu
+
+          STATUS="$(git status)"
+          echo "$STATUS"
+
+          # grep will exit non-zero if the target message isn't found,
+          # and `set -e` above will cause the step to fail.
+          echo "$STATUS" | grep 'nothing to commit, working tree clean'
--- a/.github/workflows/langchain_experimental_release.yml
+++ b/.github/workflows/langchain_experimental_release.yml
@@ -1,14 +1,7 @@
 ---
-name: libs/langchain-experimental Release
+name: libs/experimental Release

 on:
-  pull_request:
-    types:
-      - closed
-    branches:
-      - master
-    paths:
-      - 'libs/experimental/pyproject.toml'
  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

 jobs:
@@ -17,4 +10,4 @@ jobs:
      ./.github/workflows/_release.yml
    with:
      working-directory: libs/experimental
-    secrets: inherit
+    secrets: inherit
--- a/.github/workflows/langchain_release.yml
+++ b/.github/workflows/langchain_release.yml
@@ -2,13 +2,6 @@
 name: libs/langchain Release

 on:
-  pull_request:
-    types:
-      - closed
-    branches:
-      - master
-    paths:
-      - 'libs/langchain/pyproject.toml'
  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

 jobs:
@@ -17,4 +10,4 @@ jobs:
      ./.github/workflows/_release.yml
    with:
      working-directory: libs/langchain
-    secrets: inherit
+    secrets: inherit
--- a/.github/workflows/scheduled_test.yml
+++ b/.github/workflows/scheduled_test.yml
@@ -0,0 +1,61 @@
+name: Scheduled tests
+
+on:
+  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI
+  schedule:
+    - cron:  '0 13 * * *'
+
+env:
+  POETRY_VERSION: "1.5.1"
+
+jobs:
+  build:
+    defaults:
+      run:
+        working-directory: libs/langchain
+    runs-on: ubuntu-latest
+    environment: Scheduled testing
+    strategy:
+      matrix:
+        python-version:
+          - "3.8"
+          - "3.9"
+          - "3.10"
+          - "3.11"
+    name: Python ${{ matrix.python-version }}
+    steps:
+      - uses: actions/checkout@v3
+
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: "./.github/actions/poetry_setup"
+        with:
+          python-version: ${{ matrix.python-version }}
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: libs/langchain
+          cache-key: scheduled
+
+      - name: Install dependencies
+        working-directory: libs/langchain
+        shell: bash
+        run: |
+          echo "Running scheduled tests, installing dependencies with poetry..."
+          poetry install --with=test_integration
+
+      - name: Run tests
+        shell: bash
+        env:
+          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+        run: |
+          make scheduled_tests
+
+      - name: Ensure the tests did not create any additional files
+        shell: bash
+        run: |
+          set -eu
+
+          STATUS="$(git status)"
+          echo "$STATUS"
+
+          # grep will exit non-zero if the target message isn't found,
+          # and `set -e` above will cause the step to fail.
+          echo "$STATUS" | grep 'nothing to commit, working tree clean'
--- a/7
+++ b/7
@@ -43,7 +43,12 @@ spell_fix:

 help:
 	@echo '----'
-	@echo 'coverage                     - run unit tests and generate coverage report'
+	@echo 'clean                        - run docs_clean and api_docs_clean'
 	@echo 'docs_build                   - build the documentation'
 	@echo 'docs_clean                   - clean the documentation build artifacts'
 	@echo 'docs_linkcheck               - run linkchecker on the documentation'
+	@echo 'api_docs_build               - build the API Reference documentation'
+	@echo 'api_docs_clean               - clean the API Reference documentation build artifacts'
+	@echo 'api_docs_linkcheck           - run linkchecker on the API Reference documentation'
+	@echo 'spell_check               	- run codespell on the project'
+	@echo 'spell_fix               		- run codespell on the project and fix the errors'
--- a/README.md
+++ b/README.md
@@ -2,18 +2,18 @@

 ⚡ Building applications with LLMs through composability ⚡

-[![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)
-[![CI](https://github.com/hwchase17/langchain/actions/workflows/langchain_ci.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/langchain_ci.yml)
-[![Experimental CI](https://github.com/hwchase17/langchain/actions/workflows/langchain_experimental_ci.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/langchain_experimental_ci.yml)
+[![Release Notes](https://img.shields.io/github/release/langchain-ai/langchain)](https://github.com/langchain-ai/langchain/releases)
+[![CI](https://github.com/langchain-ai/langchain/actions/workflows/langchain_ci.yml/badge.svg)](https://github.com/langchain-ai/langchain/actions/workflows/langchain_ci.yml)
+[![Experimental CI](https://github.com/langchain-ai/langchain/actions/workflows/langchain_experimental_ci.yml/badge.svg)](https://github.com/langchain-ai/langchain/actions/workflows/langchain_experimental_ci.yml)
 [![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 [![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
 [![](https://dcbadge.vercel.app/api/server/6adMQxSpJS?compact=true&style=flat)](https://discord.gg/6adMQxSpJS)
-[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)
-[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)
-[![GitHub star chart](https://img.shields.io/github/stars/hwchase17/langchain?style=social)](https://star-history.com/#hwchase17/langchain)
+[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
+[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/langchain-ai/langchain)
+[![GitHub star chart](https://img.shields.io/github/stars/langchain-ai/langchain?style=social)](https://star-history.com/#langchain-ai/langchain)
 [![Dependency Status](https://img.shields.io/librariesio/github/langchain-ai/langchain)](https://libraries.io/github/langchain-ai/langchain)
-[![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues)
+[![Open Issues](https://img.shields.io/github/issues-raw/langchain-ai/langchain)](https://github.com/langchain-ai/langchain/issues)


 Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).
@@ -21,7 +21,7 @@ Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwcha
 **Production Support:** As you move your LangChains into production, we'd love to offer more hands-on support.
 Fill out [this form](https://airtable.com/appwQzlErAS2qiP0L/shrGtGaVBVAz7NcV2) to share more about what you're building, and our team will get in touch.

-## 🚨Breaking Changes for select chains (SQLDatabase) on 7/28
+## 🚨Breaking Changes for select chains (SQLDatabase) on 7/28/23

 In an effort to make `langchain` leaner and safer, we are moving select chains to `langchain_experimental`.
 This migration has already started, but we are remaining backwards compatible until 7/28.
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -0,0 +1,6 @@
+# Security Policy
+
+## Reporting a Vulnerability
+
+Please report security vulnerabilities by email to `security@langchain.dev`.
+This email is an alias to a subset of our maintainers, and will ensure the issue is promptly triaged and acted upon as needed.
--- a/docs/api_reference/conf.py
+++ b/docs/api_reference/conf.py
@@ -100,6 +100,9 @@ extensions = [
 ]
 source_suffix = [".rst"]

+# some autodoc pydantic options are repeated in the actual template.
+# potentially user error, but there may be bugs in the sphinx extension
+# with options not being passed through correctly (from either the location in the code)
 autodoc_pydantic_model_show_json = False
 autodoc_pydantic_field_list_validators = False
 autodoc_pydantic_config_members = False
@@ -112,13 +115,6 @@ autodoc_member_order = "groupwise"
 autoclass_content = "both"
 autodoc_typehints_format = "short"

-autodoc_default_options = {
-    "members": True,
-    "show-inheritance": True,
-    "inherited-members": "BaseModel",
-    "undoc-members": True,
-    "special-members": "__call__",
-}
 # autodoc_typehints = "description"
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ["templates"]
@@ -160,7 +156,7 @@ html_context = {
 html_static_path = ["_static"]

 # These paths are either relative to html_static_path
-# or fully qualified paths (eg. https://...)
+# or fully qualified paths (e.g. https://...)
 html_css_files = [
    "css/custom.css",
 ]
--- a/docs/api_reference/create_api_rst.py
+++ b/docs/api_reference/create_api_rst.py
@@ -1,49 +1,216 @@
-"""Script for auto-generating api_reference.rst"""
-import glob
-import re
+"""Script for auto-generating api_reference.rst."""
+import importlib
+import inspect
+import typing
 from pathlib import Path
+from typing import TypedDict, Sequence, List, Dict, Literal, Union
+from enum import Enum
+
+from pydantic import BaseModel

 ROOT_DIR = Path(__file__).parents[2].absolute()
+HERE = Path(__file__).parent
+
 PKG_DIR = ROOT_DIR / "libs" / "langchain" / "langchain"
 EXP_DIR = ROOT_DIR / "libs" / "experimental" / "langchain_experimental"
-WRITE_FILE = Path(__file__).parent / "api_reference.rst"
-EXP_WRITE_FILE = Path(__file__).parent / "experimental_api_reference.rst"
+WRITE_FILE = HERE / "api_reference.rst"
+EXP_WRITE_FILE = HERE / "experimental_api_reference.rst"


-def load_members(dir: Path) -> dict:
-    members: dict = {}
-    for py in glob.glob(str(dir) + "/**/*.py", recursive=True):
-        module = py[len(str(dir)) + 1 :].replace(".py", "").replace("/", ".")
-        top_level = module.split(".")[0]
-        if top_level not in members:
-            members[top_level] = {"classes": [], "functions": []}
-        with open(py, "r") as f:
-            for line in f.readlines():
-                cls = re.findall(r"^class ([^_].*)\(", line)
-                members[top_level]["classes"].extend([module + "." + c for c in cls])
-                func = re.findall(r"^def ([^_].*)\(", line)
-                afunc = re.findall(r"^async def ([^_].*)\(", line)
-                func_strings = [module + "." + f for f in func + afunc]
-                members[top_level]["functions"].extend(func_strings)
-    return members
+ClassKind = Literal["TypedDict", "Regular", "Pydantic", "enum"]


-def construct_doc(pkg: str, members: dict) -> str:
+class ClassInfo(TypedDict):
+    """Information about a class."""
+
+    name: str
+    """The name of the class."""
+    qualified_name: str
+    """The fully qualified name of the class."""
+    kind: ClassKind
+    """The kind of the class."""
+    is_public: bool
+    """Whether the class is public or not."""
+
+
+class FunctionInfo(TypedDict):
+    """Information about a function."""
+
+    name: str
+    """The name of the function."""
+    qualified_name: str
+    """The fully qualified name of the function."""
+    is_public: bool
+    """Whether the function is public or not."""
+
+
+class ModuleMembers(TypedDict):
+    """A dictionary of module members."""
+
+    classes_: Sequence[ClassInfo]
+    functions: Sequence[FunctionInfo]
+
+
+def _load_module_members(module_path: str, namespace: str) -> ModuleMembers:
+    """Load all members of a module.
+
+    Args:
+        module_path: Path to the module.
+        namespace: the namespace of the module.
+
+    Returns:
+        list: A list of loaded module objects.
+    """
+    classes_: List[ClassInfo] = []
+    functions: List[FunctionInfo] = []
+    module = importlib.import_module(module_path)
+    for name, type_ in inspect.getmembers(module):
+        if not hasattr(type_, "__module__"):
+            continue
+        if type_.__module__ != module_path:
+            continue
+
+        if inspect.isclass(type_):
+            if type(type_) == typing._TypedDictMeta:  # type: ignore
+                kind: ClassKind = "TypedDict"
+            elif issubclass(type_, Enum):
+                kind = "enum"
+            elif issubclass(type_, BaseModel):
+                kind = "Pydantic"
+            else:
+                kind = "Regular"
+
+            classes_.append(
+                ClassInfo(
+                    name=name,
+                    qualified_name=f"{namespace}.{name}",
+                    kind=kind,
+                    is_public=not name.startswith("_"),
+                )
+            )
+        elif inspect.isfunction(type_):
+            functions.append(
+                FunctionInfo(
+                    name=name,
+                    qualified_name=f"{namespace}.{name}",
+                    is_public=not name.startswith("_"),
+                )
+            )
+        else:
+            continue
+
+    return ModuleMembers(
+        classes_=classes_,
+        functions=functions,
+    )
+
+
+def _merge_module_members(
+    module_members: Sequence[ModuleMembers],
+) -> ModuleMembers:
+    """Merge module members."""
+    classes_: List[ClassInfo] = []
+    functions: List[FunctionInfo] = []
+    for module in module_members:
+        classes_.extend(module["classes_"])
+        functions.extend(module["functions"])
+
+    return ModuleMembers(
+        classes_=classes_,
+        functions=functions,
+    )
+
+
+def _load_package_modules(
+    package_directory: Union[str, Path]
+) -> Dict[str, ModuleMembers]:
+    """Recursively load modules of a package based on the file system.
+
+    Traversal based on the file system makes it easy to determine which
+    of the modules/packages are part of the package vs. 3rd party or built-in.
+
+    Parameters:
+        package_directory: Path to the package directory.
+
+    Returns:
+        list: A list of loaded module objects.
+    """
+    package_path = (
+        Path(package_directory)
+        if isinstance(package_directory, str)
+        else package_directory
+    )
+    modules_by_namespace = {}
+
+    package_name = package_path.name
+
+    for file_path in package_path.rglob("*.py"):
+        if file_path.name.startswith("_"):
+            continue
+
+        relative_module_name = file_path.relative_to(package_path)
+
+        # Skip if any module part starts with an underscore
+        if any(part.startswith("_") for part in relative_module_name.parts):
+            continue
+
+        # Get the full namespace of the module
+        namespace = str(relative_module_name).replace(".py", "").replace("/", ".")
+        # Keep only the top level namespace
+        top_namespace = namespace.split(".")[0]
+
+        try:
+            module_members = _load_module_members(
+                f"{package_name}.{namespace}", namespace
+            )
+            # Merge module members if the namespace already exists
+            if top_namespace in modules_by_namespace:
+                existing_module_members = modules_by_namespace[top_namespace]
+                _module_members = _merge_module_members(
+                    [existing_module_members, module_members]
+                )
+            else:
+                _module_members = module_members
+
+            modules_by_namespace[top_namespace] = _module_members
+
+        except ImportError as e:
+            print(f"Error: Unable to import module '{namespace}' with error: {e}")
+
+    return modules_by_namespace
+
+
+def _construct_doc(pkg: str, members_by_namespace: Dict[str, ModuleMembers]) -> str:
+    """Construct the contents of the reference.rst file for the given package.
+
+    Args:
+        pkg: The package name
+        members_by_namespace: The members of the package, dict organized by top level
+                              module contains a list of classes and functions
+                              inside of the top level namespace.
+
+    Returns:
+        The contents of the reference.rst file.
+    """
    full_doc = f"""\
-=============
+=======================
 ``{pkg}`` API Reference
-=============
+=======================

 """
-    for module, _members in sorted(members.items(), key=lambda kv: kv[0]):
-        classes = _members["classes"]
+    namespaces = sorted(members_by_namespace)
+
+    for module in namespaces:
+        _members = members_by_namespace[module]
+        classes = _members["classes_"]
        functions = _members["functions"]
        if not (classes or functions):
            continue
        section = f":mod:`{pkg}.{module}`"
+        underline = "=" * (len(section) + 1)
        full_doc += f"""\
 {section}
-{'=' * (len(section) + 1)}
+{underline}

 .. automodule:: {pkg}.{module}
    :no-members:
@@ -52,7 +219,6 @@ def construct_doc(pkg: str, members: dict) -> str:
 """

        if classes:
-            cstring = "\n    ".join(sorted(classes))
            full_doc += f"""\
 Classes
 --------------
@@ -60,13 +226,31 @@ Classes

 .. autosummary::
    :toctree: {module}
-    :template: class.rst
-
-    {cstring}
-
 """
+
+            for class_ in sorted(classes, key=lambda c: c["qualified_name"]):
+                if not class_["is_public"]:
+                    continue
+
+                if class_["kind"] == "TypedDict":
+                    template = "typeddict.rst"
+                elif class_["kind"] == "enum":
+                    template = "enum.rst"
+                elif class_["kind"] == "Pydantic":
+                    template = "pydantic.rst"
+                else:
+                    template = "class.rst"
+
+                full_doc += f"""\
+    :template: {template}
+    
+    {class_["qualified_name"]}
+    
+"""
+
        if functions:
-            fstring = "\n    ".join(sorted(functions))
+            _functions = [f["qualified_name"] for f in functions if f["is_public"]]
+            fstring = "\n    ".join(sorted(_functions))
            full_doc += f"""\
 Functions
 --------------
@@ -83,12 +267,15 @@ Functions


 def main() -> None:
-    lc_members = load_members(PKG_DIR)
-    lc_doc = ".. _api_reference:\n\n" + construct_doc("langchain", lc_members)
+    """Generate the reference.rst file for each package."""
+    lc_members = _load_package_modules(PKG_DIR)
+    lc_doc = ".. _api_reference:\n\n" + _construct_doc("langchain", lc_members)
    with open(WRITE_FILE, "w") as f:
        f.write(lc_doc)
-    exp_members = load_members(EXP_DIR)
-    exp_doc = ".. _experimental_api_reference:\n\n" + construct_doc("langchain_experimental", exp_members)
+    exp_members = _load_package_modules(EXP_DIR)
+    exp_doc = ".. _experimental_api_reference:\n\n" + _construct_doc(
+        "langchain_experimental", exp_members
+    )
    with open(EXP_WRITE_FILE, "w") as f:
        f.write(exp_doc)

--- a/docs/api_reference/guide_imports.json
+++ b/docs/api_reference/guide_imports.json
--- a/docs/api_reference/requirements.txt
+++ b/docs/api_reference/requirements.txt
@@ -1,4 +1,6 @@
 -e libs/langchain
+-e libs/experimental
+pydantic<2
 autodoc_pydantic==1.8.0
 myst_parser
 nbsphinx==0.8.9
@@ -10,4 +12,4 @@ sphinx-panels
 toml
 myst_nb
 sphinx_copybutton
-pydata-sphinx-theme==0.13.1
+pydata-sphinx-theme==0.13.1
--- a/docs/api_reference/templates/class.rst
+++ b/docs/api_reference/templates/class.rst
@@ -5,17 +5,6 @@

 .. autoclass:: {{ objname }}

-   {% block methods %}
-   {% if methods %}
-   .. rubric:: {{ _('Methods') }}
-
-   .. autosummary::
-   {% for item in methods %}
-      ~{{ name }}.{{ item }}
-   {%- endfor %}
-   {% endif %}
-   {% endblock %}
-
   {% block attributes %}
   {% if attributes %}
   .. rubric:: {{ _('Attributes') }}
@@ -27,4 +16,21 @@
   {% endif %}
   {% endblock %}

+   {% block methods %}
+   {% if methods %}
+   .. rubric:: {{ _('Methods') }}
+
+   .. autosummary::
+   {% for item in methods %}
+      ~{{ name }}.{{ item }}
+   {%- endfor %}
+
+   {% for item in methods %}
+   .. automethod:: {{ name }}.{{ item }}
+   {%- endfor %}
+
+   {% endif %}
+   {% endblock %}
+
+
 .. example_links:: {{ objname }}
--- a/docs/api_reference/templates/enum.rst
+++ b/docs/api_reference/templates/enum.rst
@@ -0,0 +1,14 @@
+:mod:`{{module}}`.{{objname}}
+{{ underline }}==============
+
+.. currentmodule:: {{ module }}
+
+.. autoclass:: {{ objname }}
+
+    {% block attributes %}
+    {% for item in attributes %}
+    .. autoattribute:: {{ item }}
+    {% endfor %}
+    {% endblock %}
+
+.. example_links:: {{ objname }}
--- a/docs/api_reference/templates/pydantic.rst
+++ b/docs/api_reference/templates/pydantic.rst
@@ -0,0 +1,22 @@
+:mod:`{{module}}`.{{objname}}
+{{ underline }}==============
+
+.. currentmodule:: {{ module }}
+
+.. autopydantic_model:: {{ objname }}
+    :model-show-json: False
+    :model-show-config-summary: False
+    :model-show-validator-members: False
+    :model-show-field-summary: False
+    :field-signature-prefix: param
+    :members:
+    :undoc-members:
+    :inherited-members:
+    :member-order: groupwise
+    :show-inheritance: True
+    :special-members: __call__
+
+    {% block attributes %}
+    {% endblock %}
+
+.. example_links:: {{ objname }}
--- a/docs/api_reference/templates/redirects.html
+++ b/docs/api_reference/templates/redirects.html
@@ -5,9 +5,10 @@
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="Refresh" content="0; url={{ redirect }}" />
-    <meta name="Description" content="scikit-learn: machine learning in Python">
+    <meta name="robots" content="follow, index">
+    <meta name="Description" content="Python API reference for LangChain.">
    <link rel="canonical" href="{{ redirect }}" />
-    <title>scikit-learn: machine learning in Python</title>
+    <title>LangChain Python API Reference Documentation.</title>
  </head>
  <body>
    <p>You will be automatically redirected to the <a href="{{ redirect }}">new location of this page</a>.</p>
--- a/docs/api_reference/templates/typeddict.rst
+++ b/docs/api_reference/templates/typeddict.rst
@@ -0,0 +1,14 @@
+:mod:`{{module}}`.{{objname}}
+{{ underline }}==============
+
+.. currentmodule:: {{ module }}
+
+.. autoclass:: {{ objname }}
+
+    {% block attributes %}
+   {% for item in attributes %}
+  .. autoattribute:: {{ item }}
+   {% endfor %}
+   {% endblock %}
+
+.. example_links:: {{ objname }}
--- a/docs/api_reference/themes/scikit-learn-modern/layout.html
+++ b/docs/api_reference/themes/scikit-learn-modern/layout.html
@@ -19,7 +19,7 @@
  {% block htmltitle %}
  <title>{{ title|striptags|e }}{{ titlesuffix }}</title>
  {% endblock %}
-  <link rel="canonical" href="http://scikit-learn.org/stable/{{pagename}}.html" />
+  <link rel="canonical" href="https://api.python.langchain.com/en/latest/{{pagename}}.html" />

  {% if favicon_url %}
  <link rel="shortcut icon" href="{{ favicon_url|e }}"/>
--- a/docs/api_reference/themes/scikit-learn-modern/nav.html
+++ b/docs/api_reference/themes/scikit-learn-modern/nav.html
@@ -6,17 +6,6 @@
  {%- set top_container_cls = "sk-landing-container" %}
 {%- endif %}

-{% if theme_link_to_live_contributing_page|tobool %}
-{# Link to development page for live builds #}
-  {%- set development_link = "https://scikit-learn.org/dev/developers/index.html" %}
-{# Open on a new development page in new window/tab for live builds #}
-  {%- set development_attrs = 'target="_blank" rel="noopener noreferrer"' %}
-{%- else %}
-  {%- set development_link = pathto('developers/index') %}
-  {%- set development_attrs = '' %}
-{%- endif %}
-
-
 <nav id="navbar" class="{{ nav_bar_class }} navbar navbar-expand-md navbar-light bg-light py-0">
  <div class="container-fluid {{ top_container_cls }} px-0">
    {%- if logo_url %}
--- a/docs/docs_skeleton/docs/community.md
+++ b/docs/docs_skeleton/docs/community.md
@@ -0,0 +1,54 @@
+# Community navigator
+
+Hi! Thanks for being here. We’re lucky to have a community of so many passionate developers building with LangChain–we have so much to teach and learn from each other. Community members contribute code, host meetups, write blog posts, amplify each other’s work, become each other's customers and collaborators, and so much more.
+
+Whether you’re new to LangChain, looking to go deeper, or just want to get more exposure to the world of building with LLMs, this page can point you in the right direction. 
+
+- **🦜 Contribute to LangChain**
+
+- **🌍 Meetups, Events, and Hackathons**
+
+- **📣 Help Us Amplify Your Work**
+
+- **💬 Stay in the loop**
+
+
+# 🦜 Contribute to LangChain
+
+LangChain is the product of over 5,000+ contributions by 1,500+ contributors, and there is ******still****** so much to do together. Here are some ways to get involved:
+
+- **[Open a pull request](https://github.com/langchain-ai/langchain/issues):** We’d appreciate all forms of contributions–new features, infrastructure improvements, better documentation, bug fixes, etc. If you have an improvement or an idea, we’d love to work on it with you.
+- **[Read our contributor guidelines:](https://github.com/langchain-ai/langchain/blob/bbd22b9b761389a5e40fc45b0570e1830aabb707/.github/CONTRIBUTING.md)** We ask contributors to follow a ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow, run a few local checks for formatting, linting, and testing before submitting, and follow certain documentation and testing conventions.
+    - **First time contributor?** [Try one of these PRs with the “good first issue” tag](https://github.com/langchain-ai/langchain/contribute).
+- **Become an expert:** Our experts help the community by answering product questions in Discord. If that’s a role you’d like to play, we’d be so grateful! (And we have some special experts-only goodies/perks we can tell you more about). Send us an email to introduce yourself at hello@langchain.dev and we’ll take it from there!
+- **Integrate with LangChain:** If your product integrates with LangChain–or aspires to–we want to help make sure the experience is as smooth as possible for you and end users. Send us an email at hello@langchain.dev and tell us what you’re working on.
+    - **Become an Integration Maintainer:** Partner with our team to ensure your integration stays up-to-date and talk directly with users (and answer their inquiries) in our Discord. Introduce yourself at hello@langchain.dev if you’d like to explore this role.
+
+
+# 🌍 Meetups, Events, and Hackathons
+
+One of our favorite things about working in AI is how much enthusiasm there is for building together. We want to help make that as easy and impactful for you as possible! 
+- **Find a meetup, hackathon, or webinar:** You can find the one for you on our [global events calendar](https://mirror-feeling-d80.notion.site/0bc81da76a184297b86ca8fc782ee9a3?v=0d80342540df465396546976a50cfb3f).  
+    - **Submit an event to our calendar:** Email us at events@langchain.dev with a link to your event page! We can also help you spread the word with our local communities.
+- **Host a meetup:** If you want to bring a group of builders together, we want to help! We can publicize your event on our event calendar/Twitter, share it with our local communities in Discord, send swag, or potentially hook you up with a sponsor. Email us at events@langchain.dev to tell us about your event!
+- **Become a meetup sponsor:** We often hear from groups of builders that want to get together, but are blocked or limited on some dimension (space to host, budget for snacks, prizes to distribute, etc.). If you’d like to help, send us an email to events@langchain.dev we can share more about how it works!
+- **Speak at an event:** Meetup hosts are always looking for great speakers, presenters, and panelists. If you’d like to do that at an event, send us an email to hello@langchain.dev with more information about yourself, what you want to talk about, and what city you’re based in and we’ll try to match you with an upcoming event!
+- **Tell us about your LLM community:** If you host or participate in a community that would welcome support from LangChain and/or our team, send us an email at hello@langchain.dev and let us know how we can help.
+
+# 📣 Help Us Amplify Your Work
+
+If you’re working on something you’re proud of, and think the LangChain community would benefit from knowing about it, we want to help you show it off.
+
+- **Post about your work and mention us:** We love hanging out on Twitter to see what people in the space are talking about and working on. If you tag [@langchainai](https://twitter.com/LangChainAI), we’ll almost certainly see it and can show you some love.
+- **Publish something on our blog:** If you’re writing about your experience building with LangChain, we’d love to post (or crosspost) it on our blog! E-mail hello@langchain.dev with a draft of your post! Or even an idea for something you want to write about.
+- **Get your product onto our [integrations hub](https://integrations.langchain.com/):** Many developers take advantage of our seamless integrations with other products, and come to our integrations hub to find out who those are. If you want to get your product up there, tell us about it (and how it works with LangChain) at hello@langchain.dev.
+
+# ☀️ Stay in the loop
+
+Here’s where our team hangs out, talks shop, spotlights cool work, and shares what we’re up to. We’d love to see you there too.
+
+- **[Twitter](https://twitter.com/LangChainAI):** We post about what we’re working on and what cool things we’re seeing in the space. If you tag @langchainai in your post, we’ll almost certainly see it, and can show you some love!
+- **[Discord](https://discord.gg/6adMQxSpJS):** connect with >30k developers who are building with LangChain
+- **[GitHub](https://github.com/langchain-ai/langchain):** Open pull requests, contribute to a discussion, and/or contribute
+- **[Subscribe to our bi-weekly Release Notes](https://6w1pwbss0py.typeform.com/to/KjZB1auB):** a twice/month email roundup of the coolest things going on in our orbit
+- **Slack:** If you’re building an application in production at your company, we’d love to get into a Slack channel together. Fill out [this form](https://airtable.com/appwQzlErAS2qiP0L/shrGtGaVBVAz7NcV2) and we’ll get in touch about setting one up.
--- a/docs/docs_skeleton/docs/expression_language/index.mdx
+++ b/docs/docs_skeleton/docs/expression_language/index.mdx
@@ -0,0 +1,14 @@
+---
+sidebar_class_name: hidden
+---
+
+# LangChain Expression Language (LCEL)
+
+LangChain Expression Language or LCEL is a declarative way to easily compose chains together.
+Any chain constructed this way will automatically have full sync, async, and streaming support.
+
+#### [Interface](/docs/expression_language/interface)
+The base interface shared by all LCEL objects
+
+#### [Cookbook](/docs/expression_language/cookbook)
+Examples of common LCEL usage patterns
--- a/docs/docs_skeleton/docs/get_started/introduction.mdx
+++ b/docs/docs_skeleton/docs/get_started/introduction.mdx
@@ -4,9 +4,9 @@ sidebar_position: 0

 # Introduction

-**LangChain** is a framework for developing applications powered by language models. It enables applications that are:
- **Data-aware**: connect a language model to other sources of data
- **Agentic**: allow a language model to interact with its environment
+**LangChain** is a framework for developing applications powered by language models. It enables applications that:
+- **Are context-aware**: connect a language model to other sources of context (prompt instructions, few shot examples, content to ground it's response in)
+- **Reason**: rely on a language model to reason (about how to answer based on provided context, what actions to take, etc)

 The main value props of LangChain are:
 1. **Components**: abstractions for working with language models, along with a collection of implementations for each abstraction. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
@@ -16,9 +16,9 @@ Off-the-shelf chains make it easy to get started. For more complex applications

 ## Get started

-[Here’s](/docs/get_started/installation.html) how to install LangChain, set up your environment, and start building.
+[Here’s](/docs/get_started/installation) how to install LangChain, set up your environment, and start building.

-We recommend following our [Quickstart](/docs/get_started/quickstart.html) guide to familiarize yourself with the framework by building your first LangChain application.
+We recommend following our [Quickstart](/docs/get_started/quickstart) guide to familiarize yourself with the framework by building your first LangChain application.

 _**Note**: These docs are for the LangChain [Python package](https://github.com/hwchase17/langchain). For documentation on [LangChain.js](https://github.com/hwchase17/langchainjs), the JS/TS version, [head here](https://js.langchain.com/docs)._

@@ -28,7 +28,7 @@ LangChain provides standard, extendable interfaces and external integrations for

 #### [Model I/O](/docs/modules/model_io/)
 Interface with language models
-#### [Data connection](/docs/modules/data_connection/)
+#### [Retrieval](/docs/modules/data_connection/)
 Interface with application-specific data
 #### [Chains](/docs/modules/chains/)
 Construct sequences of calls
@@ -40,25 +40,24 @@ Persist application state between runs of a chain
 Log and stream intermediate steps of any chain

 ## Examples, ecosystem, and resources
-### [Use cases](/docs/use_cases/)
+### [Use cases](/docs/use_cases/question_answering/)
 Walkthroughs and best-practices for common end-to-end use cases, like:
+- [Document question answering](/docs/use_cases/question_answering/)
 - [Chatbots](/docs/use_cases/chatbots/)
- [Answering questions using sources](/docs/use_cases/question_answering/)
- [Analyzing structured data](/docs/use_cases/tabular.html)
+- [Analyzing structured data](/docs/use_cases/qa_structured/sql/)
 - and much more...

 ### [Guides](/docs/guides/)
 Learn best practices for developing with LangChain.

-### [Ecosystem](/docs/ecosystem/)
-LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/integrations/) and [dependent repos](/docs/ecosystem/dependents).
+### [Ecosystem](/docs/integrations/providers/)
+LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/integrations/providers/) and [dependent repos](/docs/additional_resources/dependents).

 ### [Additional resources](/docs/additional_resources/)
-Our community is full of prolific developers, creative builders, and fantastic teachers. Check out [YouTube tutorials](/docs/additional_resources/youtube.html) for great tutorials from folks in the community, and [Gallery](https://github.com/kyrolabs/awesome-langchain) for a list of awesome LangChain projects, compiled by the folks at [KyroLabs](https://kyrolabs.com).
+Our community is full of prolific developers, creative builders, and fantastic teachers. Check out [YouTube tutorials](/docs/additional_resources/youtube) for great tutorials from folks in the community, and [Gallery](https://github.com/kyrolabs/awesome-langchain) for a list of awesome LangChain projects, compiled by the folks at [KyroLabs](https://kyrolabs.com).

-<h3><span style={{color:"#2e8555"}}> Support </span></h3>
-
-Join us on [GitHub](https://github.com/hwchase17/langchain) or [Discord](https://discord.gg/6adMQxSpJS) to ask questions, share feedback, meet other developers building with LangChain, and dream about the future of LLM’s.
+### [Community](/docs/community)
+Head to the [Community navigator](/docs/community) to find places to ask questions, share feedback, meet other developers, and dream about the future of LLM’s.

 ## API reference

--- a/docs/docs_skeleton/docs/get_started/quickstart.mdx
+++ b/docs/docs_skeleton/docs/get_started/quickstart.mdx
@@ -25,13 +25,12 @@ import OpenAISetup from "@snippets/get_started/quickstart/openai_setup.mdx"
 Now we can start building our language model application. LangChain provides many modules that can be used to build language model applications.
 Modules can be used as stand-alones in simple applications and they can be combined for more complex use cases.

-The core building block of LangChain applications is the LLMChain.
-This combines three things:
+The most common and most important chain that LangChain helps create contains three things:
 - LLM: The language model is the core reasoning engine here. In order to work with LangChain, you need to understand the different types of language models and how to work with them.
 - Prompt Templates: This provides instructions to the language model. This controls what the language model outputs, so understanding how to construct prompts and different prompting strategies is crucial.
 - Output Parsers: These translate the raw response from the LLM to a more workable format, making it easy to use the output downstream.

-In this getting started guide we will cover those three components by themselves, and then cover the LLMChain which combines all of them.
+In this getting started guide we will cover those three components by themselves, and then go over how to combine all of them.
 Understanding these concepts will set you up well for being able to use and customize LangChain applications.
 Most LangChain applications allow you to configure the LLM and/or the prompt used, so knowing how to take advantage of this will be a big enabler.

@@ -59,8 +58,8 @@ LangChain provides several objects to easily distinguish between different roles
 If none of those roles sound right, there is also a `ChatMessage` class where you can specify the role manually.
 For more information on how to use these different messages most effectively, see our prompting guide.

-LangChain exposes a standard interface for both, but it's useful to understand this difference in order to construct prompts for a given language model.
-The standard interface that LangChain exposes has two methods:
+LangChain provides a standard interface for both, but it's useful to understand this difference in order to construct prompts for a given language model.
+The standard interface that LangChain provides has two methods:
 - `predict`: Takes in a string, returns a string
 - `predict_messages`: Takes in a list of messages, returns a message.

@@ -107,7 +106,7 @@ import PromptTemplateChatModel from "@snippets/get_started/quickstart/prompt_tem
 <PromptTemplateLLM/>

 However, the advantages of using these over raw string formatting are several.
-You can "partial" out variables - eg you can format only some of the variables at a time.
+You can "partial" out variables - e.g. you can format only some of the variables at a time.
 You can compose them together, easily combining different templates into a single prompt.
 For explanations of these functionalities, see the [section on prompts](/docs/modules/model_io/prompts) for more detail.

@@ -119,14 +118,14 @@ Let's take a look at this below:

 <PromptTemplateChatModel/>

-ChatPromptTemplates can also include other things besides ChatMessageTemplates - see the [section on prompts](/docs/modules/model_io/prompts) for more detail.
+ChatPromptTemplates can also be constructed in other ways - see the [section on prompts](/docs/modules/model_io/prompts) for more detail.

-## Output Parsers
+## Output parsers

 OutputParsers convert the raw output of an LLM into a format that can be used downstream.
 There are few main type of OutputParsers, including:

- Convert text from LLM -> structured information (eg JSON)
+- Convert text from LLM -> structured information (e.g. JSON)
 - Convert a ChatMessage into just a string
 - Convert the extra information returned from a call besides the message (like OpenAI function invocation) into a string.

@@ -138,10 +137,10 @@ import OutputParser from "@snippets/get_started/quickstart/output_parser.mdx"

 <OutputParser/>

-## LLMChain
+## PromptTemplate + LLM + OutputParser

 We can now combine all these into one chain.
-This chain will take input variables, pass those to a prompt template to create a prompt, pass the prompt to an LLM, and then pass the output through an (optional) output parser.
+This chain will take input variables, pass those to a prompt template to create a prompt, pass the prompt to a language model, and then pass the output through an (optional) output parser.
 This is a convenient way to bundle up a modular piece of logic.
 Let's see it in action!

@@ -149,14 +148,19 @@ import LLMChain from "@snippets/get_started/quickstart/llm_chain.mdx"

 <LLMChain/>

-## Next Steps
+Note that we are using the `|` syntax to join these components together.
+This `|` syntax is called the LangChain Expression Language.
+To learn more about this syntax, read the documentation [here](/docs/expression_language).
+
+## Next steps

 This is it!
-We've now gone over how to create the core building block of LangChain applications - the LLMChains.
+We've now gone over how to create the core building block of LangChain applications.
 There is a lot more nuance in all these components (LLMs, prompts, output parsers) and a lot more different components to learn about as well.
 To continue on your journey:

 - [Dive deeper](/docs/modules/model_io) into LLMs, prompts, and output parsers
 - Learn the other [key components](/docs/modules)
+- Read up on [LangChain Expression Language](/docs/expression_language) to learn how to chain these components together
 - Check out our [helpful guides](/docs/guides) for detailed walkthroughs on particular topics
 - Explore [end-to-end use cases](/docs/use_cases)
--- a/docs/docs_skeleton/docs/guides/evaluation/comparison/index.mdx
+++ b/docs/docs_skeleton/docs/guides/evaluation/comparison/index.mdx
@@ -3,7 +3,7 @@ sidebar_position: 3
 ---
 # Comparison Evaluators

-Comparison evaluators in LangChain help measure two different chain or LLM outputs. These evaluators are helpful for comparative analyses, such as A/B testing between two language models, or comparing different versions of the same model. They can also be useful for things like generating preference scores for ai-assisted reinforcement learning.
+Comparison evaluators in LangChain help measure two different chains or LLM outputs. These evaluators are helpful for comparative analyses, such as A/B testing between two language models, or comparing different versions of the same model. They can also be useful for things like generating preference scores for ai-assisted reinforcement learning.

 These evaluators inherit from the `PairwiseStringEvaluator` class, providing a comparison interface for two strings - typically, the outputs from two different prompts or models, or two versions of the same model. In essence, a comparison evaluator performs an evaluation on a pair of strings and returns a dictionary containing the evaluation score and other relevant details.

@@ -16,7 +16,7 @@ Here's a summary of the key methods and properties of a comparison evaluator:
 - `requires_input`: This property indicates whether this evaluator requires an input string.
 - `requires_reference`: This property specifies whether this evaluator requires a reference label.

-Detailed information about creating custom evaluators and the available built-in comparison evaluators are provided in the following sections.
+Detailed information about creating custom evaluators and the available built-in comparison evaluators is provided in the following sections.

 import DocCardList from "@theme/DocCardList";

--- a/docs/docs_skeleton/docs/guides/evaluation/index.mdx
+++ b/docs/docs_skeleton/docs/guides/evaluation/index.mdx
@@ -1,16 +1,12 @@
---
-sidebar_position: 6
---
-
 import DocCardList from "@theme/DocCardList";

 # Evaluation

 Building applications with language models involves many moving parts. One of the most critical components is ensuring that the outcomes produced by your models are reliable and useful across a broad array of inputs, and that they work well with your application's other software components. Ensuring reliability usually boils down to some combination of application design, testing & evaluation, and runtime checks. 

-The guides in this section review the APIs and functionality LangChain provides to help yous better evaluate your applications. Evaluation and testing are both critical when thinking about deploying LLM applications, since production environments require repeatable and useful outcomes.
+The guides in this section review the APIs and functionality LangChain provides to help you better evaluate your applications. Evaluation and testing are both critical when thinking about deploying LLM applications, since production environments require repeatable and useful outcomes.

-LangChain offers various types of evaluators to help you measure performance and integrity on diverse data, and we hope to encourage the the community to create and share other useful evaluators so everyone can improve. These docs will introduce the evaluator types, how to use them, and provide some examples of their use in real-world scenarios.
+LangChain offers various types of evaluators to help you measure performance and integrity on diverse data, and we hope to encourage the community to create and share other useful evaluators so everyone can improve. These docs will introduce the evaluator types, how to use them, and provide some examples of their use in real-world scenarios.

 Each evaluator type in LangChain comes with ready-to-use implementations and an extensible API that allows for customization according to your unique requirements. Here are some of the types of evaluators we offer:

--- a/docs/docs_skeleton/docs/guides/expression_language/index.mdx
+++ b/docs/docs_skeleton/docs/guides/expression_language/index.mdx
@@ -1,9 +0,0 @@
-# LangChain Expression Language
-
-import DocCardList from "@theme/DocCardList";
-
-LangChain Expression Language is a declarative way to easily compose chains together.
-Any chain constructed this way will automatically have full sync, async, and streaming support.
-See guides below for how to interact with chains constructed this way as well as cookbook examples.
-
-<DocCardList />
--- a/docs/docs_skeleton/docs/guides/langsmith/index.md
+++ b/docs/docs_skeleton/docs/guides/langsmith/index.md
@@ -2,11 +2,21 @@

 import DocCardList from "@theme/DocCardList";

-LangSmith helps you trace and evaluate your language model applications and intelligent agents to help you
+[LangSmith](https://smith.langchain.com) helps you trace and evaluate your language model applications and intelligent agents to help you
 move from prototype to production.

-Check out the [interactive walkthrough](walkthrough) below to get started.
+Check out the [interactive walkthrough](/docs/guides/langsmith/walkthrough) below to get started.

-For more information, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/)
+For more information, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/).

-<DocCardList />
+For tutorials and other end-to-end examples demonstrating ways to integrate LangSmith in your workflow,
+check out the [LangSmith Cookbook](https://github.com/langchain-ai/langsmith-cookbook). Some of the guides therein include:
+
+- Leveraging user feedback in your JS application ([link](https://github.com/langchain-ai/langsmith-cookbook/blob/main/feedback-examples/nextjs/README.md)).
+- Building an automated feedback pipeline ([link](https://github.com/langchain-ai/langsmith-cookbook/blob/main/feedback-examples/algorithmic-feedback/algorithmic_feedback.ipynb)).
+- How to evaluate and audit your RAG workflows ([link](https://github.com/langchain-ai/langsmith-cookbook/tree/main/testing-examples/qa-correctness)).
+- How to fine-tune a LLM on real usage data ([link](https://github.com/langchain-ai/langsmith-cookbook/blob/main/fine-tuning-examples/export-to-openai/fine-tuning-on-chat-runs.ipynb)).
+- How to use the [LangChain Hub](https://smith.langchain.com/hub) to version your prompts ([link](https://github.com/langchain-ai/langsmith-cookbook/blob/main/hub-examples/retrieval-qa-chain/retrieval-qa.ipynb))
+
+
+<DocCardList />
--- a/docs/docs_skeleton/docs/guides/safety/amazon_comprehend_chain.ipynb
+++ b/docs/docs_skeleton/docs/guides/safety/amazon_comprehend_chain.ipynb
--- a/docs/docs_skeleton/docs/guides/safety/index.mdx
+++ b/docs/docs_skeleton/docs/guides/safety/index.mdx
@@ -1,6 +1,8 @@
-# Preventing harmful outputs
+# Moderation

 One of the key concerns with using LLMs is that they may generate harmful or unethical text. This is an area of active research in the field. Here we present some built-in chains inspired by this research, which are intended to make the outputs of LLMs safer.

- [Moderation chain](/docs/use_cases/safety/moderation): Explicitly check if any output text is harmful and flag it.
- [Constitutional chain](/docs/use_cases/safety/constitutional_chain): Prompt the model with a set of principles which should guide it's behavior.
+- [Moderation chain](/docs/guides/safety/moderation): Explicitly check if any output text is harmful and flag it.
+- [Constitutional chain](/docs/guides/safety/constitutional_chain): Prompt the model with a set of principles which should guide it's behavior.
+- [Logical Fallacy chain](/docs/guides/safety/logical_fallacy_chain): Checks the model output against logical fallacies to correct any deviation.
+- [Amazon Comprehend moderation chain](/docs/guides/safety/amazon_comprehend_chain): Use [Amazon Comprehend](https://aws.amazon.com/comprehend/) to detect and handle PII and toxicity.
--- a/docs/docs_skeleton/docs/guides/safety/logical_fallacy_chain.mdx
+++ b/docs/docs_skeleton/docs/guides/safety/logical_fallacy_chain.mdx
@@ -0,0 +1,85 @@
+# Removing logical fallacies from model output
+Logical fallacies are flawed reasoning or false arguments that can undermine the validity of a model's outputs. Examples include circular reasoning, false
+dichotomies, ad hominem attacks, etc.  Machine learning models are optimized to perform well on specific metrics like accuracy, perplexity, or loss. However, 
+optimizing for metrics alone does not guarantee logically sound reasoning.
+
+Language models can learn to exploit flaws in reasoning to generate plausible-sounding but logically invalid arguments.  When models rely on fallacies, their outputs become unreliable and untrustworthy, even if they achieve high scores on metrics. Users cannot depend on such outputs. Propagating logical fallacies can spread misinformation, confuse users, and lead to harmful real-world consequences when models are deployed in products or services.
+
+Monitoring and testing specifically for logical flaws is challenging unlike other quality issues. It requires reasoning about arguments rather than pattern matching.
+
+Therefore, it is crucial that model developers proactively address logical fallacies after optimizing metrics. Specialized techniques like causal modeling, robustness testing, and bias mitigation can help avoid flawed reasoning.  Overall, allowing logical flaws to persist makes models less safe and ethical. Eliminating fallacies ensures model outputs remain logically valid and aligned with human reasoning. This maintains user trust and mitigates risks.
+
+
+
+```python
+# Imports
+from langchain.llms import OpenAI
+from langchain.prompts import PromptTemplate
+from langchain.chains.llm import LLMChain
+from langchain_experimental.fallacy_removal.base import FallacyChain
+```
+
+```python
+# Example of a model output being returned with a logical fallacy
+misleading_prompt = PromptTemplate(
+    template="""You have to respond by using only logical fallacies inherent in your answer explanations.
+
+Question: {question}
+
+Bad answer:""",
+    input_variables=["question"],
+)
+
+llm = OpenAI(temperature=0)
+
+misleading_chain = LLMChain(llm=llm, prompt=misleading_prompt)
+
+misleading_chain.run(question="How do I know the earth is round?")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+    'The earth is round because my professor said it is, and everyone believes my professor'
+```
+
+</CodeOutputBlock>
+
+
+```python
+fallacies = FallacyChain.get_fallacies(["correction"])
+fallacy_chain = FallacyChain.from_llm(
+    chain=misleading_chain,
+    logical_fallacies=fallacies,
+    llm=llm,
+    verbose=True,
+)
+
+fallacy_chain.run(question="How do I know the earth is round?")
+```
+
+<CodeOutputBlock lang="python">
+
+```
+
+
+    > Entering new FallacyChain chain...
+    Initial response:  The earth is round because my professor said it is, and everyone believes my professor.
+
+    Applying correction...
+
+    Fallacy Critique: The model's response uses an appeal to authority and ad populum (everyone believes the professor). Fallacy Critique Needed.
+
+    Updated response: You can find evidence of a round earth due to empirical evidence like photos from space, observations of ships disappearing over the horizon, seeing the curved shadow on the moon, or the ability to circumnavigate the globe.
+
+
+    > Finished chain.
+
+
+
+
+
+    'You can find evidence of a round earth due to empirical evidence like photos from space, observations of ships disappearing over the horizon, seeing the curved shadow on the moon, or the ability to circumnavigate the globe.'
+```
+
+</CodeOutputBlock>
--- a/docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx
+++ b/docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx
@@ -12,7 +12,7 @@ Here are the agents available in LangChain.

 ### [Zero-shot ReAct](/docs/modules/agents/agent_types/react.html)

-This agent uses the [ReAct](https://arxiv.org/pdf/2205.00445.pdf) framework to determine which tool to use
+This agent uses the [ReAct](https://arxiv.org/pdf/2210.03629) framework to determine which tool to use
 based solely on the tool's description. Any number of tools can be provided.
 This agent requires that a description is provided for each tool.

@@ -37,11 +37,11 @@ This agent is designed to be used in conversational settings.
 The prompt is designed to make the agent helpful and conversational.
 It uses the ReAct framework to decide which tool to use, and uses memory to remember the previous conversation interactions.

-### [Self ask with search](/docs/modules/agents/agent_types/self_ask_with_search.html)
+### [Self-ask with search](/docs/modules/agents/agent_types/self_ask_with_search.html)

 This agent utilizes a single tool that should be named `Intermediate Answer`.
 This tool should be able to lookup factual answers to questions. This agent
-is equivalent to the original [self ask with search paper](https://ofir.io/self-ask.pdf),
+is equivalent to the original [self-ask with search paper](https://ofir.io/self-ask.pdf),
 where a Google search API was provided as the tool.

 ### [ReAct document store](/docs/modules/agents/agent_types/react_docstore.html)
@@ -54,4 +54,4 @@ This agent is equivalent to the
 original [ReAct paper](https://arxiv.org/pdf/2210.03629.pdf), specifically the Wikipedia example.

 ## [Plan-and-execute agents](/docs/modules/agents/agent_types/plan_and_execute.html)
-Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).
+Plan-and-execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).
--- a/docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx
+++ b/docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx
@@ -1,6 +1,6 @@
-# Plan and execute
+# Plan-and-execute

-Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).
+Plan-and-execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).

 The planning is almost always done by an LLM.

--- a/docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_agent.mdx
+++ b/docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_agent.mdx
@@ -1,13 +1,13 @@
-# Custom LLM Agent
+# Custom LLM agent

 This notebook goes through how to create your own custom LLM agent.

 An LLM agent consists of three parts:

- PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
+- `PromptTemplate`: This is the prompt template that can be used to instruct the language model on what to do
 - LLM: This is the language model that powers the agent
 - `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
- OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
+- `OutputParser`: This determines how to parse the LLM output into an `AgentAction` or `AgentFinish` object

 import Example from "@snippets/modules/agents/how_to/custom_llm_agent.mdx"

--- a/docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_chat_agent.mdx
+++ b/docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_chat_agent.mdx
@@ -4,10 +4,10 @@ This notebook goes through how to create your own custom agent based on a chat m

 An LLM chat agent consists of three parts:

- PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
- ChatModel: This is the language model that powers the agent
+- `PromptTemplate`: This is the prompt template that can be used to instruct the language model on what to do
+- `ChatModel`: This is the language model that powers the agent
 - `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
- OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
+- `OutputParser`: This determines how to parse the LLM output into an `AgentAction` or `AgentFinish` object

 import Example from "@snippets/modules/agents/how_to/custom_llm_chat_agent.mdx"

--- a/docs/docs_skeleton/docs/modules/chains/document/index.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/document/index.mdx
@@ -3,7 +3,7 @@ sidebar_position: 2
 ---
 # Documents

-These are the core chains for working with Documents. They are useful for summarizing documents, answering questions over documents, extracting information from documents, and more.
+These are the core chains for working with documents. They are useful for summarizing documents, answering questions over documents, extracting information from documents, and more.

 These chains all implement a common interface:

--- a/docs/docs_skeleton/docs/modules/chains/document/refine.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/document/refine.mdx
@@ -3,10 +3,10 @@ sidebar_position: 1
 ---
 # Refine

-The refine documents chain constructs a response by looping over the input documents and iteratively updating its answer. For each document, it passes all non-document inputs, the current document, and the latest intermediate answer to an LLM chain to get a new answer.
+The Refine documents chain constructs a response by looping over the input documents and iteratively updating its answer. For each document, it passes all non-document inputs, the current document, and the latest intermediate answer to an LLM chain to get a new answer.

 Since the Refine chain only passes a single document to the LLM at a time, it is well-suited for tasks that require analyzing more documents than can fit in the model's context.
 The obvious tradeoff is that this chain will make far more LLM calls than, for example, the Stuff documents chain.
 There are also certain tasks which are difficult to accomplish iteratively. For example, the Refine chain can perform poorly when documents frequently cross-reference one another or when a task requires detailed information from many documents.

-![refine_diagram](/img/refine.jpg)
+![refine_diagram](/img/refine.jpg)
--- a/docs/docs_skeleton/docs/modules/chains/foundational/llm_chain.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/foundational/llm_chain.mdx
@@ -1,11 +1,11 @@
 # LLM

-An LLMChain is a simple chain that adds some functionality around language models. It is used widely throughout LangChain, including in other chains and agents.
+An `LLMChain` is a simple chain that adds some functionality around language models. It is used widely throughout LangChain, including in other chains and agents.

-An LLMChain consists of a PromptTemplate and a language model (either an LLM or chat model). It formats the prompt template using the input key values provided (and also memory key values, if available), passes the formatted string to LLM and returns the LLM output.
+An `LLMChain` consists of a `PromptTemplate` and a language model (either an LLM or chat model). It formats the prompt template using the input key values provided (and also memory key values, if available), passes the formatted string to LLM and returns the LLM output.

 ## Get started

 import Example from "@snippets/modules/chains/foundational/llm_chain.mdx"

-<Example/>
+<Example/>
--- a/docs/docs_skeleton/docs/modules/chains/foundational/sequential_chains.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/foundational/sequential_chains.mdx
@@ -4,7 +4,7 @@

 The next step after calling a language model is make a series of calls to a language model. This is particularly useful when you want to take the output from one call and use it as the input to another.

-In this notebook we will walk through some examples for how to do this, using sequential chains. Sequential chains allow you to connect multiple chains and compose them into pipelines that execute some specific scenario.. There are two types of sequential chains:
+In this notebook we will walk through some examples for how to do this, using sequential chains. Sequential chains allow you to connect multiple chains and compose them into pipelines that execute some specific scenario. There are two types of sequential chains:

 - `SimpleSequentialChain`: The simplest form of sequential chains, where each step has a singular input/output, and the output of one step is the input to the next.
 - `SequentialChain`: A more general form of sequential chains, allowing for multiple inputs/outputs.
--- a/docs/docs_skeleton/docs/modules/chains/index.mdx
+++ b/docs/docs_skeleton/docs/modules/chains/index.mdx
@@ -19,8 +19,6 @@ For more specifics check out:
 - [How-to](/docs/modules/chains/how_to/) for walkthroughs of different chain features
 - [Foundational](/docs/modules/chains/foundational/) to get acquainted with core building block chains
 - [Document](/docs/modules/chains/document/) to learn how to incorporate documents into chains
- [Popular](/docs/modules/chains/popular/) chains for the most common use cases
- [Additional](/docs/modules/chains/additional/) to see some of the more advanced chains and integrations that you can use out of the box

 ## Why do we need chains?

@@ -30,4 +28,4 @@ Chains allow us to combine multiple components together to create a single, cohe

 import GetStarted from "@snippets/modules/chains/get_started.mdx"

-<GetStarted/>
+<GetStarted/>
--- a/docs/docs_skeleton/docs/modules/data_connection/document_loaders/index.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/document_loaders/index.mdx
@@ -11,7 +11,7 @@ Use document loaders to load data from a source as `Document`'s. A `Document` is
 and associated metadata. For example, there are document loaders for loading a simple `.txt` file, for loading the text
 contents of any web page, or even for loading a transcript of a YouTube video.

-Document loaders expose a "load" method for loading data as documents from a configured source. They optionally
+Document loaders provide a "load" method for loading data as documents from a configured source. They optionally
 implement a "lazy load" as well for lazily loading data into memory.

 ## Get started
--- a/docs/docs_skeleton/docs/modules/data_connection/document_transformers/text_splitters/character_text_splitter.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/document_transformers/text_splitters/character_text_splitter.mdx
@@ -2,8 +2,8 @@

 This is the simplest method. This splits based on characters (by default "\n\n") and measure chunk length by number of characters.

-1. How the text is split: by single character
-2. How the chunk size is measured: by number of characters
+1. How the text is split: by single character.
+2. How the chunk size is measured: by number of characters.

 import Example from "@snippets/modules/data_connection/document_transformers/text_splitters/character_text_splitter.mdx"

--- a/docs/docs_skeleton/docs/modules/data_connection/document_transformers/text_splitters/code_splitter.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/document_transformers/text_splitters/code_splitter.mdx
@@ -1,6 +1,6 @@
 # Split code

-CodeTextSplitter allows you to split your code with multiple language support. Import enum `Language` and specify the language. 
+CodeTextSplitter allows you to split your code with multiple languages supported. Import enum `Language` and specify the language. 

 import Example from "@snippets/modules/data_connection/document_transformers/text_splitters/code_splitter.mdx"

--- a/docs/docs_skeleton/docs/modules/data_connection/document_transformers/text_splitters/recursive_text_splitter.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/document_transformers/text_splitters/recursive_text_splitter.mdx
@@ -2,8 +2,8 @@

 This text splitter is the recommended one for generic text. It is parameterized by a list of characters. It tries to split on them in order until the chunks are small enough. The default list is `["\n\n", "\n", " ", ""]`. This has the effect of trying to keep all paragraphs (and then sentences, and then words) together as long as possible, as those would generically seem to be the strongest semantically related pieces of text.

-1. How the text is split: by list of characters
-2. How the chunk size is measured: by number of characters
+1. How the text is split: by list of characters.
+2. How the chunk size is measured: by number of characters.

 import Example from "@snippets/modules/data_connection/document_transformers/text_splitters/recursive_text_splitter.mdx"

--- a/docs/docs_skeleton/docs/modules/data_connection/index.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/index.mdx
@@ -2,15 +2,60 @@
 sidebar_position: 1
 ---

-# Data connection
+# Retrieval

-Many LLM applications require user-specific data that is not part of the model's training set. LangChain gives you the 
-building blocks to load, transform, store and query your data via:
+Many LLM applications require user-specific data that is not part of the model's training set.
+The primary way of accomplishing this is through Retrieval Augmented Generation (RAG).
+In this process, external data is *retrieved* and then passed to the LLM when doing the *generation* step.

- [Document loaders](/docs/modules/data_connection/document_loaders/): Load documents from many different sources
- [Document transformers](/docs/modules/data_connection/document_transformers/): Split documents, convert documents into Q&A format, drop redundant documents, and more
- [Text embedding models](/docs/modules/data_connection/text_embedding/): Take unstructured text and turn it into a list of floating point numbers
- [Vector stores](/docs/modules/data_connection/vectorstores/): Store and search over embedded data
- [Retrievers](/docs/modules/data_connection/retrievers/): Query your data
+LangChain provides all the building blocks for RAG applications - from simple to complex.
+This section of the documentation covers everything related to the *retrieval* step - e.g. the fetching of the data.
+Although this sounds simple, it can be subtly complex.
+This encompasses several key modules.

 ![data_connection_diagram](/img/data_connection.jpg)
+
+**[Document loaders](/docs/modules/data_connection/document_loaders/)**
+
+Load documents from many different sources.
+LangChain provides over 100 different document loaders as well as integrations with other major providers in the space,
+like AirByte and Unstructured.
+We provide integrations to load all types of documents (HTML, PDF, code) from all types of locations (private s3 buckets, public websites).
+
+**[Document transformers](/docs/modules/data_connection/document_transformers/)**
+
+A key part of retrieval is fetching only the relevant parts of documents.
+This involves several transformation steps in order to best prepare the documents for retrieval.
+One of the primary ones here is splitting (or chunking) a large document into smaller chunks.
+LangChain provides several different algorithms for doing this, as well as logic optimized for specific document types (code, markdown, etc).
+
+**[Text embedding models](/docs/modules/data_connection/text_embedding/)**
+
+Another key part of retrieval has become creating embeddings for documents.
+Embeddings capture the semantic meaning of the text, allowing you to quickly and
+efficiently find other pieces of text that are similar.
+LangChain provides integrations with over 25 different embedding providers and methods,
+from open-source to proprietary API,
+allowing you to choose the one best suited for your needs.
+LangChain provides a standard interface, allowing you to easily swap between models.
+
+**[Vector stores](/docs/modules/data_connection/vectorstores/)**
+
+With the rise of embeddings, there has emerged a need for databases to support efficient storage and searching of these embeddings.
+LangChain provides integrations with over 50 different vectorstores, from open-source local ones to cloud-hosted proprietary ones,
+allowing you to choose the one best suited for your needs.
+LangChain exposes a standard interface, allowing you to easily swap between vector stores.
+
+**[Retrievers](/docs/modules/data_connection/retrievers/)**
+
+Once the data is in the database, you still need to retrieve it.
+LangChain supports many different retrieval algorithms and is one of the places where we add the most value.
+We support basic methods that are easy to get started - namely simple semantic search.
+However, we have also added a collection of algorithms on top of this to increase performance.
+These include:
+
+- [Parent Document Retriever](/docs/modules/data_connection/retrievers/parent_document_retriever): This allows you to create multiple embeddings per parent document, allowing you to look up smaller chunks but return larger context.
+- [Self Query Retriever](/docs/modules/data_connection/retrievers/self_query): User questions often contain a reference to something that isn't just semantic but rather expresses some logic that can best be represented as a metadata filter. Self-query allows you to parse out the *semantic* part of a query from other *metadata filters* present in the query.
+- [Ensemble Retriever](/docs/modules/data_connection/retrievers/ensemble): Sometimes you may want to retrieve documents from multiple different sources, or using multiple different algorithms. The ensemble retriever allows you to easily do this.
+- And more!
+
--- a/docs/docs_skeleton/docs/modules/data_connection/retrievers/contextual_compression/index.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/retrievers/contextual_compression/index.mdx
@@ -5,10 +5,10 @@ One challenge with retrieval is that usually you don't know the specific queries
 Contextual compression is meant to fix this. The idea is simple: instead of immediately returning retrieved documents as-is, you can compress them using the context of the given query, so that only the relevant information is returned. “Compressing” here refers to both compressing the contents of an individual document and filtering out documents wholesale.

 To use the Contextual Compression Retriever, you'll need:
- a base Retriever
+- a base retriever
 - a Document Compressor

-The Contextual Compression Retriever passes queries to the base Retriever, takes the initial documents and passes them through the Document Compressor. The Document Compressor takes a list of Documents and shortens it by reducing the contents of Documents or dropping Documents altogether.
+The Contextual Compression Retriever passes queries to the base retriever, takes the initial documents and passes them through the Document Compressor. The Document Compressor takes a list of documents and shortens it by reducing the contents of documents or dropping documents altogether.

 ![](https://drive.google.com/uc?id=1CtNgWODXZudxAWSRiWgSGEoTNrUFT98v)

--- a/docs/docs_skeleton/docs/modules/data_connection/retrievers/index.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/retrievers/index.mdx
@@ -8,7 +8,7 @@ Head to [Integrations](/docs/integrations/retrievers/) for documentation on buil
 :::

 A retriever is an interface that returns documents given an unstructured query. It is more general than a vector store.
-A retriever does not need to be able to store documents, only to return (or retrieve) it. Vector stores can be used
+A retriever does not need to be able to store documents, only to return (or retrieve) them. Vector stores can be used
 as the backbone of a retriever, but there are other types of retrievers as well.

 ## Get started
--- a/docs/docs_skeleton/docs/modules/data_connection/retrievers/self_query/index.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/retrievers/self_query/index.mdx
@@ -1,6 +1,6 @@
 # Self-querying

-A self-querying retriever is one that, as the name suggests, has the ability to query itself. Specifically, given any natural language query, the retriever uses a query-constructing LLM chain to write a structured query and then applies that structured query to it's underlying VectorStore. This allows the retriever to not only use the user-input query for semantic similarity comparison with the contents of stored documented, but to also extract filters from the user query on the metadata of stored documents and to execute those filters.
+A self-querying retriever is one that, as the name suggests, has the ability to query itself. Specifically, given any natural language query, the retriever uses a query-constructing LLM chain to write a structured query and then applies that structured query to its underlying VectorStore. This allows the retriever to not only use the user-input query for semantic similarity comparison with the contents of stored documents but to also extract filters from the user query on the metadata of stored documents and to execute those filters.

 ![](https://drive.google.com/uc?id=1OQUN-0MJcDUxmPXofgS7MqReEs720pqS)

--- a/docs/docs_skeleton/docs/modules/data_connection/retrievers/time_weighted_vectorstore.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/retrievers/time_weighted_vectorstore.mdx
@@ -8,7 +8,7 @@ The algorithm for scoring them is:
 semantic_similarity + (1.0 - decay_rate) ^ hours_passed
 ```

-Notably, `hours_passed` refers to the hours passed since the object in the retriever **was last accessed**, not since it was created. This means that frequently accessed objects remain "fresh."
+Notably, `hours_passed` refers to the hours passed since the object in the retriever **was last accessed**, not since it was created. This means that frequently accessed objects remain "fresh".

 import Example from "@snippets/modules/data_connection/retrievers/how_to/time_weighted_vectorstore.mdx"

--- a/docs/docs_skeleton/docs/modules/data_connection/retrievers/vectorstore.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/retrievers/vectorstore.mdx
@@ -1,9 +1,9 @@
 # Vector store-backed retriever

-A vector store retriever is a retriever that uses a vector store to retrieve documents. It is a lightweight wrapper around the Vector Store class to make it conform to the Retriever interface.
+A vector store retriever is a retriever that uses a vector store to retrieve documents. It is a lightweight wrapper around the vector store class to make it conform to the retriever interface.
 It uses the search methods implemented by a vector store, like similarity search and MMR, to query the texts in the vector store.

-Once you construct a Vector store, it's very easy to construct a retriever. Let's walk through an example.
+Once you construct a vector store, it's very easy to construct a retriever. Let's walk through an example.

 import Example from "@snippets/modules/data_connection/retrievers/how_to/vectorstore.mdx"

--- a/docs/docs_skeleton/docs/modules/data_connection/text_embedding/index.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/text_embedding/index.mdx
@@ -11,7 +11,7 @@ The Embeddings class is a class designed for interfacing with text embedding mod

 Embeddings create a vector representation of a piece of text. This is useful because it means we can think about text in the vector space, and do things like semantic search where we look for pieces of text that are most similar in the vector space.

-The base Embeddings class in LangChain exposes two methods: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).
+The base Embeddings class in LangChain provides two methods: one for embedding documents and one for embedding a query. The former takes as input multiple texts, while the latter takes a single text. The reason for having these as two separate methods is that some embedding providers have different embedding methods for documents (to be searched over) vs queries (the search query itself).

 ## Get started

--- a/docs/docs_skeleton/docs/modules/data_connection/vectorstores/index.mdx
+++ b/docs/docs_skeleton/docs/modules/data_connection/vectorstores/index.mdx
@@ -16,7 +16,7 @@ for you.

 ## Get started

-This walkthrough showcases basic functionality related to VectorStores. A key part of working with vector stores is creating the vector to put in them, which is usually created via embeddings. Therefore, it is recommended that you familiarize yourself with the [text embedding model](/docs/modules/data_connection/text_embedding/) interfaces before diving into this.
+This walkthrough showcases basic functionality related to vector stores. A key part of working with vector stores is creating the vector to put in them, which is usually created via embeddings. Therefore, it is recommended that you familiarize yourself with the [text embedding model](/docs/modules/data_connection/text_embedding/) interfaces before diving into this.

 import GetStarted from "@snippets/modules/data_connection/vectorstores/get_started.mdx"

--- a/docs/docs_skeleton/docs/modules/index.mdx
+++ b/docs/docs_skeleton/docs/modules/index.mdx
@@ -8,7 +8,7 @@ LangChain provides standard, extendable interfaces and external integrations for

 #### [Model I/O](/docs/modules/model_io/)
 Interface with language models
-#### [Data connection](/docs/modules/data_connection/)
+#### [Retrieval](/docs/modules/data_connection/)
 Interface with application-specific data
 #### [Chains](/docs/modules/chains/)
 Construct sequences of calls
@@ -18,5 +18,3 @@ Let chains choose which tools to use given high-level directives
 Persist application state between runs of a chain
 #### [Callbacks](/docs/modules/callbacks/)
 Log and stream intermediate steps of any chain
-#### [Evaluation](/docs/modules/evaluation/)
-Evaluate the performance of a chain.
--- a/docs/docs_skeleton/docs/modules/memory/chat_messages/index.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/chat_messages/index.mdx
@@ -8,10 +8,10 @@ Head to [Integrations](/docs/integrations/memory/) for documentation on built-in
 :::

 One of the core utility classes underpinning most (if not all) memory modules is the `ChatMessageHistory` class.
-This is a super lightweight wrapper which exposes convenience methods for saving Human messages, AI messages, and then fetching them all.
+This is a super lightweight wrapper that provides convenience methods for saving HumanMessages, AIMessages, and then fetching them all.

 You may want to use this class directly if you are managing memory outside of a chain.

 import GetStarted from "@snippets/modules/memory/chat_messages/get_started.mdx"

-<GetStarted/>
+<GetStarted/>
--- a/docs/docs_skeleton/docs/modules/memory/index.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/index.mdx
@@ -32,7 +32,7 @@ Even if these are not all used directly, they need to be stored in some form.
 One of the key parts of the LangChain memory module is a series of integrations for storing these chat messages,
 from in-memory lists to persistent databases.

- [Chat message storage](/docs/modules/memory/chat_messages/): How to work with Chat Messages, and the various integrations offered
+- [Chat message storage](/docs/modules/memory/chat_messages/): How to work with Chat Messages, and the various integrations offered.

 ### Querying: Data structures and algorithms on top of chat messages
 Keeping a list of chat messages is fairly straight-forward.
--- a/docs/docs_skeleton/docs/modules/memory/types/buffer.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/buffer.mdx
@@ -1,6 +1,6 @@
-# Conversation buffer memory
+# Conversation Buffer

-This notebook shows how to use `ConversationBufferMemory`. This memory allows for storing of messages and then extracts the messages in a variable.
+This notebook shows how to use `ConversationBufferMemory`. This memory allows for storing messages and then extracts the messages in a variable.

 We can first extract it as a string.

--- a/docs/docs_skeleton/docs/modules/memory/types/buffer_window.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/buffer_window.mdx
@@ -1,6 +1,6 @@
-# Conversation buffer window memory
+# Conversation Buffer Window

-`ConversationBufferWindowMemory` keeps a list of the interactions of the conversation over time. It only uses the last K interactions. This can be useful for keeping a sliding window of the most recent interactions, so the buffer does not get too large
+`ConversationBufferWindowMemory` keeps a list of the interactions of the conversation over time. It only uses the last K interactions. This can be useful for keeping a sliding window of the most recent interactions, so the buffer does not get too large.

 Let's first explore the basic functionality of this type of memory.

--- a/docs/docs_skeleton/docs/modules/memory/types/entity_summary_memory.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/entity_summary_memory.mdx
@@ -1,6 +1,6 @@
-# Entity memory
+# Entity

-Entity Memory remembers given facts about specific entities in a conversation. It extracts information on entities (using an LLM) and builds up its knowledge about that entity over time (also using an LLM).
+Entity memory remembers given facts about specific entities in a conversation. It extracts information on entities (using an LLM) and builds up its knowledge about that entity over time (also using an LLM).

 Let's first walk through using this functionality.

--- a/docs/docs_skeleton/docs/modules/memory/types/index.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/index.mdx
@@ -1,8 +1,8 @@
 ---
 sidebar_position: 2
 ---
-# Memory Types
+# Memory types

 There are many different types of memory.
-Each have their own parameters, their own return types, and are useful in different scenarios.
+Each has their own parameters, their own return types, and is useful in different scenarios.
 Please see their individual page for more detail on each one.
--- a/docs/docs_skeleton/docs/modules/memory/types/summary.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/summary.mdx
@@ -1,4 +1,4 @@
-# Conversation summary memory
+# Conversation Summary
 Now let's take a look at using a slightly more complex type of memory - `ConversationSummaryMemory`. This type of memory creates a summary of the conversation over time. This can be useful for condensing information from the conversation over time.
 Conversation summary memory summarizes the conversation as it happens and stores the current summary in memory. This memory can then be used to inject the summary of the conversation so far into a prompt/chain. This memory is most useful for longer conversations, where keeping the past message history in the prompt verbatim would take up too many tokens.

--- a/docs/docs_skeleton/docs/modules/memory/types/vectorstore_retriever_memory.mdx
+++ b/docs/docs_skeleton/docs/modules/memory/types/vectorstore_retriever_memory.mdx
@@ -1,6 +1,6 @@
-# Vector store-backed memory
+# Backed by a Vector Store

-`VectorStoreRetrieverMemory` stores memories in a VectorDB and queries the top-K most "salient" docs every time it is called.
+`VectorStoreRetrieverMemory` stores memories in a vector store and queries the top-K most "salient" docs every time it is called.

 This differs from most of the other Memory classes in that it doesn't explicitly track the order of interactions.

--- a/docs/docs_skeleton/docs/modules/model_io/models/chat/chat_model_caching.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/models/chat/chat_model_caching.mdx
@@ -1,5 +1,5 @@
 # Caching
-LangChain provides an optional caching layer for Chat Models. This is useful for two reasons:
+LangChain provides an optional caching layer for chat models. This is useful for two reasons:

 It can save you money by reducing the number of API calls you make to the LLM provider, if you're often requesting the same completion multiple times.
 It can speed up your application by reducing the number of API calls you make to the LLM provider.
--- a/docs/docs_skeleton/docs/modules/model_io/models/chat/index.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/models/chat/index.mdx
@@ -8,8 +8,8 @@ Head to [Integrations](/docs/integrations/chat/) for documentation on built-in i
 :::

 Chat models are a variation on language models.
-While chat models use language models under the hood, the interface they expose is a bit different.
-Rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.
+While chat models use language models under the hood, the interface they use is a bit different.
+Rather than using a "text in, text out" API, they use an interface where "chat messages" are the inputs and outputs.

 Chat model APIs are fairly new, so we are still figuring out the correct abstractions.

--- a/docs/docs_skeleton/docs/modules/model_io/models/chat/prompts.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/models/chat/prompts.mdx
@@ -1,6 +1,6 @@
 # Prompts

-Prompts for Chat models are built around messages, instead of just plain text.
+Prompts for chat models are built around messages, instead of just plain text.

 import Prompts from "@snippets/modules/model_io/models/chat/how_to/prompts.mdx"

--- a/docs/docs_skeleton/docs/modules/model_io/models/chat/streaming.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/models/chat/streaming.mdx
@@ -1,6 +1,6 @@
 # Streaming

-Some Chat models provide a streaming response. This means that instead of waiting for the entire response to be returned, you can start processing it as soon as it's available. This is useful if you want to display the response to the user as it's being generated, or if you want to process the response as it's being generated.
+Some chat models provide a streaming response. This means that instead of waiting for the entire response to be returned, you can start processing it as soon as it's available. This is useful if you want to display the response to the user as it's being generated, or if you want to process the response as it's being generated.

 import StreamingChatModel from "@snippets/modules/model_io/models/chat/how_to/streaming.mdx"

--- a/docs/docs_skeleton/docs/modules/model_io/models/index.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/models/index.mdx
@@ -8,16 +8,16 @@ LangChain provides interfaces and integrations for two types of models:
 - [LLMs](/docs/modules/model_io/models/llms/): Models that take a text string as input and return a text string
 - [Chat models](/docs/modules/model_io/models/chat/): Models that are backed by a language model but take a list of Chat Messages as input and return a Chat Message

-## LLMs vs Chat Models
+## LLMs vs chat models

-LLMs and Chat Models are subtly but importantly different. LLMs in LangChain refer to pure text completion models.
+LLMs and chat models are subtly but importantly different. LLMs in LangChain refer to pure text completion models.
 The APIs they wrap take a string prompt as input and output a string completion. OpenAI's GPT-3 is implemented as an LLM.
 Chat models are often backed by LLMs but tuned specifically for having conversations.
-And, crucially, their provider APIs expose a different interface than pure text completion models. Instead of a single string,
+And, crucially, their provider APIs use a different interface than pure text completion models. Instead of a single string,
 they take a list of chat messages as input. Usually these messages are labeled with the speaker (usually one of "System",
-"AI", and "Human"). And they return a ("AI") chat message as output. GPT-4 and Anthropic's Claude are both implemented as Chat Models.
+"AI", and "Human"). And they return an AI chat message as output. GPT-4 and Anthropic's Claude are both implemented as chat models.

-To make it possible to swap LLMs and Chat Models, both implement the Base Language Model interface. This exposes common
+To make it possible to swap LLMs and chat models, both implement the Base Language Model interface. This includes common
 methods "predict", which takes a string and returns a string, and "predict messages", which takes messages and returns a message.
-If you are using a specific model it's recommended you use the methods specific to that model class (i.e., "predict" for LLMs and "predict messages" for Chat Models),
+If you are using a specific model it's recommended you use the methods specific to that model class (i.e., "predict" for LLMs and "predict messages" for chat models),
 but if you're creating an application that should work with different types of models the shared interface can be helpful.
--- a/docs/docs_skeleton/docs/modules/model_io/output_parsers/index.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/output_parsers/index.mdx
@@ -12,7 +12,7 @@ Output parsers are classes that help structure language model responses. There a

 And then one optional one:

- "Parse with prompt": A method which takes in a string (assumed to be the response from a language model) and a prompt (assumed to the prompt that generated such a response) and parses it into some structure. The prompt is largely provided in the event the OutputParser wants to retry or fix the output in some way, and needs information from the prompt to do so.
+- "Parse with prompt": A method which takes in a string (assumed to be the response from a language model) and a prompt (assumed to be the prompt that generated such a response) and parses it into some structure. The prompt is largely provided in the event the OutputParser wants to retry or fix the output in some way, and needs information from the prompt to do so.

 ## Get started

--- a/docs/docs_skeleton/docs/modules/model_io/prompts/index.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/prompts/index.mdx
@@ -3,10 +3,12 @@ sidebar_position: 0
 ---
 # Prompts

-The new way of programming models is through prompts.
-A **prompt** refers to the input to the model.
-This input is often constructed from multiple components.
-LangChain provides several classes and functions to make constructing and working with prompts easy.
+A prompt for a language model is a set of instructions or input provided by a user to
+guide the model's response, helping it understand the context and generate relevant
+and coherent language-based output, such as answering questions, completing sentences,
+or engaging in a conversation.

- [Prompt templates](/docs/modules/model_io/prompts/prompt_templates/): Parametrize model inputs
+LangChain provides several classes and functions to help construct and work with prompts.
+
+- [Prompt templates](/docs/modules/model_io/prompts/prompt_templates/): Parametrized model inputs
 - [Example selectors](/docs/modules/model_io/prompts/example_selectors/): Dynamically select examples to include in prompts
--- a/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/few_shot_examples.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/few_shot_examples.mdx
@@ -1,6 +1,6 @@
 # Few-shot prompt templates

-In this tutorial, we'll learn how to create a prompt template that uses few shot examples. A few shot prompt template can be constructed from either a set of examples, or from an Example Selector object.
+In this tutorial, we'll learn how to create a prompt template that uses few-shot examples. A few-shot prompt template can be constructed from either a set of examples, or from an Example Selector object.

 import Example from "@snippets/modules/model_io/prompts/prompt_templates/few_shot_examples.mdx"

--- a/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/index.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/index.mdx
@@ -4,18 +4,15 @@ sidebar_position: 0

 # Prompt templates

-Language models take text as input - that text is commonly referred to as a prompt.
-Typically this is not simply a hardcoded string but rather a combination of a template, some examples, and user input.
-LangChain provides several classes and functions to make constructing and working with prompts easy.
+Prompt templates are pre-defined recipes for generating prompts for language models.

-## What is a prompt template?
+A template may include instructions, few-shot examples, and specific context and
+questions appropriate for a given task.

-A prompt template refers to a reproducible way to generate a prompt. It contains a text string ("the template"), that can take in a set of parameters from the end user and generates a prompt.
+LangChain provides tooling to create and work with prompt templates.

-A prompt template can contain:
- instructions to the language model,
- a set of few shot examples to help the language model generate a better response,
- a question to the language model.
+LangChain strives to create model agnostic templates to make it easy to reuse
+existing templates across different language models.

 import GetStarted from "@snippets/modules/model_io/prompts/prompt_templates/get_started.mdx"

--- a/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/partial.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/partial.mdx
@@ -1,6 +1,6 @@
 # Partial prompt templates

-Like other methods, it can make sense to "partial" a prompt template - eg pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.
+Like other methods, it can make sense to "partial" a prompt template - e.g. pass in a subset of the required values, as to create a new prompt template which expects only the remaining subset of values.

 LangChain supports this in two ways:
 1. Partial formatting with string values.
--- a/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/prompt_composition.mdx
+++ b/docs/docs_skeleton/docs/modules/model_io/prompts/prompt_templates/prompt_composition.mdx
@@ -2,8 +2,8 @@

 This notebook goes over how to compose multiple prompts together. This can be useful when you want to reuse parts of prompts. This can be done with a PipelinePrompt. A PipelinePrompt consists of two main parts:

- Final prompt: This is the final prompt that is returned
- Pipeline prompts: This is a list of tuples, consisting of a string name and a prompt template. Each prompt template will be formatted and then passed to future prompt templates as a variable with the same name.
+- Final prompt: The final prompt that is returned
+- Pipeline prompts: A list of tuples, consisting of a string name and a prompt template. Each prompt template will be formatted and then passed to future prompt templates as a variable with the same name.

 import Example from "@snippets/modules/model_io/prompts/prompt_templates/prompt_composition.mdx"

--- a/docs/docs_skeleton/docs/use_cases/apis/api.mdx
+++ b/docs/docs_skeleton/docs/use_cases/apis/api.mdx
@@ -1,9 +0,0 @@
---
-sidebar_position: 0
---
-# API chains
-APIChain enables using LLMs to interact with APIs to retrieve relevant information. Construct the chain by providing a question relevant to the provided API documentation.
-
-import Example from "@snippets/modules/chains/popular/api.mdx"
-
-<Example/>
--- a/docs/docs_skeleton/docs/use_cases/question_answering/_category_.yml
+++ b/docs/docs_skeleton/docs/use_cases/question_answering/_category_.yml
@@ -0,0 +1,2 @@
+position: 0
+collapsed: false
--- a/docs/docs_skeleton/docs/use_cases/question_answering/how_to/chat_vector_db.mdx
+++ b/docs/docs_skeleton/docs/use_cases/question_answering/how_to/chat_vector_db.mdx
@@ -5,7 +5,7 @@ sidebar_position: 2
 # Store and reference chat history
 The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component.

-It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response.
+It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question-answering chain to return a response.

 To create one, you will need a retriever. In the below example, we will create one from a vector store, which can be created from embeddings.

--- a/docs/docs_skeleton/docusaurus.config.js
+++ b/docs/docs_skeleton/docusaurus.config.js
@@ -128,6 +128,10 @@ const config = {
          hideable: true,
        },
      },
+      colorMode: {
+        disableSwitch: false,
+        respectPrefersColorScheme: true,
+      },
      prism: {
        theme: {
          ...baseLightCodeBlockTheme,
--- a/docs/docs_skeleton/package-lock.json
+++ b/docs/docs_skeleton/package-lock.json
@@ -12,7 +12,7 @@
        "@docusaurus/preset-classic": "2.4.0",
        "@docusaurus/remark-plugin-npm2yarn": "^2.4.0",
        "@mdx-js/react": "^1.6.22",
-        "@mendable/search": "^0.0.125",
+        "@mendable/search": "^0.0.150",
        "clsx": "^1.2.1",
        "json-loader": "^0.5.7",
        "process": "^0.11.10",
@@ -3212,10 +3212,11 @@
      }
    },
    "node_modules/@mendable/search": {
-      "version": "0.0.125",
-      "resolved": "https://registry.npmjs.org/@mendable/search/-/search-0.0.125.tgz",
-      "integrity": "sha512-Mb1J3zDhOyBZV9cXqJocSOBNYGpe8+LQDqd9n9laPWxosSJcSTUewqtlIbMerrYsScBsxskoSiWgRsc7xF5z0Q==",
+      "version": "0.0.150",
+      "resolved": "https://registry.npmjs.org/@mendable/search/-/search-0.0.150.tgz",
+      "integrity": "sha512-Eb5SeAWlMxzEim/8eJ/Ysn01Pyh39xlPBzRBw/5OyOBhti0HVLXk4wd1Fq2TKgJC2ppQIvhEKO98PUcj9dNDFw==",
      "dependencies": {
+        "html-react-parser": "^4.2.0",
        "posthog-js": "^1.45.1"
      },
      "peerDependencies": {
@@ -8332,6 +8333,33 @@
        "safe-buffer": "~5.1.0"
      }
    },
+    "node_modules/html-dom-parser": {
+      "version": "4.0.0",
+      "resolved": "https://registry.npmjs.org/html-dom-parser/-/html-dom-parser-4.0.0.tgz",
+      "integrity": "sha512-TUa3wIwi80f5NF8CVWzkopBVqVAtlawUzJoLwVLHns0XSJGynss4jiY0mTWpiDOsuyw+afP+ujjMgRh9CoZcXw==",
+      "dependencies": {
+        "domhandler": "5.0.3",
+        "htmlparser2": "9.0.0"
+      }
+    },
+    "node_modules/html-dom-parser/node_modules/htmlparser2": {
+      "version": "9.0.0",
+      "resolved": "https://registry.npmjs.org/htmlparser2/-/htmlparser2-9.0.0.tgz",
+      "integrity": "sha512-uxbSI98wmFT/G4P2zXx4OVx04qWUmyFPrD2/CNepa2Zo3GPNaCaaxElDgwUrwYWkK1nr9fft0Ya8dws8coDLLQ==",
+      "funding": [
+        "https://github.com/fb55/htmlparser2?sponsor=1",
+        {
+          "type": "github",
+          "url": "https://github.com/sponsors/fb55"
+        }
+      ],
+      "dependencies": {
+        "domelementtype": "^2.3.0",
+        "domhandler": "^5.0.3",
+        "domutils": "^3.1.0",
+        "entities": "^4.5.0"
+      }
+    },
    "node_modules/html-entities": {
      "version": "2.4.0",
      "resolved": "https://registry.npmjs.org/html-entities/-/html-entities-2.4.0.tgz",
@@ -8375,6 +8403,20 @@
        "node": ">= 12"
      }
    },
+    "node_modules/html-react-parser": {
+      "version": "4.2.0",
+      "resolved": "https://registry.npmjs.org/html-react-parser/-/html-react-parser-4.2.0.tgz",
+      "integrity": "sha512-gzU55AS+FI6qD7XaKe5BLuLFM2Xw0/LodfMWZlxV9uOHe7LCD5Lukx/EgYuBI3c0kLu0XlgFXnSzO0qUUn3Vrg==",
+      "dependencies": {
+        "domhandler": "5.0.3",
+        "html-dom-parser": "4.0.0",
+        "react-property": "2.0.0",
+        "style-to-js": "1.1.3"
+      },
+      "peerDependencies": {
+        "react": "0.14 || 15 || 16 || 17 || 18"
+      }
+    },
    "node_modules/html-tags": {
      "version": "3.3.1",
      "resolved": "https://registry.npmjs.org/html-tags/-/html-tags-3.3.1.tgz",
@@ -11762,6 +11804,11 @@
        "webpack": ">=4.41.1 || 5.x"
      }
    },
+    "node_modules/react-property": {
+      "version": "2.0.0",
+      "resolved": "https://registry.npmjs.org/react-property/-/react-property-2.0.0.tgz",
+      "integrity": "sha512-kzmNjIgU32mO4mmH5+iUyrqlpFQhF8K2k7eZ4fdLSOPFrD1XgEuSBv9LDEgxRXTMBqMd8ppT0x6TIzqE5pdGdw=="
+    },
    "node_modules/react-router": {
      "version": "5.3.4",
      "resolved": "https://registry.npmjs.org/react-router/-/react-router-5.3.4.tgz",
@@ -13127,6 +13174,22 @@
        "url": "https://github.com/sponsors/sindresorhus"
      }
    },
+    "node_modules/style-to-js": {
+      "version": "1.1.3",
+      "resolved": "https://registry.npmjs.org/style-to-js/-/style-to-js-1.1.3.tgz",
+      "integrity": "sha512-zKI5gN/zb7LS/Vm0eUwjmjrXWw8IMtyA8aPBJZdYiQTXj4+wQ3IucOLIOnF7zCHxvW8UhIGh/uZh/t9zEHXNTQ==",
+      "dependencies": {
+        "style-to-object": "0.4.1"
+      }
+    },
+    "node_modules/style-to-js/node_modules/style-to-object": {
+      "version": "0.4.1",
+      "resolved": "https://registry.npmjs.org/style-to-object/-/style-to-object-0.4.1.tgz",
+      "integrity": "sha512-HFpbb5gr2ypci7Qw+IOhnP2zOU7e77b+rzM+wTzXzfi1PrtBCX0E7Pk4wL4iTLnhzZ+JgEGAhX81ebTg/aYjQw==",
+      "dependencies": {
+        "inline-style-parser": "0.1.1"
+      }
+    },
    "node_modules/style-to-object": {
      "version": "0.3.0",
      "resolved": "https://registry.npmjs.org/style-to-object/-/style-to-object-0.3.0.tgz",
--- a/docs/docs_skeleton/package.json
+++ b/docs/docs_skeleton/package.json
@@ -23,7 +23,7 @@
    "@docusaurus/preset-classic": "2.4.0",
    "@docusaurus/remark-plugin-npm2yarn": "^2.4.0",
    "@mdx-js/react": "^1.6.22",
-    "@mendable/search": "^0.0.125",
+    "@mendable/search": "^0.0.150",
    "clsx": "^1.2.1",
    "json-loader": "^0.5.7",
    "process": "^0.11.10",
--- a/docs/docs_skeleton/sidebars.js
+++ b/docs/docs_skeleton/sidebars.js
@@ -44,6 +44,16 @@ module.exports = {
        id: "modules/index"
      },
    },
+    {
+      type: "category",
+      label: "LangChain Expression Language",
+      collapsed: true,
+      items: [{ type: "autogenerated", dirName: "expression_language" } ],
+      link: {
+        type: 'doc',
+        id: "expression_language/index"
+      },
+    },
    {
      type: "category",
      label: "Guides",
@@ -52,52 +62,63 @@ module.exports = {
      link: {
        type: 'generated-index',
        description: 'Design guides for key parts of the development process',
-      slug: "guides",
-      },
-    },
-    {
-      type: "category",
-      label: "Ecosystem",
-      collapsed: true,
-      items: [{ type: "autogenerated", dirName: "ecosystem" }],
-      link: {
-        type: 'generated-index',
-      slug: "ecosystem",
+        slug: "guides",
      },
    },
    {
      type: "category",
      label: "Additional resources",
      collapsed: true,
-      items: [{ type: "autogenerated", dirName: "additional_resources" }, { type: "link", label: "Gallery", href: "https://github.com/kyrolabs/awesome-langchain" }],
+      items: [
+        { type: "autogenerated", dirName: "additional_resources" },
+        { type: "link", label: "Gallery", href: "https://github.com/kyrolabs/awesome-langchain" }
+      ],
      link: {
        type: 'generated-index',
-      slug: "additional_resources",
+        slug: "additional_resources",
      },
    },
+    'community'
  ],
  integrations: [
    {
      type: "category",
-      label: "Integrations",
+      label: "Providers",
      collapsible: false,
-      items: [{ type: "autogenerated", dirName: "integrations" }],
+      items: [
+        { type: "autogenerated", dirName: "integrations/platforms" },
+        { type: "category", label: "More", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/providers" }]},
+      ],
      link: {
        type: 'generated-index',
-      slug: "integrations",
+        slug: "integrations/providers",
+      },
+    },
+    {
+      type: "category",
+      label: "Components",
+      collapsible: false,
+      items: [
+        { type: "category", label: "LLMs", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/llms" }], link: {type: "generated-index", slug: "integrations/llms" }},
+        { type: "category", label: "Chat models", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/chat" }], link: {type: "generated-index", slug: "integrations/chat" }},
+        { type: "category", label: "Document loaders", collapsed: true, items: [{type:"autogenerated", dirName: "integrations/document_loaders" }], link: {type: "generated-index", slug: "integrations/document_loaders" }},
+        { type: "category", label: "Document transformers", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/document_transformers" }], link: {type: "generated-index", slug: "integrations/document_transformers" }},
+        { type: "category", label: "Text embedding models", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/text_embedding" }], link: {type: "generated-index", slug: "integrations/text_embedding" }},
+        { type: "category", label: "Vector stores", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/vectorstores" }], link: {type: "generated-index", slug: "integrations/vectorstores" }},
+        { type: "category", label: "Retrievers", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/retrievers" }], link: {type: "generated-index", slug: "integrations/retrievers" }},
+        { type: "category", label: "Tools", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/tools" }], link: {type: "generated-index", slug: "integrations/tools" }},
+        { type: "category", label: "Agents and toolkits", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/toolkits" }], link: {type: "generated-index", slug: "integrations/toolkits" }},
+        { type: "category", label: "Memory", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/memory" }], link: {type: "generated-index", slug: "integrations/memory" }},
+        { type: "category", label: "Callbacks", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/callbacks" }], link: {type: "generated-index", slug: "integrations/callbacks" }},
+        { type: "category", label: "Chat loaders", collapsed: true, items: [{type: "autogenerated", dirName: "integrations/chat_loaders" }], link: {type: "generated-index", slug: "integrations/chat_loaders" }},
+      ],
+      link: {
+        type: 'generated-index',
+      slug: "integrations/components",
      },
    },
  ],
  use_cases: [
-    {
-      type: "category",
-      label: "Use cases",
-      collapsible: false,
-      items: [{ type: "autogenerated", dirName: "use_cases" }],
-      link: {
-        type: 'generated-index',
-      slug: "use_cases",
-      },
-    },
+    {type: "autogenerated", dirName: "use_cases" }
  ],
 };
--- a/docs/docs_skeleton/src/pages/index.js
+++ b/docs/docs_skeleton/src/pages/index.js
@@ -11,5 +11,5 @@ import React from "react";
 import { Redirect } from "@docusaurus/router";

 export default function Home() {
-  return <Redirect to="docs/get_started/introduction.html" />;
+  return <Redirect to="docs/get_started/introduction" />;
 }
--- a/docs/docs_skeleton/src/theme/CodeBlock/index.js
+++ b/docs/docs_skeleton/src/theme/CodeBlock/index.js
@@ -24,8 +24,7 @@ function Imports({ imports }) {
          <li key={imported}>
            <a href={docs}>
              <span>{imported}</span>
-            </a>{" "}
-            from <code>{source}</code>
+            </a>
          </li>
        ))}
      </ul>
--- a/docs/docs_skeleton/static/img/OSS_LLM_overview.png
+++ b/docs/docs_skeleton/static/img/OSS_LLM_overview.png
--- a/docs/docs_skeleton/static/img/ReAct.png
+++ b/docs/docs_skeleton/static/img/ReAct.png
--- a/docs/docs_skeleton/static/img/SQLDatabaseToolkit.png
+++ b/docs/docs_skeleton/static/img/SQLDatabaseToolkit.png
--- a/docs/docs_skeleton/static/img/agents_use_case_1.png
+++ b/docs/docs_skeleton/static/img/agents_use_case_1.png
--- a/docs/docs_skeleton/static/img/agents_use_case_trace_1.png
+++ b/docs/docs_skeleton/static/img/agents_use_case_trace_1.png
--- a/docs/docs_skeleton/static/img/agents_use_case_trace_2.png
+++ b/docs/docs_skeleton/static/img/agents_use_case_trace_2.png
--- a/docs/docs_skeleton/static/img/agents_vs_chains.png
+++ b/docs/docs_skeleton/static/img/agents_vs_chains.png
--- a/docs/docs_skeleton/static/img/api_chain.png
+++ b/docs/docs_skeleton/static/img/api_chain.png
--- a/docs/docs_skeleton/static/img/api_chain_response.png
+++ b/docs/docs_skeleton/static/img/api_chain_response.png
--- a/docs/docs_skeleton/static/img/api_function_call.png
+++ b/docs/docs_skeleton/static/img/api_function_call.png
--- a/Show More
+++ b/Show More