Merge branch 'master' into erick/skip-release-check-cli

another if
remove if
2026-02-06 09:10:27 +00:00 · 2023-11-14 15:16:26 -08:00 · 2023-11-08 08:13:28 -08:00 · 2023-11-08 08:10:35 -08:00 · 2023-11-08 08:08:59 -08:00
2226 changed files with 59735 additions and 141555 deletions
--- a/.github/CONTRIBUTING.md
+++ b/.github/CONTRIBUTING.md
@@ -23,7 +23,7 @@ It's essential that we maintain great documentation and testing. If you:
  - Update any affected example notebooks and documentation. These live in `docs`.
  - Update unit and integration tests when relevant.
 - Add a feature
-  - Add a demo notebook in `docs/docs/`.
+  - Add a demo notebook in `docs/modules`.
  - Add unit and integration tests.

 We are a small, progress-oriented team. If there's something you'd like to add or change, opening a pull request is the
@@ -72,10 +72,9 @@ tell Poetry to use the virtualenv python environment (`poetry config virtualenvs

 ### Core vs. Experimental

-This repository contains three separate projects:
+This repository contains two separate projects:
 - `langchain`: core langchain code, abstractions, and use cases.
- `langchain_core`: contain interfaces for key abstractions as well as logic for combining them in chains (LCEL).
- `langchain_experimental`: see the [Experimental README](https://github.com/langchain-ai/langchain/tree/master/libs/experimental/README.md) for more information.
+- `langchain.experimental`: see the [Experimental README](https://github.com/langchain-ai/langchain/tree/master/libs/experimental/README.md) for more information.

 Each of these has its own development environment. Docs are run from the top-level makefile, but development
 is split across separate test & release flows.
@@ -129,24 +128,6 @@ make docker_tests

 There are also [integration tests and code-coverage](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests/README.md) available.

-### Only develop langchain_core or langchain_experimental
-
-If you are only developing `langchain_core` or `langchain_experimental`, you can simply install the dependencies for the respective projects and run tests:
-
-```bash
-cd libs/core
-poetry install --with test
-make test
-```
-
-Or:
-
-```bash
-cd libs/experimental
-poetry install --with test
-make test
-```
-
 ### Formatting and Linting

 Run these locally before submitting a PR; the CI system will check also.
@@ -233,10 +214,6 @@ ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogy

 Langchain relies heavily on optional dependencies to keep the Langchain package lightweight.

-You only need to add a new dependency if a **unit test** relies on the package.
-If your package is only required for **integration tests**, then you can skip these
-steps and leave all pyproject.toml and poetry.lock files alone.
-
 If you're adding a new dependency to Langchain, assume that it will be an optional dependency, and
 that most users won't have it installed.

--- a/.github/scripts/check_diff.py
+++ b/.github/scripts/check_diff.py
@@ -1,45 +0,0 @@
-import json
-import sys
-
-ALL_DIRS = {
-    "libs/core",
-    "libs/langchain",
-    "libs/experimental",
-}
-
-if __name__ == "__main__":
-    files = sys.argv[1:]
-    dirs_to_run = set()
-
-    for file in files:
-        if any(
-            file.startswith(dir_)
-            for dir_ in (
-                ".github/workflows",
-                ".github/tools",
-                ".github/actions",
-                "libs/core",
-                ".github/scripts/check_diff.py",
-            )
-        ):
-            dirs_to_run = ALL_DIRS
-            break
-        elif "libs/community" in file:
-            dirs_to_run.update(
-                ("libs/community", "libs/langchain", "libs/experimental")
-            )
-        elif "libs/partners" in file:
-            partner_dir = file.split("/")[2]
-            dirs_to_run.update(
-                (f"libs/partners/{partner_dir}", "libs/langchain", "libs/experimental")
-            )
-        elif "libs/langchain" in file:
-            dirs_to_run.update(("libs/langchain", "libs/experimental"))
-        elif "libs/experimental" in file:
-            dirs_to_run.add("libs/experimental")
-        elif file.startswith("libs/"):
-            dirs_to_run = ALL_DIRS
-            break
-        else:
-            pass
-    print(json.dumps(list(dirs_to_run)))
--- a/.github/workflows/_compile_integration_test.yml
+++ b/.github/workflows/_compile_integration_test.yml
@@ -38,7 +38,7 @@ jobs:

      - name: Install integration dependencies
        shell: bash
-        run: poetry install --with=test_integration,test
+        run: poetry install --with=test_integration

      - name: Check integration tests compile
        shell: bash
--- a/.github/workflows/_lint.yml
+++ b/.github/workflows/_lint.yml
@@ -68,7 +68,7 @@ jobs:
        # It doesn't matter how you change it, any change will cause a cache-bust.
        working-directory: ${{ inputs.working-directory }}
        run: |
-          poetry install --with lint,typing
+          poetry install --with dev,lint,test,typing

      - name: Install langchain editable
        working-directory: ${{ inputs.working-directory }}
@@ -76,7 +76,7 @@ jobs:
        env:
          LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
        run: |
-          poetry run pip install -e "$LANGCHAIN_LOCATION"
+          pip install -e "$LANGCHAIN_LOCATION"

      - name: Get .mypy_cache to speed up mypy
        uses: actions/cache@v3
@@ -90,31 +90,4 @@ jobs:
      - name: Analysing the code with our lint
        working-directory: ${{ inputs.working-directory }}
        run: |
-          make lint_package
-
-      - name: Install test dependencies
-        # Also installs dev/lint/test/typing dependencies, to ensure we have
-        # type hints for as many of our libraries as possible.
-        # This helps catch errors that require dependencies to be spotted, for example:
-        # https://github.com/langchain-ai/langchain/pull/10249/files#diff-935185cd488d015f026dcd9e19616ff62863e8cde8c0bee70318d3ccbca98341
-        #
-        # If you change this configuration, make sure to change the `cache-key`
-        # in the `poetry_setup` action above to stop using the old cache.
-        # It doesn't matter how you change it, any change will cause a cache-bust.
-        working-directory: ${{ inputs.working-directory }}
-        run: |
-          poetry install --with test
-
-      - name: Get .mypy_cache to speed up mypy
-        uses: actions/cache@v3
-        env:
-          SEGMENT_DOWNLOAD_TIMEOUT_MIN: "2"
-        with:
-          path: |
-            ${{ env.WORKDIR }}/.mypy_cache
-          key: mypy-test-${{ runner.os }}-${{ runner.arch }}-py${{ matrix.python-version }}-${{ inputs.working-directory }}-${{ hashFiles(format('{0}/poetry.lock', env.WORKDIR)) }}
-
-      - name: Analysing the code with our lint
-        working-directory: ${{ inputs.working-directory }}
-        run: |
-          make lint_tests
+          make lint
--- a/.github/workflows/_pydantic_compatibility.yml
+++ b/.github/workflows/_pydantic_compatibility.yml
@@ -1,4 +1,4 @@
-name: dependencies
+name: pydantic v1/v2 compatibility

 on:
  workflow_call:
@@ -7,10 +7,6 @@ on:
        required: true
        type: string
        description: "From which folder this pipeline executes"
-      langchain-location:
-        required: false
-        type: string
-        description: "Relative path to the langchain library folder"

 env:
  POETRY_VERSION: "1.6.1"
@@ -28,7 +24,7 @@ jobs:
          - "3.9"
          - "3.10"
          - "3.11"
-    name: dependencies - Python ${{ matrix.python-version }}
+    name: Pydantic v1/v2 compatibility - Python ${{ matrix.python-version }}
    steps:
      - uses: actions/checkout@v4

@@ -44,22 +40,6 @@ jobs:
        shell: bash
        run: poetry install

-      - name: Check imports with base dependencies
-        shell: bash
-        run: poetry run make check_imports
-
-      - name: Install test dependencies
-        shell: bash
-        run: poetry install --with test
-
-      - name: Install langchain editable
-        working-directory: ${{ inputs.working-directory }}
-        if: ${{ inputs.langchain-location }}
-        env:
-          LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
-        run: |
-          poetry run pip install -e "$LANGCHAIN_LOCATION"
-
      - name: Install the opposite major version of pydantic
        # If normal tests use pydantic v1, here we'll use v2, and vice versa.
        shell: bash
--- a/.github/workflows/_release.yml
+++ b/.github/workflows/_release.yml
@@ -14,7 +14,7 @@ env:

 jobs:
  build:
-    if: github.ref == 'refs/heads/master'
+    # if: github.ref == 'refs/heads/master'
    runs-on: ubuntu-latest

    outputs:
@@ -70,58 +70,59 @@ jobs:
      working-directory: ${{ inputs.working-directory }}
    secrets: inherit

-  pre-release-checks:
-    needs:
-      - build
-      - test-pypi-publish
-    runs-on: ubuntu-latest
-    steps:
-      # We explicitly *don't* set up caching here. This ensures our tests are
-      # maximally sensitive to catching breakage.
-      #
-      # For example, here's a way that caching can cause a falsely-passing test:
-      # - Make the langchain package manifest no longer list a dependency package
-      #   as a requirement. This means it won't be installed by `pip install`,
-      #   and attempting to use it would cause a crash.
-      # - That dependency used to be required, so it may have been cached.
-      #   When restoring the venv packages from cache, that dependency gets included.
-      # - Tests pass, because the dependency is present even though it wasn't specified.
-      # - The package is published, and it breaks on the missing dependency when
-      #   used in the real world.
-      - uses: actions/setup-python@v4
-        with:
-          python-version: ${{ env.PYTHON_VERSION }}
+  # pre-release-checks:
+  #   needs:
+  #     - build
+  #     - test-pypi-publish
+  #   runs-on: ubuntu-latest
+  #   steps:
+  #     # We explicitly *don't* set up caching here. This ensures our tests are
+  #     # maximally sensitive to catching breakage.
+  #     #
+  #     # For example, here's a way that caching can cause a falsely-passing test:
+  #     # - Make the langchain package manifest no longer list a dependency package
+  #     #   as a requirement. This means it won't be installed by `pip install`,
+  #     #   and attempting to use it would cause a crash.
+  #     # - That dependency used to be required, so it may have been cached.
+  #     #   When restoring the venv packages from cache, that dependency gets included.
+  #     # - Tests pass, because the dependency is present even though it wasn't specified.
+  #     # - The package is published, and it breaks on the missing dependency when
+  #     #   used in the real world.
+  #     - uses: actions/setup-python@v4
+  #       with:
+  #         python-version: ${{ env.PYTHON_VERSION }}

-      - name: Test published package
-        shell: bash
-        env:
-          PKG_NAME: ${{ needs.build.outputs.pkg-name }}
-          VERSION: ${{ needs.build.outputs.version }}
-        # Here we use:
-        # - The default regular PyPI index as the *primary* index, meaning 
-        #   that it takes priority (https://pypi.org/simple)
-        # - The test PyPI index as an extra index, so that any dependencies that
-        #   are not found on test PyPI can be resolved and installed anyway.
-        #   (https://test.pypi.org/simple). This will include the PKG_NAME==VERSION
-        #   package because VERSION will not have been uploaded to regular PyPI yet.
-        #
-        # TODO: add more in-depth pre-publish tests after testing that importing works
-        run: |
-          pip install \
-            --extra-index-url https://test.pypi.org/simple/ \
-            "$PKG_NAME==$VERSION"
+  #     - name: Test published package
+  #       shell: bash
+  #       env:
+  #         PKG_NAME: ${{ needs.build.outputs.pkg-name }}
+  #         VERSION: ${{ needs.build.outputs.version }}
+  #       # Here we specify:
+  #       # - The test PyPI index as the *primary* index, meaning that it takes priority.
+  #       # - The regular PyPI index as an extra index, so that any dependencies that
+  #       #   are not found on test PyPI can be resolved and installed anyway.
+  #       #
+  #       # Without the former, we might install the wrong langchain release.
+  #       # Without the latter, we might not be able to install langchain's dependencies.
+  #       #
+  #       # TODO: add more in-depth pre-publish tests after testing that importing works
+  #       run: |
+  #         pip install \
+  #           --index-url https://test.pypi.org/simple/ \
+  #           --extra-index-url https://pypi.org/simple/ \
+  #           "$PKG_NAME==$VERSION"

-          # Replace all dashes in the package name with underscores,
-          # since that's how Python imports packages with dashes in the name.
-          IMPORT_NAME="$(echo "$PKG_NAME" | sed s/-/_/g)"
+  #         # Replace all dashes in the package name with underscores,
+  #         # since that's how Python imports packages with dashes in the name.
+  #         IMPORT_NAME="$(echo "$PKG_NAME" | sed s/-/_/g)"

-          python -c "import $IMPORT_NAME; print(dir($IMPORT_NAME))"
+  #         python -c "import $IMPORT_NAME; print(dir($IMPORT_NAME))"

  publish:
    needs:
      - build
      - test-pypi-publish
-      - pre-release-checks
+      # - pre-release-checks
    runs-on: ubuntu-latest
    permissions:
      # This permission is used for trusted publishing:
@@ -162,7 +163,7 @@ jobs:
    needs:
      - build
      - test-pypi-publish
-      - pre-release-checks
+      # - pre-release-checks
      - publish
    runs-on: ubuntu-latest
    permissions:
--- a/.github/workflows/_test.yml
+++ b/.github/workflows/_test.yml
@@ -7,10 +7,6 @@ on:
        required: true
        type: string
        description: "From which folder this pipeline executes"
-      langchain-location:
-        required: false
-        type: string
-        description: "Relative path to the langchain library folder"

 env:
  POETRY_VERSION: "1.6.1"
@@ -42,20 +38,11 @@ jobs:

      - name: Install dependencies
        shell: bash
-        run: poetry install --with test
-
-      - name: Install langchain editable
-        working-directory: ${{ inputs.working-directory }}
-        if: ${{ inputs.langchain-location }}
-        env:
-          LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
-        run: |
-          poetry run pip install -e "$LANGCHAIN_LOCATION"
+        run: poetry install

      - name: Run core tests
        shell: bash
-        run: |
-          make test
+        run: make test

      - name: Ensure the tests did not create any additional files
        shell: bash
--- a/.github/workflows/_test_release.yml
+++ b/.github/workflows/_test_release.yml
@@ -14,7 +14,7 @@ env:

 jobs:
  build:
-    if: github.ref == 'refs/heads/master'
+    # if: github.ref == 'refs/heads/master'
    runs-on: ubuntu-latest

    outputs:
--- a/.github/workflows/check_diffs.yml
+++ b/.github/workflows/check_diffs.yml
@@ -1,47 +0,0 @@
---
-name: Check library diffs
-
-on:
-  push:
-    branches: [master]
-  pull_request:
-    paths:
-      - ".github/actions/**"
-      - ".github/tools/**"
-      - ".github/workflows/**"
-      - "libs/**"
-
-# If another push to the same PR or branch happens while this workflow is still running,
-# cancel the earlier run in favor of the next run.
-#
-# There's no point in testing an outdated version of the code. GitHub only allows
-# a limited number of job runners to be active at the same time, so it's better to cancel
-# pointless jobs early so that more useful jobs can run sooner.
-concurrency:
-  group: ${{ github.workflow }}-${{ github.ref }}
-  cancel-in-progress: true
-
-jobs:
-  build:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-      - uses: actions/setup-python@v4
-        with:
-          python-version: '3.10'
-      - id: files
-        uses: Ana06/get-changed-files@v2.2.0
-      - id: set-matrix
-        run: echo "dirs-to-run=$(python .github/scripts/check_diff.py ${{ steps.files.outputs.all }})" >> $GITHUB_OUTPUT
-    outputs:
-      dirs-to-run: ${{ steps.set-matrix.outputs.dirs-to-run }}
-  ci:
-    needs: [ build ]
-    strategy:
-      matrix:
-        working-directory: ${{ fromJson(needs.build.outputs.dirs-to-run) }}
-    uses: ./.github/workflows/_all_ci.yml
-    with:
-      working-directory: ${{ matrix.working-directory }}
-
-
--- a/.github/workflows/langchain_ci.yml
+++ b/.github/workflows/langchain_ci.yml
@@ -1,24 +1,20 @@
 ---
-name: langchain CI
+name: libs/langchain CI

 on:
-  workflow_call:
-    inputs:
-      working-directory:
-        required: true
-        type: string
-        description: "From which folder this pipeline executes"
-  workflow_dispatch:
-    inputs:
-      working-directory:
-        required: true
-        type: choice
-        default: 'libs/langchain'
-        options:
-        - libs/langchain
-        - libs/core
-        - libs/experimental
-
+  push:
+    branches: [ master ]
+  pull_request:
+    paths:
+      - '.github/actions/poetry_setup/action.yml'
+      - '.github/tools/**'
+      - '.github/workflows/_lint.yml'
+      - '.github/workflows/_test.yml'
+      - '.github/workflows/_pydantic_compatibility.yml'
+      - '.github/workflows/langchain_ci.yml'
+      - 'libs/*'
+      - 'libs/langchain/**'
+  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

 # If another push to the same PR or branch happens while this workflow is still running,
 # cancel the earlier run in favor of the next run.
@@ -27,39 +23,47 @@ on:
 # a limited number of job runners to be active at the same time, so it's better to cancel
 # pointless jobs early so that more useful jobs can run sooner.
 concurrency:
-  group: ${{ github.workflow }}-${{ github.ref }}-${{ inputs.working-directory }}
+  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

 env:
  POETRY_VERSION: "1.6.1"
+  WORKDIR: "libs/langchain"

 jobs:
  lint:
-    uses: ./.github/workflows/_lint.yml
+    uses:
+      ./.github/workflows/_lint.yml
    with:
-      working-directory: ${{ inputs.working-directory }}
+      working-directory: libs/langchain
    secrets: inherit

  test:
-    uses: ./.github/workflows/_test.yml
+    uses:
+      ./.github/workflows/_test.yml
    with:
-      working-directory: ${{ inputs.working-directory }}
+      working-directory: libs/langchain
    secrets: inherit

  compile-integration-tests:
-    uses: ./.github/workflows/_compile_integration_test.yml
+    uses:
+      ./.github/workflows/_compile_integration_test.yml
    with:
-      working-directory: ${{ inputs.working-directory }}
+      working-directory: libs/langchain
    secrets: inherit

-  dependencies:
-    uses: ./.github/workflows/_dependencies.yml
+  pydantic-compatibility:
+    uses:
+      ./.github/workflows/_pydantic_compatibility.yml
    with:
-      working-directory: ${{ inputs.working-directory }}
+      working-directory: libs/langchain
    secrets: inherit

  extended-tests:
    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: ${{ env.WORKDIR }}
    strategy:
      matrix:
        python-version:
@@ -68,9 +72,6 @@ jobs:
          - "3.10"
          - "3.11"
    name: Python ${{ matrix.python-version }} extended tests
-    defaults:
-      run:
-        working-directory: ${{ inputs.working-directory }}
    steps:
      - uses: actions/checkout@v4

@@ -79,14 +80,14 @@ jobs:
        with:
          python-version: ${{ matrix.python-version }}
          poetry-version: ${{ env.POETRY_VERSION }}
-          working-directory: ${{ inputs.working-directory }}
+          working-directory: libs/langchain
          cache-key: extended

      - name: Install dependencies
        shell: bash
        run: |
          echo "Running extended tests, installing dependencies with poetry..."
-          poetry install -E extended_testing --with test
+          poetry install -E extended_testing

      - name: Run extended tests
        run: make extended_tests
--- a/.github/workflows/langchain_cli_ci.yml
+++ b/.github/workflows/langchain_cli_ci.yml
@@ -0,0 +1,47 @@
+---
+name: libs/cli CI
+
+on:
+  push:
+    branches: [ master ]
+  pull_request:
+    paths:
+      - '.github/actions/poetry_setup/action.yml'
+      - '.github/tools/**'
+      - '.github/workflows/_lint.yml'
+      - '.github/workflows/_test.yml'
+      - '.github/workflows/_pydantic_compatibility.yml'
+      - '.github/workflows/langchain_cli_ci.yml'
+      - 'libs/cli/**'
+      - 'libs/*'
+  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI
+
+# If another push to the same PR or branch happens while this workflow is still running,
+# cancel the earlier run in favor of the next run.
+#
+# There's no point in testing an outdated version of the code. GitHub only allows
+# a limited number of job runners to be active at the same time, so it's better to cancel
+# pointless jobs early so that more useful jobs can run sooner.
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
+env:
+  POETRY_VERSION: "1.6.1"
+  WORKDIR: "libs/cli"
+
+jobs:
+  lint:
+    uses:
+      ./.github/workflows/_lint.yml
+    with:
+      working-directory: libs/cli
+      langchain-location: ../langchain
+    secrets: inherit
+
+  test:
+    uses:
+      ./.github/workflows/_test.yml
+    with:
+      working-directory: libs/cli
+    secrets: inherit
--- a/.github/workflows/langchain_community_release.yml
+++ b/.github/workflows/langchain_community_release.yml
@@ -1,13 +0,0 @@
---
-name: libs/community Release
-
-on:
-  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI
-
-jobs:
-  release:
-    uses:
-      ./.github/workflows/_release.yml
-    with:
-      working-directory: libs/community
-    secrets: inherit
--- a/.github/workflows/langchain_core_release.yml
+++ b/.github/workflows/langchain_core_release.yml
@@ -1,13 +0,0 @@
---
-name: libs/core Release
-
-on:
-  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI
-
-jobs:
-  release:
-    uses:
-      ./.github/workflows/_release.yml
-    with:
-      working-directory: libs/core
-    secrets: inherit
--- a/.github/workflows/langchain_experimental_ci.yml
+++ b/.github/workflows/langchain_experimental_ci.yml
@@ -0,0 +1,137 @@
+---
+name: libs/experimental CI
+
+on:
+  push:
+    branches: [ master ]
+  pull_request:
+    paths:
+      - '.github/actions/poetry_setup/action.yml'
+      - '.github/tools/**'
+      - '.github/workflows/_lint.yml'
+      - '.github/workflows/_test.yml'
+      - '.github/workflows/langchain_experimental_ci.yml'
+      - 'libs/*'
+      - 'libs/experimental/**'
+  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI
+
+# If another push to the same PR or branch happens while this workflow is still running,
+# cancel the earlier run in favor of the next run.
+#
+# There's no point in testing an outdated version of the code. GitHub only allows
+# a limited number of job runners to be active at the same time, so it's better to cancel
+# pointless jobs early so that more useful jobs can run sooner.
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
+env:
+  POETRY_VERSION: "1.6.1"
+  WORKDIR: "libs/experimental"
+
+jobs:
+  lint:
+    uses:
+      ./.github/workflows/_lint.yml
+    with:
+      working-directory: libs/experimental
+      langchain-location: ../langchain
+    secrets: inherit
+
+  test:
+    uses:
+      ./.github/workflows/_test.yml
+    with:
+      working-directory: libs/experimental
+    secrets: inherit
+
+  compile-integration-tests:
+    uses:
+      ./.github/workflows/_compile_integration_test.yml
+    with:
+      working-directory: libs/experimental
+    secrets: inherit
+
+  # It's possible that langchain-experimental works fine with the latest *published* langchain,
+  # but is broken with the langchain on `master`.
+  #
+  # We want to catch situations like that *before* releasing a new langchain, hence this test.
+  test-with-latest-langchain:
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: ${{ env.WORKDIR }}
+    strategy:
+      matrix:
+        python-version:
+          - "3.8"
+          - "3.9"
+          - "3.10"
+          - "3.11"
+    name: test with unpublished langchain - Python ${{ matrix.python-version }}
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
+        uses: "./.github/actions/poetry_setup"
+        with:
+          python-version: ${{ matrix.python-version }}
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: ${{ env.WORKDIR }}
+          cache-key: unpublished-langchain
+
+      - name: Install dependencies
+        shell: bash
+        run: |
+          echo "Running tests with unpublished langchain, installing dependencies with poetry..."
+          poetry install
+
+          echo "Editably installing langchain outside of poetry, to avoid messing up lockfile..."
+          poetry run pip install -e ../langchain
+
+      - name: Run tests
+        run: make test
+  extended-tests:
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        working-directory: ${{ env.WORKDIR }}
+    strategy:
+      matrix:
+        python-version:
+          - "3.8"
+          - "3.9"
+          - "3.10"
+          - "3.11"
+    name: Python ${{ matrix.python-version }} extended tests
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
+        uses: "./.github/actions/poetry_setup"
+        with:
+          python-version: ${{ matrix.python-version }}
+          poetry-version: ${{ env.POETRY_VERSION }}
+          working-directory: libs/experimental
+          cache-key: extended
+
+      - name: Install dependencies
+        shell: bash
+        run: |
+          echo "Running extended tests, installing dependencies with poetry..."
+          poetry install -E extended_testing
+
+      - name: Run extended tests
+        run: make extended_tests
+
+      - name: Ensure the tests did not create any additional files
+        shell: bash
+        run: |
+          set -eu
+
+          STATUS="$(git status)"
+          echo "$STATUS"
+
+          # grep will exit non-zero if the target message isn't found,
+          # and `set -e` above will cause the step to fail.
+          echo "$STATUS" | grep 'nothing to commit, working tree clean'
--- a/.github/workflows/scheduled_test.yml
+++ b/.github/workflows/scheduled_test.yml
@@ -52,7 +52,13 @@ jobs:
        shell: bash
        run: |
          echo "Running scheduled tests, installing dependencies with poetry..."
-          poetry install --with=test_integration,test
+          poetry install --with=test_integration
+          poetry run pip install google-cloud-aiplatform
+          poetry run pip install "boto3>=1.28.57"
+          if [[ ${{ matrix.python-version }} != "3.8" ]]
+          then
+            poetry run pip install fireworks-ai
+          fi

      - name: Run tests
        shell: bash
@@ -62,9 +68,7 @@ jobs:
          AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
          AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
          AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
-          AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
-          AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
-          AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
+          AZURE_OPENAI_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_DEPLOYMENT_NAME }}
          FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
        run: |
          make scheduled_tests
--- a/.github/workflows/templates_ci.yml
+++ b/.github/workflows/templates_ci.yml
@@ -33,4 +33,5 @@ jobs:
      ./.github/workflows/_lint.yml
    with:
      working-directory: templates
+      langchain-location: ../libs/langchain
    secrets: inherit
--- a/.gitignore
+++ b/.gitignore
@@ -167,7 +167,8 @@ docs/node_modules/
 docs/.docusaurus/
 docs/.cache-loader/
 docs/_dist
-docs/api_reference/*api_reference.rst
+docs/api_reference/api_reference.rst
+docs/api_reference/experimental_api_reference.rst
 docs/api_reference/_build
 docs/api_reference/*/
 !docs/api_reference/_static/
--- a/12
+++ b/12
@@ -1,6 +1,6 @@
-MIT License
+The MIT License

-Copyright (c) LangChain, Inc.
+Copyright (c) Harrison Chase

 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
@@ -9,13 +9,13 @@ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:

-The above copyright notice and this permission notice shall be included in all
-copies or substantial portions of the Software.
+The above copyright notice and this permission notice shall be included in
+all copies or substantial portions of the Software.

 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
-OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
-SOFTWARE.
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+THE SOFTWARE.
--- a/3
+++ b/3
@@ -41,10 +41,9 @@ spell_fix:
 # LINTING AND FORMATTING
 ######################

-lint lint_package lint_tests:
+lint:
 	poetry run ruff docs templates cookbook
 	poetry run ruff format docs templates cookbook --diff
-	poetry run ruff --select I docs templates cookbook

 format format_diff:
 	poetry run ruff format docs templates cookbook
--- a/README.md
+++ b/README.md
@@ -30,7 +30,7 @@ pip install langchain

 With conda:
 ```bash
-conda install langchain -c conda-forge
+pip install langsmith && conda install langchain -c conda-forge
 ```

 ## 🤔 What is LangChain?
@@ -104,7 +104,3 @@ Please see [here](https://python.langchain.com) for full documentation, which in
 As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.

 For detailed information on how to contribute, see [here](.github/CONTRIBUTING.md).
-
-## 🌟 Contributors
-
-[![langchain contributors](https://contrib.rocks/image?repo=langchain-ai/langchain&max=2000)](https://github.com/langchain-ai/langchain/graphs/contributors)
--- a/cookbook/Multi_modal_RAG.ipynb
+++ b/cookbook/Multi_modal_RAG.ipynb
--- a/cookbook/README.md
+++ b/cookbook/README.md
@@ -8,7 +8,6 @@ Notebook | Description
 [Semi_Structured_RAG.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_Structured_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data, including text and tables, using unstructured for parsing, multi-vector retriever for storing, and lcel for implementing chains.
 [Semi_structured_and_multi_moda...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_and_multi_modal_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using unstructured for parsing, multi-vector retriever for storage and retrieval, and lcel for implementing chains.
 [Semi_structured_multi_modal_RA...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using various tools and methods such as unstructured for parsing, multi-vector retriever for storing, lcel for implementing chains, and open source language models like llama2, llava, and gpt4all.
-[analyze_document.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/analyze_document.ipynb) | Analyze a single long document.
 [autogpt/autogpt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/autogpt.ipynb) | Implement autogpt, a language model, with langchain primitives such as llms, prompttemplates, vectorstores, embeddings, and tools.
 [autogpt/marathon_times.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/marathon_times.ipynb) | Implement autogpt for finding winning marathon times.
 [baby_agi.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/baby_agi.ipynb) | Implement babyagi, an ai agent that can generate and execute tasks based on a given objective, with the flexibility to swap out specific vectorstores/model providers.
@@ -45,7 +44,6 @@ Notebook | Description
 [plan_and_execute_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/plan_and_execute_agent.ipynb) | Create plan-and-execute agents that accomplish objectives by planning tasks with a language model (llm) and executing them with a separate agent.
 [press_releases.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/press_releases.ipynb) | Retrieve and query company press release data powered by [Kay.ai](https://kay.ai).
 [program_aided_language_model.i...](https://github.com/langchain-ai/langchain/tree/master/cookbook/program_aided_language_model.ipynb) | Implement program-aided language models as described in the provided research paper.
-[qa_citations.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/qa_citations.ipynb) | Different ways to get a model to cite its sources.
 [retrieval_in_sql.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/retrieval_in_sql.ipynb) | Perform retrieval-augmented-generation (rag) on a PostgreSQL database using pgvector.
 [sales_agent_with_context.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/sales_agent_with_context.ipynb) | Implement a context-aware ai sales agent, salesgpt, that can have natural sales conversations, interact with other systems, and use a product knowledge base to discuss a company's offerings.
 [self_query_hotel_search.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/self_query_hotel_search.ipynb) | Build a hotel room search feature with self-querying retrieval, using a specific hotel recommendation dataset.
--- a/cookbook/analyze_document.ipynb
+++ b/cookbook/analyze_document.ipynb
@@ -1,105 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "f69d4a4c-137d-47e9-bea1-786afce9c1c0",
-   "metadata": {},
-   "source": [
-    "# Analyze a single long document\n",
-    "\n",
-    "The AnalyzeDocumentChain takes in a single document, splits it up, and then runs it through a CombineDocumentsChain."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "2a0707ce-6d2d-471b-bc33-64da32a7b3f0",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "with open(\"../docs/docs/modules/state_of_the_union.txt\") as f:\n",
-    "    state_of_the_union = f.read()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "ca14d161-2d5b-4a6c-a296-77d8ce4b28cd",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.chains import AnalyzeDocumentChain\n",
-    "from langchain.chat_models import ChatOpenAI\n",
-    "\n",
-    "llm = ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "9f97406c-85a9-45fb-99ce-9138c0ba3731",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.chains.question_answering import load_qa_chain\n",
-    "\n",
-    "qa_chain = load_qa_chain(llm, chain_type=\"map_reduce\")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "id": "0871a753-f5bb-4b4f-a394-f87f2691f659",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "qa_document_chain = AnalyzeDocumentChain(combine_docs_chain=qa_chain)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "id": "e6f86428-3c2c-46a0-a57c-e22826fdbf91",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'The President said, \"Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service.\"'"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "qa_document_chain.run(\n",
-    "    input_document=state_of_the_union,\n",
-    "    question=\"what did the president say about justice breyer?\",\n",
-    ")"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/cookbook/code-analysis-deeplake.ipynb
+++ b/cookbook/code-analysis-deeplake.ipynb
@@ -648,7 +648,7 @@
    {
     "data": {
      "text/plain": [
-       "OpenAIEmbeddings(client=<class 'openai.api_resources.embedding.Embedding'>, model='text-embedding-ada-002', deployment='text-embedding-ada-002', openai_api_version='', openai_api_base='', openai_api_type='', openai_proxy='', embedding_ctx_length=8191, openai_api_key='', openai_organization='', allowed_special=set(), disallowed_special='all', chunk_size=1000, max_retries=6, request_timeout=None, headers=None, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={})"
+       "OpenAIEmbeddings(client=<class 'openai.api_resources.embedding.Embedding'>, model='text-embedding-ada-002', deployment='text-embedding-ada-002', openai_api_version='', openai_api_base='', openai_api_type='', openai_proxy='', embedding_ctx_length=8191, openai_api_key='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', openai_organization='', allowed_special=set(), disallowed_special='all', chunk_size=1000, max_retries=6, request_timeout=None, headers=None, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={})"
      ]
     },
     "execution_count": 13,
--- a/cookbook/docugami_xml_kg_rag.ipynb
+++ b/cookbook/docugami_xml_kg_rag.ipynb
--- a/cookbook/llm_bash.ipynb
+++ b/cookbook/llm_bash.ipynb
@@ -69,8 +69,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
+    "from langchain.chains.llm_bash.prompt import BashOutputParser\n",
    "from langchain.prompts.prompt import PromptTemplate\n",
-    "from langchain_experimental.llm_bash.prompt import BashOutputParser\n",
    "\n",
    "_PROMPT_TEMPLATE = \"\"\"If someone asks you to perform a task, your job is to come up with a series of bash commands that will perform the task. There is no need to put \"#!/bin/bash\" in your answer. Make sure to reason step by step, using this format:\n",
    "Question: \"copy the files in the directory named 'target' into a new directory at the same level as target called 'myNewDirectory'\"\n",
--- a/cookbook/multi_modal_output_agent.ipynb
+++ b/cookbook/multi_modal_output_agent.ipynb
@@ -31,7 +31,7 @@
   "source": [
    "import re\n",
    "\n",
-    "from IPython.display import Image, display\n",
+    "from IPython.display import Image\n",
    "from steamship import Block, Steamship"
   ]
  },
@@ -180,7 +180,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.3"
  }
 },
 "nbformat": 4,
--- a/cookbook/qianfan_baidu_elasticesearch_RAG.ipynb
+++ b/cookbook/qianfan_baidu_elasticesearch_RAG.ipynb
@@ -37,8 +37,7 @@
   "source": [
    "#!pip install qianfan\n",
    "#!pip install bce-python-sdk\n",
-    "#!pip install elasticsearch == 7.11.0\n",
-    "#!pip install sentence-transformers"
+    "#!pip install elasticsearch == 7.11.0"
   ]
  },
  {
@@ -55,10 +54,8 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "import sentence_transformers\n",
    "from baidubce.auth.bce_credentials import BceCredentials\n",
    "from baidubce.bce_client_configuration import BceClientConfiguration\n",
-    "from langchain.chains.retrieval_qa import RetrievalQA\n",
    "from langchain.document_loaders.baiducloud_bos_directory import BaiduBOSDirectoryLoader\n",
    "from langchain.embeddings.huggingface import HuggingFaceEmbeddings\n",
    "from langchain.llms.baidu_qianfan_endpoint import QianfanLLMEndpoint\n",
@@ -164,22 +161,15 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.9.17"
  },
+  "orig_nbformat": 4,
  "vscode": {
   "interpreter": {
    "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
@@ -187,5 +177,5 @@
  }
 },
 "nbformat": 4,
- "nbformat_minor": 4
+ "nbformat_minor": 2
 }
--- a/cookbook/retrieval_in_sql.ipynb
+++ b/cookbook/retrieval_in_sql.ipynb
@@ -133,7 +133,7 @@
    "from tqdm import tqdm\n",
    "\n",
    "for i in tqdm(range(len(title_embeddings))):\n",
-    "    title = song_titles[i].replace(\"'\", \"''\")\n",
+    "    title = titles[i].replace(\"'\", \"''\")\n",
    "    embedding = title_embeddings[i]\n",
    "    sql_command = (\n",
    "        f'UPDATE \"Track\" SET \"embeddings\" = ARRAY{embedding} WHERE \"Name\" ='\n",
@@ -681,9 +681,9 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.8.18"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 4
+ "nbformat_minor": 2
 }
--- a/cookbook/wikibase_agent.ipynb
+++ b/cookbook/wikibase_agent.ipynb
@@ -187,7 +187,7 @@
    "    for key in path:\n",
    "        try:\n",
    "            current = current[key]\n",
-    "        except KeyError:\n",
+    "        except:\n",
    "            return None\n",
    "    return current\n",
    "\n",
--- a/docs/.local_build.sh
+++ b/docs/.local_build.sh
@@ -9,15 +9,13 @@ SCRIPT_DIR="$(cd "$(dirname "$0")"; pwd)"
 cd "${SCRIPT_DIR}"

 mkdir -p ../_dist
-rsync -ruv --exclude node_modules --exclude api_reference --exclude .venv --exclude .docusaurus . ../_dist
+cp -r . ../_dist
 cd ../_dist
 poetry run python scripts/model_feat_table.py
+poetry run nbdoc_build --srcdir docs
 cp ../cookbook/README.md src/pages/cookbook.mdx
 cp ../.github/CONTRIBUTING.md docs/contributing.md
-mkdir -p docs/templates
-cp ../templates/docs/INDEX.md docs/templates/index.md
 wget https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
-
-yarn
-
-quarto preview docs
+poetry run python scripts/generate_api_reference_links.py
+yarn install
+yarn start
--- a/docs/api_reference/_static/css/custom.css
+++ b/docs/api_reference/_static/css/custom.css
@@ -15,11 +15,3 @@ pre {
 #my-component-root *, #headlessui-portal-root * {
  z-index: 10000;
 }
-
-table.longtable code {
-  white-space: normal;
-}
-
-table.longtable td {
-  max-width: 600px;
-}
--- a/docs/api_reference/create_api_rst.py
+++ b/docs/api_reference/create_api_rst.py
@@ -13,10 +13,8 @@ HERE = Path(__file__).parent

 PKG_DIR = ROOT_DIR / "libs" / "langchain" / "langchain"
 EXP_DIR = ROOT_DIR / "libs" / "experimental" / "langchain_experimental"
-CORE_DIR = ROOT_DIR / "libs" / "core" / "langchain_core"
 WRITE_FILE = HERE / "api_reference.rst"
 EXP_WRITE_FILE = HERE / "experimental_api_reference.rst"
-CORE_WRITE_FILE = HERE / "core_api_reference.rst"


 ClassKind = Literal["TypedDict", "Regular", "Pydantic", "enum"]
@@ -196,13 +194,11 @@ def _load_package_modules(
    return modules_by_namespace


-def _construct_doc(
-    package_namespace: str, members_by_namespace: Dict[str, ModuleMembers]
-) -> str:
+def _construct_doc(pkg: str, members_by_namespace: Dict[str, ModuleMembers]) -> str:
    """Construct the contents of the reference.rst file for the given package.

    Args:
-        package_namespace: The package top level namespace
+        pkg: The package name
        members_by_namespace: The members of the package, dict organized by top level
                              module contains a list of classes and functions
                              inside of the top level namespace.
@@ -212,7 +208,7 @@ def _construct_doc(
    """
    full_doc = f"""\
 =======================
-``{package_namespace}`` API Reference
+``{pkg}`` API Reference
 =======================

 """
@@ -224,13 +220,13 @@ def _construct_doc(
        functions = _members["functions"]
        if not (classes or functions):
            continue
-        section = f":mod:`{package_namespace}.{module}`"
+        section = f":mod:`{pkg}.{module}`"
        underline = "=" * (len(section) + 1)
        full_doc += f"""\
 {section}
 {underline}

-.. automodule:: {package_namespace}.{module}
+.. automodule:: {pkg}.{module}
    :no-members:
    :no-inherited-members:

@@ -240,7 +236,7 @@ def _construct_doc(
            full_doc += f"""\
 Classes
 --------------
-.. currentmodule:: {package_namespace}
+.. currentmodule:: {pkg}

 .. autosummary::
    :toctree: {module}
@@ -272,7 +268,7 @@ Classes
            full_doc += f"""\
 Functions
 --------------
-.. currentmodule:: {package_namespace}
+.. currentmodule:: {pkg}

 .. autosummary::
    :toctree: {module}
@@ -284,57 +280,46 @@ Functions
    return full_doc


-def _build_rst_file(package_name: str = "langchain") -> None:
-    """Create a rst file for building of documentation.
-
-    Args:
-        package_name: Can be either "langchain" or "core" or "experimental".
-    """
-    package_members = _load_package_modules(_package_dir(package_name))
-    with open(_out_file_path(package_name), "w") as f:
-        f.write(
-            _doc_first_line(package_name)
-            + _construct_doc(package_namespace[package_name], package_members)
-        )
+def _document_langchain_experimental() -> None:
+    """Document the langchain_experimental package."""
+    # Generate experimental_api_reference.rst
+    exp_members = _load_package_modules(EXP_DIR)
+    exp_doc = ".. _experimental_api_reference:\n\n" + _construct_doc(
+        "langchain_experimental", exp_members
+    )
+    with open(EXP_WRITE_FILE, "w") as f:
+        f.write(exp_doc)


-package_namespace = {
-    "langchain": "langchain",
-    "experimental": "langchain_experimental",
-    "core": "langchain_core",
-}
+def _document_langchain_core() -> None:
+    """Document the main langchain package."""
+    # load top level module members
+    lc_members = _load_package_modules(PKG_DIR)

+    # Add additional packages
+    tools = _load_package_modules(PKG_DIR, "tools")
+    agents = _load_package_modules(PKG_DIR, "agents")
+    schema = _load_package_modules(PKG_DIR, "schema")

-def _package_dir(package_name: str = "langchain") -> Path:
-    """Return the path to the directory containing the documentation."""
-    return ROOT_DIR / "libs" / package_name / package_namespace[package_name]
+    lc_members.update(
+        {
+            "agents.output_parsers": agents["output_parsers"],
+            "agents.format_scratchpad": agents["format_scratchpad"],
+            "tools.render": tools["render"],
+            "schema.runnable": schema["runnable"],
+        }
+    )

+    lc_doc = ".. _api_reference:\n\n" + _construct_doc("langchain", lc_members)

-def _out_file_path(package_name: str = "langchain") -> Path:
-    """Return the path to the file containing the documentation."""
-    name_prefix = {
-        "langchain": "",
-        "experimental": "experimental_",
-        "core": "core_",
-    }
-    return HERE / f"{name_prefix[package_name]}api_reference.rst"
-
-
-def _doc_first_line(package_name: str = "langchain") -> str:
-    """Return the path to the file containing the documentation."""
-    prefix = {
-        "langchain": "",
-        "experimental": "experimental",
-        "core": "core",
-    }
-    return f".. {prefix[package_name]}_api_reference:\n\n"
+    with open(WRITE_FILE, "w") as f:
+        f.write(lc_doc)


 def main() -> None:
-    """Generate the api_reference.rst file for each package."""
-    _build_rst_file(package_name="core")
-    _build_rst_file(package_name="langchain")
-    _build_rst_file(package_name="experimental")
+    """Generate the reference.rst file for each package."""
+    _document_langchain_core()
+    _document_langchain_experimental()


 if __name__ == "__main__":
--- a/docs/api_reference/requirements.txt
+++ b/docs/api_reference/requirements.txt
@@ -1,6 +1,5 @@
 -e libs/langchain
 -e libs/experimental
-e libs/core
 pydantic<2
 autodoc_pydantic==1.8.0
 myst_parser
--- a/docs/api_reference/themes/scikit-learn-modern/nav.html
+++ b/docs/api_reference/themes/scikit-learn-modern/nav.html
@@ -34,9 +34,6 @@
        <li class="nav-item">
          <a class="sk-nav-link nav-link" href="{{ pathto('api_reference') }}">API</a>
        </li>
-        <li class="nav-item">
-          <a class="sk-nav-link nav-link" href="{{ pathto('core_api_reference') }}">Core</a>
-        </li>
        <li class="nav-item">
          <a class="sk-nav-link nav-link" href="{{ pathto('experimental_api_reference') }}">Experimental</a>
        </li>
--- a/docs/docs/additional_resources/dependents.mdx
+++ b/docs/docs/additional_resources/dependents.mdx
--- a/docs/docs/expression_language/cookbook/index.mdx
+++ b/docs/docs/expression_language/cookbook/index.mdx
@@ -1,5 +1,5 @@
 ---
-sidebar_position: 3
+sidebar_position: 2
 ---

 # Cookbook
--- a/docs/docs/expression_language/cookbook/multiple_chains.ipynb
+++ b/docs/docs/expression_language/cookbook/multiple_chains.ipynb
@@ -146,7 +146,7 @@
   "source": [
    "### Branching and Merging\n",
    "\n",
-    "You may want the output of one component to be processed by 2 or more other components. [RunnableParallels](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableParallel.html#langchain_core.runnables.base.RunnableParallel) let you split or fork the chain so multiple components can process the input in parallel. Later, other components can join or merge the results to synthesize a final response. This type of chain creates a computation graph that looks like the following:\n",
+    "You may want the output of one component to be processed by 2 or more other components. [RunnableMaps](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.base.RunnableMap.html) let you split or fork the chain so multiple components can process the input in parallel. Later, other components can join or merge the results to synthesize a final response. This type of chain creates a computation graph that looks like the following:\n",
    "\n",
    "```text\n",
    "     Input\n",
--- a/docs/docs/expression_language/cookbook/prompt_llm_parser.ipynb
+++ b/docs/docs/expression_language/cookbook/prompt_llm_parser.ipynb
@@ -317,7 +317,7 @@
   "source": [
    "## Simplifying input\n",
    "\n",
-    "To make invocation even simpler, we can add a `RunnableParallel` to take care of creating the prompt input dict for us:"
+    "To make invocation even simpler, we can add a `RunnableMap` to take care of creating the prompt input dict for us:"
   ]
  },
  {
@@ -327,9 +327,9 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain.schema.runnable import RunnableParallel, RunnablePassthrough\n",
+    "from langchain.schema.runnable import RunnableMap, RunnablePassthrough\n",
    "\n",
-    "map_ = RunnableParallel(foo=RunnablePassthrough())\n",
+    "map_ = RunnableMap(foo=RunnablePassthrough())\n",
    "chain = (\n",
    "    map_\n",
    "    | prompt\n",
--- a/docs/docs/expression_language/cookbook/prompt_size.ipynb
+++ b/docs/docs/expression_language/cookbook/prompt_size.ipynb
--- a/docs/docs/expression_language/cookbook/retrieval.ipynb
+++ b/docs/docs/expression_language/cookbook/retrieval.ipynb
@@ -31,7 +31,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 10,
   "id": "33be32af",
   "metadata": {},
   "outputs": [],
@@ -48,7 +48,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 6,
   "id": "bfc47ec1",
   "metadata": {},
   "outputs": [],
@@ -70,7 +70,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 4,
   "id": "eae31755",
   "metadata": {},
   "outputs": [],
@@ -85,7 +85,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 18,
   "id": "f3040b0c",
   "metadata": {},
   "outputs": [
@@ -95,7 +95,7 @@
       "'Harrison worked at Kensho.'"
      ]
     },
-     "execution_count": 4,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -106,7 +106,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 6,
   "id": "e1d20c7c",
   "metadata": {},
   "outputs": [],
@@ -134,7 +134,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 7,
   "id": "7ee8b2d4",
   "metadata": {},
   "outputs": [
@@ -144,7 +144,7 @@
       "'Harrison ha lavorato a Kensho.'"
      ]
     },
-     "execution_count": 6,
+     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -165,20 +165,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 21,
+   "execution_count": 8,
   "id": "3f30c348",
   "metadata": {},
   "outputs": [],
   "source": [
    "from langchain.schema import format_document\n",
-    "from langchain.schema.messages import get_buffer_string\n",
-    "from langchain.schema.runnable import RunnableParallel\n",
-    "from langchain_core.messages import AIMessage, HumanMessage"
+    "from langchain.schema.runnable import RunnableMap"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": 9,
   "id": "64ab1dbf",
   "metadata": {},
   "outputs": [],
@@ -196,7 +194,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 10,
   "id": "7d628c97",
   "metadata": {},
   "outputs": [],
@@ -211,7 +209,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 11,
   "id": "f60a5d0f",
   "metadata": {},
   "outputs": [],
@@ -228,14 +226,33 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 12,
+   "id": "7d007db6",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import List, Tuple\n",
+    "\n",
+    "\n",
+    "def _format_chat_history(chat_history: List[Tuple]) -> str:\n",
+    "    buffer = \"\"\n",
+    "    for dialogue_turn in chat_history:\n",
+    "        human = \"Human: \" + dialogue_turn[0]\n",
+    "        ai = \"Assistant: \" + dialogue_turn[1]\n",
+    "        buffer += \"\\n\" + \"\\n\".join([human, ai])\n",
+    "    return buffer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
   "id": "5c32cc89",
   "metadata": {},
   "outputs": [],
   "source": [
-    "_inputs = RunnableParallel(\n",
+    "_inputs = RunnableMap(\n",
    "    standalone_question=RunnablePassthrough.assign(\n",
-    "        chat_history=lambda x: get_buffer_string(x[\"chat_history\"])\n",
+    "        chat_history=lambda x: _format_chat_history(x[\"chat_history\"])\n",
    "    )\n",
    "    | CONDENSE_QUESTION_PROMPT\n",
    "    | ChatOpenAI(temperature=0)\n",
@@ -250,17 +267,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 14,
   "id": "135c8205",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='Harrison was employed at Kensho.')"
+       "AIMessage(content='Harrison was employed at Kensho.', additional_kwargs={}, example=False)"
      ]
     },
-     "execution_count": 12,
+     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -276,17 +293,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 22,
+   "execution_count": 15,
   "id": "424e7e7a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='Harrison worked at Kensho.')"
+       "AIMessage(content='Harrison worked at Kensho.', additional_kwargs={}, example=False)"
      ]
     },
-     "execution_count": 22,
+     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -295,10 +312,7 @@
    "conversational_qa_chain.invoke(\n",
    "    {\n",
    "        \"question\": \"where did he work?\",\n",
-    "        \"chat_history\": [\n",
-    "            HumanMessage(content=\"Who wrote this notebook?\"),\n",
-    "            AIMessage(content=\"Harrison\"),\n",
-    "        ],\n",
+    "        \"chat_history\": [(\"Who wrote this notebook?\", \"Harrison\")],\n",
    "    }\n",
    ")"
   ]
@@ -315,7 +329,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 16,
   "id": "e31dd17c",
   "metadata": {},
   "outputs": [],
@@ -327,7 +341,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 17,
   "id": "d4bffe94",
   "metadata": {},
   "outputs": [],
@@ -339,7 +353,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 16,
+   "execution_count": 18,
   "id": "733be985",
   "metadata": {},
   "outputs": [],
@@ -353,7 +367,7 @@
    "standalone_question = {\n",
    "    \"standalone_question\": {\n",
    "        \"question\": lambda x: x[\"question\"],\n",
-    "        \"chat_history\": lambda x: get_buffer_string(x[\"chat_history\"]),\n",
+    "        \"chat_history\": lambda x: _format_chat_history(x[\"chat_history\"]),\n",
    "    }\n",
    "    | CONDENSE_QUESTION_PROMPT\n",
    "    | ChatOpenAI(temperature=0)\n",
@@ -380,18 +394,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 17,
+   "execution_count": 19,
   "id": "806e390c",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "{'answer': AIMessage(content='Harrison was employed at Kensho.'),\n",
-       " 'docs': [Document(page_content='harrison worked at kensho')]}"
+       "{'answer': AIMessage(content='Harrison was employed at Kensho.', additional_kwargs={}, example=False),\n",
+       " 'docs': [Document(page_content='harrison worked at kensho', metadata={})]}"
      ]
     },
-     "execution_count": 17,
+     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -404,7 +418,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 18,
+   "execution_count": 20,
   "id": "977399fd",
   "metadata": {},
   "outputs": [],
@@ -417,18 +431,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 19,
+   "execution_count": 21,
   "id": "f94f7de4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "{'history': [HumanMessage(content='where did harrison work?'),\n",
-       "  AIMessage(content='Harrison was employed at Kensho.')]}"
+       "{'history': [HumanMessage(content='where did harrison work?', additional_kwargs={}, example=False),\n",
+       "  AIMessage(content='Harrison was employed at Kensho.', additional_kwargs={}, example=False)]}"
      ]
     },
-     "execution_count": 19,
+     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -436,38 +450,6 @@
   "source": [
    "memory.load_memory_variables({})"
   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 20,
-   "id": "88f2b7cd",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'answer': AIMessage(content='Harrison actually worked at Kensho.'),\n",
-       " 'docs': [Document(page_content='harrison worked at kensho')]}"
-      ]
-     },
-     "execution_count": 20,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "inputs = {\"question\": \"but where did he really work?\"}\n",
-    "result = final_chain.invoke(inputs)\n",
-    "result"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "207a2782",
-   "metadata": {},
-   "outputs": [],
-   "source": []
  }
 ],
 "metadata": {
@@ -486,7 +468,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.1"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/expression_language/get_started.ipynb
+++ b/docs/docs/expression_language/get_started.ipynb
@@ -1,493 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "raw",
-   "id": "366a0e68-fd67-4fe5-a292-5c33733339ea",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_position: 0\n",
-    "title: Get started\n",
-    "---"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "befa7fd1",
-   "metadata": {},
-   "source": [
-    "LCEL makes it easy to build complex chains from basic components, and supports out of the box functionality such as streaming, parallelism, and logging."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "9a9acd2e",
-   "metadata": {},
-   "source": [
-    "## Basic example: prompt + model + output parser\n",
-    "\n",
-    "The most basic and common use case is chaining a prompt template and a model together. To see how this works, let's create a chain that takes a topic and generates a joke:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "466b65b3",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "\"Why did the ice cream go to therapy?\\n\\nBecause it had too many toppings and couldn't find its cone-fidence!\""
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.chat_models import ChatOpenAI\n",
-    "from langchain.prompts import ChatPromptTemplate\n",
-    "from langchain.schema.output_parser import StrOutputParser\n",
-    "\n",
-    "prompt = ChatPromptTemplate.from_template(\"tell me a short joke about {topic}\")\n",
-    "model = ChatOpenAI()\n",
-    "output_parser = StrOutputParser()\n",
-    "\n",
-    "chain = prompt | model | output_parser\n",
-    "\n",
-    "chain.invoke({\"topic\": \"ice cream\"})"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "81c502c5-85ee-4f36-aaf4-d6e350b7792f",
-   "metadata": {},
-   "source": [
-    "Notice this line of this code, where we piece together then different components into a single chain using LCEL:\n",
-    "\n",
-    "```\n",
-    "chain = prompt | model | output_parser\n",
-    "```\n",
-    "\n",
-    "The `|` symbol is similar to a [unix pipe operator](https://en.wikipedia.org/wiki/Pipeline_(Unix)), which chains together the different components feeds the output from one component as input into the next component. \n",
-    "\n",
-    "In this chain the user input is passed to the prompt template, then the prompt template output is passed to the model, then the model output is passed to the output parser. Let's take a look at each component individually to really understand what's going on. "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "aa1b77fa",
-   "metadata": {},
-   "source": [
-    "### 1. Prompt\n",
-    "\n",
-    "`prompt` is a `BasePromptTemplate`, which means it takes in a dictionary of template variables and produces a `PromptValue`. A `PromptValue` is a wrapper around a completed prompt that can be passed to either an `LLM` (which takes a string as input) or `ChatModel` (which takes a sequence of messages as input). It can work with either language model type because it defines logic both for producing `BaseMessage`s and for producing a string."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "b8656990",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "ChatPromptValue(messages=[HumanMessage(content='tell me a short joke about ice cream')])"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "prompt_value = prompt.invoke({\"topic\": \"ice cream\"})\n",
-    "prompt_value"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "id": "e6034488",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[HumanMessage(content='tell me a short joke about ice cream')]"
-      ]
-     },
-     "execution_count": 9,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "prompt_value.to_messages()"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "id": "60565463",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Human: tell me a short joke about ice cream'"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "prompt_value.to_string()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "577f0f76",
-   "metadata": {},
-   "source": [
-    "### 2. Model\n",
-    "\n",
-    "The `PromptValue` is then passed to `model`. In this case our `model` is a `ChatModel`, meaning it will output a `BaseMessage`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "33cf5f72",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content=\"Why did the ice cream go to therapy? \\n\\nBecause it had too many toppings and couldn't find its cone-fidence!\")"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "message = model.invoke(prompt_value)\n",
-    "message"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "327e7db8",
-   "metadata": {},
-   "source": [
-    "If our `model` was an `LLM`, it would output a string."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "id": "8feb05da",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'\\n\\nRobot: Why did the ice cream go to therapy? Because it had a rocky road.'"
-      ]
-     },
-     "execution_count": 12,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.llms import OpenAI\n",
-    "\n",
-    "llm = OpenAI(model=\"gpt-3.5-turbo-instruct\")\n",
-    "llm.invoke(prompt_value)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "91847478",
-   "metadata": {},
-   "source": [
-    "### 3. Output parser\n",
-    "\n",
-    "And lastly we pass our `model` output to the `output_parser`, which is a `BaseOutputParser` meaning it takes either a string or a \n",
-    "`BaseMessage` as input. The `StrOutputParser` specifically simple converts any input into a string."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 13,
-   "id": "533e59a8",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "\"Why did the ice cream go to therapy? \\n\\nBecause it had too many toppings and couldn't find its cone-fidence!\""
-      ]
-     },
-     "execution_count": 13,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "output_parser.invoke(message)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "9851e842",
-   "metadata": {},
-   "source": [
-    "### 4. Entire Pipeline\n",
-    "\n",
-    "To follow the steps along:\n",
-    "\n",
-    "1. We pass in user input on the desired topic as `{\"topic\": \"ice cream\"}`\n",
-    "2. The `prompt` component takes the user input, which is then used to construct a PromptValue after using the `topic` to construct the prompt. \n",
-    "3. The `model` component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated output from the model is a `ChatMessage` object. \n",
-    "4. Finally, the `output_parser` component takes in a `ChatMessage`, and transforms this into a Python string, which is returned from the invoke method. \n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "c4873109",
-   "metadata": {},
-   "source": [
-    "```mermaid\n",
-    "graph LR\n",
-    "    A(Input: topic=ice cream) --> |Dict| B(PromptTemplate)\n",
-    "    B -->|PromptValue| C(ChatModel)    \n",
-    "    C -->|ChatMessage| D(StrOutputParser)\n",
-    "    D --> |String| F(Result)\n",
-    "```\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fe63534d",
-   "metadata": {},
-   "source": [
-    ":::info\n",
-    "\n",
-    "Note that if you’re curious about the output of any components, you can always test out a smaller version of the chain such as `prompt`  or `prompt | model` to see the intermediate results:\n",
-    "\n",
-    ":::"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "11089b6f-23f8-474f-97ec-8cae8d0ca6d4",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "input = {\"topic\": \"ice cream\"}\n",
-    "\n",
-    "prompt.invoke(input)\n",
-    "# > ChatPromptValue(messages=[HumanMessage(content='tell me a short joke about ice cream')])\n",
-    "\n",
-    "(prompt | model).invoke(input)\n",
-    "# > AIMessage(content=\"Why did the ice cream go to therapy?\\nBecause it had too many toppings and couldn't cone-trol itself!\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "cc7d3b9d-e400-4c9b-9188-f29dac73e6bb",
-   "metadata": {},
-   "source": [
-    "## RAG Search Example\n",
-    "\n",
-    "For our next example, we want to run a retrieval-augmented generation chain to add some context when responding to questions. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "662426e8-4316-41dc-8312-9b58edc7e0c9",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# Requires:\n",
-    "# pip install langchain docarray\n",
-    "\n",
-    "from langchain.chat_models import ChatOpenAI\n",
-    "from langchain.embeddings import OpenAIEmbeddings\n",
-    "from langchain.prompts import ChatPromptTemplate\n",
-    "from langchain.schema.output_parser import StrOutputParser\n",
-    "from langchain.schema.runnable import RunnableParallel, RunnablePassthrough\n",
-    "from langchain.vectorstores import DocArrayInMemorySearch\n",
-    "\n",
-    "vectorstore = DocArrayInMemorySearch.from_texts(\n",
-    "    [\"harrison worked at kensho\", \"bears like to eat honey\"],\n",
-    "    embedding=OpenAIEmbeddings(),\n",
-    ")\n",
-    "retriever = vectorstore.as_retriever()\n",
-    "\n",
-    "template = \"\"\"Answer the question based only on the following context:\n",
-    "{context}\n",
-    "\n",
-    "Question: {question}\n",
-    "\"\"\"\n",
-    "prompt = ChatPromptTemplate.from_template(template)\n",
-    "model = ChatOpenAI()\n",
-    "output_parser = StrOutputParser()\n",
-    "\n",
-    "setup_and_retrieval = RunnableParallel(\n",
-    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
-    ")\n",
-    "chain = setup_and_retrieval | prompt | model | output_parser\n",
-    "\n",
-    "chain.invoke(\"where did harrison work?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f0999140-6001-423b-970b-adf1dfdb4dec",
-   "metadata": {},
-   "source": [
-    "In this case, the composed chain is: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "5b88e9bb-f04a-4a56-87ec-19a0e6350763",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "chain = setup_and_retrieval | prompt | model | output_parser"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6e929e15-40a5-4569-8969-384f636cab87",
-   "metadata": {},
-   "source": [
-    "To explain this, we first can see that the prompt template above takes in `context` and `question` as values to be substituted in the prompt. Before building the prompt template, we want to retrieve relevant documents to the search and include them as part of the context. \n",
-    "\n",
-    "As a preliminary step, we’ve setup the retriever using an in memory store, which can retrieve documents based on a query. This is a runnable component as well that can be chained together with other components, but you can also try to run it separately:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "a7319ef6-613b-4638-ad7d-4a2183702c1d",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "retriever.invoke(\"where did harrison work?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "e6833844-f1c4-444c-a3d2-31b3c6b31d46",
-   "metadata": {},
-   "source": [
-    "We then use the `RunnableParallel` to prepare the expected inputs into the prompt by using the entries for the retrieved documents as well as the original user question, using the retriever for document search, and RunnablePassthrough to pass the user’s question:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "dcbca26b-d6b9-4c24-806c-1ec8fdaab4ed",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "setup_and_retrieval = RunnableParallel(\n",
-    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "68c721c1-048b-4a64-9d78-df54fe465992",
-   "metadata": {},
-   "source": [
-    "To review, the complete chain is:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "1d5115a7-7b8e-458b-b936-26cc87ee81c4",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "setup_and_retrieval = RunnableParallel(\n",
-    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
-    ")\n",
-    "chain = setup_and_retrieval | prompt | model | output_parser"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5c6f5f74-b387-48a0-bedd-1fae202cd10a",
-   "metadata": {},
-   "source": [
-    "With the flow being:\n",
-    "\n",
-    "1. The first steps create a `RunnableParallel` object with two entries.  The first entry, `context` will include the document results fetched by the retriever. The second entry, `question` will contain the user’s original question. To pass on the question, we use `RunnablePassthrough` to copy this entry. \n",
-    "2. Feed the dictionary from the step above to the `prompt` component. It then takes the user input which is `question` as well as the retrieved document which is `context` to construct a prompt and output a PromptValue.  \n",
-    "3. The `model` component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated output from the model is a `ChatMessage` object. \n",
-    "4. Finally, the `output_parser` component takes in a `ChatMessage`, and transforms this into a Python string, which is returned from the invoke method.\n",
-    "\n",
-    "```mermaid\n",
-    "graph LR\n",
-    "    A(Question) --> B(RunnableParallel)\n",
-    "    B -->|Question| C(Retriever)\n",
-    "    B -->|Question| D(RunnablePassThrough)\n",
-    "    C -->|context=retrieved docs| E(PromptTemplate)\n",
-    "    D -->|question=Question| E\n",
-    "    E -->|PromptValue| F(ChatModel)    \n",
-    "    F -->|ChatMessage| G(StrOutputParser)\n",
-    "    G --> |String| H(Result)\n",
-    "```\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "8c2438df-164e-4bbe-b5f4-461695e45b0f",
-   "metadata": {},
-   "source": [
-    "## Next steps\n",
-    "\n",
-    "We recommend reading our [Why use LCEL](/docs/expression_language/why) section next to see a side-by-side comparison of the code needed to produce common functionality with and without LCEL."
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/docs/expression_language/how_to/configure.ipynb
+++ b/docs/docs/expression_language/how_to/configure.ipynb
@@ -43,7 +43,6 @@
   "source": [
    "from langchain.chat_models import ChatOpenAI\n",
    "from langchain.prompts import PromptTemplate\n",
-    "from langchain.schema.runnable import ConfigurableField\n",
    "\n",
    "model = ChatOpenAI(temperature=0).configurable_fields(\n",
    "    temperature=ConfigurableField(\n",
@@ -595,7 +594,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/expression_language/how_to/fallbacks.ipynb
+++ b/docs/docs/expression_language/how_to/fallbacks.ipynb
@@ -26,7 +26,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
   "id": "d3e893bf",
   "metadata": {},
   "outputs": [],
@@ -44,24 +44,19 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 3,
   "id": "dfdd8bf5",
   "metadata": {},
   "outputs": [],
   "source": [
    "from unittest.mock import patch\n",
    "\n",
-    "import httpx\n",
-    "from openai import RateLimitError\n",
-    "\n",
-    "request = httpx.Request(\"GET\", \"/\")\n",
-    "response = httpx.Response(200, request=request)\n",
-    "error = RateLimitError(\"rate limit\", response=response, body=\"\")"
+    "from openai.error import RateLimitError"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 5,
   "id": "e6fdffc1",
   "metadata": {},
   "outputs": [],
@@ -74,7 +69,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 27,
   "id": "584461ab",
   "metadata": {},
   "outputs": [
@@ -88,10 +83,10 @@
   ],
   "source": [
    "# Let's use just the OpenAI LLm first, to show that we run into an error\n",
-    "with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
+    "with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
    "    try:\n",
    "        print(openai_llm.invoke(\"Why did the chicken cross the road?\"))\n",
-    "    except RateLimitError:\n",
+    "    except:\n",
    "        print(\"Hit error\")"
   ]
  },
@@ -111,10 +106,10 @@
   ],
   "source": [
    "# Now let's try with fallbacks to Anthropic\n",
-    "with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
+    "with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
    "    try:\n",
    "        print(llm.invoke(\"Why did the chicken cross the road?\"))\n",
-    "    except RateLimitError:\n",
+    "    except:\n",
    "        print(\"Hit error\")"
   ]
  },
@@ -153,10 +148,10 @@
    "    ]\n",
    ")\n",
    "chain = prompt | llm\n",
-    "with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
+    "with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
    "    try:\n",
    "        print(chain.invoke({\"animal\": \"kangaroo\"}))\n",
-    "    except RateLimitError:\n",
+    "    except:\n",
    "        print(\"Hit error\")"
   ]
  },
@@ -190,10 +185,10 @@
    ")\n",
    "\n",
    "chain = prompt | llm\n",
-    "with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
+    "with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
    "    try:\n",
    "        print(chain.invoke({\"animal\": \"kangaroo\"}))\n",
-    "    except RateLimitError:\n",
+    "    except:\n",
    "        print(\"Hit error\")"
   ]
  },
@@ -291,7 +286,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.10.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/expression_language/how_to/functions.ipynb
+++ b/docs/docs/expression_language/how_to/functions.ipynb
@@ -1,16 +1,5 @@
 {
 "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_position: 2\n",
-    "title: \"RunnableLambda: Run Custom Functions\"\n",
-    "keywords: [RunnableLambda, LCEL]\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "fbc4bf6e",
@@ -18,14 +7,14 @@
   "source": [
    "# Run custom functions\n",
    "\n",
-    "You can use arbitrary functions in the pipeline.\n",
+    "You can use arbitrary functions in the pipeline\n",
    "\n",
    "Note that all inputs to these functions need to be a SINGLE argument. If you have a function that accepts multiple arguments, you should write a wrapper that accepts a single input and unpacks it into multiple argument."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 4,
   "id": "6bb221b3",
   "metadata": {},
   "outputs": [],
@@ -67,17 +56,17 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 5,
   "id": "5488ec85",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content='3 + 9 equals 12.')"
+       "AIMessage(content='3 + 9 equals 12.', additional_kwargs={}, example=False)"
      ]
     },
-     "execution_count": 2,
+     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -93,12 +82,12 @@
   "source": [
    "## Accepting a Runnable Config\n",
    "\n",
-    "Runnable lambdas can optionally accept a [RunnableConfig](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.config.RunnableConfig.html#langchain_core.runnables.config.RunnableConfig), which they can use to pass callbacks, tags, and other configuration information to nested runs."
+    "Runnable lambdas can optionally accept a [RunnableConfig](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.config.RunnableConfig.html?highlight=runnableconfig#langchain.schema.runnable.config.RunnableConfig), which they can use to pass callbacks, tags, and other configuration information to nested runs."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 9,
   "id": "80b3b5f6-5d58-44b9-807e-cce9a46bf49f",
   "metadata": {},
   "outputs": [],
@@ -109,7 +98,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 10,
   "id": "ff0daf0c-49dd-4d21-9772-e5fa133c5f36",
   "metadata": {},
   "outputs": [],
@@ -136,7 +125,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 12,
   "id": "1a5e709e-9d75-48c7-bb9c-503251990505",
   "metadata": {},
   "outputs": [
@@ -144,7 +133,6 @@
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      "{'foo': 'bar'}\n",
      "Tokens Used: 65\n",
      "\tPrompt Tokens: 56\n",
      "\tCompletion Tokens: 9\n",
@@ -157,10 +145,9 @@
    "from langchain.callbacks import get_openai_callback\n",
    "\n",
    "with get_openai_callback() as cb:\n",
-    "    output = RunnableLambda(parse_or_fix).invoke(\n",
+    "    RunnableLambda(parse_or_fix).invoke(\n",
    "        \"{foo: bar}\", {\"tags\": [\"my-tag\"], \"callbacks\": [cb]}\n",
    "    )\n",
-    "    print(output)\n",
    "    print(cb)"
   ]
  },
@@ -189,7 +176,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/expression_language/how_to/generators.ipynb
+++ b/docs/docs/expression_language/how_to/generators.ipynb
@@ -17,13 +17,6 @@
    "Let's implement a custom output parser for comma-separated lists."
   ]
  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Sync version"
-   ]
-  },
  {
   "cell_type": "code",
   "execution_count": 1,
@@ -64,7 +57,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
@@ -73,7 +66,7 @@
       "'lion, tiger, wolf, gorilla, panda'"
      ]
     },
-     "execution_count": 3,
+     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -159,81 +152,12 @@
    "list_chain.invoke({\"animal\": \"bear\"})"
   ]
  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Async version"
-   ]
-  },
  {
   "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": null,
   "metadata": {},
   "outputs": [],
-   "source": [
-    "from typing import AsyncIterator\n",
-    "\n",
-    "\n",
-    "async def asplit_into_list(\n",
-    "    input: AsyncIterator[str]\n",
-    ") -> AsyncIterator[List[str]]:  # async def\n",
-    "    buffer = \"\"\n",
-    "    async for (\n",
-    "        chunk\n",
-    "    ) in input:  # `input` is a `async_generator` object, so use `async for`\n",
-    "        buffer += chunk\n",
-    "        while \",\" in buffer:\n",
-    "            comma_index = buffer.index(\",\")\n",
-    "            yield [buffer[:comma_index].strip()]\n",
-    "            buffer = buffer[comma_index + 1 :]\n",
-    "    yield [buffer.strip()]\n",
-    "\n",
-    "\n",
-    "list_chain = str_chain | asplit_into_list"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "['lion']\n",
-      "['tiger']\n",
-      "['wolf']\n",
-      "['gorilla']\n",
-      "['panda']\n"
-     ]
-    }
-   ],
-   "source": [
-    "async for chunk in list_chain.astream({\"animal\": \"bear\"}):\n",
-    "    print(chunk, flush=True)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "['lion', 'tiger', 'wolf', 'gorilla', 'panda']"
-      ]
-     },
-     "execution_count": 10,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "await list_chain.ainvoke({\"animal\": \"bear\"})"
-   ]
+   "source": []
  }
 ],
 "metadata": {
@@ -252,7 +176,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/expression_language/how_to/index.mdx
+++ b/docs/docs/expression_language/how_to/index.mdx
@@ -1,5 +1,5 @@
 ---
-sidebar_position: 2
+sidebar_position: 1
 ---

 # How to
--- a/docs/docs/expression_language/how_to/map.ipynb
+++ b/docs/docs/expression_language/how_to/map.ipynb
@@ -1,192 +1,29 @@
 {
 "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "e2596041-9b76-4e74-836f-e6235086bbf0",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_position: 0\n",
-    "title: \"RunnableParallel: Manipulating data\"\n",
-    "keywords: [RunnableParallel, RunnableMap, LCEL]\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "b022ab74-794d-4c54-ad47-ff9549ddb9d2",
   "metadata": {},
   "source": [
-    "# Manipulating inputs & output\n",
-    "\n",
-    "RunnableParallel can be useful for manipulating the output of one Runnable to match the input format of the next Runnable in a sequence.\n",
-    "\n",
-    "Here the input to prompt is expected to be a map with keys \"context\" and \"question\". The user input is just the question. So we need to get the context using our retriever and passthrough the user input under the \"question\" key.\n",
-    "\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "267d1460-53c1-4fdb-b2c3-b6a1eb7fccff",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Harrison worked at Kensho.'"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.chat_models import ChatOpenAI\n",
-    "from langchain.embeddings import OpenAIEmbeddings\n",
-    "from langchain.prompts import ChatPromptTemplate\n",
-    "from langchain.schema.output_parser import StrOutputParser\n",
-    "from langchain.schema.runnable import RunnablePassthrough\n",
-    "from langchain.vectorstores import FAISS\n",
-    "\n",
-    "vectorstore = FAISS.from_texts(\n",
-    "    [\"harrison worked at kensho\"], embedding=OpenAIEmbeddings()\n",
-    ")\n",
-    "retriever = vectorstore.as_retriever()\n",
-    "template = \"\"\"Answer the question based only on the following context:\n",
-    "{context}\n",
-    "\n",
-    "Question: {question}\n",
-    "\"\"\"\n",
-    "prompt = ChatPromptTemplate.from_template(template)\n",
-    "model = ChatOpenAI()\n",
-    "\n",
-    "retrieval_chain = (\n",
-    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
-    "    | prompt\n",
-    "    | model\n",
-    "    | StrOutputParser()\n",
-    ")\n",
-    "\n",
-    "retrieval_chain.invoke(\"where did harrison work?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "392cd4c4-e7ed-4ab8-934d-f7a4eca55ee1",
-   "metadata": {},
-   "source": [
-    "::: {.callout-tip}\n",
-    "Note that when composing a RunnableParallel with another Runnable we don't even need to wrap our dictionary in the RunnableParallel class — the type conversion is handled for us. In the context of a chain, these are equivalent:\n",
-    ":::\n",
-    "\n",
-    "```\n",
-    "{\"context\": retriever, \"question\": RunnablePassthrough()}\n",
-    "```\n",
-    "\n",
-    "```\n",
-    "RunnableParallel({\"context\": retriever, \"question\": RunnablePassthrough()})\n",
-    "```\n",
-    "\n",
-    "```\n",
-    "RunnableParallel(context=retriever, question=RunnablePassthrough())\n",
-    "```\n",
-    "\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "7c1b8baa-3a80-44f0-bb79-d22f79815d3d",
-   "metadata": {},
-   "source": [
-    "## Using itemgetter as shorthand\n",
-    "\n",
-    "Note that you can use Python's `itemgetter` as shorthand to extract data from the map when combining with `RunnableParallel`. You can find more information about itemgetter in the [Python Documentation](https://docs.python.org/3/library/operator.html#operator.itemgetter). \n",
-    "\n",
-    "In the example below, we use itemgetter to extract specific keys from the map:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "84fc49e1-2daf-4700-ae33-a0a6ed47d5f6",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Harrison ha lavorato a Kensho.'"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from operator import itemgetter\n",
-    "\n",
-    "from langchain.chat_models import ChatOpenAI\n",
-    "from langchain.embeddings import OpenAIEmbeddings\n",
-    "from langchain.prompts import ChatPromptTemplate\n",
-    "from langchain.schema.output_parser import StrOutputParser\n",
-    "from langchain.schema.runnable import RunnablePassthrough\n",
-    "from langchain.vectorstores import FAISS\n",
-    "\n",
-    "vectorstore = FAISS.from_texts(\n",
-    "    [\"harrison worked at kensho\"], embedding=OpenAIEmbeddings()\n",
-    ")\n",
-    "retriever = vectorstore.as_retriever()\n",
-    "\n",
-    "template = \"\"\"Answer the question based only on the following context:\n",
-    "{context}\n",
-    "\n",
-    "Question: {question}\n",
-    "\n",
-    "Answer in the following language: {language}\n",
-    "\"\"\"\n",
-    "prompt = ChatPromptTemplate.from_template(template)\n",
-    "\n",
-    "chain = (\n",
-    "    {\n",
-    "        \"context\": itemgetter(\"question\") | retriever,\n",
-    "        \"question\": itemgetter(\"question\"),\n",
-    "        \"language\": itemgetter(\"language\"),\n",
-    "    }\n",
-    "    | prompt\n",
-    "    | model\n",
-    "    | StrOutputParser()\n",
-    ")\n",
-    "\n",
-    "chain.invoke({\"question\": \"where did harrison work\", \"language\": \"italian\"})"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "bc2f9847-39aa-4fe4-9049-3a8969bc4bce",
-   "metadata": {},
-   "source": [
-    "## Parallelize steps\n",
+    "# Parallelize steps\n",
    "\n",
    "RunnableParallel (aka. RunnableMap) makes it easy to execute multiple Runnables in parallel, and to return the output of these Runnables as a map."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
-   "id": "31f18442-f837-463f-bef4-8729368f5f8b",
+   "execution_count": 2,
+   "id": "7e1873d6-d4b6-43ac-96a1-edcf178201e0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "{'joke': AIMessage(content=\"Why don't bears wear shoes?\\n\\nBecause they have bear feet!\"),\n",
-       " 'poem': AIMessage(content=\"In the wild's embrace, bear roams free,\\nStrength and grace, a majestic decree.\")}"
+       "{'joke': AIMessage(content=\"Why don't bears wear shoes? \\n\\nBecause they have bear feet!\", additional_kwargs={}, example=False),\n",
+       " 'poem': AIMessage(content=\"In woodland depths, bear prowls with might,\\nSilent strength, nature's sovereign, day and night.\", additional_kwargs={}, example=False)}"
      ]
     },
-     "execution_count": 1,
+     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
@@ -207,6 +44,69 @@
    "map_chain.invoke({\"topic\": \"bear\"})"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "df867ae9-1cec-4c9e-9fef-21969b206af5",
+   "metadata": {},
+   "source": [
+    "## Manipulating outputs/inputs\n",
+    "Maps can be useful for manipulating the output of one Runnable to match the input format of the next Runnable in a sequence."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "267d1460-53c1-4fdb-b2c3-b6a1eb7fccff",
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'Harrison worked at Kensho.'"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "from langchain.embeddings import OpenAIEmbeddings\n",
+    "from langchain.schema.output_parser import StrOutputParser\n",
+    "from langchain.schema.runnable import RunnablePassthrough\n",
+    "from langchain.vectorstores import FAISS\n",
+    "\n",
+    "vectorstore = FAISS.from_texts(\n",
+    "    [\"harrison worked at kensho\"], embedding=OpenAIEmbeddings()\n",
+    ")\n",
+    "retriever = vectorstore.as_retriever()\n",
+    "template = \"\"\"Answer the question based only on the following context:\n",
+    "{context}\n",
+    "\n",
+    "Question: {question}\n",
+    "\"\"\"\n",
+    "prompt = ChatPromptTemplate.from_template(template)\n",
+    "\n",
+    "retrieval_chain = (\n",
+    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
+    "    | prompt\n",
+    "    | model\n",
+    "    | StrOutputParser()\n",
+    ")\n",
+    "\n",
+    "retrieval_chain.invoke(\"where did harrison work?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "392cd4c4-e7ed-4ab8-934d-f7a4eca55ee1",
+   "metadata": {},
+   "source": [
+    "Here the input to prompt is expected to be a map with keys \"context\" and \"question\". The user input is just the question. So we need to get the context using our retriever and passthrough the user input under the \"question\" key.\n",
+    "\n",
+    "Note that when composing a RunnableMap when another Runnable we don't even need to wrap our dictionary in the RunnableMap class — the type conversion is handled for us."
+   ]
+  },
  {
   "cell_type": "markdown",
   "id": "833da249-c0d4-4e5b-b3f8-cab549f0f7e1",
@@ -214,7 +114,7 @@
   "source": [
    "## Parallelism\n",
    "\n",
-    "RunnableParallel are also useful for running independent processes in parallel, since each Runnable in the map is executed in parallel. For example, we can see our earlier `joke_chain`, `poem_chain` and `map_chain` all have about the same runtime, even though `map_chain` executes both of the other two."
+    "RunnableMaps are also useful for running independent processes in parallel, since each Runnable in the map is executed in parallel. For example, we can see our earlier `joke_chain`, `poem_chain` and `map_chain` all have about the same runtime, even though `map_chain` executes both of the other two."
   ]
  },
  {
@@ -294,7 +194,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/expression_language/how_to/message_history.ipynb
+++ b/docs/docs/expression_language/how_to/message_history.ipynb
@@ -1,402 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "6a4becbd-238e-4c1d-a02d-08e61fbc3763",
-   "metadata": {},
-   "source": [
-    "# Add message history (memory)\n",
-    "\n",
-    "The `RunnableWithMessageHistory` let's us add message history to certain types of chains.\n",
-    "\n",
-    "Specifically, it can be used for any Runnable that takes as input one of\n",
-    "* a sequence of `BaseMessage`\n",
-    "* a dict with a key that takes a sequence of `BaseMessage`\n",
-    "* a dict with a key that takes the latest message(s) as a string or sequence of `BaseMessage`, and a separate key that takes historical messages\n",
-    "\n",
-    "And returns as output one of\n",
-    "* a string that can be treated as the contents of an `AIMessage`\n",
-    "* a sequence of `BaseMessage`\n",
-    "* a dict with a key that contains a sequence of `BaseMessage`\n",
-    "\n",
-    "Let's take a look at some examples to see how it works."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6bca45e5-35d9-4603-9ca9-6ac0ce0e35cd",
-   "metadata": {},
-   "source": [
-    "## Setup\n",
-    "\n",
-    "We'll use Redis to store our chat message histories and Anthropic's claude-2 model so we'll need to install the following dependencies:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "477d04b3-c2b6-4ba5-962f-492c0d625cd5",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "!pip install -U langchain redis anthropic"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "93776323-d6b8-4912-bb6a-867c5e655f46",
-   "metadata": {},
-   "source": [
-    "Set your [Anthropic API  key](https://console.anthropic.com/):"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "c7f56f69-d2f1-4a21-990c-b5551eb012fa",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import getpass\n",
-    "import os\n",
-    "\n",
-    "os.environ[\"ANTHROPIC_API_KEY\"] = getpass.getpass()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "6a0ec9e0-7b1c-4c6f-b570-e61d520b47c6",
-   "metadata": {},
-   "source": [
-    "Start a local Redis Stack server if we don't have an existing Redis deployment to connect to:\n",
-    "```bash\n",
-    "docker run -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest\n",
-    "```"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "cd6a250e-17fe-4368-a39d-1fe6b2cbde68",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "REDIS_URL = \"redis://localhost:6379/0\""
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "36f43b87-655c-4f64-aa7b-bd8c1955d8e5",
-   "metadata": {},
-   "source": [
-    "### [LangSmith](/docs/langsmith)\n",
-    "\n",
-    "LangSmith is especially useful for something like message history injection, where it can be hard to otherwise understand what the inputs are to various parts of the chain.\n",
-    "\n",
-    "Note that LangSmith is not needed, but it is helpful.\n",
-    "If you do want to use LangSmith, after you sign up at the link above, make sure to uncoment the below and set your environment variables to start logging traces:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "2afc1556-8da1-4499-ba11-983b66c58b18",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
-    "# os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1a5a632e-ba9e-4488-b586-640ad5494f62",
-   "metadata": {},
-   "source": [
-    "## Example: Dict input, message output\n",
-    "\n",
-    "Let's create a simple chain that takes a dict as input and returns a BaseMessage.\n",
-    "\n",
-    "In this case the `\"question\"` key in the input represents our input message, and the `\"history\"` key is where our historical messages will be injected."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "2a150d6f-8878-4950-8634-a608c5faad56",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from typing import Optional\n",
-    "\n",
-    "from langchain.chat_models import ChatAnthropic\n",
-    "from langchain.memory.chat_message_histories import RedisChatMessageHistory\n",
-    "from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder\n",
-    "from langchain.schema.chat_history import BaseChatMessageHistory\n",
-    "from langchain.schema.runnable.history import RunnableWithMessageHistory"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "3185edba-4eb6-4b32-80c6-577c0d19af97",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "prompt = ChatPromptTemplate.from_messages(\n",
-    "    [\n",
-    "        (\"system\", \"You're an assistant who's good at {ability}\"),\n",
-    "        MessagesPlaceholder(variable_name=\"history\"),\n",
-    "        (\"human\", \"{question}\"),\n",
-    "    ]\n",
-    ")\n",
-    "\n",
-    "chain = prompt | ChatAnthropic(model=\"claude-2\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f9d81796-ce61-484c-89e2-6c567d5e54ef",
-   "metadata": {},
-   "source": [
-    "### Adding message history\n",
-    "\n",
-    "To add message history to our original chain we wrap it in the `RunnableWithMessageHistory` class.\n",
-    "\n",
-    "Crucially, we also need to  define a method that takes a session_id string and based on it returns a `BaseChatMessageHistory`. Given the same input, this method should return an equivalent output.\n",
-    "\n",
-    "In this case we'll also want to specify `input_messages_key` (the key to be treated as the latest input message) and `history_messages_key` (the key to add historical messages to)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "ca7c64d8-e138-4ef8-9734-f82076c47d80",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "chain_with_history = RunnableWithMessageHistory(\n",
-    "    chain,\n",
-    "    lambda session_id: RedisChatMessageHistory(session_id, url=REDIS_URL),\n",
-    "    input_messages_key=\"question\",\n",
-    "    history_messages_key=\"history\",\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "37eefdec-9901-4650-b64c-d3c097ed5f4d",
-   "metadata": {},
-   "source": [
-    "## Invoking with config\n",
-    "\n",
-    "Whenever we call our chain with message history, we need to include a config that contains the `session_id`\n",
-    "```python\n",
-    "config={\"configurable\": {\"session_id\": \"<SESSION_ID>\"}}\n",
-    "```\n",
-    "\n",
-    "Given the same configuration, our chain should be pulling from the same chat message history."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "a85bcc22-ca4c-4ad5-9440-f94be7318f3e",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content=' Cosine is one of the basic trigonometric functions in mathematics. It is defined as the ratio of the adjacent side to the hypotenuse in a right triangle.\\n\\nSome key properties and facts about cosine:\\n\\n- It is denoted by cos(θ), where θ is the angle in a right triangle. \\n\\n- The cosine of an acute angle is always positive. For angles greater than 90 degrees, cosine can be negative.\\n\\n- Cosine is one of the three main trig functions along with sine and tangent.\\n\\n- The cosine of 0 degrees is 1. As the angle increases towards 90 degrees, the cosine value decreases towards 0.\\n\\n- The range of values for cosine is -1 to 1.\\n\\n- The cosine function maps angles in a circle to the x-coordinate on the unit circle.\\n\\n- Cosine is used to find adjacent side lengths in right triangles, and has many other applications in mathematics, physics, engineering and more.\\n\\n- Key cosine identities include: cos(A+B) = cosAcosB − sinAsinB and cos(2A) = cos^2(A) − sin^2(A)\\n\\nSo in summary, cosine is a fundamental trig')"
-      ]
-     },
-     "execution_count": 7,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain_with_history.invoke(\n",
-    "    {\"ability\": \"math\", \"question\": \"What does cosine mean?\"},\n",
-    "    config={\"configurable\": {\"session_id\": \"foobar\"}},\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "ab29abd3-751f-41ce-a1b0-53f6b565e79d",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content=' The inverse of the cosine function is called the arccosine or inverse cosine, often denoted as cos-1(x) or arccos(x).\\n\\nThe key properties and facts about arccosine:\\n\\n- It is defined as the angle θ between 0 and π radians whose cosine is x. So arccos(x) = θ such that cos(θ) = x.\\n\\n- The range of arccosine is 0 to π radians (0 to 180 degrees).\\n\\n- The domain of arccosine is -1 to 1. \\n\\n- arccos(cos(θ)) = θ for values of θ from 0 to π radians.\\n\\n- arccos(x) is the angle in a right triangle whose adjacent side is x and hypotenuse is 1.\\n\\n- arccos(0) = 90 degrees. As x increases from 0 to 1, arccos(x) decreases from 90 to 0 degrees.\\n\\n- arccos(1) = 0 degrees. arccos(-1) = 180 degrees.\\n\\n- The graph of y = arccos(x) is part of the unit circle, restricted to x')"
-      ]
-     },
-     "execution_count": 8,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain_with_history.invoke(\n",
-    "    {\"ability\": \"math\", \"question\": \"What's its inverse\"},\n",
-    "    config={\"configurable\": {\"session_id\": \"foobar\"}},\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "da3d1feb-b4bb-4624-961c-7db2e1180df7",
-   "metadata": {},
-   "source": [
-    ":::tip\n",
-    "\n",
-    "[Langsmith trace](https://smith.langchain.com/public/863a003b-7ca8-4b24-be9e-d63ec13c106e/r)\n",
-    "\n",
-    ":::"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "61d5115e-64a1-4ad5-b676-8afd4ef6093e",
-   "metadata": {},
-   "source": [
-    "Looking at the Langsmith trace for the second call, we can see that when constructing the prompt, a \"history\" variable has been injected which is a list of two messages (our first input and first output)."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "028cf151-6cd5-4533-b3cf-c8d735554647",
-   "metadata": {},
-   "source": [
-    "## Example: messages input, dict output"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "id": "0bb446b5-6251-45fe-a92a-4c6171473c53",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'output_message': AIMessage(content=' Here is a summary of Simone de Beauvoir\\'s views on free will:\\n\\n- De Beauvoir was an existentialist philosopher and believed strongly in the concept of free will. She rejected the idea that human nature or instincts determine behavior.\\n\\n- Instead, de Beauvoir argued that human beings define their own essence or nature through their actions and choices. As she famously wrote, \"One is not born, but rather becomes, a woman.\"\\n\\n- De Beauvoir believed that while individuals are situated in certain cultural contexts and social conditions, they still have agency and the ability to transcend these situations. Freedom comes from choosing one\\'s attitude toward these constraints.\\n\\n- She emphasized the radical freedom and responsibility of the individual. We are \"condemned to be free\" because we cannot escape making choices and taking responsibility for our choices. \\n\\n- De Beauvoir felt that many people evade their freedom and responsibility by adopting rigid mindsets, ideologies, or conforming uncritically to social roles.\\n\\n- She advocated for the recognition of ambiguity in the human condition and warned against the quest for absolute rules that deny freedom and responsibility. Authentic living involves embracing ambiguity.\\n\\nIn summary, de Beauvoir promoted an existential ethics')}"
-      ]
-     },
-     "execution_count": 14,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.schema.messages import HumanMessage\n",
-    "from langchain.schema.runnable import RunnableParallel\n",
-    "\n",
-    "chain = RunnableParallel({\"output_message\": ChatAnthropic(model=\"claude-2\")})\n",
-    "chain_with_history = RunnableWithMessageHistory(\n",
-    "    chain,\n",
-    "    lambda session_id: RedisChatMessageHistory(session_id, url=REDIS_URL),\n",
-    "    output_messages_key=\"output_message\",\n",
-    ")\n",
-    "\n",
-    "chain_with_history.invoke(\n",
-    "    [HumanMessage(content=\"What did Simone de Beauvoir believe about free will\")],\n",
-    "    config={\"configurable\": {\"session_id\": \"baz\"}},\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 16,
-   "id": "601ce3ff-aea8-424d-8e54-fd614256af4f",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'output_message': AIMessage(content=\" There are many similarities between Simone de Beauvoir's views on free will and those of Jean-Paul Sartre, though some key differences emerge as well:\\n\\nSimilarities with Sartre:\\n\\n- Both were existentialist thinkers who rejected determinism and emphasized human freedom and responsibility.\\n\\n- They agreed that existence precedes essence - there is no predefined human nature that determines who we are.\\n\\n- Individuals must define themselves through their choices and actions. This leads to anxiety but also freedom.\\n\\n- The human condition is characterized by ambiguity and uncertainty, rather than fixed meanings/values.\\n\\n- Both felt that most people evade their freedom through self-deception, conformity, or adopting collective identities/values uncritically.\\n\\nDifferences from Sartre: \\n\\n- Sartre placed more emphasis on the burden and anguish of radical freedom. De Beauvoir focused more on its positive potential.\\n\\n- De Beauvoir critiqued Sartre's premise that human relations are necessarily conflictual. She saw more potential for mutual recognition.\\n\\n- Sartre saw the Other's gaze as a threat to freedom. De Beauvoir put more stress on how the Other's gaze can confirm\")}"
-      ]
-     },
-     "execution_count": 16,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "chain_with_history.invoke(\n",
-    "    [HumanMessage(content=\"How did this compare to Sartre\")],\n",
-    "    config={\"configurable\": {\"session_id\": \"baz\"}},\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b898d1b1-11e6-4d30-a8dd-cc5e45533611",
-   "metadata": {},
-   "source": [
-    ":::tip\n",
-    "\n",
-    "[LangSmith trace](https://smith.langchain.com/public/f6c3e1d1-a49d-4955-a9fa-c6519df74fa7/r)\n",
-    "\n",
-    ":::"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1724292c-01c6-44bb-83e8-9cdb6bf01483",
-   "metadata": {},
-   "source": [
-    "## More examples\n",
-    "\n",
-    "We could also do any of the below:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "fd89240b-5a25-48f8-9568-5c1127f9ffad",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from operator import itemgetter\n",
-    "\n",
-    "# messages in, messages out\n",
-    "RunnableWithMessageHistory(\n",
-    "    ChatAnthropic(model=\"claude-2\"),\n",
-    "    lambda session_id: RedisChatMessageHistory(session_id, url=REDIS_URL),\n",
-    ")\n",
-    "\n",
-    "# dict with single key for all messages in, messages out\n",
-    "RunnableWithMessageHistory(\n",
-    "    itemgetter(\"input_messages\") | ChatAnthropic(model=\"claude-2\"),\n",
-    "    lambda session_id: RedisChatMessageHistory(session_id, url=REDIS_URL),\n",
-    "    input_messages_key=\"input_messages\",\n",
-    ")"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "poetry-venv",
-   "language": "python",
-   "name": "poetry-venv"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/docs/expression_language/how_to/passthrough.ipynb
+++ b/docs/docs/expression_language/how_to/passthrough.ipynb
@@ -1,159 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "d35de667-0352-4bfb-a890-cebe7f676fe7",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_position: 1\n",
-    "title: \"RunnablePassthrough: Passing data through\"\n",
-    "keywords: [RunnablePassthrough, RunnableParallel, LCEL]\n",
-    "---"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b022ab74-794d-4c54-ad47-ff9549ddb9d2",
-   "metadata": {},
-   "source": [
-    "# Passing data through\n",
-    "\n",
-    "RunnablePassthrough allows to pass inputs unchanged or with the addition of extra keys. This typically is used in conjuction with RunnableParallel to assign data to a new key in the map. \n",
-    "\n",
-    "RunnablePassthrough() called on it's own, will simply take the input and pass it through. \n",
-    "\n",
-    "RunnablePassthrough called with assign (`RunnablePassthrough.assign(...)`) will take the input, and will add the extra arguments passed to the assign function. \n",
-    "\n",
-    "See the example below:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "03988b8d-d54c-4492-8707-1594372cf093",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'passed': {'num': 1}, 'extra': {'num': 1, 'mult': 3}, 'modified': 2}"
-      ]
-     },
-     "execution_count": 11,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.schema.runnable import RunnableParallel, RunnablePassthrough\n",
-    "\n",
-    "runnable = RunnableParallel(\n",
-    "    passed=RunnablePassthrough(),\n",
-    "    extra=RunnablePassthrough.assign(mult=lambda x: x[\"num\"] * 3),\n",
-    "    modified=lambda x: x[\"num\"] + 1,\n",
-    ")\n",
-    "\n",
-    "runnable.invoke({\"num\": 1})"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "702c7acc-cd31-4037-9489-647df192fd7c",
-   "metadata": {},
-   "source": [
-    "As seen above, `passed` key was called with `RunnablePassthrough()` and so it simply passed on `{'num': 1}`. \n",
-    "\n",
-    "In the second line, we used `RunnablePastshrough.assign` with a lambda that multiplies the numerical value by 3. In this cased, `extra` was set with `{'num': 1, 'mult': 3}` which is the original value with the `mult` key added. \n",
-    "\n",
-    "Finally, we also set a third key in the map with `modified` which uses a labmda to set a single value adding 1 to the num, which resulted in `modified` key with the value of `2`."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "15187a3b-d666-4b9b-a258-672fc51fe0e2",
-   "metadata": {},
-   "source": [
-    "## Retrieval Example\n",
-    "\n",
-    "In the example below, we see a use case where we use RunnablePassthrough along with RunnableMap. "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 17,
-   "id": "267d1460-53c1-4fdb-b2c3-b6a1eb7fccff",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "'Harrison worked at Kensho.'"
-      ]
-     },
-     "execution_count": 17,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.chat_models import ChatOpenAI\n",
-    "from langchain.embeddings import OpenAIEmbeddings\n",
-    "from langchain.prompts import ChatPromptTemplate\n",
-    "from langchain.schema.output_parser import StrOutputParser\n",
-    "from langchain.schema.runnable import RunnablePassthrough\n",
-    "from langchain.vectorstores import FAISS\n",
-    "\n",
-    "vectorstore = FAISS.from_texts(\n",
-    "    [\"harrison worked at kensho\"], embedding=OpenAIEmbeddings()\n",
-    ")\n",
-    "retriever = vectorstore.as_retriever()\n",
-    "template = \"\"\"Answer the question based only on the following context:\n",
-    "{context}\n",
-    "\n",
-    "Question: {question}\n",
-    "\"\"\"\n",
-    "prompt = ChatPromptTemplate.from_template(template)\n",
-    "model = ChatOpenAI()\n",
-    "\n",
-    "retrieval_chain = (\n",
-    "    {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
-    "    | prompt\n",
-    "    | model\n",
-    "    | StrOutputParser()\n",
-    ")\n",
-    "\n",
-    "retrieval_chain.invoke(\"where did harrison work?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "392cd4c4-e7ed-4ab8-934d-f7a4eca55ee1",
-   "metadata": {},
-   "source": [
-    "Here the input to prompt is expected to be a map with keys \"context\" and \"question\". The user input is just the question. So we need to get the context using our retriever and passthrough the user input under the \"question\" key. In this case, the RunnablePassthrough allows us to pass on the user's question to the prompt and model. \n"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.6"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/docs/expression_language/how_to/routing.ipynb
+++ b/docs/docs/expression_language/how_to/routing.ipynb
@@ -1,16 +1,5 @@
 {
 "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_position: 3\n",
-    "title: \"RunnableBranch: Dynamically route logic based on input\"\n",
-    "keywords: [RunnableBranch, LCEL]\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "4b47436a",
@@ -74,7 +63,7 @@
    "chain = (\n",
    "    PromptTemplate.from_template(\n",
    "        \"\"\"Given the user question below, classify it as either being about `LangChain`, `Anthropic`, or `Other`.\n",
-    "\n",
+    "                                     \n",
    "Do not respond with more than one word.\n",
    "\n",
    "<question>\n",
@@ -304,7 +293,7 @@
    }
   ],
   "source": [
-    "full_chain.invoke({\"question\": \"how do I use Anthropic?\"})"
+    "full_chain.invoke({\"question\": \"how do I use Anthroipc?\"})"
   ]
  },
  {
--- a/docs/docs/expression_language/index.mdx
+++ b/docs/docs/expression_language/index.mdx
@@ -20,7 +20,7 @@ Whenever your LCEL chains have steps that can be executed in parallel (eg if you
 Configure retries and fallbacks for any part of your LCEL chain. This is a great way to make your chains more reliable at scale. We’re currently working on adding streaming support for retries/fallbacks, so you can get the added reliability without any latency cost.

 **Access intermediate results**
-For more complex chains it’s often very useful to access the results of intermediate steps even before the final output is produced. This can be used to let end-users know something is happening, or even just to debug your chain. You can stream intermediate results, and it’s available on every [LangServe](/docs/langserve) server.
+For more complex chains it’s often very useful to access the results of intermediate steps even before the final output is produced. This can be used let end-users know something is happening, or even just to debug your chain. You can stream intermediate results, and it’s available on every [LangServe](/docs/langserve) server.

 **Input and output schemas**
 Input and output schemas give every LCEL chain Pydantic and JSONSchema schemas inferred from the structure of your chain. This can be used for validation of inputs and outputs, and is an integral part of LangServe.
@@ -30,4 +30,4 @@ As your chains get more and more complex, it becomes increasingly important to u
 With LCEL, **all** steps are automatically logged to [LangSmith](/docs/langsmith/) for maximum observability and debuggability.

 **Seamless LangServe deployment integration**
-Any chain created with LCEL can be easily deployed using [LangServe](/docs/langserve).
+Any chain created with LCEL can be easily deployed using LangServe.
--- a/docs/docs/expression_language/interface.ipynb
+++ b/docs/docs/expression_language/interface.ipynb
@@ -6,7 +6,7 @@
   "metadata": {},
   "source": [
    "---\n",
-    "sidebar_position: 1\n",
+    "sidebar_position: 0\n",
    "title: Interface\n",
    "---"
   ]
@@ -16,7 +16,7 @@
   "id": "9a9acd2e",
   "metadata": {},
   "source": [
-    "To make it as easy as possible to create custom chains, we've implemented a [\"Runnable\"](https://api.python.langchain.com/en/stable/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol. The `Runnable` protocol is implemented for most components. \n",
+    "To make it as easy as possible to create custom chains, we've implemented a [\"Runnable\"](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.base.Runnable.html#langchain.schema.runnable.base.Runnable) protocol. The `Runnable` protocol is implemented for most components. \n",
    "This is a standard interface, which makes it easy to define custom chains as well as invoke them in a standard way. \n",
    "The standard interface includes:\n",
    "\n",
--- a/docs/docs/expression_language/why.ipynb
+++ b/docs/docs/expression_language/why.ipynb
--- a/docs/docs/get_started/installation.mdx
+++ b/docs/docs/get_started/installation.mdx
@@ -29,7 +29,7 @@ If you want to install from source, you can do so by cloning the repo and be sur
 pip install -e .
 ```

-## LangChain experimental
+## Langchain experimental
 The `langchain-experimental` package holds experimental LangChain code, intended for research and experimental uses.
 Install with:

@@ -37,6 +37,14 @@ Install with:
 pip install langchain-experimental
 ```

+## LangChain CLI
+The LangChain CLI is useful for working with LangChain templates and other LangServe projects.
+Install with:
+
+```bash
+pip install langchain-cli
+```
+
 ## LangServe
 LangServe helps developers deploy LangChain runnables and chains as a REST API.
 LangServe is automatically installed by LangChain CLI.
@@ -47,14 +55,6 @@ pip install "langserve[all]"
 ```
 for both client and server dependencies. Or `pip install "langserve[client]"` for client code, and `pip install "langserve[server]"` for server code.

-## LangChain CLI
-The LangChain CLI is useful for working with LangChain templates and other LangServe projects.
-Install with:
-
-```bash
-pip install langchain-cli
-```
-
 ## LangSmith SDK
 The LangSmith SDK is automatically installed by LangChain.
 If not using LangChain, install with:
--- a/docs/docs/get_started/introduction.mdx
+++ b/docs/docs/get_started/introduction.mdx
@@ -14,7 +14,7 @@ This framework consists of several parts.
 - **[LangServe](/docs/langserve)**: A library for deploying LangChain chains as a REST API.
 - **[LangSmith](/docs/langsmith)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.

-![LangChain Diagram](/svg/langchain_stack.svg)
+![LangChain Diagram](/img/langchain_stack.png)

 Together, these products simplify the entire application lifecycle:
 - **Develop**: Write your applications in LangChain/LangChain.js. Hit the ground running using Templates for reference.
@@ -49,7 +49,7 @@ LCEL is a declarative way to compose chains. LCEL was designed from day 1 to sup

 - **[Overview](/docs/expression_language/)**: LCEL and its benefits
 - **[Interface](/docs/expression_language/interface)**: The standard interface for LCEL objects
- **[How-to](/docs/expression_language/how_to)**: Key features of LCEL
+- **[How-to](/docs/expression_language/interface)**: Key features of LCEL
 - **[Cookbook](/docs/expression_language/cookbook)**: Example code for accomplishing common tasks


@@ -79,7 +79,7 @@ Walkthroughs and techniques for common end-to-end use cases, like:
 ### [Integrations](/docs/integrations/providers/)
 LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/integrations/providers/).

-### [Guides](/docs/guides/guides/debugging)
+### [Guides](/docs/guides/adapters/openai)
 Best practices for developing with LangChain.

 ### [API reference](https://api.python.langchain.com)
--- a/docs/docs/get_started/quickstart.mdx
+++ b/docs/docs/get_started/quickstart.mdx
@@ -4,7 +4,7 @@ In this quickstart we'll show you how to:
 - Get setup with LangChain, LangSmith and LangServe
 - Use the most basic and common components of LangChain: prompt templates, models, and output parsers
 - Use LangChain Expression Language, the protocol that LangChain is built on and which facilitates component chaining
- Build a simple application with LangChain
+- Build simple application with LangChain
 - Trace your application with LangSmith
 - Serve your application with LangServe

@@ -344,7 +344,7 @@ category_chain = chat_prompt | ChatOpenAI() | CommaSeparatedListOutputParser()
 app = FastAPI(
  title="LangChain Server",
  version="1.0",
-  description="A simple API server using LangChain's Runnable interfaces",
+  description="A simple api server using Langchain's Runnable interfaces",
 )

 # 3. Adding chain route
--- a/docs/docs/integrations/adapters/_category_.yml
+++ b/docs/docs/integrations/adapters/_category_.yml
--- a/docs/docs/integrations/adapters/openai-old.ipynb
+++ b/docs/docs/integrations/adapters/openai-old.ipynb
@@ -5,9 +5,7 @@
   "id": "700a516b",
   "metadata": {},
   "source": [
-    "# OpenAI Adapter(Old)\n",
-    "\n",
-    "**Please ensure OpenAI library is less than 1.0.0; otherwise, refer to the newer doc [OpenAI Adapter](./openai).**\n",
+    "# OpenAI Adapter\n",
    "\n",
    "A lot of people get started with OpenAI but want to explore other models. LangChain's integrations with many model providers make this easy to do so. While LangChain has it's own message and model APIs, we've also made it as easy as possible to explore other models by exposing an adapter to adapt LangChain models to the OpenAI api.\n",
    "\n",
@@ -51,6 +49,18 @@
    "Original OpenAI call"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "e1d27dfa",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "result = openai.ChatCompletion.create(\n",
+    "    messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
+    ")"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 15,
@@ -69,9 +79,6 @@
    }
   ],
   "source": [
-    "result = openai.ChatCompletion.create(\n",
-    "    messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
-    ")\n",
    "result[\"choices\"][0][\"message\"].to_dict_recursive()"
   ]
  },
@@ -83,6 +90,18 @@
    "LangChain OpenAI wrapper call"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "id": "87c2d515",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "lc_result = lc_openai.ChatCompletion.create(\n",
+    "    messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
+    ")"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 17,
@@ -101,9 +120,6 @@
    }
   ],
   "source": [
-    "lc_result = lc_openai.ChatCompletion.create(\n",
-    "    messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
-    ")\n",
    "lc_result[\"choices\"][0][\"message\"]"
   ]
  },
@@ -115,6 +131,18 @@
    "Swapping out model providers"
   ]
  },
+  {
+   "cell_type": "code",
+   "execution_count": 18,
+   "id": "7a2c011c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "lc_result = lc_openai.ChatCompletion.create(\n",
+    "    messages=messages, model=\"claude-2\", temperature=0, provider=\"ChatAnthropic\"\n",
+    ")"
+   ]
+  },
  {
   "cell_type": "code",
   "execution_count": 19,
@@ -133,9 +161,6 @@
    }
   ],
   "source": [
-    "lc_result = lc_openai.ChatCompletion.create(\n",
-    "    messages=messages, model=\"claude-2\", temperature=0, provider=\"ChatAnthropic\"\n",
-    ")\n",
    "lc_result[\"choices\"][0][\"message\"]"
   ]
  },
@@ -277,7 +302,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.10.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/guides/debugging.md
+++ b/docs/docs/guides/debugging.md
@@ -12,7 +12,7 @@ Platforms with tracing capabilities like [LangSmith](/docs/langsmith/) and [Wand

 For anyone building production-grade LLM applications, we highly recommend using a platform like this.

-![LangSmith run](../../static/img/run_details.png)
+![LangSmith run](/img/run_details.png)

 ## `set_debug` and `set_verbose`

--- a/docs/docs/guides/evaluation/string/json.ipynb
+++ b/docs/docs/guides/evaluation/string/json.ipynb
@@ -5,13 +5,13 @@
   "id": "465cfbef-5bba-4b3b-b02d-fe2eba39db17",
   "metadata": {},
   "source": [
-    "# JSON Evaluators\n",
+    "# Evaluating Structured Output: JSON Evaluators\n",
    "\n",
-    "Evaluating [extraction](https://python.langchain.com/docs/use_cases/extraction) and function calling applications often comes down to validation that the LLM's string output can be parsed correctly and how it compares to a reference object. The following `JSON` validators provide functionality to check your model's output consistently.\n",
+    "Evaluating [extraction](https://python.langchain.com/docs/use_cases/extraction) and function calling applications often comes down to validation that the LLM's string output can be parsed correctly and how it compares to a reference object. The following JSON validators provide provide functionality to check your model's output in a consistent way.\n",
    "\n",
    "## JsonValidityEvaluator\n",
    "\n",
-    "The `JsonValidityEvaluator` is designed to check the validity of a `JSON` string prediction.\n",
+    "The `JsonValidityEvaluator` is designed to check the validity of a JSON string prediction.\n",
    "\n",
    "### Overview:\n",
    "- **Requires Input?**: No\n",
@@ -377,7 +377,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.2"
  }
 },
 "nbformat": 4,
--- a/docs/docs/guides/evaluation/string/string_distance.ipynb
+++ b/docs/docs/guides/evaluation/string/string_distance.ipynb
@@ -8,12 +8,9 @@
    "# String Distance\n",
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/guides/evaluation/string/string_distance.ipynb)\n",
    "\n",
-    ">In information theory, linguistics, and computer science, the [Levenshtein distance (Wikipedia)](https://en.wikipedia.org/wiki/Levenshtein_distance) is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other. It is named after the Soviet mathematician Vladimir Levenshtein, who considered this distance in 1965.\n",
+    "One of the simplest ways to compare an LLM or chain's string output against a reference label is by using string distance measurements such as Levenshtein or postfix distance.  This can be used alongside approximate/fuzzy matching criteria for very basic unit testing.\n",
    "\n",
-    "\n",
-    "One of the simplest ways to compare an LLM or chain's string output against a reference label is by using string distance measurements such as `Levenshtein` or `postfix` distance.  This can be used alongside approximate/fuzzy matching criteria for very basic unit testing.\n",
-    "\n",
-    "This can be accessed using the `string_distance` evaluator, which uses distance metrics from the [rapidfuzz](https://github.com/maxbachmann/RapidFuzz) library.\n",
+    "This can be accessed using the `string_distance` evaluator, which uses distance metric's from the [rapidfuzz](https://github.com/maxbachmann/RapidFuzz) library.\n",
    "\n",
    "**Note:** The returned scores are _distances_, meaning lower is typically \"better\".\n",
    "\n",
@@ -216,9 +213,9 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
-}
+}
--- a/docs/docs/guides/fallbacks.ipynb
+++ b/docs/docs/guides/fallbacks.ipynb
@@ -28,7 +28,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 18,
   "id": "d3e893bf",
   "metadata": {},
   "outputs": [],
@@ -46,24 +46,19 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 21,
   "id": "dfdd8bf5",
   "metadata": {},
   "outputs": [],
   "source": [
    "from unittest.mock import patch\n",
    "\n",
-    "import httpx\n",
-    "from openai import RateLimitError\n",
-    "\n",
-    "request = httpx.Request(\"GET\", \"/\")\n",
-    "response = httpx.Response(200, request=request)\n",
-    "error = RateLimitError(\"rate limit\", response=response, body=\"\")"
+    "from openai.error import RateLimitError"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 24,
   "id": "e6fdffc1",
   "metadata": {},
   "outputs": [],
@@ -76,7 +71,7 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 27,
   "id": "584461ab",
   "metadata": {},
   "outputs": [
@@ -90,10 +85,10 @@
   ],
   "source": [
    "# Let's use just the OpenAI LLm first, to show that we run into an error\n",
-    "with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
+    "with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
    "    try:\n",
    "        print(openai_llm.invoke(\"Why did the chicken cross the road?\"))\n",
-    "    except RateLimitError:\n",
+    "    except:\n",
    "        print(\"Hit error\")"
   ]
  },
@@ -113,10 +108,10 @@
   ],
   "source": [
    "# Now let's try with fallbacks to Anthropic\n",
-    "with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
+    "with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
    "    try:\n",
    "        print(llm.invoke(\"Why did the chicken cross the road?\"))\n",
-    "    except RateLimitError:\n",
+    "    except:\n",
    "        print(\"Hit error\")"
   ]
  },
@@ -155,10 +150,10 @@
    "    ]\n",
    ")\n",
    "chain = prompt | llm\n",
-    "with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
+    "with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
    "    try:\n",
    "        print(chain.invoke({\"animal\": \"kangaroo\"}))\n",
-    "    except RateLimitError:\n",
+    "    except:\n",
    "        print(\"Hit error\")"
   ]
  },
@@ -436,7 +431,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
--- a/docs/docs/guides/local_llms.ipynb
+++ b/docs/docs/guides/local_llms.ipynb
@@ -32,7 +32,7 @@
    "1. `Base model`: What is the base-model and how was it trained?\n",
    "2. `Fine-tuning approach`: Was the base-model fine-tuned and, if so, what [set of instructions](https://cameronrwolfe.substack.com/p/beyond-llama-the-power-of-open-llms#%C2%A7alpaca-an-instruction-following-llama-model) was used?\n",
    "\n",
-    "![Image description](../../static/img/OSS_LLM_overview.png)\n",
+    "![Image description](/img/OSS_LLM_overview.png)\n",
    "\n",
    "The relative performance of these models can be assessed using several leaderboards, including:\n",
    "\n",
@@ -55,7 +55,7 @@
    "\n",
    "In particular, see [this excellent post](https://finbarr.ca/how-is-llama-cpp-possible/) on the importance of quantization.\n",
    "\n",
-    "![Image description](../../static/img/llama-memory-weights.png)\n",
+    "![Image description](/img/llama-memory-weights.png)\n",
    "\n",
    "With less precision, we radically decrease the memory needed to store the LLM in memory.\n",
    "\n",
@@ -63,7 +63,7 @@
    "\n",
    "A Mac M2 Max is 5-6x faster than a M1 for inference due to the larger GPU memory bandwidth.\n",
    "\n",
-    "![Image description](../../static/img/llama_t_put.png)\n",
+    "![Image description](/img/llama_t_put.png)\n",
    "\n",
    "## Quickstart\n",
    "\n",
@@ -284,8 +284,6 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "from langchain.callbacks.manager import CallbackManager\n",
-    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
    "from langchain.llms import LlamaCpp\n",
    "\n",
    "llm = LlamaCpp(\n",
--- a/docs/docs/guides/privacy/presidio_data_anonymization/index.ipynb
+++ b/docs/docs/guides/privacy/presidio_data_anonymization/index.ipynb
@@ -8,8 +8,6 @@
    "\n",
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/langchain-ai/langchain/blob/master/docs/docs/guides/privacy/presidio_data_anonymization/index.ipynb)\n",
    "\n",
-    ">[Presidio](https://microsoft.github.io/presidio/) (Origin from Latin praesidium ‘protection, garrison’) helps to ensure sensitive data is properly managed and governed. It provides fast identification and anonymization modules for private entities in text and images such as credit card numbers, names, locations, social security numbers, bitcoin wallets, US phone numbers, financial data and more.\n",
-    "\n",
    "## Use case\n",
    "\n",
    "Data anonymization is crucial before passing information to a language model like GPT-4 because it helps protect privacy and maintain confidentiality. If data is not anonymized, sensitive information such as names, addresses, contact numbers, or other identifiers linked to specific individuals could potentially be learned and misused. Hence, by obscuring or removing this personally identifiable information (PII), data can be used freely without compromising individuals' privacy rights or breaching data protection laws and regulations.\n",
@@ -532,7 +530,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.4"
  }
 },
 "nbformat": 4,
--- a/docs/docs/guides/privacy/presidio_data_anonymization/qa_privacy_protection.ipynb
+++ b/docs/docs/guides/privacy/presidio_data_anonymization/qa_privacy_protection.ipynb
@@ -60,7 +60,7 @@
    "\n",
    " Firstly, the wallet contains my credit card with number 4111 1111 1111 1111, which is registered under my name and linked to my bank account, PL61109010140000071219812874.\n",
    "\n",
-    " Additionally, the wallet had a driver's license - DL No: 999000680 issued to my name. It also houses my Social Security Number, 602-76-4532.\n",
+    " Additionally, the wallet had a driver's license - DL No: 999000680 issued to my name. It also houses my Social Security Number, 602-76-4532. \n",
    "\n",
    " What's more, I had my polish identity card there, with the number ABC123456.\n",
    "\n",
@@ -68,7 +68,7 @@
    "\n",
    " In case any information arises regarding my wallet, please reach out to me on my phone number, 999-888-7777, or through my personal email, johndoe@example.com.\n",
    "\n",
-    " Please consider this information to be highly confidential and respect my privacy.\n",
+    " Please consider this information to be highly confidential and respect my privacy. \n",
    "\n",
    " The bank has been informed about the stolen credit card and necessary actions have been taken from their end. They will be reachable at their official email, support@bankname.com.\n",
    " My representative there is Victoria Cherry (her business phone: 987-654-3210).\n",
@@ -667,11 +667,7 @@
    "from langchain.chat_models.openai import ChatOpenAI\n",
    "from langchain.prompts import ChatPromptTemplate\n",
    "from langchain.schema.output_parser import StrOutputParser\n",
-    "from langchain.schema.runnable import (\n",
-    "    RunnableLambda,\n",
-    "    RunnableParallel,\n",
-    "    RunnablePassthrough,\n",
-    ")\n",
+    "from langchain.schema.runnable import RunnableLambda, RunnableMap, RunnablePassthrough\n",
    "\n",
    "# 6. Create anonymizer chain\n",
    "template = \"\"\"Answer the question based only on the following context:\n",
@@ -684,7 +680,7 @@
    "model = ChatOpenAI(temperature=0.3)\n",
    "\n",
    "\n",
-    "_inputs = RunnableParallel(\n",
+    "_inputs = RunnableMap(\n",
    "    question=RunnablePassthrough(),\n",
    "    # It is important to remember about question anonymization\n",
    "    anonymized_question=RunnableLambda(anonymizer.anonymize),\n",
@@ -886,7 +882,7 @@
    "\n",
    "\n",
    "chain_with_deanonymization = (\n",
-    "    RunnableParallel({\"question\": RunnablePassthrough()})\n",
+    "    RunnableMap({\"question\": RunnablePassthrough()})\n",
    "    | {\n",
    "        \"context\": itemgetter(\"question\")\n",
    "        | retriever\n",
--- a/docs/docs/guides/safety/amazon_comprehend_chain.ipynb
+++ b/docs/docs/guides/safety/amazon_comprehend_chain.ipynb
@@ -7,9 +7,7 @@
   "source": [
    "# Amazon Comprehend Moderation Chain\n",
    "\n",
-    ">[Amazon Comprehend](https://aws.amazon.com/comprehend/) is a natural-language processing (NLP) service that uses machine learning to uncover valuable insights and connections in text.\n",
-    "\n",
-    "This notebook shows how to use `Amazon Comprehend` to detect and handle `Personally Identifiable Information` (`PII`) and toxicity.\n",
+    "This notebook shows how to use [Amazon Comprehend](https://aws.amazon.com/comprehend/) to detect and handle `Personally Identifiable Information` (`PII`) and toxicity.\n",
    "\n",
    "## Setting up"
   ]
@@ -1419,7 +1417,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/guides/safety/hugging_face_prompt_injection.ipynb
+++ b/docs/docs/guides/safety/hugging_face_prompt_injection.ipynb
@@ -8,7 +8,7 @@
    "# Hugging Face prompt injection identification\n",
    "\n",
    "This notebook shows how to prevent prompt injection attacks using the text classification model from `HuggingFace`.\n",
-    "By default it uses a *deberta* model trained to identify prompt injections. In this walkthrough we'll use https://huggingface.co/laiyer/deberta-v3-base-prompt-injection."
+    "It exploits the *deberta* model trained to identify prompt injections: https://huggingface.co/deepset/deberta-v3-base-injection"
   ]
  },
  {
@@ -21,37 +21,19 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
   "id": "aea25588-3c3f-4506-9094-221b3a0d519b",
   "metadata": {},
   "outputs": [
    {
     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "58ab3557623a495d8cc3c3e32a61938f",
-       "version_major": 2,
-       "version_minor": 0
-      },
      "text/plain": [
-       "Downloading config.json:   0%|          | 0.00/994 [00:00<?, ?B/s]"
+       "'hugging_face_injection_identifier'"
      ]
     },
+     "execution_count": 1,
     "metadata": {},
-     "output_type": "display_data"
-    },
-    {
-     "data": {
-      "application/vnd.jupyter.widget-view+json": {
-       "model_id": "3bf062f02d304ab5a485a2a228b4cf41",
-       "version_major": 2,
-       "version_minor": 0
-      },
-      "text/plain": [
-       "Downloading model.safetensors:   0%|          | 0.00/738M [00:00<?, ?B/s]"
-      ]
-     },
-     "metadata": {},
-     "output_type": "display_data"
+     "output_type": "execute_result"
    }
   ],
   "source": [
@@ -59,10 +41,7 @@
    "    HuggingFaceInjectionIdentifier,\n",
    ")\n",
    "\n",
-    "# Using https://huggingface.co/laiyer/deberta-v3-base-prompt-injection\n",
-    "injection_identifier = HuggingFaceInjectionIdentifier(\n",
-    "    model=\"laiyer/deberta-v3-base-prompt-injection\"\n",
-    ")\n",
+    "injection_identifier = HuggingFaceInjectionIdentifier()\n",
    "injection_identifier.name"
   ]
  },
@@ -320,9 +299,9 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "poetry-venv",
+   "display_name": "Python 3 (ipykernel)",
   "language": "python",
-   "name": "poetry-venv"
+   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
@@ -334,7 +313,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.9.1"
+   "version": "3.10.12"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/adapters/openai.ipynb
+++ b/docs/docs/integrations/adapters/openai.ipynb
@@ -1,318 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "id": "700a516b",
-   "metadata": {},
-   "source": [
-    "# OpenAI Adapter\n",
-    "\n",
-    "**Please ensure OpenAI library is version 1.0.0 or higher; otherwise, refer to the older doc [OpenAI Adapter(Old)](./openai-old).**\n",
-    "\n",
-    "A lot of people get started with OpenAI but want to explore other models. LangChain's integrations with many model providers make this easy to do so. While LangChain has it's own message and model APIs, we've also made it as easy as possible to explore other models by exposing an adapter to adapt LangChain models to the OpenAI api.\n",
-    "\n",
-    "At the moment this only deals with output and does not return other information (token counts, stop reasons, etc)."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "6017f26a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import openai\n",
-    "from langchain.adapters import openai as lc_openai"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "b522ceda",
-   "metadata": {},
-   "source": [
-    "## chat.completions.create"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "1d22eb61",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "messages = [{\"role\": \"user\", \"content\": \"hi\"}]"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "d550d3ad",
-   "metadata": {},
-   "source": [
-    "Original OpenAI call"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "id": "012d81ae",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'content': 'Hello! How can I assist you today?',\n",
-       " 'role': 'assistant',\n",
-       " 'function_call': None,\n",
-       " 'tool_calls': None}"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "result = openai.chat.completions.create(\n",
-    "    messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
-    ")\n",
-    "result.choices[0].message.model_dump()"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "db5b5500",
-   "metadata": {},
-   "source": [
-    "LangChain OpenAI wrapper call"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "c67a5ac8",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'role': 'assistant', 'content': 'Hello! How can I help you today?'}"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "lc_result = lc_openai.chat.completions.create(\n",
-    "    messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
-    ")\n",
-    "\n",
-    "lc_result.choices[0].message  # Attribute access"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "37a6e461-8608-47f6-ac45-12ad753c062a",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'role': 'assistant', 'content': 'Hello! How can I help you today?'}"
-      ]
-     },
-     "execution_count": 5,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "lc_result[\"choices\"][0][\"message\"]  # Also compatible with index access"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "034ba845",
-   "metadata": {},
-   "source": [
-    "Swapping out model providers"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "f7c94827",
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "{'role': 'assistant', 'content': 'Hello! How can I assist you today?'}"
-      ]
-     },
-     "execution_count": 6,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "lc_result = lc_openai.chat.completions.create(\n",
-    "    messages=messages, model=\"claude-2\", temperature=0, provider=\"ChatAnthropic\"\n",
-    ")\n",
-    "lc_result.choices[0].message"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "cb3f181d",
-   "metadata": {},
-   "source": [
-    "## chat.completions.stream"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "f7b8cd18",
-   "metadata": {},
-   "source": [
-    "Original OpenAI call"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "fd8cb1ea",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "{'content': '', 'function_call': None, 'role': 'assistant', 'tool_calls': None}\n",
-      "{'content': 'Hello', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': '!', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': ' How', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': ' can', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': ' I', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': ' assist', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': ' you', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': ' today', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': '?', 'function_call': None, 'role': None, 'tool_calls': None}\n",
-      "{'content': None, 'function_call': None, 'role': None, 'tool_calls': None}\n"
-     ]
-    }
-   ],
-   "source": [
-    "for c in openai.chat.completions.create(\n",
-    "    messages=messages, model=\"gpt-3.5-turbo\", temperature=0, stream=True\n",
-    "):\n",
-    "    print(c.choices[0].delta.model_dump())"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0b2a076b",
-   "metadata": {},
-   "source": [
-    "LangChain OpenAI wrapper call"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "9521218c",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "{'role': 'assistant', 'content': ''}\n",
-      "{'content': 'Hello'}\n",
-      "{'content': '!'}\n",
-      "{'content': ' How'}\n",
-      "{'content': ' can'}\n",
-      "{'content': ' I'}\n",
-      "{'content': ' assist'}\n",
-      "{'content': ' you'}\n",
-      "{'content': ' today'}\n",
-      "{'content': '?'}\n",
-      "{}\n"
-     ]
-    }
-   ],
-   "source": [
-    "for c in lc_openai.chat.completions.create(\n",
-    "    messages=messages, model=\"gpt-3.5-turbo\", temperature=0, stream=True\n",
-    "):\n",
-    "    print(c.choices[0].delta)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "0fc39750",
-   "metadata": {},
-   "source": [
-    "Swapping out model providers"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "id": "68f0214e",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "{'role': 'assistant', 'content': ''}\n",
-      "{'content': 'Hello'}\n",
-      "{'content': '!'}\n",
-      "{'content': ' How'}\n",
-      "{'content': ' can'}\n",
-      "{'content': ' I'}\n",
-      "{'content': ' assist'}\n",
-      "{'content': ' you'}\n",
-      "{'content': ' today'}\n",
-      "{'content': '?'}\n",
-      "{}\n"
-     ]
-    }
-   ],
-   "source": [
-    "for c in lc_openai.chat.completions.create(\n",
-    "    messages=messages,\n",
-    "    model=\"claude-2\",\n",
-    "    temperature=0,\n",
-    "    stream=True,\n",
-    "    provider=\"ChatAnthropic\",\n",
-    "):\n",
-    "    print(c[\"choices\"][0][\"delta\"])"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/docs/integrations/callbacks/argilla.ipynb
+++ b/docs/docs/integrations/callbacks/argilla.ipynb
@@ -7,6 +7,8 @@
   "source": [
    "# Argilla\n",
    "\n",
+    "![Argilla - Open-source data platform for LLMs](https://argilla.io/og.png)\n",
+    "\n",
    ">[Argilla](https://argilla.io/) is an open-source data curation platform for LLMs.\n",
    "> Using Argilla, everyone can build robust language models through faster data curation \n",
    "> using both human and machine feedback. We provide support for each step in the MLOps cycle, \n",
@@ -408,7 +410,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.3"
  },
  "vscode": {
   "interpreter": {
--- a/docs/docs/integrations/callbacks/context.ipynb
+++ b/docs/docs/integrations/callbacks/context.ipynb
@@ -7,9 +7,12 @@
   "source": [
    "# Context\n",
    "\n",
-    ">[Context](https://context.ai/) provides user analytics for LLM-powered products and features.\n",
+    "![Context - User Analytics for LLM Powered Products](https://with.context.ai/langchain.png)\n",
    "\n",
-    "With `Context`, you can start understanding your users and improving their experiences in less than 30 minutes.\n"
+    "[Context](https://context.ai/) provides user analytics for LLM powered products and features.\n",
+    "\n",
+    "With Context, you can start understanding your users and improving their experiences in less than 30 minutes.\n",
+    "\n"
   ]
  },
  {
@@ -86,9 +89,11 @@
   "metadata": {},
   "source": [
    "## Usage\n",
-    "### Context callback within a chat model\n",
+    "### Using the Context callback within a chat model\n",
    "\n",
-    "The Context callback handler can be used to directly record transcripts between users and AI assistants."
+    "The Context callback handler can be used to directly record transcripts between users and AI assistants.\n",
+    "\n",
+    "#### Example"
   ]
  },
  {
@@ -127,7 +132,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Context callback within Chains\n",
+    "### Using the Context callback within Chains\n",
    "\n",
    "The Context callback handler can also be used to record the inputs and outputs of chains. Note that intermediate steps of the chain are not recorded - only the starting inputs and final outputs.\n",
    "\n",
@@ -144,7 +149,9 @@
    ">handler = ContextCallbackHandler(token)\n",
    ">chat = ChatOpenAI(temperature=0.9, callbacks=[callback])\n",
    ">chain = LLMChain(llm=chat, prompt=chat_prompt_template, callbacks=[callback])\n",
-    ">```\n"
+    ">```\n",
+    "\n",
+    "#### Example"
   ]
  },
  {
@@ -196,7 +203,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.9.1"
  },
  "vscode": {
   "interpreter": {
--- a/docs/docs/integrations/callbacks/infino.ipynb
+++ b/docs/docs/integrations/callbacks/infino.ipynb
@@ -7,14 +7,12 @@
   "source": [
    "# Infino\n",
    "\n",
-    ">[Infino](https://github.com/infinohq/infino) is a scalable telemetry store designed for logs, metrics, and traces. Infino can function as a standalone observability solution or as the storage layer in your observability stack.\n",
-    "\n",
    "This example shows how one can track the following while calling OpenAI and ChatOpenAI models via `LangChain` and [Infino](https://github.com/infinohq/infino):\n",
    "\n",
-    "* prompt input\n",
-    "* response from `ChatGPT` or any other `LangChain` model\n",
-    "* latency\n",
-    "* errors\n",
+    "* prompt input,\n",
+    "* response from `ChatGPT` or any other `LangChain` model,\n",
+    "* latency,\n",
+    "* errors,\n",
    "* number of tokens consumed"
   ]
  },
@@ -456,7 +454,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/callbacks/labelstudio.ipynb
+++ b/docs/docs/integrations/callbacks/labelstudio.ipynb
@@ -4,9 +4,6 @@
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
-    "jupyter": {
-     "outputs_hidden": true
-    },
    "pycharm": {
     "name": "#%% md\n"
    }
@@ -14,14 +11,17 @@
   "source": [
    "# Label Studio\n",
    "\n",
+    "<div>\n",
+    "<img src=\"https://labelstudio-pub.s3.amazonaws.com/lc/open-source-data-labeling-platform.png\" width=\"400\"/>\n",
+    "</div>\n",
    "\n",
-    ">[Label Studio](https://labelstud.io/guide/get_started) is an open-source data labeling platform that provides LangChain with flexibility when it comes to labeling data for fine-tuning large language models (LLMs). It also enables the preparation of custom training data and the collection and evaluation of responses through human feedback.\n",
+    "Label Studio is an open-source data labeling platform that provides LangChain with flexibility when it comes to labeling data for fine-tuning large language models (LLMs). It also enables the preparation of custom training data and the collection and evaluation of responses through human feedback.\n",
    "\n",
-    "In this guide, you will learn how to connect a LangChain pipeline to `Label Studio` to:\n",
+    "In this guide, you will learn how to connect a LangChain pipeline to Label Studio to:\n",
    "\n",
-    "- Aggregate all input prompts, conversations, and responses in a single `Label Studio` project. This consolidates all the data in one place for easier labeling and analysis.\n",
+    "- Aggregate all input prompts, conversations, and responses in a single LabelStudio project. This consolidates all the data in one place for easier labeling and analysis.\n",
    "- Refine prompts and responses to create a dataset for supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) scenarios. The labeled data can be used to further train the LLM to improve its performance.\n",
-    "- Evaluate model responses through human feedback. `Label Studio` provides an interface for humans to review and provide feedback on model responses, allowing evaluation and iteration."
+    "- Evaluate model responses through human feedback. LabelStudio provides an interface for humans to review and provide feedback on model responses, allowing evaluation and iteration."
   ]
  },
  {
@@ -362,9 +362,9 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "labelops",
   "language": "python",
-   "name": "python3"
+   "name": "labelops"
  },
  "language_info": {
   "codemirror_mode": {
@@ -376,9 +376,9 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.9.16"
  }
 },
 "nbformat": 4,
- "nbformat_minor": 4
+ "nbformat_minor": 1
 }
--- a/docs/docs/integrations/callbacks/llmonitor.md
+++ b/docs/docs/integrations/callbacks/llmonitor.md
@@ -1,6 +1,6 @@
 # LLMonitor

->[LLMonitor](https://llmonitor.com?utm_source=langchain&utm_medium=py&utm_campaign=docs) is an open-source observability platform that provides cost and usage analytics, user tracking, tracing and evaluation tools.
+[LLMonitor](https://llmonitor.com?utm_source=langchain&utm_medium=py&utm_campaign=docs) is an open-source observability platform that provides cost and usage analytics, user tracking, tracing and evaluation tools.

 <video controls width='100%' >
  <source src='https://llmonitor.com/videos/demo-annotated.mp4'/>
--- a/docs/docs/integrations/callbacks/promptlayer.ipynb
+++ b/docs/docs/integrations/callbacks/promptlayer.ipynb
@@ -7,13 +7,13 @@
   "source": [
    "# PromptLayer\n",
    "\n",
-    ">[PromptLayer](https://docs.promptlayer.com/introduction) is a platform for prompt engineering. It also helps with the LLM observability to visualize requests, version prompts, and track usage.\n",
-    ">\n",
-    ">While `PromptLayer` does have LLMs that integrate directly with LangChain (e.g. [`PromptLayerOpenAI`](https://python.langchain.com/docs/integrations/llms/promptlayer_openai)), using a callback is the recommended way to integrate `PromptLayer` with LangChain.\n",
+    "![PromptLayer](https://promptlayer.com/text_logo.png)\n",
    "\n",
-    "In this guide, we will go over how to setup the `PromptLayerCallbackHandler`. \n",
+    "[PromptLayer](https://promptlayer.com) is a an LLM observability platform that lets you visualize requests, version prompts, and track usage. In this guide we will go over how to setup the `PromptLayerCallbackHandler`. \n",
    "\n",
-    "See [PromptLayer docs](https://docs.promptlayer.com/languages/langchain) for more information."
+    "While PromptLayer does have LLMs that integrate directly with LangChain (e.g. [`PromptLayerOpenAI`](https://python.langchain.com/docs/integrations/llms/promptlayer_openai)), this callback is the recommended way to integrate PromptLayer with LangChain.\n",
+    "\n",
+    "See [our docs](https://docs.promptlayer.com/languages/langchain) for more information."
   ]
  },
  {
@@ -51,7 +51,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Usage\n",
+    "### Usage\n",
    "\n",
    "Getting started with `PromptLayerCallbackHandler` is fairly simple, it takes two optional arguments:\n",
    "1. `pl_tags` - an optional list of strings that will be tracked as tags on PromptLayer.\n",
@@ -63,7 +63,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Simple OpenAI Example\n",
+    "### Simple OpenAI Example\n",
    "\n",
    "In this simple example we use `PromptLayerCallbackHandler` with `ChatOpenAI`. We add a PromptLayer tag named `chatopenai`"
   ]
@@ -99,7 +99,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## GPT4All Example"
+    "### GPT4All Example"
   ]
  },
  {
@@ -125,9 +125,9 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "## Full Featured Example\n",
+    "### Full Featured Example\n",
    "\n",
-    "In this example, we unlock more of the power of `PromptLayer`.\n",
+    "In this example we unlock more of the power of PromptLayer.\n",
    "\n",
    "PromptLayer allows you to visually create, version, and track prompt templates. Using the [Prompt Registry](https://docs.promptlayer.com/features/prompt-registry), we can programmatically fetch the prompt template called `example`.\n",
    "\n",
@@ -182,7 +182,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
@@ -196,7 +196,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.8.8 (default, Apr 13 2021, 12:59:45) \n[Clang 10.0.0 ]"
  },
  "vscode": {
   "interpreter": {
--- a/docs/docs/integrations/callbacks/sagemaker_tracking.ipynb
+++ b/docs/docs/integrations/callbacks/sagemaker_tracking.ipynb
@@ -7,15 +7,14 @@
   "source": [
    "# SageMaker Tracking\n",
    "\n",
-    ">[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a fully managed service that is used to quickly and easily build, train and deploy machine learning (ML) models. \n",
-    "\n",
-    ">[Amazon SageMaker Experiments](https://docs.aws.amazon.com/sagemaker/latest/dg/experiments.html) is a capability of `Amazon SageMaker` that lets you organize, track, compare and evaluate ML experiments and model versions.\n",
-    "\n",
-    "This notebook shows how LangChain Callback can be used to log and track prompts and other LLM hyperparameters into `SageMaker Experiments`. Here, we use different scenarios to showcase the capability:\n",
+    "This notebook shows how LangChain Callback can be used to log and track prompts and other LLM hyperparameters into SageMaker Experiments. Here, we use different scenarios to showcase the capability:\n",
    "* **Scenario 1**: *Single LLM* - A case where a single LLM model is used to generate output based on a given prompt.\n",
    "* **Scenario 2**: *Sequential Chain* - A case where a sequential chain of two LLM models is used.\n",
    "* **Scenario 3**: *Agent with Tools (Chain of Thought)* - A case where multiple tools (search and math) are used in addition to an LLM.\n",
    "\n",
+    "[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a fully managed service that is used to quickly and easily build, train and deploy machine learning (ML) models. \n",
+    "\n",
+    "[Amazon SageMaker Experiments](https://docs.aws.amazon.com/sagemaker/latest/dg/experiments.html) is a capability of Amazon SageMaker that lets you organize, track, compare and evaluate ML experiments and model versions.\n",
    "\n",
    "In this notebook, we will create a single experiment to log the prompts from each scenario."
   ]
@@ -900,9 +899,9 @@
  ],
  "instance_type": "ml.t3.large",
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "conda_pytorch_p310",
   "language": "python",
-   "name": "python3"
+   "name": "conda_pytorch_p310"
  },
  "language_info": {
   "codemirror_mode": {
@@ -914,7 +913,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.10.10"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/callbacks/trubrics.ipynb
+++ b/docs/docs/integrations/callbacks/trubrics.ipynb
@@ -9,13 +9,12 @@
   "source": [
    "# Trubrics\n",
    "\n",
+    "![Trubrics](https://miro.medium.com/v2/resize:fit:720/format:webp/1*AhYbKO-v8F4u3hx2aDIqKg.png)\n",
    "\n",
-    ">[Trubrics](https://trubrics.com) is an LLM user analytics platform that lets you collect, analyse and manage user\n",
-    "prompts & feedback on AI models.\n",
-    ">\n",
-    ">Check out [Trubrics repo](https://github.com/trubrics/trubrics-sdk) for more information on `Trubrics`.\n",
+    "[Trubrics](https://trubrics.com) is an LLM user analytics platform that lets you collect, analyse and manage user\n",
+    "prompts & feedback on AI models. In this guide we will go over how to setup the `TrubricsCallbackHandler`. \n",
    "\n",
-    "In this guide, we will go over how to set up the `TrubricsCallbackHandler`. \n"
+    "Check out [our repo](https://github.com/trubrics/trubrics-sdk) for more information on Trubrics."
   ]
  },
  {
@@ -348,9 +347,9 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "langchain",
   "language": "python",
-   "name": "python3"
+   "name": "langchain"
  },
  "language_info": {
   "codemirror_mode": {
@@ -362,7 +361,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.11.4"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/chat/anthropic.ipynb
+++ b/docs/docs/integrations/chat/anthropic.ipynb
@@ -1,21 +1,11 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "a016701c",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Anthropic\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "bf733a38-db84-4363-89e2-de6735c37230",
   "metadata": {},
   "source": [
-    "# ChatAnthropic\n",
+    "# Anthropic\n",
    "\n",
    "This notebook covers how to get started with Anthropic chat models."
   ]
--- a/docs/docs/integrations/chat/anyscale.ipynb
+++ b/docs/docs/integrations/chat/anyscale.ipynb
@@ -1,22 +1,12 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "31895fc4",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Anyscale\n",
-    "---"
-   ]
-  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "642fd21c-600a-47a1-be96-6e1438b421a9",
   "metadata": {},
   "source": [
-    "# ChatAnyscale\n",
+    "# Anyscale\n",
    "\n",
    "This notebook demonstrates the use of `langchain.chat_models.ChatAnyscale` for [Anyscale Endpoints](https://endpoints.anyscale.com/).\n",
    "\n",
@@ -43,7 +33,7 @@
   "metadata": {},
   "outputs": [
    {
-     "name": "stdout",
+     "name": "stdin",
     "output_type": "stream",
     "text": [
      " ········\n"
--- a/docs/docs/integrations/chat/azure_chat_openai.ipynb
+++ b/docs/docs/integrations/chat/azure_chat_openai.ipynb
@@ -1,25 +1,13 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "641f8cb0",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Azure OpenAI\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "38f26d7a",
   "metadata": {},
   "source": [
-    "# AzureChatOpenAI\n",
+    "# Azure OpenAI\n",
    "\n",
-    ">[Azure OpenAI Service](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview) provides REST API access to OpenAI's powerful language models including the GPT-4, GPT-3.5-Turbo, and Embeddings model series. These models can be easily adapted to your specific task including but not limited to content generation, summarization, semantic search, and natural language to code translation. Users can access the service through REST APIs, Python SDK, or a web-based interface in the Azure OpenAI Studio.\n",
-    "\n",
-    "This notebook goes over how to connect to an Azure-hosted OpenAI endpoint. We recommend having version `openai>=1` installed."
+    "This notebook goes over how to connect to an Azure hosted OpenAI endpoint. We recommend having version `openai>=1` installed."
   ]
  },
  {
@@ -174,7 +162,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/chat/azureml_chat_endpoint.ipynb
+++ b/docs/docs/integrations/chat/azureml_chat_endpoint.ipynb
@@ -1,25 +1,14 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Azure ML Endpoint\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# AzureMLChatOnlineEndpoint\n",
+    "# AzureML Chat Online Endpoint\n",
    "\n",
-    ">[Azure Machine Learning](https://azure.microsoft.com/en-us/products/machine-learning/) is a platform used to build, train, and deploy machine learning models. Users can explore the types of models to deploy in the Model Catalog, which provides Azure Foundation Models and OpenAI Models. `Azure Foundation Models` include various open-source models and popular Hugging Face models. Users can also import models of their liking into AzureML.\n",
-    ">\n",
-    ">[Azure Machine Learning Online Endpoints](https://learn.microsoft.com/en-us/azure/machine-learning/concept-endpoints). After you train machine learning models or pipelines, you need to deploy them to production so that others can use them for inference. Inference is the process of applying new input data to the machine learning model or pipeline to generate outputs. While these outputs are typically referred to as \"predictions,\" inferencing can be used to generate outputs for other machine learning tasks, such as classification and clustering. In `Azure Machine Learning`, you perform inferencing by using endpoints and deployments. `Endpoints` and `Deployments` allow you to decouple the interface of your production workload from the implementation that serves it.\n",
+    "[AzureML](https://azure.microsoft.com/en-us/products/machine-learning/) is a platform used to build, train, and deploy machine learning models. Users can explore the types of models to deploy in the Model Catalog, which provides Azure Foundation Models and OpenAI Models. Azure Foundation Models include various open-source models and popular Hugging Face models. Users can also import models of their liking into AzureML.\n",
    "\n",
-    "This notebook goes over how to use a chat model hosted on an `Azure Machine Learning Endpoint`."
+    "This notebook goes over how to use a chat model hosted on an `AzureML online endpoint`"
   ]
  },
  {
@@ -102,7 +91,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.9.1"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/chat/baichuan.ipynb
+++ b/docs/docs/integrations/chat/baichuan.ipynb
@@ -1,19 +1,10 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Baichuan Chat\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# ChatBaichuan\n",
+    "# Baichuan Chat\n",
    "\n",
    "Baichuan chat models API by Baichuan Intelligent Technology. For more information, see [https://platform.baichuan-ai.com/docs/api](https://platform.baichuan-ai.com/docs/api)"
   ]
@@ -72,9 +63,7 @@
   "outputs": [
    {
     "data": {
-      "text/plain": [
-       "AIMessage(content='首先，我们需要确定闰年的二月有多少天。闰年的二月有29天。\\n\\n然后，我们可以计算你的月薪：\\n\\n日薪 = 月薪 / (当月天数)\\n\\n所以，你的月薪 = 日薪 * 当月天数\\n\\n将数值代入公式：\\n\\n月薪 = 8元/天 * 29天 = 232元\\n\\n因此，你在闰年的二月的月薪是232元。')"
-      ]
+      "text/plain": "AIMessage(content='首先，我们需要确定闰年的二月有多少天。闰年的二月有29天。\\n\\n然后，我们可以计算你的月薪：\\n\\n日薪 = 月薪 / (当月天数)\\n\\n所以，你的月薪 = 日薪 * 当月天数\\n\\n将数值代入公式：\\n\\n月薪 = 8元/天 * 29天 = 232元\\n\\n因此，你在闰年的二月的月薪是232元。')"
     },
     "execution_count": 3,
     "metadata": {},
@@ -87,23 +76,16 @@
  },
  {
   "cell_type": "markdown",
-   "metadata": {
-    "collapsed": false
-   },
   "source": [
    "## For ChatBaichuan with Streaming"
-   ]
+   ],
+   "metadata": {
+    "collapsed": false
+   }
  },
  {
   "cell_type": "code",
   "execution_count": 5,
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-10-17T15:14:25.870044Z",
-     "start_time": "2023-10-17T15:14:25.863381Z"
-    },
-    "collapsed": false
-   },
   "outputs": [],
   "source": [
    "chat = ChatBaichuan(\n",
@@ -111,24 +93,22 @@
    "    baichuan_secret_key=\"YOUR_SECRET_KEY\",\n",
    "    streaming=True,\n",
    ")"
-   ]
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-10-17T15:14:25.870044Z",
+     "start_time": "2023-10-17T15:14:25.863381Z"
+    }
+   }
  },
  {
   "cell_type": "code",
   "execution_count": 6,
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-10-17T15:14:27.153546Z",
-     "start_time": "2023-10-17T15:14:25.868470Z"
-    },
-    "collapsed": false
-   },
   "outputs": [
    {
     "data": {
-      "text/plain": [
-       "AIMessageChunk(content='首先，我们需要确定闰年的二月有多少天。闰年的二月有29天。\\n\\n然后，我们可以计算你的月薪：\\n\\n日薪 = 月薪 / (当月天数)\\n\\n所以，你的月薪 = 日薪 * 当月天数\\n\\n将数值代入公式：\\n\\n月薪 = 8元/天 * 29天 = 232元\\n\\n因此，你在闰年的二月的月薪是232元。')"
-      ]
+      "text/plain": "AIMessageChunk(content='首先，我们需要确定闰年的二月有多少天。闰年的二月有29天。\\n\\n然后，我们可以计算你的月薪：\\n\\n日薪 = 月薪 / (当月天数)\\n\\n所以，你的月薪 = 日薪 * 当月天数\\n\\n将数值代入公式：\\n\\n月薪 = 8元/天 * 29天 = 232元\\n\\n因此，你在闰年的二月的月薪是232元。')"
     },
     "execution_count": 6,
     "metadata": {},
@@ -137,7 +117,14 @@
   ],
   "source": [
    "chat([HumanMessage(content=\"我日薪8块钱，请问在闰年的二月，我月薪多少\")])"
-   ]
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-10-17T15:14:27.153546Z",
+     "start_time": "2023-10-17T15:14:25.868470Z"
+    }
+   }
  }
 ],
 "metadata": {
--- a/docs/docs/integrations/chat/baidu_qianfan_endpoint.ipynb
+++ b/docs/docs/integrations/chat/baidu_qianfan_endpoint.ipynb
@@ -1,20 +1,11 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Baidu Qianfan\n",
-    "---"
-   ]
-  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# QianfanChatEndpoint\n",
+    "# Baidu Qianfan\n",
    "\n",
    "Baidu AI Cloud Qianfan Platform is a one-stop large model development and service operation platform for enterprise developers. Qianfan not only provides including the model of Wenxin Yiyan (ERNIE-Bot) and the third-party open-source models, but also provides various AI development tools and the whole set of development environment, which facilitates customers to use and develop large model applications easily.\n",
    "\n",
--- a/docs/docs/integrations/chat/bedrock.ipynb
+++ b/docs/docs/integrations/chat/bedrock.ipynb
@@ -1,31 +1,13 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "fbc66410",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Bedrock Chat\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "bf733a38-db84-4363-89e2-de6735c37230",
   "metadata": {},
   "source": [
-    "# BedrockChat\n",
+    "# Bedrock Chat\n",
    "\n",
-    ">[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that offers a choice of \n",
-    "> high-performing foundation models (FMs) from leading AI companies like `AI21 Labs`, `Anthropic`, `Cohere`, \n",
-    "> `Meta`, `Stability AI`, and `Amazon` via a single API, along with a broad set of capabilities you need to \n",
-    "> build generative AI applications with security, privacy, and responsible AI. Using `Amazon Bedrock`, \n",
-    "> you can easily experiment with and evaluate top FMs for your use case, privately customize them with \n",
-    "> your data using techniques such as fine-tuning and `Retrieval Augmented Generation` (`RAG`), and build \n",
-    "> agents that execute tasks using your enterprise systems and data sources. Since `Amazon Bedrock` is \n",
-    "> serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy \n",
-    "> generative AI capabilities into your applications using the AWS services you are already familiar with.\n"
+    "[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case"
   ]
  },
  {
@@ -149,7 +131,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.10.9"
  }
 },
 "nbformat": 4,
--- a/docs/docs/integrations/chat/cohere.ipynb
+++ b/docs/docs/integrations/chat/cohere.ipynb
@@ -1,21 +1,11 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "53fbf15f",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Cohere\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "bf733a38-db84-4363-89e2-de6735c37230",
   "metadata": {},
   "source": [
-    "# ChatCohere\n",
+    "# Cohere\n",
    "\n",
    "This notebook covers how to get started with Cohere chat models."
   ]
--- a/docs/docs/integrations/chat/ernie.ipynb
+++ b/docs/docs/integrations/chat/ernie.ipynb
@@ -1,34 +1,13 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Ernie Bot Chat\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# ErnieBotChat\n",
+    "# ERNIE-Bot Chat\n",
    "\n",
    "[ERNIE-Bot](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/jlil56u11) is a large language model developed by Baidu, covering a huge amount of Chinese data.\n",
-    "This notebook covers how to get started with ErnieBot chat models.\n",
-    "\n",
-    "**Note:** We recommend users using this class to switch to [Baidu Qianfan](./baidu_qianfan_endpoint). they are 3 why we recommend users to use `QianfanChatEndpoint`:\n",
-    "1. `QianfanChatEndpoint` support more LLM in the Qianfan platform.\n",
-    "2. `QianfanChatEndpoint` support streaming mode.\n",
-    "3. `QianfanChatEndpoint` support function calling usgage.\n",
-    "\n",
-    "Some tips for migration:\n",
-    "- change `ernie_client_id` to `qianfan_ak`, also change `ernie_client_secret` to `qianfan_sk`.\n",
-    "- install `qianfan` package. \n",
-    "    ```\n",
-    "    pip install qianfan\n",
-    "    ```"
+    "This notebook covers how to get started with ErnieBot chat models."
   ]
  },
  {
--- a/docs/docs/integrations/chat/everlyai.ipynb
+++ b/docs/docs/integrations/chat/everlyai.ipynb
@@ -1,21 +1,11 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "5e45f35c",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: EverlyAI\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "642fd21c-600a-47a1-be96-6e1438b421a9",
   "metadata": {},
   "source": [
-    "# ChatEverlyAI\n",
+    "# EverlyAI\n",
    "\n",
    ">[EverlyAI](https://everlyai.xyz) allows you to run your ML models at scale in the cloud. It also provides API access to [several LLM models](https://everlyai.xyz).\n",
    "\n",
--- a/docs/docs/integrations/chat/fireworks.ipynb
+++ b/docs/docs/integrations/chat/fireworks.ipynb
@@ -1,22 +1,12 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "529aeba9",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Fireworks\n",
-    "---"
-   ]
-  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "642fd21c-600a-47a1-be96-6e1438b421a9",
   "metadata": {},
   "source": [
-    "# ChatFireworks\n",
+    "# Fireworks\n",
    "\n",
    ">[Fireworks](https://app.fireworks.ai/) accelerates product development on generative AI by creating an innovative AI experiment and production platform. \n",
    "\n",
--- a/docs/docs/integrations/chat/google_vertex_ai_palm.ipynb
+++ b/docs/docs/integrations/chat/google_vertex_ai_palm.ipynb
@@ -1,20 +1,11 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Google Cloud Vertex AI\n",
-    "---"
-   ]
-  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# ChatVertexAI\n",
+    "# Google Cloud Vertex AI \n",
    "\n",
    "Note: This is separate from the Google PaLM integration. Google has chosen to offer an enterprise version of PaLM through GCP, and this supports the models made available through there. \n",
    "\n",
@@ -34,18 +25,18 @@
  },
  {
   "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 1,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
-    "!pip install -U google-cloud-aiplatform"
+    "#!pip install langchain google-cloud-aiplatform"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
@@ -55,29 +46,43 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "chat = ChatVertexAI()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 34,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "system = \"You are a helpful assistant who translate English to French\"\n",
+    "human = \"Translate this sentence from English to French. I love programming.\"\n",
+    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])\n",
+    "messages = prompt.format_messages()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content=\" J'aime la programmation.\")"
+       "AIMessage(content=\" J'aime la programmation.\", additional_kwargs={}, example=False)"
      ]
     },
-     "execution_count": 2,
+     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "system = \"You are a helpful assistant who translate English to French\"\n",
-    "human = \"Translate this sentence from English to French. I love programming.\"\n",
-    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])\n",
-    "\n",
-    "chat = ChatVertexAI()\n",
-    "\n",
-    "chain = prompt | chat\n",
-    "chain.invoke({})"
+    "chat(messages)"
   ]
  },
  {
@@ -89,29 +94,35 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 12,
   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content=' プログラミングが大好きです')"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
+   "outputs": [],
   "source": [
    "system = (\n",
    "    \"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
    ")\n",
    "human = \"{text}\"\n",
-    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])\n",
-    "\n",
+    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=' 私はプログラミングが大好きです。', additional_kwargs={}, example=False)"
+      ]
+     },
+     "execution_count": 13,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
    "chain = prompt | chat\n",
-    "\n",
    "chain.invoke(\n",
    "    {\n",
    "        \"input_language\": \"English\",\n",
@@ -142,7 +153,20 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 18,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "chat = ChatVertexAI(\n",
+    "    model_name=\"codechat-bison\", max_output_tokens=1000, temperature=0.5\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 20,
   "metadata": {
    "tags": []
   },
@@ -152,39 +176,20 @@
     "output_type": "stream",
     "text": [
      " ```python\n",
-      "def is_prime(n):\n",
-      "    if n <= 1:\n",
+      "def is_prime(x): \n",
+      "    if (x <= 1): \n",
      "        return False\n",
-      "    for i in range(2, n):\n",
-      "        if n % i == 0:\n",
+      "    for i in range(2, x): \n",
+      "        if (x % i == 0): \n",
      "            return False\n",
      "    return True\n",
-      "\n",
-      "def find_prime_numbers(n):\n",
-      "    prime_numbers = []\n",
-      "    for i in range(2, n + 1):\n",
-      "        if is_prime(i):\n",
-      "            prime_numbers.append(i)\n",
-      "    return prime_numbers\n",
-      "\n",
-      "print(find_prime_numbers(100))\n",
-      "```\n",
-      "\n",
-      "Output:\n",
-      "\n",
-      "```\n",
-      "[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]\n",
      "```\n"
     ]
    }
   ],
   "source": [
-    "chat = ChatVertexAI(\n",
-    "    model_name=\"codechat-bison\", max_output_tokens=1000, temperature=0.5\n",
-    ")\n",
-    "\n",
-    "message = chat.invoke(\"Write a Python function to identify all prime numbers\")\n",
-    "print(message.content)"
+    "# For simple string in string out usage, we can use the `predict` method:\n",
+    "print(chat.predict(\"Write a Python function to identify all prime numbers\"))"
   ]
  },
  {
@@ -193,47 +198,66 @@
   "source": [
    "## Asynchronous calls\n",
    "\n",
-    "We can make asynchronous calls via the Runnables [Async Interface](/docs/expression_language/interface)"
+    "We can make asynchronous calls via the `agenerate` and `ainvoke` methods."
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
-    "# for running these examples in the notebook:\n",
    "import asyncio\n",
    "\n",
-    "import nest_asyncio\n",
-    "\n",
-    "nest_asyncio.apply()"
+    "# import nest_asyncio\n",
+    "# nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
-       "AIMessage(content=' Why do you love programming?')"
+       "LLMResult(generations=[[ChatGeneration(text=\" J'aime la programmation.\", generation_info=None, message=AIMessage(content=\" J'aime la programmation.\", additional_kwargs={}, example=False))]], llm_output={}, run=[RunInfo(run_id=UUID('223599ef-38f8-4c79-ac6d-a5013060eb9d'))])"
      ]
     },
-     "execution_count": 6,
+     "execution_count": 35,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
-    "system = (\n",
-    "    \"You are a helpful assistant that translates {input_language} to {output_language}.\"\n",
+    "chat = ChatVertexAI(\n",
+    "    model_name=\"chat-bison\",\n",
+    "    max_output_tokens=1000,\n",
+    "    temperature=0.7,\n",
+    "    top_p=0.95,\n",
+    "    top_k=40,\n",
    ")\n",
-    "human = \"{text}\"\n",
-    "prompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])\n",
-    "chain = prompt | chat\n",
    "\n",
+    "asyncio.run(chat.agenerate([messages]))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 36,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "AIMessage(content=' अहं प्रोग्रामिंग प्रेमामि', additional_kwargs={}, example=False)"
+      ]
+     },
+     "execution_count": 36,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
    "asyncio.run(\n",
    "    chain.ainvoke(\n",
    "        {\n",
@@ -256,51 +280,56 @@
  },
  {
   "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import sys"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
-      " The five most populous countries in the world are:\n",
-      "1. China (1.4 billion)\n",
-      "2. India (1.3 billion)\n",
-      "3. United States (331 million)\n",
-      "4. Indonesia (273 million)\n",
-      "5. Pakistan (220 million)"
+      " 1. China (1,444,216,107)\n",
+      "2. India (1,393,409,038)\n",
+      "3. United States (332,403,650)\n",
+      "4. Indonesia (273,523,615)\n",
+      "5. Pakistan (220,892,340)\n",
+      "6. Brazil (212,559,409)\n",
+      "7. Nigeria (206,139,589)\n",
+      "8. Bangladesh (164,689,383)\n",
+      "9. Russia (145,934,462)\n",
+      "10. Mexico (128,932,488)\n",
+      "11. Japan (126,476,461)\n",
+      "12. Ethiopia (115,063,982)\n",
+      "13. Philippines (109,581,078)\n",
+      "14. Egypt (102,334,404)\n",
+      "15. Vietnam (97,338,589)"
     ]
    }
   ],
   "source": [
-    "import sys\n",
-    "\n",
    "prompt = ChatPromptTemplate.from_messages(\n",
-    "    [(\"human\", \"List out the 5 most populous countries in the world\")]\n",
+    "    [(\"human\", \"List out the 15 most populous countries in the world\")]\n",
    ")\n",
-    "\n",
-    "chat = ChatVertexAI()\n",
-    "\n",
-    "chain = prompt | chat\n",
-    "\n",
-    "for chunk in chain.stream({}):\n",
+    "messages = prompt.format_messages()\n",
+    "for chunk in chat.stream(messages):\n",
    "    sys.stdout.write(chunk.content)\n",
    "    sys.stdout.flush()"
   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "poetry-venv",
   "language": "python",
-   "name": "python3"
+   "name": "poetry-venv"
  },
  "language_info": {
   "codemirror_mode": {
@@ -312,7 +341,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.11.4"
+   "version": "3.9.1"
  },
  "vscode": {
   "interpreter": {
--- a/docs/docs/integrations/chat/hunyuan.ipynb
+++ b/docs/docs/integrations/chat/hunyuan.ipynb
@@ -1,19 +1,10 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Tencent Hunyuan\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# ChatHunyuan\n",
+    "# Tencent Hunyuan\n",
    "\n",
    "Hunyuan chat model API by Tencent. For more information, see [https://cloud.tencent.com/document/product/1729](https://cloud.tencent.com/document/product/1729)"
   ]
@@ -45,7 +36,7 @@
   "outputs": [],
   "source": [
    "chat = ChatHunyuan(\n",
-    "    hunyuan_app_id=111111111,\n",
+    "    hunyuan_app_id=\"YOUR_APP_ID\",\n",
    "    hunyuan_secret_id=\"YOUR_SECRET_ID\",\n",
    "    hunyuan_secret_key=\"YOUR_SECRET_KEY\",\n",
    ")"
@@ -63,9 +54,7 @@
   "outputs": [
    {
     "data": {
-      "text/plain": [
-       "AIMessage(content=\"J'aime programmer.\")"
-      ]
+      "text/plain": "AIMessage(content=\"J'aime programmer.\")"
     },
     "execution_count": 3,
     "metadata": {},
@@ -84,23 +73,16 @@
  },
  {
   "cell_type": "markdown",
-   "metadata": {
-    "collapsed": false
-   },
   "source": [
    "## For ChatHunyuan with Streaming"
-   ]
+   ],
+   "metadata": {
+    "collapsed": false
+   }
  },
  {
   "cell_type": "code",
   "execution_count": 2,
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-10-19T10:20:41.507720Z",
-     "start_time": "2023-10-19T10:20:41.496456Z"
-    },
-    "collapsed": false
-   },
   "outputs": [],
   "source": [
    "chat = ChatHunyuan(\n",
@@ -109,24 +91,22 @@
    "    hunyuan_secret_key=\"YOUR_SECRET_KEY\",\n",
    "    streaming=True,\n",
    ")"
-   ]
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-10-19T10:20:41.507720Z",
+     "start_time": "2023-10-19T10:20:41.496456Z"
+    }
+   }
  },
  {
   "cell_type": "code",
   "execution_count": 3,
-   "metadata": {
-    "ExecuteTime": {
-     "end_time": "2023-10-19T10:20:46.275673Z",
-     "start_time": "2023-10-19T10:20:44.241097Z"
-    },
-    "collapsed": false
-   },
   "outputs": [
    {
     "data": {
-      "text/plain": [
-       "AIMessageChunk(content=\"J'aime programmer.\")"
-      ]
+      "text/plain": "AIMessageChunk(content=\"J'aime programmer.\")"
     },
     "execution_count": 3,
     "metadata": {},
@@ -141,19 +121,26 @@
    "        )\n",
    "    ]\n",
    ")"
-   ]
+   ],
+   "metadata": {
+    "collapsed": false,
+    "ExecuteTime": {
+     "end_time": "2023-10-19T10:20:46.275673Z",
+     "start_time": "2023-10-19T10:20:44.241097Z"
+    }
+   }
  },
  {
   "cell_type": "code",
   "execution_count": null,
+   "outputs": [],
+   "source": [],
   "metadata": {
+    "collapsed": false,
    "ExecuteTime": {
     "start_time": "2023-10-19T10:19:56.233477Z"
-    },
-    "collapsed": false
-   },
-   "outputs": [],
-   "source": []
+    }
+   }
  }
 ],
 "metadata": {
--- a/docs/docs/integrations/chat/konko.ipynb
+++ b/docs/docs/integrations/chat/konko.ipynb
@@ -1,19 +1,10 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Konko\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# ChatKonko\n",
+    "# Konko\n",
    "\n",
    ">[Konko](https://www.konko.ai/) API is a fully managed Web API designed to help application developers:\n",
    "\n",
--- a/docs/docs/integrations/chat/litellm.ipynb
+++ b/docs/docs/integrations/chat/litellm.ipynb
@@ -1,22 +1,12 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "59148044",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: LiteLLM\n",
-    "---"
-   ]
-  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "bf733a38-db84-4363-89e2-de6735c37230",
   "metadata": {},
   "source": [
-    "# ChatLiteLLM\n",
+    "# 🚅 LiteLLM\n",
    "\n",
    "[LiteLLM](https://github.com/BerriAI/litellm) is a library that simplifies calling Anthropic, Azure, Huggingface, Replicate, etc. \n",
    "\n",
--- a/docs/docs/integrations/chat/llama2_chat.ipynb
+++ b/docs/docs/integrations/chat/llama2_chat.ipynb
@@ -1,739 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "raw",
-   "id": "7320f16b",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Llama 2 Chat\n",
-    "---"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "90a1faf2",
-   "metadata": {},
-   "source": [
-    "# Llama2Chat\n",
-    "\n",
-    "This notebook shows how to augment Llama-2 `LLM`s with the `Llama2Chat` wrapper to support the [Llama-2 chat prompt format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2). Several `LLM` implementations in LangChain can be used as interface to Llama-2 chat models. These include [HuggingFaceTextGenInference](https://python.langchain.com/docs/integrations/llms/huggingface_textgen_inference), [LlamaCpp](https://python.langchain.com/docs/use_cases/question_answering/how_to/local_retrieval_qa), [GPT4All](https://python.langchain.com/docs/integrations/llms/gpt4all), ..., to mention a few examples. \n",
-    "\n",
-    "`Llama2Chat` is a generic wrapper that implements `BaseChatModel` and can therefore be used in applications as [chat model](https://python.langchain.com/docs/modules/model_io/models/chat/). `Llama2Chat` converts a list of [chat messages](https://python.langchain.com/docs/modules/model_io/models/chat/#messages) into the [required chat prompt format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2) and forwards the formatted prompt as `str` to the wrapped `LLM`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "id": "36c03540",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.chains import LLMChain\n",
-    "from langchain.memory import ConversationBufferMemory\n",
-    "from langchain_experimental.chat_models import Llama2Chat"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "5c76910f",
-   "metadata": {},
-   "source": [
-    "For the chat application examples below, we'll use the following chat `prompt_template`:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "id": "9bbfaf3a",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.prompts.chat import (\n",
-    "    ChatPromptTemplate,\n",
-    "    HumanMessagePromptTemplate,\n",
-    "    MessagesPlaceholder,\n",
-    ")\n",
-    "from langchain.schema import SystemMessage\n",
-    "\n",
-    "template_messages = [\n",
-    "    SystemMessage(content=\"You are a helpful assistant.\"),\n",
-    "    MessagesPlaceholder(variable_name=\"chat_history\"),\n",
-    "    HumanMessagePromptTemplate.from_template(\"{text}\"),\n",
-    "]\n",
-    "prompt_template = ChatPromptTemplate.from_messages(template_messages)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2f3343b7",
-   "metadata": {},
-   "source": [
-    "## Chat with Llama-2 via `HuggingFaceTextGenInference` LLM"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2ff99380",
-   "metadata": {},
-   "source": [
-    "A [HuggingFaceTextGenInference](https://python.langchain.com/docs/integrations/llms/huggingface_textgen_inference) LLM encapsulates access to a [text-generation-inference](https://github.com/huggingface/text-generation-inference) server. In the following example, the inference server serves a [meta-llama/Llama-2-13b-chat-hf](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf) model. It can be started locally with:\n",
-    "\n",
-    "```bash\n",
-    "docker run \\\n",
-    "  --rm \\\n",
-    "  --gpus all \\\n",
-    "  --ipc=host \\\n",
-    "  -p 8080:80 \\\n",
-    "  -v ~/.cache/huggingface/hub:/data \\\n",
-    "  -e HF_API_TOKEN=${HF_API_TOKEN} \\\n",
-    "  ghcr.io/huggingface/text-generation-inference:0.9 \\\n",
-    "  --hostname 0.0.0.0 \\\n",
-    "  --model-id meta-llama/Llama-2-13b-chat-hf \\\n",
-    "  --quantize bitsandbytes \\\n",
-    "  --num-shard 4\n",
-    "```\n",
-    "\n",
-    "This works on a machine with 4 x RTX 3080ti cards, for example. Adjust the `--num_shard` value to the number of GPUs available. The `HF_API_TOKEN` environment variable holds the Hugging Face API token."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "238095fd",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "# !pip3 install text-generation"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "79c4ace9",
-   "metadata": {},
-   "source": [
-    "Create a `HuggingFaceTextGenInference` instance that connects to the local inference server and wrap it into `Llama2Chat`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "id": "7a9f6de2",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.llms import HuggingFaceTextGenInference\n",
-    "\n",
-    "llm = HuggingFaceTextGenInference(\n",
-    "    inference_server_url=\"http://127.0.0.1:8080/\",\n",
-    "    max_new_tokens=512,\n",
-    "    top_k=50,\n",
-    "    temperature=0.1,\n",
-    "    repetition_penalty=1.03,\n",
-    ")\n",
-    "\n",
-    "model = Llama2Chat(llm=llm)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "4f646a2b",
-   "metadata": {},
-   "source": [
-    "Then you are ready to use the chat `model` together with `prompt_template` and conversation `memory` in an `LLMChain`."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "id": "54b5d1d1",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True)\n",
-    "chain = LLMChain(llm=model, prompt=prompt_template, memory=memory)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 6,
-   "id": "e6717947",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      " Sure, I'd be happy to help! Here are a few popular locations to consider visiting in Vienna:\n",
-      "\n",
-      "1. Schönbrunn Palace\n",
-      "2. St. Stephen's Cathedral\n",
-      "3. Hofburg Palace\n",
-      "4. Belvedere Palace\n",
-      "5. Prater Park\n",
-      "6. Vienna State Opera\n",
-      "7. Albertina Museum\n",
-      "8. Museum of Natural History\n",
-      "9. Kunsthistorisches Museum\n",
-      "10. Ringstrasse\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(\n",
-    "    chain.run(\n",
-    "        text=\"What can I see in Vienna? Propose a few locations. Names only, no details.\"\n",
-    "    )\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "id": "17bf10d5",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      " Certainly! St. Stephen's Cathedral (Stephansdom) is one of the most recognizable landmarks in Vienna and a must-see attraction for visitors. This stunning Gothic cathedral is located in the heart of the city and is known for its intricate stone carvings, colorful stained glass windows, and impressive dome.\n",
-      "\n",
-      "The cathedral was built in the 12th century and has been the site of many important events throughout history, including the coronation of Holy Roman emperors and the funeral of Mozart. Today, it is still an active place of worship and offers guided tours, concerts, and special events. Visitors can climb up the south tower for panoramic views of the city or attend a service to experience the beautiful music and chanting.\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(chain.run(text=\"Tell me more about #2.\"))"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "2a297e09",
-   "metadata": {},
-   "source": [
-    "## Chat with Llama-2 via `LlamaCPP` LLM"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "52c1a0b9",
-   "metadata": {},
-   "source": [
-    "For using a Llama-2 chat model with a [LlamaCPP](https://python.langchain.com/docs/integrations/llms/llamacpp) `LMM`, install the `llama-cpp-python` library using [these installation instructions](https://python.langchain.com/docs/integrations/llms/llamacpp#installation). The following example uses a quantized [llama-2-7b-chat.Q4_0.gguf](https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_0.gguf) model stored locally at `~/Models/llama-2-7b-chat.Q4_0.gguf`. \n",
-    "\n",
-    "After creating a `LlamaCpp` instance, the `llm` is again wrapped into `Llama2Chat`"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "id": "07c0d04e",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "llama_model_loader: loaded meta data with 19 key-value pairs and 291 tensors from /home/martin/Models/llama-2-7b-chat.Q4_0.gguf (version GGUF V2)\n",
-      "llama_model_loader: - tensor    0:                token_embd.weight q4_0     [  4096, 32000,     1,     1 ]\n",
-      "llama_model_loader: - tensor    1:           blk.0.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor    2:            blk.0.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor    3:            blk.0.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor    4:              blk.0.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor    5:            blk.0.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor    6:              blk.0.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor    7:         blk.0.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor    8:              blk.0.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor    9:              blk.0.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   10:           blk.1.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   11:            blk.1.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   12:            blk.1.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   13:              blk.1.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   14:            blk.1.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   15:              blk.1.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   16:         blk.1.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   17:              blk.1.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   18:              blk.1.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   19:          blk.10.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   20:           blk.10.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   21:           blk.10.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   22:             blk.10.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   23:           blk.10.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   24:             blk.10.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   25:        blk.10.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   26:             blk.10.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   27:             blk.10.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   28:          blk.11.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   29:           blk.11.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   30:           blk.11.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   31:             blk.11.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   32:           blk.11.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   33:             blk.11.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   34:        blk.11.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   35:             blk.11.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   36:             blk.11.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   37:          blk.12.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   38:           blk.12.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   39:           blk.12.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   40:             blk.12.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   41:           blk.12.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   42:             blk.12.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   43:        blk.12.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   44:             blk.12.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   45:             blk.12.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   46:          blk.13.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   47:           blk.13.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   48:           blk.13.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   49:             blk.13.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   50:           blk.13.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   51:             blk.13.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   52:        blk.13.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   53:             blk.13.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   54:             blk.13.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   55:          blk.14.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   56:           blk.14.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   57:           blk.14.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   58:             blk.14.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   59:           blk.14.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   60:             blk.14.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   61:        blk.14.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   62:             blk.14.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   63:             blk.14.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   64:          blk.15.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   65:           blk.15.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   66:           blk.15.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   67:             blk.15.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   68:           blk.15.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   69:             blk.15.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   70:        blk.15.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   71:             blk.15.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   72:             blk.15.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   73:          blk.16.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   74:           blk.16.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   75:           blk.16.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   76:             blk.16.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   77:           blk.16.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   78:             blk.16.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   79:        blk.16.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   80:             blk.16.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   81:             blk.16.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   82:          blk.17.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   83:           blk.17.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   84:           blk.17.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   85:             blk.17.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   86:           blk.17.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   87:             blk.17.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   88:        blk.17.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   89:             blk.17.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   90:             blk.17.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   91:          blk.18.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   92:           blk.18.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   93:           blk.18.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   94:             blk.18.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor   95:           blk.18.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor   96:             blk.18.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   97:        blk.18.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   98:             blk.18.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor   99:             blk.18.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  100:          blk.19.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  101:           blk.19.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  102:           blk.19.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  103:             blk.19.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  104:           blk.19.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  105:             blk.19.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  106:        blk.19.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  107:             blk.19.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  108:             blk.19.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  109:           blk.2.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  110:            blk.2.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  111:            blk.2.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  112:              blk.2.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  113:            blk.2.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  114:              blk.2.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  115:         blk.2.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  116:              blk.2.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  117:              blk.2.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  118:          blk.20.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  119:           blk.20.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  120:           blk.20.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  121:             blk.20.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  122:           blk.20.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  123:             blk.20.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  124:        blk.20.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  125:             blk.20.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  126:             blk.20.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  127:          blk.21.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  128:           blk.21.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  129:           blk.21.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  130:             blk.21.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  131:           blk.21.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  132:             blk.21.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  133:        blk.21.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  134:             blk.21.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  135:             blk.21.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  136:          blk.22.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  137:           blk.22.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  138:           blk.22.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  139:             blk.22.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  140:           blk.22.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  141:             blk.22.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  142:        blk.22.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  143:             blk.22.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  144:             blk.22.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  145:          blk.23.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  146:           blk.23.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  147:           blk.23.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  148:             blk.23.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  149:           blk.23.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  150:             blk.23.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  151:        blk.23.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  152:             blk.23.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  153:             blk.23.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  154:           blk.3.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  155:            blk.3.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  156:            blk.3.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  157:              blk.3.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  158:            blk.3.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  159:              blk.3.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  160:         blk.3.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  161:              blk.3.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  162:              blk.3.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  163:           blk.4.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  164:            blk.4.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  165:            blk.4.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  166:              blk.4.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  167:            blk.4.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  168:              blk.4.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  169:         blk.4.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  170:              blk.4.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  171:              blk.4.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  172:           blk.5.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  173:            blk.5.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  174:            blk.5.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  175:              blk.5.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  176:            blk.5.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  177:              blk.5.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  178:         blk.5.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  179:              blk.5.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  180:              blk.5.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  181:           blk.6.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  182:            blk.6.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  183:            blk.6.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  184:              blk.6.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  185:            blk.6.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  186:              blk.6.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  187:         blk.6.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  188:              blk.6.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  189:              blk.6.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  190:           blk.7.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  191:            blk.7.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  192:            blk.7.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  193:              blk.7.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  194:            blk.7.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  195:              blk.7.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  196:         blk.7.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  197:              blk.7.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  198:              blk.7.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  199:           blk.8.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  200:            blk.8.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  201:            blk.8.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  202:              blk.8.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  203:            blk.8.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  204:              blk.8.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  205:         blk.8.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  206:              blk.8.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  207:              blk.8.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  208:           blk.9.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  209:            blk.9.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  210:            blk.9.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  211:              blk.9.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  212:            blk.9.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  213:              blk.9.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  214:         blk.9.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  215:              blk.9.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  216:              blk.9.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  217:                    output.weight q6_K     [  4096, 32000,     1,     1 ]\n",
-      "llama_model_loader: - tensor  218:          blk.24.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  219:           blk.24.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  220:           blk.24.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  221:             blk.24.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  222:           blk.24.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  223:             blk.24.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  224:        blk.24.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  225:             blk.24.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  226:             blk.24.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  227:          blk.25.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  228:           blk.25.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  229:           blk.25.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  230:             blk.25.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  231:           blk.25.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  232:             blk.25.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  233:        blk.25.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  234:             blk.25.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  235:             blk.25.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  236:          blk.26.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  237:           blk.26.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  238:           blk.26.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  239:             blk.26.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  240:           blk.26.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  241:             blk.26.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  242:        blk.26.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  243:             blk.26.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  244:             blk.26.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  245:          blk.27.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  246:           blk.27.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  247:           blk.27.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  248:             blk.27.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  249:           blk.27.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  250:             blk.27.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  251:        blk.27.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  252:             blk.27.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  253:             blk.27.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  254:          blk.28.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  255:           blk.28.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  256:           blk.28.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  257:             blk.28.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  258:           blk.28.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  259:             blk.28.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  260:        blk.28.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  261:             blk.28.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  262:             blk.28.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  263:          blk.29.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  264:           blk.29.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  265:           blk.29.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  266:             blk.29.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  267:           blk.29.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  268:             blk.29.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  269:        blk.29.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  270:             blk.29.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  271:             blk.29.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  272:          blk.30.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  273:           blk.30.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  274:           blk.30.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  275:             blk.30.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  276:           blk.30.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  277:             blk.30.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  278:        blk.30.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  279:             blk.30.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  280:             blk.30.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  281:          blk.31.attn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  282:           blk.31.ffn_down.weight q4_0     [ 11008,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  283:           blk.31.ffn_gate.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  284:             blk.31.ffn_up.weight q4_0     [  4096, 11008,     1,     1 ]\n",
-      "llama_model_loader: - tensor  285:           blk.31.ffn_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - tensor  286:             blk.31.attn_k.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  287:        blk.31.attn_output.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  288:             blk.31.attn_q.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  289:             blk.31.attn_v.weight q4_0     [  4096,  4096,     1,     1 ]\n",
-      "llama_model_loader: - tensor  290:               output_norm.weight f32      [  4096,     1,     1,     1 ]\n",
-      "llama_model_loader: - kv   0:                       general.architecture str     \n",
-      "llama_model_loader: - kv   1:                               general.name str     \n",
-      "llama_model_loader: - kv   2:                       llama.context_length u32     \n",
-      "llama_model_loader: - kv   3:                     llama.embedding_length u32     \n",
-      "llama_model_loader: - kv   4:                          llama.block_count u32     \n",
-      "llama_model_loader: - kv   5:                  llama.feed_forward_length u32     \n",
-      "llama_model_loader: - kv   6:                 llama.rope.dimension_count u32     \n",
-      "llama_model_loader: - kv   7:                 llama.attention.head_count u32     \n",
-      "llama_model_loader: - kv   8:              llama.attention.head_count_kv u32     \n",
-      "llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32     \n",
-      "llama_model_loader: - kv  10:                          general.file_type u32     \n",
-      "llama_model_loader: - kv  11:                       tokenizer.ggml.model str     \n",
-      "llama_model_loader: - kv  12:                      tokenizer.ggml.tokens arr     \n",
-      "llama_model_loader: - kv  13:                      tokenizer.ggml.scores arr     \n",
-      "llama_model_loader: - kv  14:                  tokenizer.ggml.token_type arr     \n",
-      "llama_model_loader: - kv  15:                tokenizer.ggml.bos_token_id u32     \n",
-      "llama_model_loader: - kv  16:                tokenizer.ggml.eos_token_id u32     \n",
-      "llama_model_loader: - kv  17:            tokenizer.ggml.unknown_token_id u32     \n",
-      "llama_model_loader: - kv  18:               general.quantization_version u32     \n",
-      "llama_model_loader: - type  f32:   65 tensors\n",
-      "llama_model_loader: - type q4_0:  225 tensors\n",
-      "llama_model_loader: - type q6_K:    1 tensors\n",
-      "llm_load_vocab: special tokens definition check successful ( 259/32000 ).\n",
-      "llm_load_print_meta: format           = GGUF V2\n",
-      "llm_load_print_meta: arch             = llama\n",
-      "llm_load_print_meta: vocab type       = SPM\n",
-      "llm_load_print_meta: n_vocab          = 32000\n",
-      "llm_load_print_meta: n_merges         = 0\n",
-      "llm_load_print_meta: n_ctx_train      = 4096\n",
-      "llm_load_print_meta: n_embd           = 4096\n",
-      "llm_load_print_meta: n_head           = 32\n",
-      "llm_load_print_meta: n_head_kv        = 32\n",
-      "llm_load_print_meta: n_layer          = 32\n",
-      "llm_load_print_meta: n_rot            = 128\n",
-      "llm_load_print_meta: n_gqa            = 1\n",
-      "llm_load_print_meta: f_norm_eps       = 0.0e+00\n",
-      "llm_load_print_meta: f_norm_rms_eps   = 1.0e-06\n",
-      "llm_load_print_meta: f_clamp_kqv      = 0.0e+00\n",
-      "llm_load_print_meta: f_max_alibi_bias = 0.0e+00\n",
-      "llm_load_print_meta: n_ff             = 11008\n",
-      "llm_load_print_meta: rope scaling     = linear\n",
-      "llm_load_print_meta: freq_base_train  = 10000.0\n",
-      "llm_load_print_meta: freq_scale_train = 1\n",
-      "llm_load_print_meta: n_yarn_orig_ctx  = 4096\n",
-      "llm_load_print_meta: rope_finetuned   = unknown\n",
-      "llm_load_print_meta: model type       = 7B\n",
-      "llm_load_print_meta: model ftype      = mostly Q4_0\n",
-      "llm_load_print_meta: model params     = 6.74 B\n",
-      "llm_load_print_meta: model size       = 3.56 GiB (4.54 BPW) \n",
-      "llm_load_print_meta: general.name   = LLaMA v2\n",
-      "llm_load_print_meta: BOS token = 1 '<s>'\n",
-      "llm_load_print_meta: EOS token = 2 '</s>'\n",
-      "llm_load_print_meta: UNK token = 0 '<unk>'\n",
-      "llm_load_print_meta: LF token  = 13 '<0x0A>'\n",
-      "llm_load_tensors: ggml ctx size =    0.11 MB\n",
-      "llm_load_tensors: mem required  = 3647.97 MB\n",
-      "..................................................................................................\n",
-      "llama_new_context_with_model: n_ctx      = 512\n",
-      "llama_new_context_with_model: freq_base  = 10000.0\n",
-      "llama_new_context_with_model: freq_scale = 1\n",
-      "llama_new_context_with_model: kv self size  =  256.00 MB\n",
-      "llama_build_graph: non-view tensors processed: 740/740\n",
-      "llama_new_context_with_model: compute buffer total size = 2.66 MB\n",
-      "AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 0 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | \n"
-     ]
-    }
-   ],
-   "source": [
-    "from os.path import expanduser\n",
-    "\n",
-    "from langchain.llms import LlamaCpp\n",
-    "\n",
-    "model_path = expanduser(\"~/Models/llama-2-7b-chat.Q4_0.gguf\")\n",
-    "\n",
-    "llm = LlamaCpp(\n",
-    "    model_path=model_path,\n",
-    "    streaming=False,\n",
-    ")\n",
-    "model = Llama2Chat(llm=llm)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "50498d96",
-   "metadata": {},
-   "source": [
-    "and used in the same way as in the previous example."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "id": "90782b96",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "memory = ConversationBufferMemory(memory_key=\"chat_history\", return_messages=True)\n",
-    "chain = LLMChain(llm=model, prompt=prompt_template, memory=memory)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 10,
-   "id": "2160b26d",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "  Of course! Vienna is a beautiful city with a rich history and culture. Here are some of the top tourist attractions you might want to consider visiting:\n",
-      "1. Schönbrunn Palace\n",
-      "2. St. Stephen's Cathedral\n",
-      "3. Hofburg Palace\n",
-      "4. Belvedere Palace\n",
-      "5. Prater Park\n",
-      "6. MuseumsQuartier\n",
-      "7. Ringstrasse\n",
-      "8. Vienna State Opera\n",
-      "9. Kunsthistorisches Museum\n",
-      "10. Imperial Palace\n",
-      "\n",
-      "These are just a few of the many amazing places to see in Vienna. Each one has its own unique history and charm, so I hope you enjoy exploring this beautiful city!\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "\n",
-      "llama_print_timings:        load time =     250.46 ms\n",
-      "llama_print_timings:      sample time =      56.40 ms /   144 runs   (    0.39 ms per token,  2553.37 tokens per second)\n",
-      "llama_print_timings: prompt eval time =    1444.25 ms /    47 tokens (   30.73 ms per token,    32.54 tokens per second)\n",
-      "llama_print_timings:        eval time =    8832.02 ms /   143 runs   (   61.76 ms per token,    16.19 tokens per second)\n",
-      "llama_print_timings:       total time =   10645.94 ms\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(\n",
-    "    chain.run(\n",
-    "        text=\"What can I see in Vienna? Propose a few locations. Names only, no details.\"\n",
-    "    )\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "id": "d9ce06e3",
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "Llama.generate: prefix-match hit\n"
-     ]
-    },
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "  Of course! St. Stephen's Cathedral (also known as Stephansdom) is a stunning Gothic-style cathedral located in the heart of Vienna, Austria. It is one of the most recognizable landmarks in the city and is considered a symbol of Vienna.\n",
-      "Here are some interesting facts about St. Stephen's Cathedral:\n",
-      "1. History: The construction of St. Stephen's Cathedral began in the 12th century on the site of a former Romanesque church, and it took over 600 years to complete. The cathedral has been renovated and expanded several times throughout its history, with the most significant renovation taking place in the 19th century.\n",
-      "2. Architecture: St. Stephen's Cathedral is built in the Gothic style, characterized by its tall spires, pointed arches, and intricate stone carvings. The cathedral features a mix of Romanesque, Gothic, and Baroque elements, making it a unique blend of styles.\n",
-      "3. Design: The cathedral's design is based on the plan of a cross with a long nave and two shorter arms extending from it. The main altar is\n"
-     ]
-    },
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "\n",
-      "llama_print_timings:        load time =     250.46 ms\n",
-      "llama_print_timings:      sample time =     100.60 ms /   256 runs   (    0.39 ms per token,  2544.73 tokens per second)\n",
-      "llama_print_timings: prompt eval time =    5128.71 ms /   160 tokens (   32.05 ms per token,    31.20 tokens per second)\n",
-      "llama_print_timings:        eval time =   16193.02 ms /   255 runs   (   63.50 ms per token,    15.75 tokens per second)\n",
-      "llama_print_timings:       total time =   21988.57 ms\n"
-     ]
-    }
-   ],
-   "source": [
-    "print(chain.run(text=\"Tell me more about #2.\"))"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.11.4"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
--- a/docs/docs/integrations/chat/llama_api.ipynb
+++ b/docs/docs/integrations/chat/llama_api.ipynb
@@ -1,21 +1,11 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "71b5cfca",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Llama API\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "90a1faf2",
   "metadata": {},
   "source": [
-    "# ChatLlamaAPI\n",
+    "# Llama API\n",
    "\n",
    "This notebook shows how to use LangChain with [LlamaAPI](https://llama-api.com/) - a hosted version of Llama2 that adds in support for function calling."
   ]
--- a/docs/docs/integrations/chat/minimax.ipynb
+++ b/docs/docs/integrations/chat/minimax.ipynb
@@ -1,20 +1,11 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: MiniMax\n",
-    "---"
-   ]
-  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# MiniMaxChat\n",
+    "# MiniMax\n",
    "\n",
    "[Minimax](https://api.minimax.chat) is a Chinese startup that provides LLM service for companies and individuals.\n",
    "\n",
--- a/docs/docs/integrations/chat/ollama.ipynb
+++ b/docs/docs/integrations/chat/ollama.ipynb
@@ -1,19 +1,10 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Ollama\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# ChatOllama\n",
+    "# Ollama\n",
    "\n",
    "[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as LLaMA2, locally.\n",
    "\n",
@@ -128,159 +119,6 @@
    "chat_model(messages)"
   ]
  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Extraction\n",
-    " \n",
-    "Update your version of Ollama and supply the [`format`](https://github.com/jmorganca/ollama/blob/main/docs/api.md#json-mode) flag.\n",
-    "\n",
-    "We can enforce the model to produce JSON.\n",
-    "\n",
-    "**Note:** You can also try out the experimental [OllamaFunctions](https://python.langchain.com/docs/integrations/chat/ollama_functions) wrapper for convenience."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain.callbacks.manager import CallbackManager\n",
-    "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
-    "from langchain.chat_models import ChatOllama\n",
-    "\n",
-    "chat_model = ChatOllama(\n",
-    "    model=\"llama2\",\n",
-    "    format=\"json\",\n",
-    "    callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 11,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      " Sure! Here's a JSON response with the colors of the sky at different times of the day:\n",
-      " Begriffe und Abkürzungen:\n",
-      "\n",
-      "* `time`: The time of day (in 24-hour format)\n",
-      "* `sky_color`: The color of the sky at that time (as a hex code)\n",
-      "\n",
-      "Here are the colors of the sky at different times of the day:\n",
-      "```json\n",
-      "[\n",
-      "  {\n",
-      "    \"time\": \"6am\",\n",
-      "    \"sky_color\": \"#0080c0\"\n",
-      "  },\n",
-      "  {\n",
-      "    \"time\": \"9am\",\n",
-      "    \"sky_color\": \"#3498db\"\n",
-      "  },\n",
-      "  {\n",
-      "    \"time\": \"12pm\",\n",
-      "    \"sky_color\": \"#ef7c00\"\n",
-      "  },\n",
-      "  {\n",
-      "    \"time\": \"3pm\",\n",
-      "    \"sky_color\": \"#9564b6\"\n",
-      "  },\n",
-      "  {\n",
-      "    \"time\": \"6pm\",\n",
-      "    \"sky_color\": \"#e78ac3\"\n",
-      "  },\n",
-      "  {\n",
-      "    \"time\": \"9pm\",\n",
-      "    \"sky_color\": \"#5f006a\"\n",
-      "  }\n",
-      "]\n",
-      "```\n",
-      "In this response, the `time` property is a string in 24-hour format, representing the time of day. The `sky_color` property is a hex code representing the color of the sky at that time. For example, at 6am, the sky is blue (#0080c0), while at 9pm, it's dark blue (#5f006a)."
-     ]
-    }
-   ],
-   "source": [
-    "from langchain.schema import HumanMessage\n",
-    "\n",
-    "messages = [\n",
-    "    HumanMessage(\n",
-    "        content=\"What color is the sky at different times of the day? Respond using JSON\"\n",
-    "    )\n",
-    "]\n",
-    "\n",
-    "chat_model_response = chat_model(messages)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      " Sure! Based on the JSON schema you provided, here's the information we can gather about a person named John who is 35 years old and loves pizza:\n",
-      "\n",
-      "**Name:** John\n",
-      "\n",
-      "**Age:** 35 (integer)\n",
-      "\n",
-      "**Favorite food:** Pizza (string)\n",
-      "\n",
-      "So, the JSON object for John would look like this:\n",
-      "```json\n",
-      "{\n",
-      "  \"name\": \"John\",\n",
-      "  \"age\": 35,\n",
-      "  \"fav_food\": \"pizza\"\n",
-      "}\n",
-      "```\n",
-      "Note that we cannot provide additional information about John beyond what is specified in the schema. For example, we do not have any information about his gender, occupation, or address, as those fields are not included in the schema."
-     ]
-    }
-   ],
-   "source": [
-    "import json\n",
-    "\n",
-    "from langchain.schema import HumanMessage\n",
-    "\n",
-    "json_schema = {\n",
-    "    \"title\": \"Person\",\n",
-    "    \"description\": \"Identifying information about a person.\",\n",
-    "    \"type\": \"object\",\n",
-    "    \"properties\": {\n",
-    "        \"name\": {\"title\": \"Name\", \"description\": \"The person's name\", \"type\": \"string\"},\n",
-    "        \"age\": {\"title\": \"Age\", \"description\": \"The person's age\", \"type\": \"integer\"},\n",
-    "        \"fav_food\": {\n",
-    "            \"title\": \"Fav Food\",\n",
-    "            \"description\": \"The person's favorite food\",\n",
-    "            \"type\": \"string\",\n",
-    "        },\n",
-    "    },\n",
-    "    \"required\": [\"name\", \"age\"],\n",
-    "}\n",
-    "\n",
-    "messages = [\n",
-    "    HumanMessage(\n",
-    "        content=\"Please tell me about a person using the following JSON schema:\"\n",
-    "    ),\n",
-    "    HumanMessage(content=json.dumps(json_schema, indent=2)),\n",
-    "    HumanMessage(\n",
-    "        content=\"Now, considering the schema, tell me about a person named John who is 35 years old and loves pizza.\"\n",
-    "    ),\n",
-    "]\n",
-    "\n",
-    "chat_model_response = chat_model(messages)"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
@@ -537,5 +375,5 @@
  }
 },
 "nbformat": 4,
- "nbformat_minor": 4
+ "nbformat_minor": 2
 }
--- a/docs/docs/integrations/chat/ollama_functions.ipynb
+++ b/docs/docs/integrations/chat/ollama_functions.ipynb
@@ -1,180 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: Ollama Functions\n",
-    "---"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "# OllamaFunctions\n",
-    "\n",
-    "This notebook shows how to use an experimental wrapper around Ollama that gives it the same API as OpenAI Functions.\n",
-    "\n",
-    "Note that more powerful and capable models will perform better with complex schema and/or multiple functions. The examples below use Mistral.\n",
-    "For a complete list of supported models and model variants, see the [Ollama model library](https://ollama.ai/library).\n",
-    "\n",
-    "## Setup\n",
-    "\n",
-    "Follow [these instructions](https://github.com/jmorganca/ollama) to set up and run a local Ollama instance.\n",
-    "\n",
-    "## Usage\n",
-    "\n",
-    "You can initialize OllamaFunctions in a similar way to how you'd initialize a standard ChatOllama instance:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 1,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_experimental.llms.ollama_functions import OllamaFunctions\n",
-    "\n",
-    "model = OllamaFunctions(model=\"mistral\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "You can then bind functions defined with JSON Schema parameters and a `function_call` parameter to force the model to call the given function:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 2,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "model = model.bind(\n",
-    "    functions=[\n",
-    "        {\n",
-    "            \"name\": \"get_current_weather\",\n",
-    "            \"description\": \"Get the current weather in a given location\",\n",
-    "            \"parameters\": {\n",
-    "                \"type\": \"object\",\n",
-    "                \"properties\": {\n",
-    "                    \"location\": {\n",
-    "                        \"type\": \"string\",\n",
-    "                        \"description\": \"The city and state, \" \"e.g. San Francisco, CA\",\n",
-    "                    },\n",
-    "                    \"unit\": {\n",
-    "                        \"type\": \"string\",\n",
-    "                        \"enum\": [\"celsius\", \"fahrenheit\"],\n",
-    "                    },\n",
-    "                },\n",
-    "                \"required\": [\"location\"],\n",
-    "            },\n",
-    "        }\n",
-    "    ],\n",
-    "    function_call={\"name\": \"get_current_weather\"},\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Calling a function with this model then results in JSON output matching the provided schema:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 3,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "AIMessage(content='', additional_kwargs={'function_call': {'name': 'get_current_weather', 'arguments': '{\"location\": \"Boston, MA\", \"unit\": \"celsius\"}'}})"
-      ]
-     },
-     "execution_count": 3,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.schema import HumanMessage\n",
-    "\n",
-    "model.invoke(\"what is the weather in Boston?\")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Using for extraction\n",
-    "\n",
-    "One useful thing you can do with function calling here is extracting properties from a given input in a structured format:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 4,
-   "metadata": {},
-   "outputs": [
-    {
-     "data": {
-      "text/plain": [
-       "[{'name': 'Alex', 'height': 5, 'hair_color': 'blonde'},\n",
-       " {'name': 'Claudia', 'height': 6, 'hair_color': 'brunette'}]"
-      ]
-     },
-     "execution_count": 4,
-     "metadata": {},
-     "output_type": "execute_result"
-    }
-   ],
-   "source": [
-    "from langchain.chains import create_extraction_chain\n",
-    "\n",
-    "# Schema\n",
-    "schema = {\n",
-    "    \"properties\": {\n",
-    "        \"name\": {\"type\": \"string\"},\n",
-    "        \"height\": {\"type\": \"integer\"},\n",
-    "        \"hair_color\": {\"type\": \"string\"},\n",
-    "    },\n",
-    "    \"required\": [\"name\", \"height\"],\n",
-    "}\n",
-    "\n",
-    "# Input\n",
-    "input = \"\"\"Alex is 5 feet tall. Claudia is 1 feet taller than Alex and jumps higher than him. Claudia is a brunette and Alex is blonde.\"\"\"\n",
-    "\n",
-    "# Run chain\n",
-    "llm = OllamaFunctions(model=\"mistral\", temperature=0)\n",
-    "chain = create_extraction_chain(schema, llm)\n",
-    "chain.run(input)"
-   ]
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": ".venv",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.10.5"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
--- a/docs/docs/integrations/chat/openai.ipynb
+++ b/docs/docs/integrations/chat/openai.ipynb
@@ -1,21 +1,11 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "afaf8039",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: OpenAI\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "id": "e49f1e0d",
   "metadata": {},
   "source": [
-    "# ChatOpenAI\n",
+    "# OpenAI\n",
    "\n",
    "This notebook covers how to get started with OpenAI chat models."
   ]
--- a/docs/docs/integrations/chat/pai_eas_chat_endpoint.ipynb
+++ b/docs/docs/integrations/chat/pai_eas_chat_endpoint.ipynb
@@ -1,19 +1,10 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: AliCloud PAI EAS\n",
-    "---"
-   ]
-  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "# PaiEasChatEndpoint\n",
+    "# AliCloud PAI EAS\n",
    "Machine Learning Platform for AI of Alibaba Cloud is a machine learning or deep learning engineering platform intended for enterprises and developers. It provides easy-to-use, cost-effective, high-performance, and easy-to-scale plug-ins that can be applied to various industry scenarios. With over 140 built-in optimization algorithms, Machine Learning Platform for AI provides whole-process AI engineering capabilities including data labeling (PAI-iTAG), model building (PAI-Designer and PAI-DSW), model training (PAI-DLC), compilation optimization, and inference deployment (PAI-EAS). PAI-EAS supports different types of hardware resources, including CPUs and GPUs, and features high throughput and low latency. It allows you to deploy large-scale complex models with a few clicks and perform elastic scale-ins and scale-outs in real time. It also provides a comprehensive O&M and monitoring system."
   ]
  },
--- a/docs/docs/integrations/chat/promptlayer_chatopenai.ipynb
+++ b/docs/docs/integrations/chat/promptlayer_chatopenai.ipynb
@@ -1,22 +1,12 @@
 {
 "cells": [
-  {
-   "cell_type": "raw",
-   "id": "ce3672d3",
-   "metadata": {},
-   "source": [
-    "---\n",
-    "sidebar_label: PromptLayer ChatOpenAI\n",
-    "---"
-   ]
-  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "959300d4",
   "metadata": {},
   "source": [
-    "# PromptLayerChatOpenAI\n",
+    "# PromptLayer ChatOpenAI\n",
    "\n",
    "This example showcases how to connect to [PromptLayer](https://www.promptlayer.com) to start recording your ChatOpenAI requests."
   ]
@@ -129,6 +119,12 @@
    "**The above request should now appear on your [PromptLayer dashboard](https://www.promptlayer.com).**"
   ]
  },
+  {
+   "cell_type": "markdown",
+   "id": "05e9e2fe",
+   "metadata": {},
+   "source": []
+  },
  {
   "attachments": {},
   "cell_type": "markdown",
@@ -146,8 +142,6 @@
   "metadata": {},
   "outputs": [],
   "source": [
-    "import promptlayer\n",
-    "\n",
    "chat = PromptLayerChatOpenAI(return_pl_id=True)\n",
    "chat_results = chat.generate([[HumanMessage(content=\"I am a cat and I want\")]])\n",
    "\n",
@@ -168,7 +162,7 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
+   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
@@ -182,7 +176,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.10.12"
+   "version": "3.8.8 (default, Apr 13 2021, 12:59:45) \n[Clang 10.0.0 ]"
  },
  "vscode": {
   "interpreter": {
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Erick Friis	948e3eaf53	Merge branch 'master' into erick/skip-release-check-cli	2023-11-14 15:16:26 -08:00
Erick Friis	8030dc90be	another if	2023-11-08 08:13:28 -08:00
Erick Friis	366be2936a	remove if	2023-11-08 08:10:35 -08:00
Erick Friis	e5b078d5f7	skip release check	2023-11-08 08:08:59 -08:00