mirror of
https://github.com/hwchase17/langchain.git
synced 2026-02-15 09:39:11 +00:00
Compare commits
43 Commits
langchain-
...
cc/0.4/doc
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
91e825b92c | ||
|
|
281488a5cf | ||
|
|
8d2ba88ef0 | ||
|
|
45a067509f | ||
|
|
23c3fa65d4 | ||
|
|
3c5cc349b6 | ||
|
|
5cfb7ce57b | ||
|
|
13d67cf37e | ||
|
|
978119ef3c | ||
|
|
dd68b762d9 | ||
|
|
c784f63701 | ||
|
|
aed20287af | ||
|
|
5ada33b3e6 | ||
|
|
7f989d3c3b | ||
|
|
b7968c2b7d | ||
|
|
a1c79711b3 | ||
|
|
1dc22c602e | ||
|
|
18732e5b8b | ||
|
|
2f0c6421a1 | ||
|
|
8f19ca30b0 | ||
|
|
cfe13f673a | ||
|
|
5599c59d4a | ||
|
|
11d68a0b9e | ||
|
|
566774a893 | ||
|
|
255a6d668a | ||
|
|
cbf4c0e565 | ||
|
|
dc66737f03 | ||
|
|
499dc35cfb | ||
|
|
42c1159991 | ||
|
|
cc6139860c | ||
|
|
ae8f58ac6f | ||
|
|
346731544b | ||
|
|
c1b86cc929 | ||
|
|
376f70be96 | ||
|
|
a369b3aed5 | ||
|
|
5eec2207c0 | ||
|
|
9b468a10a5 | ||
|
|
b7494d6566 | ||
|
|
ac2de920b1 | ||
|
|
e02eed5489 | ||
|
|
5414527236 | ||
|
|
881c6534a6 | ||
|
|
5e9eb19a83 |
2
.github/CONTRIBUTING.md
vendored
2
.github/CONTRIBUTING.md
vendored
@@ -7,4 +7,4 @@ To learn how to contribute to LangChain, please follow the [contribution guide h
|
||||
|
||||
## New features
|
||||
|
||||
For new features, please start a new [discussion on our forum](https://forum.langchain.com/), where the maintainers will help with scoping out the necessary changes.
|
||||
For new features, please start a new [discussion](https://forum.langchain.com/), where the maintainers will help with scoping out the necessary changes.
|
||||
|
||||
7
.github/ISSUE_TEMPLATE/config.yml
vendored
7
.github/ISSUE_TEMPLATE/config.yml
vendored
@@ -1,9 +1,6 @@
|
||||
blank_issues_enabled: false
|
||||
version: 2.1
|
||||
contact_links:
|
||||
- name: Documentation
|
||||
url: https://github.com/langchain-ai/docs/issues/new?template=langchain.yml
|
||||
about: Report an issue related to the LangChain documentation
|
||||
- name: LangChain Forum
|
||||
url: https://forum.langchain.com/
|
||||
about: General community discussions and support
|
||||
url: https://forum.langchain.com/
|
||||
about: General community discussions, support, and feature requests
|
||||
|
||||
59
.github/ISSUE_TEMPLATE/documentation.yml
vendored
Normal file
59
.github/ISSUE_TEMPLATE/documentation.yml
vendored
Normal file
@@ -0,0 +1,59 @@
|
||||
name: Documentation
|
||||
description: Report an issue related to the LangChain documentation.
|
||||
title: "docs: <Please write a comprehensive title after the 'docs: ' prefix>"
|
||||
labels: [documentation]
|
||||
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Thank you for taking the time to report an issue in the documentation.
|
||||
|
||||
Only report issues with documentation here, explain if there are
|
||||
any missing topics or if you found a mistake in the documentation.
|
||||
|
||||
Do **NOT** use this to ask usage questions or reporting issues with your code.
|
||||
|
||||
If you have usage questions or need help solving some problem,
|
||||
please use the [LangChain Forum](https://forum.langchain.com/).
|
||||
|
||||
If you're in the wrong place, here are some helpful links to find a better
|
||||
place to ask your question:
|
||||
|
||||
* [LangChain Forum](https://forum.langchain.com/),
|
||||
* [LangChain Github Issues](https://github.com/langchain-ai/langchain/issues?q=is%3Aissue),
|
||||
* [LangChain documentation with the integrated search](https://python.langchain.com/docs/get_started/introduction),
|
||||
* [LangChain how-to guides](https://python.langchain.com/docs/how_to/),
|
||||
* [API Reference](https://python.langchain.com/api_reference/),
|
||||
* [LangChain ChatBot](https://chat.langchain.com/)
|
||||
* [GitHub search](https://github.com/langchain-ai/langchain),
|
||||
- type: input
|
||||
id: url
|
||||
attributes:
|
||||
label: URL
|
||||
description: URL to documentation
|
||||
validations:
|
||||
required: false
|
||||
- type: checkboxes
|
||||
id: checks
|
||||
attributes:
|
||||
label: Checklist
|
||||
description: Please confirm and check all the following options.
|
||||
options:
|
||||
- label: I added a very descriptive title to this issue.
|
||||
required: true
|
||||
- label: I included a link to the documentation page I am referring to (if applicable).
|
||||
required: true
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: "Issue with current documentation:"
|
||||
description: >
|
||||
Please make sure to leave a reference to the document/code you're
|
||||
referring to. Feel free to include names of classes, functions, methods
|
||||
or concepts you'd like to see documented more.
|
||||
- type: textarea
|
||||
attributes:
|
||||
label: "Idea or request for content:"
|
||||
description: >
|
||||
Please describe as clearly as possible what topics you think are missing
|
||||
from the current documentation.
|
||||
11
.github/PULL_REQUEST_TEMPLATE.md
vendored
11
.github/PULL_REQUEST_TEMPLATE.md
vendored
@@ -1,5 +1,3 @@
|
||||
(Replace this entire block of text)
|
||||
|
||||
Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. **If any of these steps are not completed, your PR will not be considered for review.**
|
||||
|
||||
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
|
||||
@@ -11,13 +9,14 @@ Thank you for contributing to LangChain! Follow these steps to mark your pull re
|
||||
- feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release
|
||||
- Allowed `{SCOPE}` values (optional):
|
||||
- core, cli, langchain, standard-tests, docs, anthropic, chroma, deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant, xai
|
||||
- *Note:* the `{DESCRIPTION}` must not start with an uppercase letter.
|
||||
- Note: the `{DESCRIPTION}` must not start with an uppercase letter.
|
||||
- Once you've written the title, please delete this checklist item; do not include it in the PR.
|
||||
|
||||
- [ ] **PR message**: ***Delete this entire checklist*** and replace with
|
||||
- **Description:** a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue.
|
||||
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
|
||||
- **Dependencies:** any dependencies required for this change
|
||||
- **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out!
|
||||
|
||||
- [ ] **Add tests and docs**: If you're adding a new integration, you must include:
|
||||
1. A test for the integration, preferably unit tests that do not rely on network access,
|
||||
@@ -27,7 +26,7 @@ Thank you for contributing to LangChain! Follow these steps to mark your pull re
|
||||
|
||||
Additional guidelines:
|
||||
|
||||
- Most PRs should not touch more than one package.
|
||||
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests.
|
||||
- Changes should be backwards compatible.
|
||||
- Make sure optional dependencies are imported within a function.
|
||||
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests.
|
||||
- Most PRs should not touch more than one package.
|
||||
- Changes should be backwards compatible.
|
||||
|
||||
2
.github/scripts/check_diff.py
vendored
2
.github/scripts/check_diff.py
vendored
@@ -132,8 +132,6 @@ def _get_configs_for_single_dir(job: str, dir_: str) -> List[Dict[str, str]]:
|
||||
|
||||
elif dir_ == "libs/langchain" and job == "extended-tests":
|
||||
py_versions = ["3.9", "3.13"]
|
||||
elif dir_ == "libs/langchain_v1":
|
||||
py_versions = ["3.10", "3.13"]
|
||||
|
||||
elif dir_ == ".":
|
||||
# unable to install with 3.13 because tokenizers doesn't support 3.13 yet
|
||||
|
||||
@@ -27,7 +27,7 @@ jobs:
|
||||
timeout-minutes: 20
|
||||
name: 'Python ${{ inputs.python-version }}'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
2
.github/workflows/_integration_test.yml
vendored
2
.github/workflows/_integration_test.yml
vendored
@@ -28,7 +28,7 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
name: 'Python ${{ inputs.python-version }}'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
2
.github/workflows/_lint.yml
vendored
2
.github/workflows/_lint.yml
vendored
@@ -33,7 +33,7 @@ jobs:
|
||||
timeout-minutes: 20
|
||||
steps:
|
||||
- name: '📋 Checkout Code'
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
17
.github/workflows/_release.yml
vendored
17
.github/workflows/_release.yml
vendored
@@ -43,7 +43,7 @@ jobs:
|
||||
version: ${{ steps.check-version.outputs.version }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Python + uv
|
||||
uses: "./.github/actions/uv_setup"
|
||||
@@ -92,7 +92,7 @@ jobs:
|
||||
outputs:
|
||||
release-body: ${{ steps.generate-release-body.outputs.release-body }}
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: langchain-ai/langchain
|
||||
path: langchain
|
||||
@@ -199,7 +199,7 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 20
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
# We explicitly *don't* set up caching here. This ensures our tests are
|
||||
# maximally sensitive to catching breakage.
|
||||
@@ -289,8 +289,7 @@ jobs:
|
||||
env:
|
||||
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
|
||||
run: |
|
||||
VIRTUAL_ENV=.venv uv pip install --force-reinstall --editable .
|
||||
VIRTUAL_ENV=.venv uv pip install --force-reinstall $MIN_VERSIONS
|
||||
VIRTUAL_ENV=.venv uv pip install --force-reinstall $MIN_VERSIONS --editable .
|
||||
make tests
|
||||
working-directory: ${{ inputs.working-directory }}
|
||||
|
||||
@@ -363,7 +362,7 @@ jobs:
|
||||
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
|
||||
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
# We implement this conditional as Github Actions does not have good support
|
||||
# for conditionally needing steps. https://github.com/actions/runner/issues/491
|
||||
@@ -394,7 +393,7 @@ jobs:
|
||||
git ls-remote --tags origin "langchain-${{ matrix.partner }}*" \
|
||||
| awk '{print $2}' \
|
||||
| sed 's|refs/tags/||' \
|
||||
| grep -E '[0-9]+\.[0-9]+\.[0-9]+$' \
|
||||
| grep -Ev '==[^=]*(\.?dev[0-9]*|\.?rc[0-9]*)$' \
|
||||
| sort -Vr \
|
||||
| head -n 1
|
||||
)"
|
||||
@@ -441,7 +440,7 @@ jobs:
|
||||
working-directory: ${{ inputs.working-directory }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Python + uv
|
||||
uses: "./.github/actions/uv_setup"
|
||||
@@ -480,7 +479,7 @@ jobs:
|
||||
working-directory: ${{ inputs.working-directory }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Python + uv
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
2
.github/workflows/_test.yml
vendored
2
.github/workflows/_test.yml
vendored
@@ -32,7 +32,7 @@ jobs:
|
||||
name: 'Python ${{ inputs.python-version }}'
|
||||
steps:
|
||||
- name: '📋 Checkout Code'
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
2
.github/workflows/_test_doc_imports.yml
vendored
2
.github/workflows/_test_doc_imports.yml
vendored
@@ -21,7 +21,7 @@ jobs:
|
||||
name: '🔍 Check Doc Imports (Python ${{ inputs.python-version }})'
|
||||
steps:
|
||||
- name: '📋 Checkout Code'
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
2
.github/workflows/_test_pydantic.yml
vendored
2
.github/workflows/_test_pydantic.yml
vendored
@@ -34,7 +34,7 @@ jobs:
|
||||
name: 'Pydantic ~=${{ inputs.pydantic-version }}'
|
||||
steps:
|
||||
- name: '📋 Checkout Code'
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
4
.github/workflows/_test_release.yml
vendored
4
.github/workflows/_test_release.yml
vendored
@@ -27,7 +27,7 @@ jobs:
|
||||
version: ${{ steps.check-version.outputs.version }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
@@ -83,7 +83,7 @@ jobs:
|
||||
id-token: write
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- uses: actions/download-artifact@v5
|
||||
with:
|
||||
|
||||
6
.github/workflows/api_doc_build.yml
vendored
6
.github/workflows/api_doc_build.yml
vendored
@@ -17,10 +17,10 @@ jobs:
|
||||
permissions:
|
||||
contents: read
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
path: langchain
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: langchain-ai/langchain-api-docs-html
|
||||
path: langchain-api-docs-html
|
||||
@@ -72,7 +72,7 @@ jobs:
|
||||
done
|
||||
|
||||
- name: '🐍 Setup Python ${{ env.PYTHON_VERSION }}'
|
||||
uses: actions/setup-python@v6
|
||||
uses: actions/setup-python@v5
|
||||
id: setup-python
|
||||
with:
|
||||
python-version: ${{ env.PYTHON_VERSION }}
|
||||
|
||||
2
.github/workflows/check-broken-links.yml
vendored
2
.github/workflows/check-broken-links.yml
vendored
@@ -13,7 +13,7 @@ jobs:
|
||||
if: github.repository_owner == 'langchain-ai' || github.event_name != 'schedule'
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
- name: '🟢 Setup Node.js 18.x'
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
|
||||
31
.github/workflows/check_core_versions.yml
vendored
31
.github/workflows/check_core_versions.yml
vendored
@@ -16,34 +16,19 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: '✅ Verify pyproject.toml & version.py Match'
|
||||
run: |
|
||||
# Check core versions
|
||||
CORE_PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/core/pyproject.toml)
|
||||
CORE_VERSION_PY_VERSION=$(grep -Po '(?<=^VERSION = ")[^"]*' libs/core/langchain_core/version.py)
|
||||
PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/core/pyproject.toml)
|
||||
VERSION_PY_VERSION=$(grep -Po '(?<=^VERSION = ")[^"]*' libs/core/langchain_core/version.py)
|
||||
|
||||
# Compare core versions
|
||||
if [ "$CORE_PYPROJECT_VERSION" != "$CORE_VERSION_PY_VERSION" ]; then
|
||||
# Compare the two versions
|
||||
if [ "$PYPROJECT_VERSION" != "$VERSION_PY_VERSION" ]; then
|
||||
echo "langchain-core versions in pyproject.toml and version.py do not match!"
|
||||
echo "pyproject.toml version: $CORE_PYPROJECT_VERSION"
|
||||
echo "version.py version: $CORE_VERSION_PY_VERSION"
|
||||
echo "pyproject.toml version: $PYPROJECT_VERSION"
|
||||
echo "version.py version: $VERSION_PY_VERSION"
|
||||
exit 1
|
||||
else
|
||||
echo "Core versions match: $CORE_PYPROJECT_VERSION"
|
||||
fi
|
||||
|
||||
# Check langchain_v1 versions
|
||||
LANGCHAIN_PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/langchain_v1/pyproject.toml)
|
||||
LANGCHAIN_INIT_PY_VERSION=$(grep -Po '(?<=^__version__ = ")[^"]*' libs/langchain_v1/langchain/__init__.py)
|
||||
|
||||
# Compare langchain_v1 versions
|
||||
if [ "$LANGCHAIN_PYPROJECT_VERSION" != "$LANGCHAIN_INIT_PY_VERSION" ]; then
|
||||
echo "langchain_v1 versions in pyproject.toml and __init__.py do not match!"
|
||||
echo "pyproject.toml version: $LANGCHAIN_PYPROJECT_VERSION"
|
||||
echo "version.py version: $LANGCHAIN_INIT_PY_VERSION"
|
||||
exit 1
|
||||
else
|
||||
echo "Langchain v1 versions match: $LANGCHAIN_PYPROJECT_VERSION"
|
||||
echo "Versions match: $PYPROJECT_VERSION"
|
||||
fi
|
||||
|
||||
6
.github/workflows/check_diffs.yml
vendored
6
.github/workflows/check_diffs.yml
vendored
@@ -33,9 +33,9 @@ jobs:
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'ci-ignore') }}
|
||||
steps:
|
||||
- name: '📋 Checkout Code'
|
||||
uses: actions/checkout@v5
|
||||
uses: actions/checkout@v4
|
||||
- name: '🐍 Setup Python 3.11'
|
||||
uses: actions/setup-python@v6
|
||||
uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.11'
|
||||
- name: '📂 Get Changed Files'
|
||||
@@ -138,7 +138,7 @@ jobs:
|
||||
run:
|
||||
working-directory: ${{ matrix.job-configs.working-directory }}
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python ${{ matrix.job-configs.python-version }} + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
4
.github/workflows/check_new_docs.yml
vendored
4
.github/workflows/check_new_docs.yml
vendored
@@ -22,8 +22,8 @@ jobs:
|
||||
build:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/setup-python@v6
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: '3.10'
|
||||
- id: files
|
||||
|
||||
4
.github/workflows/codspeed.yml
vendored
4
.github/workflows/codspeed.yml
vendored
@@ -36,7 +36,7 @@ jobs:
|
||||
fail-fast: false
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
# We have to use 3.12 as 3.13 is not yet supported
|
||||
- name: '📦 Install UV Package Manager'
|
||||
@@ -44,7 +44,7 @@ jobs:
|
||||
with:
|
||||
python-version: "3.12"
|
||||
|
||||
- uses: actions/setup-python@v6
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.12"
|
||||
|
||||
|
||||
2
.github/workflows/people.yml
vendored
2
.github/workflows/people.yml
vendored
@@ -19,7 +19,7 @@ jobs:
|
||||
env:
|
||||
GITHUB_CONTEXT: ${{ toJson(github) }}
|
||||
run: echo "$GITHUB_CONTEXT"
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
# Ref: https://github.com/actions/runner/issues/2033
|
||||
- name: '🔧 Fix Git Safe Directory in Container'
|
||||
run: mkdir -p /home/runner/work/_temp/_github_home && printf "[safe]\n\tdirectory = /github/workspace" > /home/runner/work/_temp/_github_home/.gitconfig
|
||||
|
||||
2
.github/workflows/pr_lint.yml
vendored
2
.github/workflows/pr_lint.yml
vendored
@@ -62,7 +62,7 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: '✅ Validate Conventional Commits Format'
|
||||
uses: amannn/action-semantic-pull-request@v6
|
||||
uses: amannn/action-semantic-pull-request@v5
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
with:
|
||||
|
||||
6
.github/workflows/run_notebooks.yml
vendored
6
.github/workflows/run_notebooks.yml
vendored
@@ -26,7 +26,7 @@ jobs:
|
||||
if: github.repository == 'langchain-ai/langchain' || github.event_name != 'schedule'
|
||||
name: '📑 Test Documentation Notebooks'
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: '🐍 Set up Python + UV'
|
||||
uses: "./.github/actions/uv_setup"
|
||||
@@ -35,12 +35,12 @@ jobs:
|
||||
|
||||
- name: '🔐 Authenticate to Google Cloud'
|
||||
id: 'auth'
|
||||
uses: google-github-actions/auth@v3
|
||||
uses: google-github-actions/auth@v2
|
||||
with:
|
||||
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
|
||||
|
||||
- name: '🔐 Configure AWS Credentials'
|
||||
uses: aws-actions/configure-aws-credentials@v5
|
||||
uses: aws-actions/configure-aws-credentials@v4
|
||||
with:
|
||||
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
|
||||
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
|
||||
|
||||
12
.github/workflows/scheduled_test.yml
vendored
12
.github/workflows/scheduled_test.yml
vendored
@@ -20,7 +20,7 @@ env:
|
||||
POETRY_VERSION: "1.8.4"
|
||||
UV_FROZEN: "true"
|
||||
DEFAULT_LIBS: '["libs/partners/openai", "libs/partners/anthropic", "libs/partners/fireworks", "libs/partners/groq", "libs/partners/mistralai", "libs/partners/xai", "libs/partners/google-vertexai", "libs/partners/google-genai", "libs/partners/aws"]'
|
||||
POETRY_LIBS: ("libs/partners/aws")
|
||||
POETRY_LIBS: ("libs/partners/google-vertexai" "libs/partners/google-genai" "libs/partners/aws")
|
||||
|
||||
jobs:
|
||||
# Generate dynamic test matrix based on input parameters or defaults
|
||||
@@ -68,14 +68,14 @@ jobs:
|
||||
working-directory: ${{ fromJSON(needs.compute-matrix.outputs.matrix).working-directory }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
path: langchain
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: langchain-ai/langchain-google
|
||||
path: langchain-google
|
||||
- uses: actions/checkout@v5
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
repository: langchain-ai/langchain-aws
|
||||
path: langchain-aws
|
||||
@@ -106,12 +106,12 @@ jobs:
|
||||
|
||||
- name: '🔐 Authenticate to Google Cloud'
|
||||
id: 'auth'
|
||||
uses: google-github-actions/auth@v3
|
||||
uses: google-github-actions/auth@v2
|
||||
with:
|
||||
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
|
||||
|
||||
- name: '🔐 Configure AWS Credentials'
|
||||
uses: aws-actions/configure-aws-credentials@v5
|
||||
uses: aws-actions/configure-aws-credentials@v4
|
||||
with:
|
||||
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
|
||||
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
|
||||
|
||||
18
README.md
18
README.md
@@ -8,12 +8,16 @@
|
||||
<br>
|
||||
</div>
|
||||
|
||||
[](https://github.com/langchain-ai/langchain/releases)
|
||||
[](https://github.com/langchain-ai/langchain/actions/workflows/check_diffs.yml)
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
[](https://pypistats.org/packages/langchain-core)
|
||||
[](https://pypistats.org/packages/langchain-core)
|
||||
[](https://star-history.com/#langchain-ai/langchain)
|
||||
[](https://github.com/langchain-ai/langchain/issues)
|
||||
[](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
|
||||
[<img src="https://github.com/codespaces/badge.svg" alt="Open in Github Codespace" title="Open in Github Codespace" width="150" height="20">](https://codespaces.new/langchain-ai/langchain)
|
||||
[](https://codspeed.io/langchain-ai/langchain)
|
||||
[](https://twitter.com/langchainai)
|
||||
[](https://codspeed.io/langchain-ai/langchain)
|
||||
|
||||
> [!NOTE]
|
||||
> Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
|
||||
@@ -41,7 +45,7 @@ interface for models, embeddings, vector stores, and more.
|
||||
Use LangChain for:
|
||||
|
||||
- **Real-time data augmentation**. Easily connect LLMs to diverse data sources and
|
||||
external/internal systems, drawing from LangChain’s vast library of integrations with
|
||||
external / internal systems, drawing from LangChain’s vast library of integrations with
|
||||
model providers, tools, vector stores, retrievers, and more.
|
||||
- **Model interoperability**. Swap models in and out as your engineering team
|
||||
experiments to find the best choice for your application’s needs. As the industry
|
||||
@@ -56,7 +60,7 @@ applications.
|
||||
|
||||
To improve your LLM application development, pair LangChain with:
|
||||
|
||||
- [LangSmith](https://www.langchain.com/langsmith) - Helpful for agent evals and
|
||||
- [LangSmith](http://www.langchain.com/langsmith) - Helpful for agent evals and
|
||||
observability. Debug poor-performing LLM app runs, evaluate agent trajectories, gain
|
||||
visibility in production, and improve performance over time.
|
||||
- [LangGraph](https://langchain-ai.github.io/langgraph/) - Build agents that can
|
||||
@@ -64,8 +68,9 @@ reliably handle complex tasks with LangGraph, our low-level agent orchestration
|
||||
framework. LangGraph offers customizable architecture, long-term memory, and
|
||||
human-in-the-loop workflows — and is trusted in production by companies like LinkedIn,
|
||||
Uber, Klarna, and GitLab.
|
||||
- [LangGraph Platform](https://docs.langchain.com/langgraph-platform) - Deploy
|
||||
and scale agents effortlessly with a purpose-built deployment platform for long-running, stateful workflows. Discover, reuse, configure, and share agents across
|
||||
- [LangGraph Platform](https://langchain-ai.github.io/langgraph/concepts/langgraph_platform/) - Deploy
|
||||
and scale agents effortlessly with a purpose-built deployment platform for long
|
||||
running, stateful workflows. Discover, reuse, configure, and share agents across
|
||||
teams — and iterate quickly with visual prototyping in
|
||||
[LangGraph Studio](https://langchain-ai.github.io/langgraph/concepts/langgraph_studio/).
|
||||
|
||||
@@ -80,4 +85,3 @@ concepts behind the LangChain framework.
|
||||
- [LangChain Forum](https://forum.langchain.com/): Connect with the community and share all of your technical questions, ideas, and feedback.
|
||||
- [API Reference](https://python.langchain.com/api_reference/): Detailed reference on
|
||||
navigating base packages and integrations for LangChain.
|
||||
- [Chat LangChain](https://chat.langchain.com/): Ask questions & chat with our documentation.
|
||||
|
||||
@@ -4,9 +4,9 @@ LangChain has a large ecosystem of integrations with various external resources
|
||||
|
||||
## Best practices
|
||||
|
||||
When building such applications, developers should remember to follow good security practices:
|
||||
When building such applications developers should remember to follow good security practices:
|
||||
|
||||
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc., as appropriate for your application.
|
||||
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc. as appropriate for your application.
|
||||
* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it's safest to assume that any LLM able to use those credentials may in fact delete data.
|
||||
* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It's best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use.
|
||||
|
||||
@@ -67,7 +67,8 @@ All out of scope targets defined by huntr as well as:
|
||||
for more details, but generally tools interact with the real world. Developers are
|
||||
expected to understand the security implications of their code and are responsible
|
||||
for the security of their tools.
|
||||
* Code documented with security notices. This will be decided on a case-by-case basis, but likely will not be eligible for a bounty as the code is already
|
||||
* Code documented with security notices. This will be decided on a case by
|
||||
case basis, but likely will not be eligible for a bounty as the code is already
|
||||
documented with guidelines for developers that should be followed for making their
|
||||
application secure.
|
||||
* Any LangSmith related repositories or APIs (see [Reporting LangSmith Vulnerabilities](#reporting-langsmith-vulnerabilities)).
|
||||
|
||||
@@ -64,4 +64,3 @@ Notebook | Description
|
||||
[visual_RAG_vdms.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/visual_RAG_vdms.ipynb) | Performs Visual Retrieval-Augmented-Generation (RAG) using videos and scene descriptions generated by open source models.
|
||||
[contextual_rag.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/contextual_rag.ipynb) | Performs contextual retrieval-augmented generation (RAG) prepending chunk-specific explanatory context to each chunk before embedding.
|
||||
[rag-agents-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/local_rag_agents_intel_cpu.ipynb) | Build a RAG agent locally with open source models that routes questions through one of two paths to find answers. The agent generates answers based on documents retrieved from either the vector database or retrieved from web search. If the vector database lacks relevant information, the agent opts for web search. Open-source models for LLM and embeddings are used locally on an Intel Xeon CPU to execute this pipeline.
|
||||
[rag_mlflow_tracking_evaluation.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag_mlflow_tracking_evaluation.ipynb) | Guide on how to create a RAG pipeline and track + evaluate it with MLflow.
|
||||
|
||||
@@ -79,17 +79,6 @@
|
||||
"tool_executor = ToolExecutor(tools)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "168152fc",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"📘 **Note on `SystemMessage` usage with LangGraph-based agents**\n",
|
||||
"\n",
|
||||
"When constructing the `messages` list for an agent, you *must* manually include any `SystemMessage`s.\n",
|
||||
"Unlike some agent executors in LangChain that set a default, LangGraph requires explicit inclusion."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "fe6e8f78-1ef7-42ad-b2bf-835ed5850553",
|
||||
|
||||
@@ -1,455 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "3716230e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# RAG Pipeline with MLflow Tracking, Tracing & Evaluation\n",
|
||||
"\n",
|
||||
"This notebook demonstrates how to build a complete Retrieval-Augmented Generation (RAG) pipeline using LangChain and integrate it with MLflow for experiment tracking, tracing, and evaluation.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"- **RAG Pipeline Construction**: Build a complete RAG system using LangChain components\n",
|
||||
"- **MLflow Integration**: Track experiments, parameters, and artifacts\n",
|
||||
"- **Tracing**: Monitor inputs, outputs, retrieved documents, scores, prompts, and timings\n",
|
||||
"- **Evaluation**: Use MLflow's built-in scorers to assess RAG performance\n",
|
||||
"- **Best Practices**: Implement proper configuration management and reproducible experiments\n",
|
||||
"\n",
|
||||
"We'll build a RAG system that can answer questions about academic papers by:\n",
|
||||
"1. Loading and chunking documents from ArXiv\n",
|
||||
"2. Creating embeddings and a vector store\n",
|
||||
"3. Setting up a retrieval-augmented generation chain\n",
|
||||
"4. Tracking all experiments with MLflow\n",
|
||||
"5. Evaluating the system's performance\n",
|
||||
"\n",
|
||||
""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2f7561c4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Setup"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "0814ebe9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -U langchain mlflow langchain-community arxiv pymupdf langchain-text-splitters langchain-openai"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "747399b6",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"import mlflow\n",
|
||||
"from mlflow.genai.scorers import RelevanceToQuery, Correctness, ExpectationsGuidelines\n",
|
||||
"from langchain_community.document_loaders import ArxivLoader\n",
|
||||
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
|
||||
"from langchain_core.vectorstores import InMemoryVectorStore\n",
|
||||
"from langchain_openai import OpenAIEmbeddings, ChatOpenAI\n",
|
||||
"from langchain_core.prompts import ChatPromptTemplate\n",
|
||||
"from langchain_core.runnables import RunnableLambda, RunnablePassthrough\n",
|
||||
"from langchain_core.output_parsers import StrOutputParser"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "4141ee05",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"os.environ[\"OPENAI_API_KEY\"] = \"<YOUR OPENAI API KEY>\"\n",
|
||||
"\n",
|
||||
"mlflow.set_experiment(\"LangChain-RAG-MLflow\")\n",
|
||||
"mlflow.langchain.autolog()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "dd5eb41b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Define all hyperparameters and configuration in a centralized dictionary. This makes it easy to:\n",
|
||||
"- Track different experiment configurations\n",
|
||||
"- Reproduce results\n",
|
||||
"- Perform hyperparameter tuning\n",
|
||||
"\n",
|
||||
"**Key Parameters**:\n",
|
||||
"- `chunk_size`: Size of text chunks for document splitting\n",
|
||||
"- `chunk_overlap`: Overlap between consecutive chunks\n",
|
||||
"- `retriever_k`: Number of documents to retrieve\n",
|
||||
"- `embeddings_model`: OpenAI embedding model\n",
|
||||
"- `llm`: Language model for generation\n",
|
||||
"- `temperature`: Sampling temperature for the LLM"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"id": "6dcdc5d8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"CONFIG = {\n",
|
||||
" \"chunk_size\": 400,\n",
|
||||
" \"chunk_overlap\": 80,\n",
|
||||
" \"retriever_k\": 3,\n",
|
||||
" \"embeddings_model\": \"text-embedding-3-small\",\n",
|
||||
" \"system_prompt\": \"You are a helpful assistant. Use the following context to answer the question. Use three sentences maximum and keep the answer concise.\",\n",
|
||||
" \"llm\": \"gpt-5-nano\",\n",
|
||||
" \"temperature\": 0,\n",
|
||||
"}"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8a2985f1",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### ArXiv Dcoument Loading and Processing"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "1f32aa36",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{'Published': '2023-08-02', 'Title': 'Attention Is All You Need', 'Authors': 'Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin', 'Summary': 'The dominant sequence transduction models are based on complex recurrent or\\nconvolutional neural networks in an encoder-decoder configuration. The best\\nperforming models also connect the encoder and decoder through an attention\\nmechanism. We propose a new simple network architecture, the Transformer, based\\nsolely on attention mechanisms, dispensing with recurrence and convolutions\\nentirely. Experiments on two machine translation tasks show these models to be\\nsuperior in quality while being more parallelizable and requiring significantly\\nless time to train. Our model achieves 28.4 BLEU on the WMT 2014\\nEnglish-to-German translation task, improving over the existing best results,\\nincluding ensembles by over 2 BLEU. On the WMT 2014 English-to-French\\ntranslation task, our model establishes a new single-model state-of-the-art\\nBLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction\\nof the training costs of the best models from the literature. We show that the\\nTransformer generalizes well to other tasks by applying it successfully to\\nEnglish constituency parsing both with large and limited training data.'}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Load documents from ArXiv\n",
|
||||
"loader = ArxivLoader(\n",
|
||||
" query=\"1706.03762\",\n",
|
||||
" load_max_docs=1,\n",
|
||||
")\n",
|
||||
"docs = loader.load()\n",
|
||||
"print(docs[0].metadata)\n",
|
||||
"\n",
|
||||
"# Split documents into chunks\n",
|
||||
"splitter = RecursiveCharacterTextSplitter(\n",
|
||||
" chunk_size=CONFIG[\"chunk_size\"],\n",
|
||||
" chunk_overlap=CONFIG[\"chunk_overlap\"],\n",
|
||||
")\n",
|
||||
"chunks = splitter.split_documents(docs)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Join chunks into a single string\n",
|
||||
"def join_chunks(chunks):\n",
|
||||
" return \"\\n\\n\".join([chunk.page_content for chunk in chunks])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6e194ab4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Vector Store and Retriever Setup"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"id": "26dfbeaa",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Create embeddings\n",
|
||||
"embeddings = OpenAIEmbeddings(model=CONFIG[\"embeddings_model\"])\n",
|
||||
"\n",
|
||||
"# Create vector store from documents\n",
|
||||
"vectorstore = InMemoryVectorStore.from_documents(\n",
|
||||
" chunks,\n",
|
||||
" embedding=embeddings,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Create retriever\n",
|
||||
"retriever = vectorstore.as_retriever(search_kwargs={\"k\": CONFIG[\"retriever_k\"]})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "bc1f181b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### RAG Chain Construction using [LCEL](https://python.langchain.com/docs/concepts/lcel/)\n",
|
||||
"\n",
|
||||
"Flow:\n",
|
||||
"1. Query → Retriever (finds relevant chunks)\n",
|
||||
"2. Chunks → join_chunks (creates context)\n",
|
||||
"3. Context + Query → Prompt Template\n",
|
||||
"4. Prompt → Language Model → Response\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"id": "6a810dc3",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Initialize the language model\n",
|
||||
"llm = ChatOpenAI(model=CONFIG[\"llm\"], temperature=CONFIG[\"temperature\"])\n",
|
||||
"\n",
|
||||
"# Create the prompt template\n",
|
||||
"prompt = ChatPromptTemplate.from_messages(\n",
|
||||
" [\n",
|
||||
" (\"system\", CONFIG[\"system_prompt\"] + \"\\n\\nContext:\\n{context}\\n\\n\"),\n",
|
||||
" (\"human\", \"\\n{question}\\n\"),\n",
|
||||
" ]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Construct the RAG chain\n",
|
||||
"rag_chain = (\n",
|
||||
" {\n",
|
||||
" \"context\": retriever | RunnableLambda(join_chunks),\n",
|
||||
" \"question\": RunnablePassthrough(),\n",
|
||||
" }\n",
|
||||
" | prompt\n",
|
||||
" | llm\n",
|
||||
" | StrOutputParser()\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c04bd019",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Prediction Function with MLflow Tracing\n",
|
||||
"\n",
|
||||
"Create a prediction function decorated with `@mlflow.trace` to automatically log:\n",
|
||||
"- Input queries\n",
|
||||
"- Retrieved documents\n",
|
||||
"- Generated responses\n",
|
||||
"- Execution time\n",
|
||||
"- Chain intermediate steps"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "7b45fc04",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Question: What is the main idea of the paper?\n",
|
||||
"Response: The main idea is to replace recurrent/convolutional sequence models with a pure attention-based architecture called the Transformer. It uses self-attention to model dependencies between all positions in the input and output, enabling full parallelization and better handling of long-range relations. This approach achieves strong results on translation and can extend to other modalities.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"@mlflow.trace\n",
|
||||
"def predict_fn(question: str) -> str:\n",
|
||||
" return rag_chain.invoke(question)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Test the prediction function\n",
|
||||
"sample_question = \"What is the main idea of the paper?\"\n",
|
||||
"response = predict_fn(sample_question)\n",
|
||||
"print(f\"Question: {sample_question}\")\n",
|
||||
"print(f\"Response: {response}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "421469de",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Evaluation Dataset and Scoring\n",
|
||||
"\n",
|
||||
"Define an evaluation dataset and run systematic evaluation using [MLflow's built-in scorers](https://mlflow.org/docs/latest/genai/eval-monitor/scorers/llm-judge/predefined/#available-scorers):\n",
|
||||
"\n",
|
||||
"<u>Evaluation Components:</u>\n",
|
||||
"- **Dataset**: Questions with expected concepts and facts\n",
|
||||
"- **Scorers**: \n",
|
||||
" - `RelevanceToQuery`: Measures how relevant the response is to the question\n",
|
||||
" - `Correctness`: Evaluates factual accuracy of the response\n",
|
||||
" - `ExpectationsGuidelines`: Checks that output matches expectation guidelines\n",
|
||||
"\n",
|
||||
"<u>Best Practices:</u>\n",
|
||||
"- Create diverse test cases covering different query types\n",
|
||||
"- Include expected concepts to guide evaluation\n",
|
||||
"- Use multiple scoring metrics for comprehensive assessment"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "5c1dc4f2",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"2025/08/23 20:14:39 INFO mlflow.models.evaluation.utils.trace: Auto tracing is temporarily enabled during the model evaluation for computing some metrics and debugging. To disable tracing, call `mlflow.autolog(disable=True)`.\n",
|
||||
"2025/08/23 20:14:39 INFO mlflow.genai.utils.data_validation: Testing model prediction with the first sample in the dataset.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"application/vnd.jupyter.widget-view+json": {
|
||||
"model_id": "2b6c6687efa24796b39c7951d589d481",
|
||||
"version_major": 2,
|
||||
"version_minor": 0
|
||||
},
|
||||
"text/plain": [
|
||||
"Evaluating: 0%| | 0/3 [Elapsed: 00:00, Remaining: ?] "
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"✨ Evaluation completed.\n",
|
||||
"\n",
|
||||
"Metrics and evaluation results are logged to the MLflow run:\n",
|
||||
" Run name: \u001b[94mbaseline_eval\u001b[0m\n",
|
||||
" Run ID: \u001b[94ma2218d9f24c9415f8040d3b77af103a9\u001b[0m\n",
|
||||
"\n",
|
||||
"To view the detailed evaluation results with sample-wise scores,\n",
|
||||
"open the \u001b[93m\u001b[1mTraces\u001b[0m tab in the Run page in the MLflow UI.\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Define evaluation dataset\n",
|
||||
"eval_dataset = [\n",
|
||||
" {\n",
|
||||
" \"inputs\": {\"question\": \"What is the main idea of the paper?\"},\n",
|
||||
" \"expectations\": {\n",
|
||||
" \"key_concepts\": [\"attention mechanism\", \"transformer\", \"neural network\"],\n",
|
||||
" \"expected_facts\": [\n",
|
||||
" \"attention mechanism is a key component of the transformer model\"\n",
|
||||
" ],\n",
|
||||
" \"guidelines\": [\"The response must be factual and concise\"],\n",
|
||||
" },\n",
|
||||
" },\n",
|
||||
" {\n",
|
||||
" \"inputs\": {\n",
|
||||
" \"question\": \"What's the difference between a transformer and a recurrent neural network?\"\n",
|
||||
" },\n",
|
||||
" \"expectations\": {\n",
|
||||
" \"key_concepts\": [\"sequential\", \"attention mechanism\", \"hidden state\"],\n",
|
||||
" \"expected_facts\": [\n",
|
||||
" \"transformer processes data in parallel while RNN processes data sequentially\"\n",
|
||||
" ],\n",
|
||||
" \"guidelines\": [\n",
|
||||
" \"The response must be factual and focus on the difference between the two models\"\n",
|
||||
" ],\n",
|
||||
" },\n",
|
||||
" },\n",
|
||||
" {\n",
|
||||
" \"inputs\": {\"question\": \"What does the attention mechanism do?\"},\n",
|
||||
" \"expectations\": {\n",
|
||||
" \"key_concepts\": [\"query\", \"key\", \"value\", \"relationship\", \"similarity\"],\n",
|
||||
" \"expected_facts\": [\n",
|
||||
" \"attention allows the model to weigh the importance of different parts of the input sequence when processing it\"\n",
|
||||
" ],\n",
|
||||
" \"guidelines\": [\n",
|
||||
" \"The response must be factual and explain the concept of attention\"\n",
|
||||
" ],\n",
|
||||
" },\n",
|
||||
" },\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"# Run evaluation with MLflow\n",
|
||||
"with mlflow.start_run(run_name=\"baseline_eval\") as run:\n",
|
||||
" # Log configuration parameters\n",
|
||||
" mlflow.log_params(CONFIG)\n",
|
||||
"\n",
|
||||
" # Run evaluation\n",
|
||||
" results = mlflow.genai.evaluate(\n",
|
||||
" data=eval_dataset,\n",
|
||||
" predict_fn=predict_fn,\n",
|
||||
" scorers=[RelevanceToQuery(), Correctness(), ExpectationsGuidelines()],\n",
|
||||
" )"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "52b137c7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Launch MLflow UI to check out the results\n",
|
||||
"\n",
|
||||
"<u>What you'll see in the UI:</u>\n",
|
||||
"- **Experiments**: Compare different RAG configurations\n",
|
||||
"- **Runs**: Individual experiment runs with metrics and parameters\n",
|
||||
"- **Traces**: Detailed execution traces showing retrieval and generation steps\n",
|
||||
"- **Evaluation Results**: Scoring metrics and detailed comparisons\n",
|
||||
"- **Artifacts**: Saved models, datasets, and other files\n",
|
||||
"\n",
|
||||
"Navigate to `http://localhost:5000` after running the command below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "817c3799",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!mlflow ui"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c75861e3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You should see something like this\n",
|
||||
"\n",
|
||||
""
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "venv",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.13.5"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
@@ -97,7 +97,7 @@ def skip_private_members(app, what, name, obj, skip, options):
|
||||
if hasattr(obj, "__doc__") and obj.__doc__ and ":private:" in obj.__doc__:
|
||||
return True
|
||||
if name == "__init__" and obj.__objclass__ is object:
|
||||
# don't document default init
|
||||
# dont document default init
|
||||
return True
|
||||
return None
|
||||
|
||||
|
||||
@@ -217,7 +217,11 @@ def _load_package_modules(
|
||||
# Get the full namespace of the module
|
||||
namespace = str(relative_module_name).replace(".py", "").replace("/", ".")
|
||||
# Keep only the top level namespace
|
||||
top_namespace = namespace.split(".")[0]
|
||||
# (but make special exception for content_blocks and v1.messages)
|
||||
if namespace == "messages.content_blocks" or namespace == "v1.messages":
|
||||
top_namespace = namespace # Keep full namespace for content_blocks
|
||||
else:
|
||||
top_namespace = namespace.split(".")[0]
|
||||
|
||||
try:
|
||||
# If submodule is present, we need to construct the paths in a slightly
|
||||
@@ -545,14 +549,7 @@ def _build_index(dirs: List[str]) -> None:
|
||||
"ai21": "AI21",
|
||||
"ibm": "IBM",
|
||||
}
|
||||
ordered = [
|
||||
"core",
|
||||
"langchain",
|
||||
"text-splitters",
|
||||
"community",
|
||||
"experimental",
|
||||
"standard-tests",
|
||||
]
|
||||
ordered = ["core", "langchain", "text-splitters", "community", "experimental"]
|
||||
main_ = [dir_ for dir_ in ordered if dir_ in dirs]
|
||||
integrations = sorted(dir_ for dir_ in dirs if dir_ not in main_)
|
||||
doc = """# LangChain Python API Reference
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -31,7 +31,7 @@ The conceptual guide does not cover step-by-step instructions or specific implem
|
||||
- **[Vector stores](/docs/concepts/vectorstores)**: Storage of and efficient search over vectors and associated metadata.
|
||||
- **[Retriever](/docs/concepts/retrievers)**: A component that returns relevant documents from a knowledge base in response to a query.
|
||||
- **[Retrieval Augmented Generation (RAG)](/docs/concepts/rag)**: A technique that enhances language models by combining them with external knowledge bases.
|
||||
- **[Agents](/docs/concepts/agents)**: Use a [language model](/docs/concepts/chat_models) to choose a sequence of actions to take. Agents can interact with external resources via [tools](/docs/concepts/tools).
|
||||
- **[Agents](/docs/concepts/agents)**: Use a [language model](/docs/concepts/chat_models) to choose a sequence of actions to take. Agents can interact with external resources via [tool](/docs/concepts/tools).
|
||||
- **[Prompt templates](/docs/concepts/prompt_templates)**: Component for factoring out the static parts of a model "prompt" (usually a sequence of messages). Useful for serializing, versioning, and reusing these static parts.
|
||||
- **[Output parsers](/docs/concepts/output_parsers)**: Responsible for taking the output of a model and transforming it into a more suitable format for downstream tasks. Output parsers were primarily useful prior to the general availability of [tool calling](/docs/concepts/tool_calling) and [structured outputs](/docs/concepts/structured_outputs).
|
||||
- **[Few-shot prompting](/docs/concepts/few_shot_prompting)**: A technique for improving model performance by providing a few examples of the task to perform in the prompt.
|
||||
@@ -48,7 +48,7 @@ The conceptual guide does not cover step-by-step instructions or specific implem
|
||||
- **[AIMessage](/docs/concepts/messages#aimessage)**: Represents a complete response from an AI model.
|
||||
- **[astream_events](/docs/concepts/chat_models#key-methods)**: Stream granular information from [LCEL](/docs/concepts/lcel) chains.
|
||||
- **[BaseTool](/docs/concepts/tools/#tool-interface)**: The base class for all tools in LangChain.
|
||||
- **[batch](/docs/concepts/runnables)**: Used to execute a runnable with batch inputs.
|
||||
- **[batch](/docs/concepts/runnables)**: Use to execute a runnable with batch inputs.
|
||||
- **[bind_tools](/docs/concepts/tool_calling/#tool-binding)**: Allows models to interact with tools.
|
||||
- **[Caching](/docs/concepts/chat_models#caching)**: Storing results to avoid redundant calls to a chat model.
|
||||
- **[Chat models](/docs/concepts/multimodality/#multimodality-in-chat-models)**: Chat models that handle multiple data modalities.
|
||||
|
||||
@@ -147,7 +147,7 @@ An `AIMessage` has the following attributes. The attributes which are **standard
|
||||
| `tool_calls` | Standardized | Tool calls associated with the message. See [tool calling](/docs/concepts/tool_calling) for details. |
|
||||
| `invalid_tool_calls` | Standardized | Tool calls with parsing errors associated with the message. See [tool calling](/docs/concepts/tool_calling) for details. |
|
||||
| `usage_metadata` | Standardized | Usage metadata for a message, such as [token counts](/docs/concepts/tokens). See [Usage Metadata API Reference](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.UsageMetadata.html). |
|
||||
| `id` | Standardized | An optional unique identifier for the message, ideally provided by the provider/model that created the message. See [Message IDs](#message-ids) for details. |
|
||||
| `id` | Standardized | An optional unique identifier for the message, ideally provided by the provider/model that created the message. |
|
||||
| `response_metadata` | Raw | Response metadata, e.g., response headers, logprobs, token counts. |
|
||||
|
||||
#### content
|
||||
@@ -243,37 +243,3 @@ At the moment, the output of the model will be in terms of LangChain messages, s
|
||||
need OpenAI format for the output as well.
|
||||
|
||||
The [convert_to_openai_messages](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.convert_to_openai_messages.html) utility function can be used to convert from LangChain messages to OpenAI format.
|
||||
|
||||
## Message IDs
|
||||
|
||||
LangChain messages include an optional `id` field that serves as a unique identifier. Understanding when and how these IDs are assigned can be helpful for debugging, tracing, and working with message history.
|
||||
|
||||
### When Messages Get IDs
|
||||
|
||||
Messages receive IDs in the following scenarios:
|
||||
|
||||
**Automatically assigned by LangChain:**
|
||||
- When generated through chat model invocation (`.invoke()`, `.stream()`, `.astream()`) with an active run manager/tracing context
|
||||
- IDs follow the format:
|
||||
- `run-$RUN_ID` (e.g., `run-ba48f958-6402-41a5-b461-5e250a4ebd36-0`)
|
||||
- `run-$RUN_ID-$IDX` (e.g., `run-ba48f958-6402-41a5-b461-5e250a4ebd36-1`) when there are multiple generations from a single chat model invocation.
|
||||
|
||||
**Provider-assigned IDs (highest priority):**
|
||||
- When the model provider assigns its own ID to the message
|
||||
- These take precedence over LangChain-generated run IDs
|
||||
- Format varies by provider
|
||||
|
||||
### When Messages Don't Get IDs
|
||||
|
||||
Messages will **not** receive IDs in these situations:
|
||||
|
||||
- **Manual message creation**: Messages created directly (e.g., `AIMessage(content="hello")`) without going through chat models
|
||||
- **No run manager context**: When there's no active callback/tracing infrastructure
|
||||
|
||||
### ID Priority System
|
||||
|
||||
LangChain follows a clear precedence system for message IDs:
|
||||
|
||||
1. **Provider-assigned IDs** (highest priority): IDs from the model provider
|
||||
2. **LangChain run IDs** (medium priority): IDs starting with `run-`
|
||||
3. **Manual IDs** (lowest priority): IDs explicitly set by users
|
||||
|
||||
@@ -53,29 +53,17 @@ This is how you use MessagesPlaceholder.
|
||||
|
||||
```python
|
||||
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
|
||||
from langchain_core.messages import HumanMessage, AIMessage
|
||||
from langchain_core.messages import HumanMessage
|
||||
|
||||
prompt_template = ChatPromptTemplate([
|
||||
("system", "You are a helpful assistant"),
|
||||
MessagesPlaceholder("msgs")
|
||||
])
|
||||
|
||||
# Simple example with one message
|
||||
prompt_template.invoke({"msgs": [HumanMessage(content="hi!")]})
|
||||
|
||||
# More complex example with conversation history
|
||||
messages_to_pass = [
|
||||
HumanMessage(content="What's the capital of France?"),
|
||||
AIMessage(content="The capital of France is Paris."),
|
||||
HumanMessage(content="And what about Germany?")
|
||||
]
|
||||
|
||||
formatted_prompt = prompt_template.invoke({"msgs": messages_to_pass})
|
||||
print(formatted_prompt)
|
||||
```
|
||||
|
||||
|
||||
This will produce a list of four messages total: the system message plus the three messages we passed in (two HumanMessages and one AIMessage).
|
||||
This will produce a list of two messages, the first one being a system message, and the second one being the HumanMessage we passed in.
|
||||
If we had passed in 5 messages, then it would have produced 6 messages in total (the system message plus the 5 passed in).
|
||||
This is useful for letting a list of messages be slotted into a particular spot.
|
||||
|
||||
|
||||
@@ -29,22 +29,6 @@ model_with_structure = model.with_structured_output(schema)
|
||||
structured_output = model_with_structure.invoke(user_input)
|
||||
```
|
||||
|
||||
:::warning[Tool Order Matters]
|
||||
|
||||
When combining structured output with additional tools, bind tools **first**, then apply structured output:
|
||||
|
||||
```python
|
||||
# Correct
|
||||
model_with_tools = model.bind_tools([tool1, tool2])
|
||||
structured_model = model_with_tools.with_structured_output(schema)
|
||||
|
||||
# Incorrect - will cause tool resolution errors
|
||||
structured_model = model.with_structured_output(schema)
|
||||
broken_model = structured_model.bind_tools([tool1, tool2])
|
||||
```
|
||||
|
||||
:::
|
||||
|
||||
## Schema definition
|
||||
|
||||
The central concept is that the output structure of model responses needs to be represented in some way.
|
||||
|
||||
@@ -171,26 +171,6 @@ Please see the [InjectedState](https://langchain-ai.github.io/langgraph/referenc
|
||||
|
||||
Please see the [InjectedStore](https://langchain-ai.github.io/langgraph/reference/prebuilt/#langgraph.prebuilt.tool_node.InjectedStore) documentation for more details.
|
||||
|
||||
## Tool Artifacts vs. Injected State
|
||||
|
||||
Although similar conceptually, tool artifacts in LangChain and [injected state in LangGraph](https://langchain-ai.github.io/langgraph/reference/agents/#langgraph.prebuilt.tool_node.InjectedState) serve different purposes and operate at different levels of abstraction.
|
||||
|
||||
**Tool Artifacts**
|
||||
|
||||
- **Purpose:** Store and pass data between tool executions within a single chain/workflow
|
||||
- **Scope:** Limited to tool-to-tool communication
|
||||
- **Lifecycle:** Tied to individual tool calls and their immediate context
|
||||
- **Usage:** Temporary storage for intermediate results that tools need to share
|
||||
|
||||
**Injected State (LangGraph)**
|
||||
|
||||
- **Purpose:** Maintain persistent state across the entire graph execution
|
||||
- **Scope:** Global to the entire graph workflow
|
||||
- **Lifecycle:** Persists throughout the entire graph execution and can be saved/restored
|
||||
- **Usage:** Long-term state management, conversation memory, user context, workflow checkpointing
|
||||
|
||||
Tool artifacts are ephemeral data passed between tools, while injected state is persistent workflow-level state that survives across multiple steps, tool calls, and even execution sessions in LangGraph.
|
||||
|
||||
## Best practices
|
||||
|
||||
When designing tools to be used by models, keep the following in mind:
|
||||
|
||||
@@ -7,4 +7,4 @@ Traces contain individual steps called `runs`. These can be individual calls fro
|
||||
tool, or sub-chains.
|
||||
Tracing gives you observability inside your chains and agents, and is vital in diagnosing issues.
|
||||
|
||||
For a deeper dive, check out [this LangSmith conceptual guide](https://docs.langchain.com/langsmith/observability-quickstart).
|
||||
For a deeper dive, check out [this LangSmith conceptual guide](https://docs.smith.langchain.com/concepts/tracing).
|
||||
|
||||
@@ -3,9 +3,9 @@
|
||||
Here are some things to keep in mind for all types of contributions:
|
||||
|
||||
- Follow the ["fork and pull request"](https://docs.github.com/en/get-started/exploring-projects-on-github/contributing-to-a-project) workflow.
|
||||
- Fill out the checked-in pull request template when opening pull requests. Note related issues.
|
||||
- Fill out the checked-in pull request template when opening pull requests. Note related issues and tag relevant maintainers.
|
||||
- Ensure your PR passes formatting, linting, and testing checks before requesting a review.
|
||||
- If you would like comments or feedback on your current progress, please open an issue or discussion.
|
||||
- If you would like comments or feedback on your current progress, please open an issue or discussion and tag a maintainer.
|
||||
- See the sections on [Testing](setup.mdx#testing) and [Formatting and Linting](setup.mdx#formatting-and-linting) for how to run these checks locally.
|
||||
- Backwards compatibility is key. Your changes must not be breaking, except in case of critical bug and security fixes.
|
||||
- Look for duplicate PRs or issues that have already been opened before opening a new one.
|
||||
|
||||
@@ -223,49 +223,6 @@ If codespell is incorrectly flagging a word, you can skip spellcheck for that wo
|
||||
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
|
||||
```
|
||||
|
||||
### Pre-commit
|
||||
|
||||
We use [pre-commit](https://pre-commit.com/) to ensure commits are formatted/linted.
|
||||
|
||||
#### Installing Pre-commit
|
||||
|
||||
First, install pre-commit:
|
||||
|
||||
```bash
|
||||
# Option 1: Using uv (recommended)
|
||||
uv tool install pre-commit
|
||||
|
||||
# Option 2: Using Homebrew (globally for macOS/Linux)
|
||||
brew install pre-commit
|
||||
|
||||
# Option 3: Using pip
|
||||
pip install pre-commit
|
||||
```
|
||||
|
||||
Then install the git hook scripts:
|
||||
|
||||
```bash
|
||||
pre-commit install
|
||||
```
|
||||
|
||||
#### How Pre-commit Works
|
||||
|
||||
Once installed, pre-commit will automatically run on every `git commit`. Hooks are specified in `.pre-commit-config.yaml` and will:
|
||||
|
||||
- Format code using `ruff` for the specific library/package you're modifying
|
||||
- Only run on files that have changed
|
||||
- Prevent commits if formatting fails
|
||||
|
||||
#### Skipping Pre-commit
|
||||
|
||||
In exceptional cases, you can skip pre-commit hooks with:
|
||||
|
||||
```bash
|
||||
git commit --no-verify
|
||||
```
|
||||
|
||||
However, this is discouraged as the CI system will still enforce the same formatting rules.
|
||||
|
||||
## Working with optional dependencies
|
||||
|
||||
`langchain`, `langchain-community`, and `langchain-experimental` rely on optional dependencies to keep these packages lightweight.
|
||||
|
||||
@@ -79,7 +79,7 @@ Here are some high-level tips on writing a good how-to guide:
|
||||
|
||||
### Conceptual guide
|
||||
|
||||
LangChain's conceptual guides fall under the **Explanation** quadrant of Diataxis. These guides should cover LangChain terms and concepts
|
||||
LangChain's conceptual guide falls under the **Explanation** quadrant of Diataxis. These guides should cover LangChain terms and concepts
|
||||
in a more abstract way than how-to guides or tutorials, targeting curious users interested in
|
||||
gaining a deeper understanding and insights of the framework. Try to avoid excessively large code examples as the primary goal is to
|
||||
provide perspective to the user rather than to finish a practical project. These guides should cover **why** things work the way they do.
|
||||
@@ -105,7 +105,7 @@ Here are some high-level tips on writing a good conceptual guide:
|
||||
### References
|
||||
|
||||
References contain detailed, low-level information that describes exactly what functionality exists and how to use it.
|
||||
In LangChain, these are mainly our API reference pages, which are populated from docstrings within code.
|
||||
In LangChain, this is mainly our API reference pages, which are populated from docstrings within code.
|
||||
References pages are generally not read end-to-end, but are consulted as necessary when a user needs to know
|
||||
how to use something specific.
|
||||
|
||||
@@ -119,7 +119,7 @@ but here are some high-level tips on writing a good docstring:
|
||||
- Be concise
|
||||
- Discuss special cases and deviations from a user's expectations
|
||||
- Go into detail on required inputs and outputs
|
||||
- Light details on when one might use the feature are fine, but in-depth details belong in other sections
|
||||
- Light details on when one might use the feature are fine, but in-depth details belong in other sections.
|
||||
|
||||
Each category serves a distinct purpose and requires a specific approach to writing and structuring the content.
|
||||
|
||||
@@ -127,17 +127,17 @@ Each category serves a distinct purpose and requires a specific approach to writ
|
||||
|
||||
Here are some other guidelines you should think about when writing and organizing documentation.
|
||||
|
||||
We generally do not merge new tutorials from outside contributors without an acute need.
|
||||
We generally do not merge new tutorials from outside contributors without an actue need.
|
||||
We welcome updates as well as new integration docs, how-tos, and references.
|
||||
|
||||
### Avoid duplication
|
||||
|
||||
Multiple pages that cover the same material in depth are difficult to maintain and cause confusion. There should
|
||||
be only one (very rarely two) canonical pages for a given concept or feature. Instead, you should link to other guides.
|
||||
be only one (very rarely two), canonical pages for a given concept or feature. Instead, you should link to other guides.
|
||||
|
||||
### Link to other sections
|
||||
|
||||
Because sections of the docs do not exist in a vacuum, it is important to link to other sections frequently
|
||||
Because sections of the docs do not exist in a vacuum, it is important to link to other sections frequently,
|
||||
to allow a developer to learn more about an unfamiliar topic within the flow of reading.
|
||||
|
||||
This includes linking to the API references and conceptual sections!
|
||||
|
||||
@@ -33,7 +33,7 @@ Sometimes you want to make a small change, like fixing a typo, and the easiest w
|
||||
- Click the "Commit changes..." button at the top-right corner of the page.
|
||||
- Give your commit a title like "Fix typo in X section."
|
||||
- Optionally, write an extended commit description.
|
||||
- Click "Propose changes".
|
||||
- Click "Propose changes"
|
||||
|
||||
5. **Submit a pull request (PR):**
|
||||
- GitHub will redirect you to a page where you can create a pull request.
|
||||
|
||||
@@ -159,7 +159,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 8,
|
||||
"id": "321e3036-abd2-4e1f-bcc6-606efd036954",
|
||||
"metadata": {
|
||||
"execution": {
|
||||
@@ -183,7 +183,7 @@
|
||||
],
|
||||
"source": [
|
||||
"configurable_model.invoke(\n",
|
||||
" \"what's your name\", config={\"configurable\": {\"model\": \"claude-3-5-sonnet-latest\"}}\n",
|
||||
" \"what's your name\", config={\"configurable\": {\"model\": \"claude-3-5-sonnet-20240620\"}}\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
@@ -234,7 +234,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 7,
|
||||
"id": "6c8755ba-c001-4f5a-a497-be3f1db83244",
|
||||
"metadata": {
|
||||
"execution": {
|
||||
@@ -261,7 +261,7 @@
|
||||
" \"what's your name\",\n",
|
||||
" config={\n",
|
||||
" \"configurable\": {\n",
|
||||
" \"first_model\": \"claude-3-5-sonnet-latest\",\n",
|
||||
" \"first_model\": \"claude-3-5-sonnet-20240620\",\n",
|
||||
" \"first_temperature\": 0.5,\n",
|
||||
" \"first_max_tokens\": 100,\n",
|
||||
" }\n",
|
||||
@@ -336,7 +336,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 9,
|
||||
"id": "e57dfe9f-cd24-4e37-9ce9-ccf8daf78f89",
|
||||
"metadata": {
|
||||
"execution": {
|
||||
@@ -368,14 +368,14 @@
|
||||
"source": [
|
||||
"llm_with_tools.invoke(\n",
|
||||
" \"what's bigger in 2024 LA or NYC\",\n",
|
||||
" config={\"configurable\": {\"model\": \"claude-3-5-sonnet-latest\"}},\n",
|
||||
" config={\"configurable\": {\"model\": \"claude-3-5-sonnet-20240620\"}},\n",
|
||||
").tool_calls"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "langchain-monorepo",
|
||||
"display_name": "langchain",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
@@ -389,7 +389,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.11"
|
||||
"version": "3.10.16"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -741,13 +741,13 @@
|
||||
"\n",
|
||||
"If you're using tools with agents, you will likely need an error handling strategy, so the agent can recover from the error and continue execution.\n",
|
||||
"\n",
|
||||
"A simple strategy is to throw a `ToolException` from inside the tool and specify an error handler using `handle_tool_errors`. \n",
|
||||
"A simple strategy is to throw a `ToolException` from inside the tool and specify an error handler using `handle_tool_error`. \n",
|
||||
"\n",
|
||||
"When the error handler is specified, the exception will be caught and the error handler will decide which output to return from the tool.\n",
|
||||
"\n",
|
||||
"You can set `handle_tool_errors` to `True`, a string value, or a function. If it's a function, the function should take a `ToolException` as a parameter and return a value.\n",
|
||||
"You can set `handle_tool_error` to `True`, a string value, or a function. If it's a function, the function should take a `ToolException` as a parameter and return a value.\n",
|
||||
"\n",
|
||||
"Please note that only raising a `ToolException` won't be effective. You need to first set the `handle_tool_errors` of the tool because its default value is `False`."
|
||||
"Please note that only raising a `ToolException` won't be effective. You need to first set the `handle_tool_error` of the tool because its default value is `False`."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -777,7 +777,7 @@
|
||||
"id": "9d93b217-1d44-4d31-8956-db9ea680ff4f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Here's an example with the default `handle_tool_errors=True` behavior."
|
||||
"Here's an example with the default `handle_tool_error=True` behavior."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -807,7 +807,7 @@
|
||||
"source": [
|
||||
"get_weather_tool = StructuredTool.from_function(\n",
|
||||
" func=get_weather,\n",
|
||||
" handle_tool_errors=True,\n",
|
||||
" handle_tool_error=True,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"get_weather_tool.invoke({\"city\": \"foobar\"})"
|
||||
@@ -818,7 +818,7 @@
|
||||
"id": "f91d6dc0-3271-4adc-a155-21f2e62ffa56",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We can set `handle_tool_errors` to a string that will always be returned."
|
||||
"We can set `handle_tool_error` to a string that will always be returned."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -848,7 +848,7 @@
|
||||
"source": [
|
||||
"get_weather_tool = StructuredTool.from_function(\n",
|
||||
" func=get_weather,\n",
|
||||
" handle_tool_errors=\"There is no such city, but it's probably above 0K there!\",\n",
|
||||
" handle_tool_error=\"There is no such city, but it's probably above 0K there!\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"get_weather_tool.invoke({\"city\": \"foobar\"})"
|
||||
@@ -893,7 +893,7 @@
|
||||
"\n",
|
||||
"get_weather_tool = StructuredTool.from_function(\n",
|
||||
" func=get_weather,\n",
|
||||
" handle_tool_errors=_handle_error,\n",
|
||||
" handle_tool_error=_handle_error,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"get_weather_tool.invoke({\"city\": \"foobar\"})"
|
||||
|
||||
@@ -565,7 +565,7 @@
|
||||
"id": "3ac2c37a-06a1-40d3-a192-9078eb83994b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<table><thead><tr><th colspan=\"3\">Table 1: Current layout detection models in the LayoutParser model zoo</th></tr><tr><th>Dataset</th><th>Base Model1</th><th>Large Model Notes</th></tr></thead><tbody><tr><td>PubLayNet [38]</td><td>F/M</td><td>Layouts of modern scientific documents</td></tr><tr><td>PRImA</td><td>M</td><td>Layouts of scanned modern magazines and scientific reports</td></tr><tr><td>Newspaper</td><td>F</td><td>Layouts of scanned US newspapers from the 20th century</td></tr><tr><td>TableBank [18]</td><td>F</td><td>Table region on modern scientific and business document</td></tr><tr><td>HJDataset</td><td>F/M</td><td>Layouts of history Japanese documents</td></tr></tbody></table>"
|
||||
"<table><thead><tr><th colspan=\"3\">able 1. LUllclll 1ayoul actCCLloll 1110AdCs 111 L1C LayoOulralsel 1110U4cl 200</th></tr><tr><th>Dataset</th><th>| Base Model\\'|</th><th>Notes</th></tr></thead><tbody><tr><td>PubLayNet [38]</td><td>F/M</td><td>Layouts of modern scientific documents</td></tr><tr><td>PRImA</td><td>M</td><td>Layouts of scanned modern magazines and scientific reports</td></tr><tr><td>Newspaper</td><td>F</td><td>Layouts of scanned US newspapers from the 20th century</td></tr><tr><td>TableBank [18]</td><td>F</td><td>Table region on modern scientific and business document</td></tr><tr><td>HJDataset</td><td>F/M</td><td>Layouts of history Japanese documents</td></tr></tbody></table>"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -122,13 +122,13 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"from langchain_experimental.graph_transformers import LLMGraphTransformer\n",
|
||||
"# from langchain_experimental.graph_transformers import LLMGraphTransformer\n",
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"\n",
|
||||
"llm = ChatOpenAI(temperature=0, model_name=\"gpt-4-turbo\")\n",
|
||||
|
||||
@@ -5,7 +5,7 @@ sidebar_class_name: hidden
|
||||
|
||||
# How-to guides
|
||||
|
||||
Here you’ll find answers to "How do I….?" types of questions.
|
||||
Here you’ll find answers to “How do I….?” types of questions.
|
||||
These guides are *goal-oriented* and *concrete*; they're meant to help you complete a specific task.
|
||||
For conceptual explanations see the [Conceptual guide](/docs/concepts/).
|
||||
For end-to-end walkthroughs see [Tutorials](/docs/tutorials).
|
||||
@@ -47,7 +47,7 @@ See [supported integrations](/docs/integrations/chat/) for details on getting st
|
||||
- [How to: use chat model to call tools](/docs/how_to/tool_calling)
|
||||
- [How to: stream tool calls](/docs/how_to/tool_streaming)
|
||||
- [How to: handle rate limits](/docs/how_to/chat_model_rate_limiting)
|
||||
- [How to: few-shot prompt tool behavior](/docs/how_to/tools_few_shot)
|
||||
- [How to: few shot prompt tool behavior](/docs/how_to/tools_few_shot)
|
||||
- [How to: bind model-specific formatted tools](/docs/how_to/tools_model_specific)
|
||||
- [How to: force a specific tool call](/docs/how_to/tool_choice)
|
||||
- [How to: pass multimodal data directly to models](/docs/how_to/multimodal_inputs/)
|
||||
@@ -64,8 +64,8 @@ See [supported integrations](/docs/integrations/chat/) for details on getting st
|
||||
|
||||
[Prompt Templates](/docs/concepts/prompt_templates) are responsible for formatting user input into a format that can be passed to a language model.
|
||||
|
||||
- [How to: use few-shot examples](/docs/how_to/few_shot_examples)
|
||||
- [How to: use few-shot examples in chat models](/docs/how_to/few_shot_examples_chat/)
|
||||
- [How to: use few shot examples](/docs/how_to/few_shot_examples)
|
||||
- [How to: use few shot examples in chat models](/docs/how_to/few_shot_examples_chat/)
|
||||
- [How to: partially format prompt templates](/docs/how_to/prompts_partial)
|
||||
- [How to: compose prompts together](/docs/how_to/prompts_composition)
|
||||
- [How to: use multimodal prompts](/docs/how_to/multimodal_prompts/)
|
||||
@@ -168,7 +168,7 @@ See [supported integrations](/docs/integrations/vectorstores/) for details on ge
|
||||
|
||||
Indexing is the process of keeping your vectorstore in-sync with the underlying data source.
|
||||
|
||||
- [How to: reindex data to keep your vectorstore in sync with the underlying data source](/docs/how_to/indexing)
|
||||
- [How to: reindex data to keep your vectorstore in-sync with the underlying data source](/docs/how_to/indexing)
|
||||
|
||||
### Tools
|
||||
|
||||
@@ -178,7 +178,7 @@ LangChain [Tools](/docs/concepts/tools) contain a description of the tool (to pa
|
||||
- [How to: use built-in tools and toolkits](/docs/how_to/tools_builtin)
|
||||
- [How to: use chat models to call tools](/docs/how_to/tool_calling)
|
||||
- [How to: pass tool outputs to chat models](/docs/how_to/tool_results_pass_to_model)
|
||||
- [How to: pass runtime values to tools](/docs/how_to/tool_runtime)
|
||||
- [How to: pass run time values to tools](/docs/how_to/tool_runtime)
|
||||
- [How to: add a human-in-the-loop for tools](/docs/how_to/tools_human)
|
||||
- [How to: handle tool errors](/docs/how_to/tools_error)
|
||||
- [How to: force models to call a tool](/docs/how_to/tool_choice)
|
||||
@@ -297,7 +297,7 @@ For a high-level tutorial, check out [this guide](/docs/tutorials/sql_qa/).
|
||||
You can use an LLM to do question answering over graph databases.
|
||||
For a high-level tutorial, check out [this guide](/docs/tutorials/graph/).
|
||||
|
||||
- [How to: add a semantic layer over a database](/docs/how_to/graph_semantic)
|
||||
- [How to: add a semantic layer over the database](/docs/how_to/graph_semantic)
|
||||
- [How to: construct knowledge graphs](/docs/how_to/graph_constructing)
|
||||
|
||||
### Summarization
|
||||
@@ -345,7 +345,7 @@ LangGraph is an extension of LangChain aimed at
|
||||
building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.
|
||||
|
||||
LangGraph documentation is currently hosted on a separate site.
|
||||
You can find the [LangGraph guides here](https://langchain-ai.github.io/langgraph/guides/).
|
||||
You can peruse [LangGraph how-to guides here](https://langchain-ai.github.io/langgraph/how-tos/).
|
||||
|
||||
## [LangSmith](https://docs.smith.langchain.com/)
|
||||
|
||||
|
||||
@@ -61,7 +61,7 @@
|
||||
" * document addition by id (`add_documents` method with `ids` argument)\n",
|
||||
" * delete by id (`delete` method with `ids` argument)\n",
|
||||
"\n",
|
||||
"Compatible Vectorstores: `AnalyticDB`, `AstraDB`, `AwaDB`, `AzureCosmosDBNoSqlVectorSearch`, `AzureCosmosDBVectorSearch`, `AzureSearch`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MongoDBAtlasVectorSearch`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SingleStoreDB`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `Yellowbrick`, `ZepVectorStore`, `TencentVectorDB`, `OpenSearchVectorSearch`.\n",
|
||||
"Compatible Vectorstores: `Aerospike`, `AnalyticDB`, `AstraDB`, `AwaDB`, `AzureCosmosDBNoSqlVectorSearch`, `AzureCosmosDBVectorSearch`, `AzureSearch`, `Bagel`, `Cassandra`, `Chroma`, `CouchbaseVectorStore`, `DashVector`, `DatabricksVectorSearch`, `DeepLake`, `Dingo`, `ElasticVectorSearch`, `ElasticsearchStore`, `FAISS`, `HanaDB`, `Milvus`, `MongoDBAtlasVectorSearch`, `MyScale`, `OpenSearchVectorSearch`, `PGVector`, `Pinecone`, `Qdrant`, `Redis`, `Rockset`, `ScaNN`, `SingleStoreDB`, `SupabaseVectorStore`, `SurrealDBStore`, `TimescaleVector`, `Vald`, `VDMS`, `Vearch`, `VespaStore`, `Weaviate`, `Yellowbrick`, `ZepVectorStore`, `TencentVectorDB`, `OpenSearchVectorSearch`.\n",
|
||||
" \n",
|
||||
"## Caution\n",
|
||||
"\n",
|
||||
|
||||
@@ -45,7 +45,7 @@
|
||||
"A few frameworks for this have emerged to support inference of open-source LLMs on various devices:\n",
|
||||
"\n",
|
||||
"1. [`llama.cpp`](https://github.com/ggerganov/llama.cpp): C++ implementation of llama inference code with [weight optimization / quantization](https://finbarr.ca/how-is-llama-cpp-possible/)\n",
|
||||
"2. [`gpt4all`](https://docs.gpt4all.io/index.html): Optimized C backend for inference\n",
|
||||
"2. [`gpt4all`](https://github.com/nomic-ai/gpt4all): Optimized C backend for inference\n",
|
||||
"3. [`ollama`](https://github.com/ollama/ollama): Bundles model weights and environment into an app that runs on device and serves the LLM\n",
|
||||
"4. [`llamafile`](https://github.com/Mozilla-Ocho/llamafile): Bundles model weights and everything needed to run the model in a single file, allowing you to run the LLM locally from this file without any additional installation steps\n",
|
||||
"\n",
|
||||
@@ -74,12 +74,12 @@
|
||||
"\n",
|
||||
"## Quickstart\n",
|
||||
"\n",
|
||||
"[Ollama](https://ollama.com/) is one way to easily run inference on macOS.\n",
|
||||
"[Ollama](https://ollama.ai/) is one way to easily run inference on macOS.\n",
|
||||
" \n",
|
||||
"The instructions [here](https://github.com/ollama/ollama?tab=readme-ov-file#ollama) provide details, which we summarize:\n",
|
||||
" \n",
|
||||
"* [Download and run](https://ollama.ai/download) the app\n",
|
||||
"* From command line, fetch a model from this [list of options](https://ollama.com/search): e.g., `ollama pull gpt-oss:20b`\n",
|
||||
"* From command line, fetch a model from this [list of options](https://ollama.com/search): e.g., `ollama pull llama3.1:8b`\n",
|
||||
"* When the app is running, all models are automatically served on `localhost:11434`\n"
|
||||
]
|
||||
},
|
||||
@@ -95,7 +95,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 2,
|
||||
"id": "86178adb",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -113,7 +113,7 @@
|
||||
"source": [
|
||||
"from langchain_ollama import ChatOllama\n",
|
||||
"\n",
|
||||
"llm = ChatOllama(model=\"gpt-oss:20b\", validate_model_on_init=True)\n",
|
||||
"llm = ChatOllama(model=\"gpt-oss:20b\")\n",
|
||||
"\n",
|
||||
"llm.invoke(\"The first man on the moon was ...\").content"
|
||||
]
|
||||
@@ -149,40 +149,7 @@
|
||||
],
|
||||
"source": [
|
||||
"for chunk in llm.stream(\"The first man on the moon was ...\"):\n",
|
||||
" print(chunk, end=\"|\", flush=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e5731060",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Ollama also includes a chat model wrapper that handles formatting conversation turns:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"id": "f14a778a",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"AIMessage(content='The answer is a historic one!\\n\\nThe first man to walk on the Moon was Neil Armstrong, an American astronaut and commander of the Apollo 11 mission. On July 20, 1969, Armstrong stepped out of the lunar module Eagle onto the surface of the Moon, famously declaring:\\n\\n\"That\\'s one small step for man, one giant leap for mankind.\"\\n\\nArmstrong was followed by fellow astronaut Edwin \"Buzz\" Aldrin, who also walked on the Moon during the mission. Michael Collins remained in orbit around the Moon in the command module Columbia.\\n\\nNeil Armstrong passed away on August 25, 2012, but his legacy as a pioneering astronaut and engineer continues to inspire people around the world!', response_metadata={'model': 'llama3.1:8b', 'created_at': '2024-08-01T00:38:29.176717Z', 'message': {'role': 'assistant', 'content': ''}, 'done_reason': 'stop', 'done': True, 'total_duration': 10681861417, 'load_duration': 34270292, 'prompt_eval_count': 19, 'prompt_eval_duration': 6209448000, 'eval_count': 141, 'eval_duration': 4432022000}, id='run-7bed57c5-7f54-4092-912c-ae49073dcd48-0', usage_metadata={'input_tokens': 19, 'output_tokens': 141, 'total_tokens': 160})"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_ollama import ChatOllama\n",
|
||||
"\n",
|
||||
"chat_model = ChatOllama(model=\"llama3.1:8b\")\n",
|
||||
"\n",
|
||||
"chat_model.invoke(\"Who was the first man on the moon?\")"
|
||||
" print(chunk.text(), end=\"|\", flush=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -212,7 +179,7 @@
|
||||
"\n",
|
||||
"In particular, ensure that conda is using the correct virtual environment that you created (`miniforge3`).\n",
|
||||
"\n",
|
||||
"e.g., for me:\n",
|
||||
"e.g.,\n",
|
||||
"\n",
|
||||
"```shell\n",
|
||||
"conda activate /Users/rlm/miniforge3/envs/llama\n",
|
||||
@@ -234,18 +201,18 @@
|
||||
"\n",
|
||||
"There are various ways to gain access to quantized model weights.\n",
|
||||
"\n",
|
||||
"1. [`HuggingFace`](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp). You can also download models in [`llamafile` format](https://huggingface.co/models?other=llamafile) from HuggingFace.\n",
|
||||
"2. [`gpt4all`](https://gpt4all.io/index.html) - The model explorer offers a leaderboard of metrics and associated quantized models available for download \n",
|
||||
"3. [`ollama`](https://github.com/jmorganca/ollama) - Several models can be accessed directly via `pull`\n",
|
||||
"1. [HuggingFace](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp). You can also download models in [`llamafile` format](https://huggingface.co/models?other=llamafile) from HuggingFace.\n",
|
||||
"2. [gpt4all](https://gpt4all.io/index.html) - The model explorer offers a leaderboard of metrics and associated quantized models available for download \n",
|
||||
"3. [ollama](https://github.com/ollama/ollama) - Several models can be accessed directly via `pull`\n",
|
||||
"\n",
|
||||
"### Ollama\n",
|
||||
"\n",
|
||||
"With [Ollama](https://github.com/ollama/ollama), fetch a model via `ollama pull <model family>:<tag>`."
|
||||
"With [Ollama](https://github.com/ollama/ollama), fetch a model via `ollama pull <model family>:<tag>`:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 42,
|
||||
"id": "8ecd2f78",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -680,17 +647,11 @@
|
||||
"\n",
|
||||
"In addition, [here](https://blog.langchain.dev/using-langsmith-to-support-fine-tuning-of-open-source-llms/) is an overview on fine-tuning, which can utilize open-source LLMs."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "14c2c170",
|
||||
"metadata": {},
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "langchain",
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
@@ -704,7 +665,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.11"
|
||||
"version": "3.10.5"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -74,12 +74,12 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": null,
|
||||
"id": "a88ff70c",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_experimental.text_splitter import SemanticChunker\n",
|
||||
"# from langchain_experimental.text_splitter import SemanticChunker\n",
|
||||
"from langchain_openai.embeddings import OpenAIEmbeddings\n",
|
||||
"\n",
|
||||
"text_splitter = SemanticChunker(OpenAIEmbeddings())"
|
||||
|
||||
@@ -612,56 +612,11 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"execution_count": null,
|
||||
"id": "35ea904e-795f-411b-bef8-6484dbb6e35c",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"\n",
|
||||
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
|
||||
"\u001b[32;1m\u001b[1;3m\n",
|
||||
"Invoking: `python_repl_ast` with `{'query': \"df[['Age', 'Fare']].corr().iloc[0,1]\"}`\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\u001b[0m\u001b[36;1m\u001b[1;3m0.11232863699941621\u001b[0m\u001b[32;1m\u001b[1;3m\n",
|
||||
"Invoking: `python_repl_ast` with `{'query': \"df[['Fare', 'Survived']].corr().iloc[0,1]\"}`\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\u001b[0m\u001b[36;1m\u001b[1;3m0.2561785496289603\u001b[0m\u001b[32;1m\u001b[1;3mThe correlation between Age and Fare is approximately 0.112, and the correlation between Fare and Survival is approximately 0.256.\n",
|
||||
"\n",
|
||||
"Therefore, the correlation between Fare and Survival (0.256) is greater than the correlation between Age and Fare (0.112).\u001b[0m\n",
|
||||
"\n",
|
||||
"\u001b[1m> Finished chain.\u001b[0m\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"{'input': \"What's the correlation between age and fare? is that greater than the correlation between fare and survival?\",\n",
|
||||
" 'output': 'The correlation between Age and Fare is approximately 0.112, and the correlation between Fare and Survival is approximately 0.256.\\n\\nTherefore, the correlation between Fare and Survival (0.256) is greater than the correlation between Age and Fare (0.112).'}"
|
||||
]
|
||||
},
|
||||
"execution_count": 18,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_experimental.agents import create_pandas_dataframe_agent\n",
|
||||
"\n",
|
||||
"agent = create_pandas_dataframe_agent(\n",
|
||||
" llm, df, agent_type=\"openai-tools\", verbose=True, allow_dangerous_code=True\n",
|
||||
")\n",
|
||||
"agent.invoke(\n",
|
||||
" {\n",
|
||||
" \"input\": \"What's the correlation between age and fare? is that greater than the correlation between fare and survival?\"\n",
|
||||
" }\n",
|
||||
")"
|
||||
]
|
||||
"outputs": [],
|
||||
"source": "from langchain_experimental.agents import create_pandas_dataframe_agent\n\nagent = create_pandas_dataframe_agent(\n llm, df, agent_type=\"openai-tools\", verbose=True, allow_dangerous_code=True\n)\nagent.invoke(\n {\n \"input\": \"What's the correlation between age and fare? is that greater than the correlation between fare and survival?\"\n }\n)"
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -786,4 +741,4 @@
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
}
|
||||
@@ -998,91 +998,6 @@
|
||||
"\n",
|
||||
"chain.invoke({\"query\": query})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "xfejabhtn2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Combining with Additional Tools\n",
|
||||
"\n",
|
||||
"When you need to use both structured output and additional tools (like web search), note the order of operations:\n",
|
||||
"\n",
|
||||
"**Correct Order**:\n",
|
||||
"```python\n",
|
||||
"# 1. Bind tools first\n",
|
||||
"llm_with_tools = llm.bind_tools([web_search_tool, calculator_tool])\n",
|
||||
"\n",
|
||||
"# 2. Apply structured output\n",
|
||||
"structured_llm = llm_with_tools.with_structured_output(MySchema)\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"**Incorrect Order**:\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"# This will fail with \"Tool 'MySchema' not found\" error\n",
|
||||
"structured_llm = llm.with_structured_output(MySchema)\n",
|
||||
"broken_llm = structured_llm.bind_tools([web_search_tool])\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "653798ca",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Why Order Matters:**\n",
|
||||
"`with_structured_output()` internally uses tool calling to enforce the schema. When you bind additional tools afterward, it creates a conflict in the tool resolution system."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1345f4a4",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Complete Example:**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "0835637b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from pydantic import BaseModel, Field\n",
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"class SearchResult(BaseModel):\n",
|
||||
" \"\"\"Structured search result.\"\"\"\n",
|
||||
"\n",
|
||||
" query: str = Field(description=\"The search query\")\n",
|
||||
" findings: str = Field(description=\"Summary of findings\")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Define tools\n",
|
||||
"search_tool = {\n",
|
||||
" \"type\": \"function\",\n",
|
||||
" \"function\": {\n",
|
||||
" \"name\": \"web_search\",\n",
|
||||
" \"description\": \"Search the web for information\",\n",
|
||||
" \"parameters\": {\n",
|
||||
" \"type\": \"object\",\n",
|
||||
" \"properties\": {\"query\": {\"type\": \"string\", \"description\": \"Search query\"}},\n",
|
||||
" \"required\": [\"query\"],\n",
|
||||
" },\n",
|
||||
" },\n",
|
||||
"}\n",
|
||||
"\n",
|
||||
"# Correct approach\n",
|
||||
"llm = ChatOpenAI()\n",
|
||||
"llm_with_search = llm.bind_tools([search_tool])\n",
|
||||
"structured_search_llm = llm_with_search.with_structured_output(SearchResult)\n",
|
||||
"\n",
|
||||
"# Now you can use both search and get structured output\n",
|
||||
"result = structured_search_llm.invoke(\"Search for latest AI research and summarize\")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
|
||||
@@ -147,7 +147,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 5,
|
||||
"id": "74de0286-b003-4b48-9cdd-ecab435515ca",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -157,7 +157,7 @@
|
||||
"\n",
|
||||
"from langchain_anthropic import ChatAnthropic\n",
|
||||
"\n",
|
||||
"llm = ChatAnthropic(model=\"claude-3-5-sonnet-latest\", temperature=0)"
|
||||
"llm = ChatAnthropic(model=\"claude-3-5-sonnet-20240620\", temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -55,7 +55,7 @@
|
||||
"source": [
|
||||
"## Defining tool schemas\n",
|
||||
"\n",
|
||||
"For a model to be able to call tools, we need to pass in tool schemas that describe what the tool does and what its arguments are. Chat models that support tool calling features implement a `.bind_tools()` method for passing tool schemas to the model. Tool schemas can be passed in as Python functions (with typehints and docstrings), Pydantic models, TypedDict classes, or LangChain [Tool objects](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.base.BaseTool.html#basetool). Subsequent invocations of the model will pass in these tool schemas along with the prompt.\n",
|
||||
"For a model to be able to call tools, we need to pass in tool schemas that describe what the tool does and what it's arguments are. Chat models that support tool calling features implement a `.bind_tools()` method for passing tool schemas to the model. Tool schemas can be passed in as Python functions (with typehints and docstrings), Pydantic models, TypedDict classes, or LangChain [Tool objects](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.base.BaseTool.html#basetool). Subsequent invocations of the model will pass in these tool schemas along with the prompt.\n",
|
||||
"\n",
|
||||
"### Python functions\n",
|
||||
"Our tool schemas can be Python functions:"
|
||||
|
||||
@@ -38,7 +38,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -53,7 +53,7 @@
|
||||
"if \"ANTHROPIC_API_KEY\" not in os.environ:\n",
|
||||
" os.environ[\"ANTHROPIC_API_KEY\"] = getpass()\n",
|
||||
"\n",
|
||||
"model = ChatAnthropic(model=\"claude-3-5-sonnet-latest\", temperature=0)"
|
||||
"model = ChatAnthropic(model=\"claude-3-5-sonnet-20240620\", temperature=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -53,7 +53,7 @@
|
||||
"\n",
|
||||
"To keep the most recent messages, we set `strategy=\"last\"`. We'll also set `include_system=True` to include the `SystemMessage`, and `start_on=\"human\"` to make sure the resulting chat history is valid. \n",
|
||||
"\n",
|
||||
"This is a good default configuration when using `trim_messages` based on token count. Remember to adjust `token_counter` and `max_tokens` for your use case. Keep in mind that new queries added to the chat history will be included in the token count unless you trim prior to adding the new query.\n",
|
||||
"This is a good default configuration when using `trim_messages` based on token count. Remember to adjust `token_counter` and `max_tokens` for your use case.\n",
|
||||
"\n",
|
||||
"Notice that for our `token_counter` we can pass in a function (more on that below) or a language model (since language models have a message token counting method). It makes sense to pass in a model when you're trimming your messages to fit into the context window of that specific model:"
|
||||
]
|
||||
@@ -525,7 +525,7 @@
|
||||
"id": "4d91d390-e7f7-467b-ad87-d100411d7a21",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Looking at [the LangSmith trace](https://smith.langchain.com/public/65af12c4-c24d-4824-90f0-6547566e59bb/r) we can see that before the messages are passed to the model they are first trimmed.\n",
|
||||
"Looking at the LangSmith trace we can see that before the messages are passed to the model they are first trimmed: https://smith.langchain.com/public/65af12c4-c24d-4824-90f0-6547566e59bb/r\n",
|
||||
"\n",
|
||||
"Looking at just the trimmer, we can see that it's a Runnable object that can be invoked like all Runnables:"
|
||||
]
|
||||
@@ -620,7 +620,7 @@
|
||||
"id": "556b7b4c-43cb-41de-94fc-1a41f4ec4d2e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Looking at [the LangSmith trace](https://smith.langchain.com/public/17dd700b-9994-44ca-930c-116e00997315/r) we can see that we retrieve all of our messages but before the messages are passed to the model they are trimmed to be just the system message and last human message."
|
||||
"Looking at the LangSmith trace we can see that we retrieve all of our messages but before the messages are passed to the model they are trimmed to be just the system message and last human message: https://smith.langchain.com/public/17dd700b-9994-44ca-930c-116e00997315/r"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -630,7 +630,7 @@
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For a complete description of all arguments head to the [API reference](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.trim_messages.html)."
|
||||
"For a complete description of all arguments head to the API reference: https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.trim_messages.html"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
||||
@@ -7,7 +7,10 @@
|
||||
"source": [
|
||||
"# Confident\n",
|
||||
"\n",
|
||||
">[DeepEval](https://confident-ai.com) package for unit testing LLMs."
|
||||
">[DeepEval](https://confident-ai.com) package for unit testing LLMs.\n",
|
||||
"> Using Confident, everyone can build robust language models through faster iterations\n",
|
||||
"> using both unit testing and integration testing. We provide support for each step in the iteration\n",
|
||||
"> from synthetic data creation to testing.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -39,7 +42,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install deepeval langchain langchain-openai"
|
||||
"%pip install --upgrade --quiet langchain langchain-openai langchain-community deepeval langchain-chroma"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -61,29 +64,11 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/html": [
|
||||
"<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">🎉🥳 Congratulations! You've successfully logged in! 🙌 \n",
|
||||
"</pre>\n"
|
||||
],
|
||||
"text/plain": [
|
||||
"🎉🥳 Congratulations! You've successfully logged in! 🙌 \n"
|
||||
]
|
||||
},
|
||||
"metadata": {},
|
||||
"output_type": "display_data"
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"import deepeval\n",
|
||||
"\n",
|
||||
"api_key = os.getenv(\"DEEPEVAL_API_KEY\")\n",
|
||||
"deepeval.login(api_key)"
|
||||
"!deepeval login"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -91,9 +76,12 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Setup Confident AI Callback (Modern)\n",
|
||||
"### Setup DeepEval\n",
|
||||
"\n",
|
||||
"The previous DeepEvalCallbackHandler and metric tracking are deprecated. Please use the new integration below."
|
||||
"You can, by default, use the `DeepEvalCallbackHandler` to set up the metrics you want to track. However, this has limited support for metrics at the moment (more to be added soon). It currently supports:\n",
|
||||
"- [Answer Relevancy](https://docs.confident-ai.com/docs/measuring_llm_performance/answer_relevancy)\n",
|
||||
"- [Bias](https://docs.confident-ai.com/docs/measuring_llm_performance/debias)\n",
|
||||
"- [Toxicness](https://docs.confident-ai.com/docs/measuring_llm_performance/non_toxic)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -102,15 +90,10 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from deepeval.integrations.langchain import CallbackHandler\n",
|
||||
"from deepeval.metrics.answer_relevancy import AnswerRelevancy\n",
|
||||
"\n",
|
||||
"handler = CallbackHandler(\n",
|
||||
" name=\"My Trace\",\n",
|
||||
" tags=[\"production\", \"v1\"],\n",
|
||||
" metadata={\"experiment\": \"A/B\"},\n",
|
||||
" thread_id=\"thread-123\",\n",
|
||||
" user_id=\"user-456\",\n",
|
||||
")"
|
||||
"# Here we want to make sure the answer is minimally relevant\n",
|
||||
"answer_relevancy_metric = AnswerRelevancy(minimum_score=0.5)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -120,11 +103,186 @@
|
||||
"source": [
|
||||
"## Get Started"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"To use the `DeepEvalCallbackHandler`, we need the `implementation_name`. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_community.callbacks.confident_callback import DeepEvalCallbackHandler\n",
|
||||
"\n",
|
||||
"deepeval_callback = DeepEvalCallbackHandler(\n",
|
||||
" implementation_name=\"langchainQuickstart\", metrics=[answer_relevancy_metric]\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Scenario 1: Feeding into LLM\n",
|
||||
"\n",
|
||||
"You can then feed it into your LLM with OpenAI."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"LLMResult(generations=[[Generation(text='\\n\\nQ: What did the fish say when he hit the wall? \\nA: Dam.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\nThe Moon \\n\\nThe moon is high in the midnight sky,\\nSparkling like a star above.\\nThe night so peaceful, so serene,\\nFilling up the air with love.\\n\\nEver changing and renewing,\\nA never-ending light of grace.\\nThe moon remains a constant view,\\nA reminder of life’s gentle pace.\\n\\nThrough time and space it guides us on,\\nA never-fading beacon of hope.\\nThe moon shines down on us all,\\nAs it continues to rise and elope.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\nQ. What did one magnet say to the other magnet?\\nA. \"I find you very attractive!\"', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=\"\\n\\nThe world is charged with the grandeur of God.\\nIt will flame out, like shining from shook foil;\\nIt gathers to a greatness, like the ooze of oil\\nCrushed. Why do men then now not reck his rod?\\n\\nGenerations have trod, have trod, have trod;\\nAnd all is seared with trade; bleared, smeared with toil;\\nAnd wears man's smudge and shares man's smell: the soil\\nIs bare now, nor can foot feel, being shod.\\n\\nAnd for all this, nature is never spent;\\nThere lives the dearest freshness deep down things;\\nAnd though the last lights off the black West went\\nOh, morning, at the brown brink eastward, springs —\\n\\nBecause the Holy Ghost over the bent\\nWorld broods with warm breast and with ah! bright wings.\\n\\n~Gerard Manley Hopkins\", generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text='\\n\\nQ: What did one ocean say to the other ocean?\\nA: Nothing, they just waved.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=\"\\n\\nA poem for you\\n\\nOn a field of green\\n\\nThe sky so blue\\n\\nA gentle breeze, the sun above\\n\\nA beautiful world, for us to love\\n\\nLife is a journey, full of surprise\\n\\nFull of joy and full of surprise\\n\\nBe brave and take small steps\\n\\nThe future will be revealed with depth\\n\\nIn the morning, when dawn arrives\\n\\nA fresh start, no reason to hide\\n\\nSomewhere down the road, there's a heart that beats\\n\\nBelieve in yourself, you'll always succeed.\", generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'completion_tokens': 504, 'total_tokens': 528, 'prompt_tokens': 24}, 'model_name': 'text-davinci-003'})"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_openai import OpenAI\n",
|
||||
"\n",
|
||||
"llm = OpenAI(\n",
|
||||
" temperature=0,\n",
|
||||
" callbacks=[deepeval_callback],\n",
|
||||
" verbose=True,\n",
|
||||
" openai_api_key=\"<YOUR_API_KEY>\",\n",
|
||||
")\n",
|
||||
"output = llm.generate(\n",
|
||||
" [\n",
|
||||
" \"What is the best evaluation tool out there? (no bias at all)\",\n",
|
||||
" ]\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can then check the metric if it was successful by calling the `is_successful()` method."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"answer_relevancy_metric.is_successful()\n",
|
||||
"# returns True/False"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Once you have ran that, you should be able to see our dashboard below. \n",
|
||||
"\n",
|
||||
""
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Scenario 2: Tracking an LLM in a chain without callbacks\n",
|
||||
"\n",
|
||||
"To track an LLM in a chain without callbacks, you can plug into it at the end.\n",
|
||||
"\n",
|
||||
"We can start by defining a simple chain as shown below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import requests\n",
|
||||
"from langchain.chains import RetrievalQA\n",
|
||||
"from langchain_chroma import Chroma\n",
|
||||
"from langchain_community.document_loaders import TextLoader\n",
|
||||
"from langchain_openai import OpenAI, OpenAIEmbeddings\n",
|
||||
"from langchain_text_splitters import CharacterTextSplitter\n",
|
||||
"\n",
|
||||
"text_file_url = \"https://raw.githubusercontent.com/hwchase17/chat-your-data/master/state_of_the_union.txt\"\n",
|
||||
"\n",
|
||||
"openai_api_key = \"sk-XXX\"\n",
|
||||
"\n",
|
||||
"with open(\"state_of_the_union.txt\", \"w\") as f:\n",
|
||||
" response = requests.get(text_file_url)\n",
|
||||
" f.write(response.text)\n",
|
||||
"\n",
|
||||
"loader = TextLoader(\"state_of_the_union.txt\")\n",
|
||||
"documents = loader.load()\n",
|
||||
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
|
||||
"texts = text_splitter.split_documents(documents)\n",
|
||||
"\n",
|
||||
"embeddings = OpenAIEmbeddings(openai_api_key=openai_api_key)\n",
|
||||
"docsearch = Chroma.from_documents(texts, embeddings)\n",
|
||||
"\n",
|
||||
"qa = RetrievalQA.from_chain_type(\n",
|
||||
" llm=OpenAI(openai_api_key=openai_api_key),\n",
|
||||
" chain_type=\"stuff\",\n",
|
||||
" retriever=docsearch.as_retriever(),\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Providing a new question-answering pipeline\n",
|
||||
"query = \"Who is the president?\"\n",
|
||||
"result = qa.run(query)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"After defining a chain, you can then manually check for answer similarity."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"answer_relevancy_metric.measure(result, query)\n",
|
||||
"answer_relevancy_metric.is_successful()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### What's next?\n",
|
||||
"\n",
|
||||
"You can create your own custom metrics [here](https://docs.confident-ai.com/docs/quickstart/custom-metrics). \n",
|
||||
"\n",
|
||||
"DeepEval also offers other features such as being able to [automatically create unit tests](https://docs.confident-ai.com/docs/quickstart/synthetic-data-creation), [tests for hallucination](https://docs.confident-ai.com/docs/measuring_llm_performance/factual_consistency).\n",
|
||||
"\n",
|
||||
"If you are interested, check out our Github repository here [https://github.com/confident-ai/deepeval](https://github.com/confident-ai/deepeval). We welcome any PRs and discussions on how to improve LLM performance."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "langchain",
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
@@ -138,7 +296,12 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.11"
|
||||
"version": "3.10.12"
|
||||
},
|
||||
"vscode": {
|
||||
"interpreter": {
|
||||
"hash": "a53ebf4a859167383b364e7e7521d0add3c2dbbdecce4edf676e8c4634ff3fbb"
|
||||
}
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -124,7 +124,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 4,
|
||||
"id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -132,7 +132,7 @@
|
||||
"from langchain_anthropic import ChatAnthropic\n",
|
||||
"\n",
|
||||
"llm = ChatAnthropic(\n",
|
||||
" model=\"claude-3-5-sonnet-latest\",\n",
|
||||
" model=\"claude-3-5-sonnet-20240620\",\n",
|
||||
" temperature=0,\n",
|
||||
" max_tokens=1024,\n",
|
||||
" timeout=None,\n",
|
||||
@@ -970,8 +970,8 @@
|
||||
"source": [
|
||||
"### In tool results (agentic RAG)\n",
|
||||
"\n",
|
||||
":::info\n",
|
||||
"Requires ``langchain-anthropic>=0.3.17``\n",
|
||||
":::info Requires ``langchain-anthropic>=0.3.17``\n",
|
||||
"\n",
|
||||
":::\n",
|
||||
"\n",
|
||||
"Claude supports a [search_result](https://docs.anthropic.com/en/docs/build-with-claude/search-results) content block representing citable results from queries against a knowledge base or other custom source. These content blocks can be passed to claude both top-line (as in the above example) and within a tool result. This allows Claude to cite elements of its response using the result of a tool call.\n",
|
||||
@@ -998,6 +998,8 @@
|
||||
" ]\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"We also need to specify the `search-results-2025-06-09` beta when instantiating ChatAnthropic. You can see an end-to-end example below.\n",
|
||||
"\n",
|
||||
"<details>\n",
|
||||
"<summary>End to end example with LangGraph</summary>\n",
|
||||
"\n",
|
||||
@@ -1238,110 +1240,6 @@
|
||||
"response = llm_with_tools.invoke(\"How do I update a web app to TypeScript 5.5?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "kloc4rvd1w",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Web search + structured output\n",
|
||||
"\n",
|
||||
"When combining web search tools with structured output, it's important to **bind the tools first and then apply structured output**:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "rjjergy6ef",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from pydantic import BaseModel, Field\n",
|
||||
"from langchain_anthropic import ChatAnthropic\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Define structured output schema\n",
|
||||
"class ResearchResult(BaseModel):\n",
|
||||
" \"\"\"Structured research result from web search.\"\"\"\n",
|
||||
"\n",
|
||||
" topic: str = Field(description=\"The research topic\")\n",
|
||||
" summary: str = Field(description=\"Summary of key findings\")\n",
|
||||
" key_points: list[str] = Field(description=\"List of important points discovered\")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Configure web search tool\n",
|
||||
"websearch_tools = [\n",
|
||||
" {\n",
|
||||
" \"type\": \"web_search_20250305\",\n",
|
||||
" \"name\": \"web_search\",\n",
|
||||
" \"max_uses\": 10,\n",
|
||||
" }\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"llm = ChatAnthropic(model=\"claude-3-5-sonnet-20241022\")\n",
|
||||
"\n",
|
||||
"# Correct order: bind tools first, then structured output\n",
|
||||
"llm_with_search = llm.bind_tools(websearch_tools)\n",
|
||||
"research_llm = llm_with_search.with_structured_output(ResearchResult)\n",
|
||||
"\n",
|
||||
"# Now you can use both web search and get structured output\n",
|
||||
"result = research_llm.invoke(\"Research the latest developments in quantum computing\")\n",
|
||||
"print(f\"Topic: {result.topic}\")\n",
|
||||
"print(f\"Summary: {result.summary}\")\n",
|
||||
"print(f\"Key Points: {result.key_points}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c580c20a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Web fetching\n",
|
||||
"\n",
|
||||
"Claude can use a [web fetching tool](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/web-fetch-tool) to run searches and ground its responses with citations."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "5cf6ad08",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
":::info\n",
|
||||
"Web search tool is supported since ``langchain-anthropic>=0.3.20``\n",
|
||||
":::"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c4804be1",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_anthropic import ChatAnthropic\n",
|
||||
"\n",
|
||||
"llm = ChatAnthropic(\n",
|
||||
" model=\"claude-3-5-haiku-latest\",\n",
|
||||
" betas=[\"web-fetch-2025-09-10\"], # Enable web fetch beta\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"tool = {\"type\": \"web_fetch_20250910\", \"name\": \"web_fetch\", \"max_uses\": 3}\n",
|
||||
"llm_with_tools = llm.bind_tools([tool])\n",
|
||||
"\n",
|
||||
"response = llm_with_tools.invoke(\n",
|
||||
" \"Please analyze the content at https://example.com/article\"\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "088c41d0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
":::warning\n",
|
||||
"Note: you must add the `'web-fetch-2025-09-10'` beta header to use this tool.\n",
|
||||
":::"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1478cdc6-2e52-4870-80f9-b4ddf88f2db2",
|
||||
@@ -1351,14 +1249,14 @@
|
||||
"\n",
|
||||
"Claude can use a [code execution tool](https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/code-execution-tool) to execute Python code in a sandboxed environment.\n",
|
||||
"\n",
|
||||
":::info\n",
|
||||
"Code execution is supported since ``langchain-anthropic>=0.3.14``\n",
|
||||
":::info Code execution is supported since ``langchain-anthropic>=0.3.14``\n",
|
||||
"\n",
|
||||
":::"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 1,
|
||||
"id": "2ce13632-a2da-439f-a429-f66481501630",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -1367,7 +1265,7 @@
|
||||
"\n",
|
||||
"llm = ChatAnthropic(\n",
|
||||
" model=\"claude-sonnet-4-20250514\",\n",
|
||||
" betas=[\"code-execution-2025-05-22\"], # Enable code execution beta\n",
|
||||
" betas=[\"code-execution-2025-05-22\"],\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"tool = {\"type\": \"code_execution_20250522\", \"name\": \"code_execution\"}\n",
|
||||
@@ -1378,16 +1276,6 @@
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a6b5e15a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
":::warning\n",
|
||||
"Note: you must add the `'code_execution_20250522'` beta header to use this tool.\n",
|
||||
":::"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "24076f91-3a3d-4e53-9618-429888197061",
|
||||
@@ -1466,14 +1354,14 @@
|
||||
"\n",
|
||||
"Claude can use a [MCP connector tool](https://docs.anthropic.com/en/docs/agents-and-tools/mcp-connector) for model-generated calls to remote MCP servers.\n",
|
||||
"\n",
|
||||
":::info\n",
|
||||
"Remote MCP is supported since ``langchain-anthropic>=0.3.14``\n",
|
||||
":::info Remote MCP is supported since ``langchain-anthropic>=0.3.14``\n",
|
||||
"\n",
|
||||
":::"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 1,
|
||||
"id": "22fc4a89-e6d8-4615-96cb-2e117349aebf",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -1485,17 +1373,17 @@
|
||||
" \"type\": \"url\",\n",
|
||||
" \"url\": \"https://mcp.deepwiki.com/mcp\",\n",
|
||||
" \"name\": \"deepwiki\",\n",
|
||||
" \"tool_configuration\": { # Optional configuration\n",
|
||||
" \"tool_configuration\": { # optional configuration\n",
|
||||
" \"enabled\": True,\n",
|
||||
" \"allowed_tools\": [\"ask_question\"],\n",
|
||||
" },\n",
|
||||
" \"authorization_token\": \"PLACEHOLDER\", # Optional authorization\n",
|
||||
" \"authorization_token\": \"PLACEHOLDER\", # optional authorization\n",
|
||||
" }\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"llm = ChatAnthropic(\n",
|
||||
" model=\"claude-sonnet-4-20250514\",\n",
|
||||
" betas=[\"mcp-client-2025-04-04\"], # Enable MCP beta\n",
|
||||
" betas=[\"mcp-client-2025-04-04\"],\n",
|
||||
" mcp_servers=mcp_servers,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
@@ -1505,16 +1393,6 @@
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0d6d7197",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
":::warning\n",
|
||||
"Note: you must add the `'mcp-client-2025-04-04'` beta header to use this tool.\n",
|
||||
":::"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "2fd5d545-a40d-42b1-ad0c-0a79e2536c9b",
|
||||
|
||||
@@ -129,7 +129,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 1,
|
||||
"id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -137,7 +137,7 @@
|
||||
"from langchain_aws import ChatBedrockConverse\n",
|
||||
"\n",
|
||||
"llm = ChatBedrockConverse(\n",
|
||||
" model_id=\"anthropic.claude-3-5-sonnet-latest-v1:0\",\n",
|
||||
" model_id=\"anthropic.claude-3-5-sonnet-20240620-v1:0\",\n",
|
||||
" # region_name=...,\n",
|
||||
" # aws_access_key_id=...,\n",
|
||||
" # aws_secret_access_key=...,\n",
|
||||
|
||||
@@ -53,7 +53,7 @@
|
||||
"source": [
|
||||
"### Installation\n",
|
||||
"\n",
|
||||
"The LangChain OCIGenAI integration lives in the `langchain-oci` package and you will also need to install the `oci` package:"
|
||||
"The LangChain OCIGenAI integration lives in the `langchain-community` package and you will also need to install the `oci` package:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -63,7 +63,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -qU langchain-oci"
|
||||
"%pip install -qU langchain-community oci"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -83,7 +83,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_oci.chat_models import ChatOCIGenAI\n",
|
||||
"from langchain_community.chat_models.oci_generative_ai import ChatOCIGenAI\n",
|
||||
"from langchain_core.messages import AIMessage, HumanMessage, SystemMessage\n",
|
||||
"\n",
|
||||
"chat = ChatOCIGenAI(\n",
|
||||
|
||||
@@ -23,12 +23,16 @@
|
||||
"\n",
|
||||
"It optimizes setup and configuration details, including GPU usage.\n",
|
||||
"\n",
|
||||
"For a complete list of supported models and model variants, see the [Ollama model library](https://github.com/jmorganca/ollama#model-library).\n",
|
||||
"For a complete list of supported models and model variants, see the [Ollama model library](https://ollama.com/search).\n",
|
||||
"\n",
|
||||
":::warning\n",
|
||||
"This page is for the new v1 `ChatOllama` class with standard content block output. If you are looking for the legacy v0 `Ollama` class, see the [v0.3 documentation](https://python.langchain.com/v0.3/docs/integrations/chat/ollama/).\n",
|
||||
":::\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"### Integration details\n",
|
||||
"\n",
|
||||
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/ollama) | Package downloads | Package latest |\n",
|
||||
"| Class | Package | Local | Serializable | [JS support](https://js.langchain.com/docs/integrations/chat/ollama/) | Package downloads | Package latest |\n",
|
||||
"| :--- | :--- | :---: | :---: | :---: | :---: | :---: |\n",
|
||||
"| [ChatOllama](https://python.langchain.com/api_reference/ollama/chat_models/langchain_ollama.chat_models.ChatOllama.html#chatollama) | [langchain-ollama](https://python.langchain.com/api_reference/ollama/index.html) | ✅ | ❌ | ✅ |  |  |\n",
|
||||
"\n",
|
||||
@@ -52,7 +56,7 @@
|
||||
">\n",
|
||||
"> On Linux (or WSL), the models will be stored at `/usr/share/ollama/.ollama/models`\n",
|
||||
"\n",
|
||||
"* Specify the exact version of the model of interest as such `ollama pull gpt-oss:20b` (View the [various tags for the `Vicuna`](https://ollama.ai/library/vicuna/tags) model in this instance)\n",
|
||||
"* Specify the exact version of the model of interest as such `ollama pull gpt-oss:20b`\n",
|
||||
"* To view all pulled models, use `ollama list`\n",
|
||||
"* To chat directly with a model from the command line, use `ollama run <name-of-model>`\n",
|
||||
"* View the [Ollama documentation](https://github.com/ollama/ollama/blob/main/docs/README.md) for more commands. You can run `ollama help` in the terminal to see available commands.\n"
|
||||
@@ -103,7 +107,7 @@
|
||||
"metadata": {},
|
||||
"source": [
|
||||
":::warning\n",
|
||||
"Make sure you're using the latest Ollama version!\n",
|
||||
"Make sure you're using the latest Ollama client version!\n",
|
||||
":::\n",
|
||||
"\n",
|
||||
"Update by running:"
|
||||
@@ -131,15 +135,16 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"execution_count": 2,
|
||||
"id": "cb09c344-1836-4e0c-acf8-11d13ac1dbae",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_ollama import ChatOllama\n",
|
||||
"from langchain_ollama.v1 import ChatOllama\n",
|
||||
"\n",
|
||||
"llm = ChatOllama(\n",
|
||||
" model=\"llama3.1\",\n",
|
||||
" model=\"gpt-oss:20b\",\n",
|
||||
" validate_model_on_init=True,\n",
|
||||
" temperature=0,\n",
|
||||
" # other params...\n",
|
||||
")"
|
||||
@@ -162,46 +167,56 @@
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"AIMessage(content='The translation of \"I love programming\" in French is:\\n\\n\"J\\'adore le programmation.\"', additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2025-06-25T18:43:00.483666Z', 'done': True, 'done_reason': 'stop', 'total_duration': 619971208, 'load_duration': 27793125, 'prompt_eval_count': 35, 'prompt_eval_duration': 36354583, 'eval_count': 22, 'eval_duration': 555182667, 'model_name': 'llama3.1'}, id='run--348bb5ef-9dd9-4271-bc7e-a9ddb54c28c1-0', usage_metadata={'input_tokens': 35, 'output_tokens': 22, 'total_tokens': 57})"
|
||||
]
|
||||
},
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"AIMessage(type='ai', name=None, id='lc_run--5521db11-a5eb-4e46-956c-1455151cdaa3-0', lc_version='v1', content=[{'type': 'text', 'text': 'The translation of \"I love programming\" to French is:\\n\\n\"Je aime le programmation\"\\n\\nHowever, a more common and idiomatic way to express this in French would be:\\n\\n\"J\\'aime programmer\"\\n\\nThis phrase uses the verb \"aimer\" (to love) in the present tense, which is more suitable for expressing a general feeling or preference.'}], usage_metadata={'input_tokens': 34, 'output_tokens': 73, 'total_tokens': 107}, response_metadata={'model_name': 'llama3.2', 'created_at': '2025-08-08T23:07:44.439483Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1410566833, 'load_duration': 28419542, 'prompt_eval_count': 34, 'prompt_eval_duration': 141642125, 'eval_count': 73, 'eval_duration': 1240075000}, parsed=None)\n",
|
||||
"\n",
|
||||
"Content:\n",
|
||||
"The translation of \"I love programming\" to French is:\n",
|
||||
"\n",
|
||||
"\"Je aime le programmation\"\n",
|
||||
"\n",
|
||||
"However, a more common and idiomatic way to express this in French would be:\n",
|
||||
"\n",
|
||||
"\"J'aime programmer\"\n",
|
||||
"\n",
|
||||
"This phrase uses the verb \"aimer\" (to love) in the present tense, which is more suitable for expressing a general feeling or preference.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"messages = [\n",
|
||||
" (\n",
|
||||
" \"system\",\n",
|
||||
" \"You are a helpful assistant that translates English to French. Translate the user sentence.\",\n",
|
||||
" ),\n",
|
||||
" (\"human\", \"I love programming.\"),\n",
|
||||
"]\n",
|
||||
"ai_msg = llm.invoke(messages)\n",
|
||||
"ai_msg"
|
||||
"ai_msg = llm.invoke(\"Translate 'I love programming' to French.\")\n",
|
||||
"print(f\"{ai_msg}\\n\")\n",
|
||||
"print(f\"Content:\\n{ai_msg.text}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "ede35e47",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Streaming"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"id": "d86145b3-bfef-46e8-b227-4dda5c9c2705",
|
||||
"execution_count": 10,
|
||||
"id": "77474829",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"The translation of \"I love programming\" in French is:\n",
|
||||
"\n",
|
||||
"\"J'adore le programmation.\"\n"
|
||||
"Hi| there|!| I|'m| just| a| chat|bot|,| so| I| don|'t| have| feelings|,| but| I|'m| here| and| ready| to| help| you| with| anything| you| need|!| How| can| I| assist| you| today|?| 😊|"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(ai_msg.content)"
|
||||
"for chunk in llm.stream(\"How are you doing?\"):\n",
|
||||
" if chunk.text:\n",
|
||||
" print(chunk.text, end=\"|\", flush=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -223,10 +238,10 @@
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"AIMessage(content='\"Programmieren ist meine Leidenschaft.\"\\n\\n(I translated \"programming\" to the German word \"Programmieren\", and added \"ist meine Leidenschaft\" which means \"is my passion\")', additional_kwargs={}, response_metadata={'model': 'llama3.1', 'created_at': '2025-06-25T18:43:29.350032Z', 'done': True, 'done_reason': 'stop', 'total_duration': 1194744459, 'load_duration': 26982500, 'prompt_eval_count': 30, 'prompt_eval_duration': 117043458, 'eval_count': 41, 'eval_duration': 1049892167, 'model_name': 'llama3.1'}, id='run--efc6436e-2346-43d9-8118-3c20b3cdf0d0-0', usage_metadata={'input_tokens': 30, 'output_tokens': 41, 'total_tokens': 71})"
|
||||
"'Ich liebe Programmierung.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"execution_count": 19,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -245,13 +260,15 @@
|
||||
")\n",
|
||||
"\n",
|
||||
"chain = prompt | llm\n",
|
||||
"chain.invoke(\n",
|
||||
"result = chain.invoke(\n",
|
||||
" {\n",
|
||||
" \"input_language\": \"English\",\n",
|
||||
" \"output_language\": \"German\",\n",
|
||||
" \"input\": \"I love programming.\",\n",
|
||||
" }\n",
|
||||
")"
|
||||
")\n",
|
||||
"\n",
|
||||
"result.text"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -272,7 +289,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 13,
|
||||
"id": "f767015f",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -280,16 +297,16 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[{'name': 'validate_user', 'args': {'addresses': ['123 Fake St, Boston, MA', '234 Pretend Boulevard, Houston, TX'], 'user_id': '123'}, 'id': 'aef33a32-a34b-4b37-b054-e0d85584772f', 'type': 'tool_call'}]\n"
|
||||
"[{'type': 'tool_call', 'id': 'f365489e-1dc4-4d60-aaff-e56290ae4f99', 'name': 'validate_user', 'args': {'addresses': ['123 Fake St in Boston MA', '234 Pretend Boulevard in Houston TX'], 'user_id': 123}}]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from typing import List\n",
|
||||
"\n",
|
||||
"from langchain_core.messages import AIMessage\n",
|
||||
"from langchain_core.v1.messages import AIMessage\n",
|
||||
"from langchain_core.tools import tool\n",
|
||||
"from langchain_ollama import ChatOllama\n",
|
||||
"from langchain_ollama.v1 import ChatOllama\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"@tool\n",
|
||||
@@ -319,6 +336,50 @@
|
||||
" print(result.tool_calls)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4321b6a8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Structured output"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"id": "20f8ae70",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Name: Alice, Age: 28, Job: Software Engineer\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_ollama.v1 import ChatOllama\n",
|
||||
"from pydantic import BaseModel, Field\n",
|
||||
"\n",
|
||||
"llm = ChatOllama(model=\"llama3.2\", validate_model_on_init=True, temperature=0)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"class Person(BaseModel):\n",
|
||||
" \"\"\"Information about a person.\"\"\"\n",
|
||||
"\n",
|
||||
" name: str = Field(description=\"The person's full name\")\n",
|
||||
" age: int = Field(description=\"The person's age in years\")\n",
|
||||
" occupation: str = Field(description=\"The person's job or profession\")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"structured_llm = llm.with_structured_output(Person)\n",
|
||||
"response: Person = structured_llm.invoke(\n",
|
||||
" \"Tell me about a fictional software engineer named Alice who is 28 years old.\"\n",
|
||||
")\n",
|
||||
"print(f\"Name: {response.name}, Age: {response.age}, Job: {response.occupation}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4c5e0197",
|
||||
@@ -326,9 +387,9 @@
|
||||
"source": [
|
||||
"## Multi-modal\n",
|
||||
"\n",
|
||||
"Ollama has limited support for multi-modal LLMs, such as [gemma3](https://ollama.com/library/gemma3)\n",
|
||||
"Ollama has limited support for multi-modal LLMs, such as [gemma3](https://ollama.com/library/gemma3).\n",
|
||||
"\n",
|
||||
"Be sure to update Ollama so that you have the most recent version to support multi-modal."
|
||||
"### Image input"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -411,15 +472,15 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"90%\n"
|
||||
"Based on the image, the dollar-based gross retention rate is **90%**.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_core.messages import HumanMessage\n",
|
||||
"from langchain_ollama import ChatOllama\n",
|
||||
"from langchain_core.v1.messages import HumanMessage\n",
|
||||
"from langchain_ollama.v1 import ChatOllama\n",
|
||||
"\n",
|
||||
"llm = ChatOllama(model=\"bakllava\", temperature=0)\n",
|
||||
"llm = ChatOllama(model=\"gemma3:4b\", validate_model_on_init=True, temperature=0)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def prompt_func(data):\n",
|
||||
@@ -427,8 +488,9 @@
|
||||
" image = data[\"image\"]\n",
|
||||
"\n",
|
||||
" image_part = {\n",
|
||||
" \"type\": \"image_url\",\n",
|
||||
" \"image_url\": f\"data:image/jpeg;base64,{image}\",\n",
|
||||
" \"type\": \"image\",\n",
|
||||
" \"base64\": f\"data:image/jpeg;base64,{image}\",\n",
|
||||
" \"mime_type\": \"image/jpeg\",\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" content_parts = []\n",
|
||||
@@ -438,7 +500,7 @@
|
||||
" content_parts.append(image_part)\n",
|
||||
" content_parts.append(text_part)\n",
|
||||
"\n",
|
||||
" return [HumanMessage(content=content_parts)]\n",
|
||||
" return [HumanMessage(content_parts)]\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"from langchain_core.output_parsers import StrOutputParser\n",
|
||||
@@ -457,11 +519,9 @@
|
||||
"id": "fb6a331f-1507-411f-89e5-c4d598154f3c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Reasoning models and custom message roles\n",
|
||||
"## Reasoning models\n",
|
||||
"\n",
|
||||
"Some models, such as IBM's [Granite 3.2](https://ollama.com/library/granite3.2), support custom message roles to enable thinking processes.\n",
|
||||
"\n",
|
||||
"To access Granite 3.2's thinking features, pass a message with a `\"control\"` role with content set to `\"thinking\"`. Because `\"control\"` is a non-standard message role, we can use a [ChatMessage](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.chat.ChatMessage.html) object to implement it:"
|
||||
"Many models support outputting their reasoning process in addition to the final answer. This is useful for debugging and understanding how the model arrived at its conclusion. This train of thought reasoning is available in models such as `gpt-oss`, `qwen3:8b`, and `deepseek-r1`. To enable reasoning output, set the `reasoning` parameter to `True` either when instantiating the model or during invocation."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -474,30 +534,25 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Here is my thought process:\n",
|
||||
"The user is asking for the value of 3 raised to the power of 3, which is a basic exponentiation operation.\n",
|
||||
"\n",
|
||||
"Here is my response:\n",
|
||||
"\n",
|
||||
"3^3 (read as \"3 to the power of 3\") equals 27. \n",
|
||||
"\n",
|
||||
"This calculation is performed by multiplying 3 by itself three times: 3*3*3 = 27.\n"
|
||||
"Response including reasoning: [{'type': 'reasoning', 'reasoning': \"Okay, so I need to figure out what 3^3 is. Let me start by recalling what exponents mean. From what I remember, when you have a number raised to a power, like a^b, it means you multiply the number by itself b times. So, for example, 2^3 would be 2 multiplied by itself three times: 2 × 2 × 2. Let me check if that's right. Yeah, I think that's correct. So applying that to 3^3, it should be 3 multiplied by itself three times.\\n\\nWait, let me make sure I'm not confusing the base and the exponent. The base is the number being multiplied, and the exponent is how many times it's multiplied. So in 3^3, the base is 3 and the exponent is 3. That means I need to multiply 3 by itself three times. Let me write that out step by step.\\n\\nFirst, multiply the first two 3s: 3 × 3. What's 3 times 3? That's 9. Okay, so the first multiplication gives me 9. Now, I need to multiply that result by the third 3. So 9 × 3. Let me calculate that. 9 times 3 is... 27. So putting it all together, 3 × 3 × 3 equals 27. \\n\\nWait, let me verify that again. Maybe I should do it in a different way to make sure I didn't make a mistake. Let's break it down. 3^3 is the same as 3 × 3 × 3. Let me compute 3 × 3 first, which is 9, and then multiply that by 3. 9 × 3 is indeed 27. Hmm, that seems right. \\n\\nAlternatively, I can think of exponents as repeated multiplication. So 3^1 is 3, 3^2 is 3 × 3 = 9, and 3^3 is 3 × 3 × 3 = 27. Yeah, that progression makes sense. Each time the exponent increases by 1, you multiply by the base again. So starting from 3^1 = 3, then 3^2 is 3 × 3 = 9, then 3^3 is 9 × 3 = 27. \\n\\nIs there another way to check this? Maybe using exponent rules. For example, if I know that 3^2 is 9, then multiplying by another 3 would give me 3^3. Since 9 × 3 is 27, that confirms it again. \\n\\nAlternatively, maybe I can use logarithms or something else, but that might be overcomplicating. Since exponents are straightforward multiplication, I think my initial calculation is correct. \\n\\nWait, just to be thorough, maybe I can use a calculator to verify. Let me imagine pressing 3, then the exponent key, then 3. If I do that, it should give me 27. Yeah, that's what I remember. So all methods point to 27. \\n\\nI think I've checked it multiple ways: breaking down the multiplication step by step, using the exponent progression, and even considering a calculator verification. All of them lead to the same answer. Therefore, I'm confident that 3^3 equals 27.\\n\"}, {'type': 'text', 'text': 'To determine the value of $3^3$, we start by understanding what an exponent represents. The expression $a^b$ means multiplying the base $a$ by itself $b$ times. \\n\\n### Step-by-Step Calculation:\\n1. **Identify the base and exponent**: \\n In $3^3$, the base is **3**, and the exponent is **3**. This means we multiply 3 by itself three times.\\n\\n2. **Perform the multiplication**: \\n - First, multiply the first two 3s: \\n $3 \\\\times 3 = 9$ \\n - Next, multiply the result by the third 3: \\n $9 \\\\times 3 = 27$\\n\\n3. **Verify the result**: \\n - $3^1 = 3$ \\n - $3^2 = 3 \\\\times 3 = 9$ \\n - $3^3 = 3 \\\\times 3 \\\\times 3 = 27$ \\n This progression confirms the calculation.\\n\\n### Final Answer:\\n$$\\n3^3 = \\\\boxed{27}\\n$$'}]\n",
|
||||
"Response without reasoning: [{'type': 'text', 'text': \"Sure! Let's break down what **3³** means and how to calculate it step by step.\\n\\n---\\n\\n### Step 1: Understand the notation\\nThe expression **3³** means **3 multiplied by itself three times**. The small number (3) is called the **exponent**, and it tells us how many times the base number (3) is used as a factor.\\n\\nSo:\\n$$\\n3^3 = 3 \\\\times 3 \\\\times 3\\n$$\\n\\n---\\n\\n### Step 2: Perform the multiplication step by step\\n\\n1. Multiply the first two 3s:\\n $$\\n 3 \\\\times 3 = 9\\n $$\\n\\n2. Now multiply the result by the third 3:\\n $$\\n 9 \\\\times 3 = 27\\n $$\\n\\n---\\n\\n### Step 3: Final Answer\\n\\n$$\\n3^3 = 27\\n$$\\n\\n---\\n\\n### Summary\\n- **3³** means **3 × 3 × 3**\\n- **3 × 3 = 9**\\n- **9 × 3 = 27**\\n- So, **3³ = 27**\\n\\nLet me know if you'd like to explore exponents further!\"}]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_core.messages import ChatMessage, HumanMessage\n",
|
||||
"from langchain_ollama import ChatOllama\n",
|
||||
"from langchain_ollama.v1 import ChatOllama\n",
|
||||
"\n",
|
||||
"llm = ChatOllama(model=\"granite3.2:8b\")\n",
|
||||
"# All outputs from `llm` will include reasoning unless overridden during invocation\n",
|
||||
"llm = ChatOllama(model=\"qwen3:8b\", validate_model_on_init=True, reasoning=True)\n",
|
||||
"\n",
|
||||
"messages = [\n",
|
||||
" ChatMessage(role=\"control\", content=\"thinking\"),\n",
|
||||
" HumanMessage(\"What is 3^3?\"),\n",
|
||||
"]\n",
|
||||
"response_a = llm.invoke(\"What is 3^3? Explain your reasoning step by step.\")\n",
|
||||
"print(f\"Response including reasoning: {response_a.content}\")\n",
|
||||
"\n",
|
||||
"response = llm.invoke(messages)\n",
|
||||
"print(response.content)"
|
||||
"# Test override; note no ReasoningContentBlock in the response\n",
|
||||
"response_b = llm.invoke(\n",
|
||||
" \"What is 3^3? Explain your reasoning step by step.\", reasoning=False\n",
|
||||
")\n",
|
||||
"print(f\"Response without reasoning: {response_b.content}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -505,7 +560,7 @@
|
||||
"id": "6271d032-da40-44d4-9b52-58370e164be3",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Note that the model exposes its thought process in addition to its final response."
|
||||
"Note that the model exposes its thought process as a `ReasoningContentBlock` addition to its final response."
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -521,7 +576,7 @@
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "langchain",
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
@@ -535,7 +590,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.11"
|
||||
"version": "3.10.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -7,7 +7,7 @@
|
||||
"source": [
|
||||
"# Azure AI Data\n",
|
||||
"\n",
|
||||
">[Azure AI Foundry (formerly Azure AI Studio)](https://ai.azure.com/) provides the capability to upload data assets to cloud storage and register existing data assets from the following sources:\n",
|
||||
">[Azure AI Studio](https://ai.azure.com/) provides the capability to upload data assets to cloud storage and register existing data assets from the following sources:\n",
|
||||
">\n",
|
||||
">- `Microsoft OneLake`\n",
|
||||
">- `Azure Blob Storage`\n",
|
||||
|
||||
@@ -2,91 +2,67 @@
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"jupyter": {
|
||||
"outputs_hidden": false
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Oracle Autonomous Database\n",
|
||||
"\n",
|
||||
"Oracle Autonomous Database is a cloud database that uses machine learning to automate database tuning, security, backups, updates, and other routine management tasks traditionally performed by DBAs.\n",
|
||||
"Oracle autonomous database is a cloud database that uses machine learning to automate database tuning, security, backups, updates, and other routine management tasks traditionally performed by DBAs.\n",
|
||||
"\n",
|
||||
"This notebook covers how to load documents from Oracle Autonomous Database.\n",
|
||||
"This notebook covers how to load documents from oracle autonomous database, the loader supports connection with connection string or tns configuration.\n",
|
||||
"\n",
|
||||
"## Prerequisites\n",
|
||||
"1. Install python-oracledb:\n",
|
||||
"\n",
|
||||
" `pip install oracledb`\n",
|
||||
" \n",
|
||||
" See [Installing python-oracledb](https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html).\n",
|
||||
"\n",
|
||||
"2. A database that python-oracledb's default 'Thin' mode can connected to. This is true of Oracle Autonomous Database, see [python-oracledb Architecture](https://python-oracledb.readthedocs.io/en/latest/user_guide/introduction.html#architecture).\n"
|
||||
]
|
||||
"1. Database runs in a 'Thin' mode:\n",
|
||||
" https://python-oracledb.readthedocs.io/en/latest/user_guide/appendix_b.html\n",
|
||||
"2. `pip install oracledb`:\n",
|
||||
" https://python-oracledb.readthedocs.io/en/latest/user_guide/installation.html"
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"jupyter": {
|
||||
"outputs_hidden": false
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Instructions"
|
||||
]
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"jupyter": {
|
||||
"outputs_hidden": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"pip install oracledb"
|
||||
]
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"jupyter": {
|
||||
"outputs_hidden": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_community.document_loaders import OracleAutonomousDatabaseLoader\n",
|
||||
"from settings import s"
|
||||
]
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"jupyter": {
|
||||
"outputs_hidden": false
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"With mutual TLS authentication (mTLS), wallet_location and wallet_password parameters are required to create the connection. See python-oracledb documentation [Connecting to Oracle Cloud Autonomous Databases](https://python-oracledb.readthedocs.io/en/latest/user_guide/connection_handling.html#connecting-to-oracle-cloud-autonomous-databases)."
|
||||
]
|
||||
"With mutual TLS authentication (mTLS), wallet_location and wallet_password are required to create the connection, user can create connection by providing either connection string or tns configuration details."
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"jupyter": {
|
||||
"outputs_hidden": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"SQL_QUERY = \"select prod_id, time_id from sh.costs fetch first 5 rows only\"\n",
|
||||
@@ -113,30 +89,24 @@
|
||||
" wallet_password=s.PASSWORD,\n",
|
||||
")\n",
|
||||
"doc_2 = doc_loader_2.load()"
|
||||
]
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"jupyter": {
|
||||
"outputs_hidden": false
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"With 1-way TLS authentication, only the database credentials and connection string are required to establish a connection.\n",
|
||||
"The example below also shows passing bind variable values with the argument \"parameters\"."
|
||||
]
|
||||
"With TLS authentication, wallet_location and wallet_password are not required.\n",
|
||||
"Bind variable option is provided by argument \"parameters\"."
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
}
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"jupyter": {
|
||||
"outputs_hidden": false
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"SQL_QUERY = \"select channel_id, channel_desc from sh.channels where channel_desc = :1 fetch first 5 rows only\"\n",
|
||||
@@ -161,28 +131,31 @@
|
||||
" parameters=[\"Direct Sales\"],\n",
|
||||
")\n",
|
||||
"doc_4 = doc_loader_4.load()"
|
||||
]
|
||||
],
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
}
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
"version": 2
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.11"
|
||||
"pygments_lexer": "ipython2",
|
||||
"version": "2.7.6"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
"nbformat_minor": 0
|
||||
}
|
||||
|
||||
@@ -1,334 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Oxylabs"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"[Oxylabs](https://oxylabs.io/) is a web intelligence collection platform that enables companies worldwide to unlock data-driven insights.\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
"Oxylabs document loader allows to load data from search engines, e-commerce sites, travel platforms, and any other website. It supports geolocation, browser rendering, data parsing, multiple user agents and many more parameters. Check out [Oxylabs documentation](https://developers.oxylabs.io/scraping-solutions/web-scraper-api) for more information.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"### Integration details\n",
|
||||
"\n",
|
||||
"| Class | Package | Local | Serializable | Pricing |\n",
|
||||
"|:--------------|:------------------------------------------------------------------|:-----:|:------------:|:-----------------------------:|\n",
|
||||
"| OxylabsLoader | [langchain-oxylabs](https://github.com/oxylabs/langchain-oxylabs) | ✅ | ❌ | Free 5,000 results for 1 week |\n",
|
||||
"\n",
|
||||
"### Loader features\n",
|
||||
"| Document Lazy Loading |\n",
|
||||
"|:---------------------:|\n",
|
||||
"| ✅ |\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Setup"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Install the required dependencies.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -U langchain-oxylabs"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Credentials\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Set up the proper API keys and environment variables.\n",
|
||||
"Create your API user credentials: Sign up for a free trial or purchase the product\n",
|
||||
"in the [Oxylabs dashboard](https://dashboard.oxylabs.io/en/registration)\n",
|
||||
"to create your API user credentials (OXYLABS_USERNAME and OXYLABS_PASSWORD)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import getpass\n",
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"OXYLABS_USERNAME\"] = getpass.getpass(\"Enter your Oxylabs username: \")\n",
|
||||
"os.environ[\"OXYLABS_PASSWORD\"] = getpass.getpass(\"Enter your Oxylabs password: \")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Initialization"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2025-08-06T10:57:51.630011Z",
|
||||
"start_time": "2025-08-06T10:57:51.623814Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_oxylabs import OxylabsLoader"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2025-08-06T10:57:53.685413Z",
|
||||
"start_time": "2025-08-06T10:57:53.628859Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = OxylabsLoader(\n",
|
||||
" urls=[\n",
|
||||
" \"https://sandbox.oxylabs.io/products/1\",\n",
|
||||
" \"https://sandbox.oxylabs.io/products/2\",\n",
|
||||
" ],\n",
|
||||
" params={\"markdown\": True},\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": "## Load"
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2025-08-06T10:59:51.487327Z",
|
||||
"start_time": "2025-08-06T10:59:48.592743Z"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"2751\n",
|
||||
"[](/)\n",
|
||||
"\n",
|
||||
"Game platforms:\n",
|
||||
"\n",
|
||||
"* **All**\n",
|
||||
"\n",
|
||||
"* [Nintendo platform](/products/category/nintendo)\n",
|
||||
"\n",
|
||||
"+ wii\n",
|
||||
"+ wii-u\n",
|
||||
"+ nintendo-64\n",
|
||||
"+ switch\n",
|
||||
"+ gamecube\n",
|
||||
"+ game-boy-advance\n",
|
||||
"+ 3ds\n",
|
||||
"+ ds\n",
|
||||
"\n",
|
||||
"* [Xbox platform](/products/category/xbox-platform)\n",
|
||||
"\n",
|
||||
"* **Dreamcast**\n",
|
||||
"\n",
|
||||
"* [Playstation platform](/products/category/playstation-platform)\n",
|
||||
"\n",
|
||||
"* **Pc**\n",
|
||||
"\n",
|
||||
"* **Stadia**\n",
|
||||
"\n",
|
||||
"Go Back\n",
|
||||
"\n",
|
||||
"Note!This is a sandbox website used for web scraping. Information listed in this website does not have any real meaning and should not be associated with the actual products.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"## The Legend of Zelda: Ocarina of Time\n",
|
||||
"\n",
|
||||
"**Developer:** Nintendo**Platform:****Type:** singleplayer\n",
|
||||
"\n",
|
||||
"As a young boy, Link is tricked by Ganondorf, the King of the Gerudo Thieves. The evil human uses Link to g\n",
|
||||
"5542\n",
|
||||
"[](/)\n",
|
||||
"\n",
|
||||
"Game platforms:\n",
|
||||
"\n",
|
||||
"* **All**\n",
|
||||
"\n",
|
||||
"* [Nintendo platform](/products/category/nintendo)\n",
|
||||
"\n",
|
||||
"+ wii\n",
|
||||
"+ wii-u\n",
|
||||
"+ nintendo-64\n",
|
||||
"+ switch\n",
|
||||
"+ gamecube\n",
|
||||
"+ game-boy-advance\n",
|
||||
"+ 3ds\n",
|
||||
"+ ds\n",
|
||||
"\n",
|
||||
"* [Xbox platform](/products/category/xbox-platform)\n",
|
||||
"\n",
|
||||
"* **Dreamcast**\n",
|
||||
"\n",
|
||||
"* [Playstation platform](/products/category/playstation-platform)\n",
|
||||
"\n",
|
||||
"* **Pc**\n",
|
||||
"\n",
|
||||
"* **Stadia**\n",
|
||||
"\n",
|
||||
"Go Back\n",
|
||||
"\n",
|
||||
"Note!This is a sandbox website used for web scraping. Information listed in this website does not have any real meaning and should not be associated with the actual products.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"## Super Mario Galaxy\n",
|
||||
"\n",
|
||||
"**Developer:** Nintendo**Platform:****Type:** singleplayer\n",
|
||||
"\n",
|
||||
"[Metacritic's 2007 Wii Game of the Year] The ultimate Nintendo hero is taking the ultimate step ... out into space. Join Mario as he ushers in a new era of video games, de\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"for document in loader.load():\n",
|
||||
" print(document.page_content[:1000])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"metadata": {},
|
||||
"cell_type": "markdown",
|
||||
"source": "## Lazy Load"
|
||||
},
|
||||
{
|
||||
"metadata": {},
|
||||
"cell_type": "code",
|
||||
"outputs": [],
|
||||
"execution_count": null,
|
||||
"source": [
|
||||
"for document in loader.lazy_load():\n",
|
||||
" print(document.page_content[:1000])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Advanced examples\n",
|
||||
"\n",
|
||||
"The following examples show the usage of `OxylabsLoader` with geolocation, currency, pagination and user agent parameters for Amazon Search and Google Search sources."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2025-08-06T11:04:19.901122Z",
|
||||
"start_time": "2025-08-06T11:04:19.838933Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = OxylabsLoader(\n",
|
||||
" queries=[\"gaming headset\", \"gaming chair\", \"computer mouse\"],\n",
|
||||
" params={\n",
|
||||
" \"source\": \"amazon_search\",\n",
|
||||
" \"parse\": True,\n",
|
||||
" \"geo_location\": \"DE\",\n",
|
||||
" \"currency\": \"EUR\",\n",
|
||||
" \"pages\": 3,\n",
|
||||
" },\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 23,
|
||||
"metadata": {
|
||||
"ExecuteTime": {
|
||||
"end_time": "2025-08-06T11:07:17.648142Z",
|
||||
"start_time": "2025-08-06T11:07:17.595629Z"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = OxylabsLoader(\n",
|
||||
" queries=[\"europe gdp per capita\", \"us gdp per capita\"],\n",
|
||||
" params={\n",
|
||||
" \"source\": \"google_search\",\n",
|
||||
" \"parse\": True,\n",
|
||||
" \"geo_location\": \"Paris, France\",\n",
|
||||
" \"user_agent_type\": \"mobile\",\n",
|
||||
" },\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"[More information about this package.](https://github.com/oxylabs/langchain-oxylabs)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@@ -132,12 +132,13 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.documents import Document\n",
|
||||
"from langchain_experimental.graph_transformers import LLMGraphTransformer\n",
|
||||
"\n",
|
||||
"# from langchain_experimental.graph_transformers import LLMGraphTransformer\n",
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"\n",
|
||||
"# Define the LLMGraphTransformer\n",
|
||||
|
||||
@@ -548,12 +548,12 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.documents import Document\n",
|
||||
"from langchain_experimental.graph_transformers import LLMGraphTransformer"
|
||||
"# from langchain_experimental.graph_transformers import LLMGraphTransformer"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -44,7 +44,9 @@
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": "%pip install --upgrade --quiet llama-cpp-python"
|
||||
"source": [
|
||||
"%pip install --upgrade --quiet llama-cpp-python"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -62,7 +64,9 @@
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": "!CMAKE_ARGS=\"-DGGML_CUDA=on\" FORCE_CMAKE=1 pip install llama-cpp-python"
|
||||
"source": [
|
||||
"!CMAKE_ARGS=\"-DGGML_CUDA=on\" FORCE_CMAKE=1 pip install llama-cpp-python"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -76,7 +80,9 @@
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": "!CMAKE_ARGS=\"-DGGML_CUDA=on\" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir"
|
||||
"source": [
|
||||
"!CMAKE_ARGS=\"-DGGML_CUDA=on\" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -94,7 +100,9 @@
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": "!CMAKE_ARGS=\"-DLLAMA_METAL=on\" FORCE_CMAKE=1 pip install llama-cpp-python"
|
||||
"source": [
|
||||
"!CMAKE_ARGS=\"-DLLAMA_METAL=on\" FORCE_CMAKE=1 pip install llama-cpp-python"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -108,7 +116,9 @@
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": "!CMAKE_ARGS=\"-DLLAMA_METAL=on\" FORCE_CMAKE=1 pip install llama-cpp-python --force-reinstall --no-binary :all: --no-cache-dir"
|
||||
"source": [
|
||||
"!CMAKE_ARGS=\"-DLLAMA_METAL=on\" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -164,7 +174,9 @@
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": "!python -m pip install -e . --force-reinstall --no-cache-dir"
|
||||
"source": [
|
||||
"!python -m pip install -e . --force-reinstall --no-cache-dir"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -706,4 +718,4 @@
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
}
|
||||
|
||||
@@ -31,7 +31,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install -U langchain-oci"
|
||||
"!pip install -U oci langchain-community"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -47,7 +47,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_oci.llms import OCIGenAI\n",
|
||||
"from langchain_community.llms.oci_generative_ai import OCIGenAI\n",
|
||||
"\n",
|
||||
"llm = OCIGenAI(\n",
|
||||
" model_id=\"cohere.command\",\n",
|
||||
|
||||
@@ -1,215 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# RecallioMemory + LangChain Integration Demo\n",
|
||||
"A minimal notebook to show drop-in usage of RecallioMemory in LangChain (with scoped writes and recall)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install recallio langchain langchain-recallio openai"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Setup: API Keys & Imports"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_recallio.memory import RecallioMemory\n",
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"from langchain.prompts import ChatPromptTemplate\n",
|
||||
"import os\n",
|
||||
"\n",
|
||||
"# Set your keys here or use environment variables\n",
|
||||
"RECALLIO_API_KEY = os.getenv(\"RECALLIO_API_KEY\", \"YOUR_RECALLIO_API_KEY\")\n",
|
||||
"OPENAI_API_KEY = os.getenv(\"OPENAI_API_KEY\", \"YOUR_OPENAI_API_KEY\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Initialize RecallioMemory"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"memory = RecallioMemory(\n",
|
||||
" project_id=\"project_abc\",\n",
|
||||
" api_key=RECALLIO_API_KEY,\n",
|
||||
" session_id=\"demo-session-001\",\n",
|
||||
" user_id=\"demo-user-42\",\n",
|
||||
" default_tags=[\"test\", \"langchain\"],\n",
|
||||
" return_messages=True,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Build a LangChain ConversationChain with RecallioMemory"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# You can swap in any supported LLM here\n",
|
||||
"llm = ChatOpenAI(api_key=OPENAI_API_KEY, temperature=0)\n",
|
||||
"prompt = ChatPromptTemplate.from_messages(\n",
|
||||
" [\n",
|
||||
" (\n",
|
||||
" \"system\",\n",
|
||||
" \"The following is a friendly conversation between a human and an AI. \"\n",
|
||||
" \"The AI is talkative and provides lots of specific details from its context. \"\n",
|
||||
" \"If the AI does not know the answer to a question, it truthfully says it does not know.\",\n",
|
||||
" ),\n",
|
||||
" (\"placeholder\", \"{history}\"), # RecallioMemory will fill this slot\n",
|
||||
" (\"human\", \"{input}\"),\n",
|
||||
" ]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# LCEL chain that returns an AIMessage\n",
|
||||
"base_chain = prompt | llm\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Create a stateful chain using RecallioMemory\n",
|
||||
"def chat_with_memory(user_input: str):\n",
|
||||
" # Load conversation history from memory\n",
|
||||
" memory_vars = memory.load_memory_variables({\"input\": user_input})\n",
|
||||
"\n",
|
||||
" # Run the chain with history and user input\n",
|
||||
" response = base_chain.invoke(\n",
|
||||
" {\"input\": user_input, \"history\": memory_vars.get(\"history\", \"\")}\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" # Save the conversation to memory\n",
|
||||
" memory.save_context({\"input\": user_input}, {\"output\": response.content})\n",
|
||||
"\n",
|
||||
" return response"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Example: Chat with Memory"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Bot: Hello Guillaume! It's nice to meet you. How can I assist you today?\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# First user message – note the AI remembers the name\n",
|
||||
"resp1 = chat_with_memory(\"Hi! My name is Guillaume. Remember that.\")\n",
|
||||
"print(\"Bot:\", resp1.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Bot: Your name is Guillaume.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Second user message – AI should recall the name from memory\n",
|
||||
"resp2 = chat_with_memory(\"What is my name?\")\n",
|
||||
"print(\"Bot:\", resp2.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## See What Is Stored in Recallio\n",
|
||||
"This is for debugging/demo only; in production, you wouldn't do this on every run."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Current memory variables: {'history': [HumanMessage(content='Name is Guillaume', additional_kwargs={}, response_metadata={})]}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(\"Current memory variables:\", memory.load_memory_variables({}))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Clear Memory (Optional Cleanup - Requires Manager level Key)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# memory.clear()\n",
|
||||
"# print(\"Memory cleared.\")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python",
|
||||
"version": "3.10"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
26
docs/docs/integrations/providers/aerospike.mdx
Normal file
26
docs/docs/integrations/providers/aerospike.mdx
Normal file
@@ -0,0 +1,26 @@
|
||||
# Aerospike
|
||||
|
||||
>[Aerospike](https://aerospike.com/docs/vector) is a high-performance, distributed database known for its speed and scalability, now with support for vector storage and search, enabling retrieval and search of embedding vectors for machine learning and AI applications.
|
||||
> See the documentation for Aerospike Vector Search (AVS) [here](https://aerospike.com/docs/vector).
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
Install the AVS Python SDK and AVS langchain vector store:
|
||||
|
||||
```bash
|
||||
pip install aerospike-vector-search langchain-aerospike
|
||||
```
|
||||
|
||||
See the documentation for the Python SDK [here](https://aerospike-vector-search-python-client.readthedocs.io/en/latest/index.html).
|
||||
The documentation for the AVS langchain vector store is [here](https://langchain-aerospike.readthedocs.io/en/latest/).
|
||||
|
||||
## Vector Store
|
||||
|
||||
To import this vectorstore:
|
||||
|
||||
```python
|
||||
from langchain_aerospike.vectorstores import Aerospike
|
||||
```
|
||||
|
||||
See a usage example [here](https://python.langchain.com/docs/integrations/vectorstores/aerospike/).
|
||||
|
||||
@@ -1,38 +0,0 @@
|
||||
# Anchor Browser
|
||||
|
||||
[Anchor](https://anchorbrowser.io?utm=langchain) is the platform for AI Agentic browser automation, which solves the challenge of automating workflows for web applications that lack APIs or have limited API coverage. It simplifies the creation, deployment, and management of browser-based automations, transforming complex web interactions into simple API endpoints.
|
||||
|
||||
`langchain-anchorbrowser` provides 3 main tools:
|
||||
- `AnchorContentTool` - For web content extractions in Markdown or HTML format.
|
||||
- `AnchorScreenshotTool` - For web page screenshots.
|
||||
- `AnchorWebTaskTools` - To perform web tasks.
|
||||
|
||||
## Quickstart
|
||||
|
||||
### Installation
|
||||
|
||||
Install the package:
|
||||
|
||||
```bash
|
||||
pip install langchain-anchorbrowser
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
Import and utilize your intended tool. The full list of Anchor Browser available tools see **Tool Features** table in [Anchor Browser tool page](/docs/integrations/tools/anchor_browser)
|
||||
|
||||
```python
|
||||
from langchain_anchorbrowser import AnchorContentTool
|
||||
|
||||
# Get Markdown Content for https://www.anchorbrowser.io
|
||||
AnchorContentTool().invoke(
|
||||
{"url": "https://www.anchorbrowser.io", "format": "markdown"}
|
||||
)
|
||||
```
|
||||
|
||||
## Additional Resources
|
||||
|
||||
- [PyPi](https://pypi.org/project/langchain-anchorbrowser)
|
||||
- [Github](https://github.com/anchorbrowser/langchain-anchorbrowser)
|
||||
- [Anchor Browser Docs](https://docs.anchorbrowser.io/introduction?utm=langchain)
|
||||
- [Anchor Browser API Reference](https://docs.anchorbrowser.io/api-reference/ai-tools/perform-web-task?utm=langchain)
|
||||
@@ -21,38 +21,6 @@ pip install deepeval
|
||||
|
||||
See an [example](/docs/integrations/callbacks/confident).
|
||||
|
||||
|
||||
## Modern Integration Example
|
||||
|
||||
Install the required packages:
|
||||
|
||||
```bash
|
||||
pip install deepeval langchain langchain-openai
|
||||
```
|
||||
|
||||
Authenticate with your API key:
|
||||
|
||||
```python
|
||||
import os
|
||||
import deepeval
|
||||
|
||||
# Load API key from environment variable for security
|
||||
api_key = os.environ.get("DEEPEVAL_API_KEY")
|
||||
deepeval.login(api_key)
|
||||
from langchain.callbacks.confident_callback import DeepEvalCallbackHandler
|
||||
```
|
||||
|
||||
Use the new callback handler:
|
||||
|
||||
```python
|
||||
from deepeval.integrations.langchain import CallbackHandler
|
||||
|
||||
handler = CallbackHandler(
|
||||
name="My Trace",
|
||||
tags=["production", "v1"],
|
||||
metadata={"experiment": "A/B"},
|
||||
thread_id="thread-123",
|
||||
user_id="user-456"
|
||||
)
|
||||
```
|
||||
|
||||
See the [full example](/docs/integrations/callbacks/confident).
|
||||
@@ -929,41 +929,6 @@ from langchain_google_community.gmail.search import GmailSearch
|
||||
from langchain_google_community.gmail.send_message import GmailSendMessage
|
||||
```
|
||||
|
||||
### MCP Toolbox
|
||||
|
||||
[MCP Toolbox](https://github.com/googleapis/genai-toolbox) provides a simple and efficient way to connect to your databases, including those on Google Cloud like [Cloud SQL](https://cloud.google.com/sql/docs) and [AlloyDB](https://cloud.google.com/alloydb/docs/overview). With MCP Toolbox, you can seamlessly integrate your database with LangChain to build powerful, data-driven applications.
|
||||
|
||||
#### Installation
|
||||
|
||||
To get started, [install the Toolbox server and client](https://github.com/googleapis/genai-toolbox/releases/).
|
||||
|
||||
|
||||
[Configure](https://googleapis.github.io/genai-toolbox/getting-started/configure/) a `tools.yaml` to define your tools, and then execute toolbox to start the server:
|
||||
|
||||
```bash
|
||||
toolbox --tools-file "tools.yaml"
|
||||
```
|
||||
|
||||
Then, install the Toolbox client:
|
||||
|
||||
```bash
|
||||
pip install toolbox-langchain
|
||||
```
|
||||
|
||||
#### Getting Started
|
||||
|
||||
Here is a quick example of how to use MCP Toolbox to connect to your database:
|
||||
|
||||
```python
|
||||
from toolbox_langchain import ToolboxClient
|
||||
|
||||
async with ToolboxClient("http://127.0.0.1:5000") as client:
|
||||
|
||||
tools = client.load_toolset()
|
||||
```
|
||||
|
||||
See [usage example and setup instructions](/docs/integrations/tools/toolbox).
|
||||
|
||||
### Memory
|
||||
|
||||
Store conversation history using Google Cloud databases.
|
||||
|
||||
@@ -2,10 +2,17 @@
|
||||
|
||||
This will help you getting started with DigitalOcean Gradient [chat models](/docs/concepts/chat_models).
|
||||
|
||||
## Overview
|
||||
### Integration details
|
||||
|
||||
| Class | Package | Package downloads | Package latest |
|
||||
| :--- | :--- | :---: | :---: |
|
||||
| [ChatGradient](https://python.langchain.com/api_reference/langchain-gradient/chat_models/langchain_gradient.chat_models.ChatGradient.html) | [langchain-gradient](https://python.langchain.com/api_reference/langchain-gradient/) |  |  |
|
||||
|
||||
|
||||
## Setup
|
||||
|
||||
langchain-gradient uses DigitalOcean's Gradient™ AI Platform.
|
||||
langchain-gradient uses DigitalOcean Gradient Platform.
|
||||
|
||||
Create an account on DigitalOcean, acquire a `DIGITALOCEAN_INFERENCE_KEY` API key from the Gradient Platform, and install the `langchain-gradient` integration package.
|
||||
|
||||
|
||||
@@ -77,7 +77,7 @@ from langchain_ibm import WatsonxRerank
|
||||
See a [usage example](/docs/integrations/tools/ibm_watsonx).
|
||||
|
||||
```python
|
||||
from langchain_ibm.agent_toolkits.utility import WatsonxToolkit
|
||||
from langchain_ibm import WatsonxToolkit
|
||||
```
|
||||
|
||||
## DB2
|
||||
|
||||
@@ -40,11 +40,11 @@ embeddings.embed_query("What is the meaning of life?")
|
||||
```
|
||||
|
||||
## LLMs
|
||||
`ModelScopeEndpoint` class exposes LLMs from ModelScope.
|
||||
`ModelScopeLLM` class exposes LLMs from ModelScope.
|
||||
|
||||
```python
|
||||
from langchain_modelscope import ModelScopeEndpoint
|
||||
from langchain_modelscope import ModelScopeLLM
|
||||
|
||||
llm = ModelScopeEndpoint(model="Qwen/Qwen2.5-Coder-32B-Instruct")
|
||||
llm = ModelScopeLLM(model="Qwen/Qwen2.5-Coder-32B-Instruct")
|
||||
llm.invoke("The meaning of life is")
|
||||
```
|
||||
|
||||
@@ -11,17 +11,17 @@ The `LangChain` integrations related to [Oracle Cloud Infrastructure](https://ww
|
||||
To use, you should have the latest `oci` python SDK and the langchain_community package installed.
|
||||
|
||||
```bash
|
||||
pip install -U langchain_oci
|
||||
pip install -U oci langchain-community
|
||||
```
|
||||
|
||||
See [chat](/docs/integrations/llms/oci_generative_ai), [complete](/docs/integrations/chat/oci_generative_ai), and [embedding](/docs/integrations/text_embedding/oci_generative_ai) usage examples.
|
||||
|
||||
```python
|
||||
from langchain_oci.chat_models import ChatOCIGenAI
|
||||
from langchain_community.chat_models import ChatOCIGenAI
|
||||
|
||||
from langchain_oci.llms import OCIGenAI
|
||||
from langchain_community.llms import OCIGenAI
|
||||
|
||||
from langchain_oci.embeddings import OCIGenAIEmbeddings
|
||||
from langchain_community.embeddings import OCIGenAIEmbeddings
|
||||
```
|
||||
|
||||
## OCI Data Science Model Deployment Endpoint
|
||||
@@ -42,8 +42,8 @@ See [chat](/docs/integrations/chat/oci_data_science) and [complete](/docs/integr
|
||||
|
||||
|
||||
```python
|
||||
from langchain_oci.chat_models import ChatOCIModelDeployment
|
||||
from langchain_community.chat_models import ChatOCIModelDeployment
|
||||
|
||||
from langchain_oci.llms import OCIModelDeploymentLLM
|
||||
from langchain_community.llms import OCIModelDeploymentLLM
|
||||
```
|
||||
|
||||
|
||||
@@ -3,11 +3,13 @@
|
||||
>[Ollama](https://ollama.com/) allows you to run open-source large language models,
|
||||
> such as [gpt-oss](https://ollama.com/library/gpt-oss), locally.
|
||||
>
|
||||
>`Ollama` bundles model weights, configuration, and data into a single package, defined by a Modelfile.
|
||||
>It optimizes setup and configuration details, including GPU usage.
|
||||
>For a complete list of supported models and model variants, see the [Ollama model library](https://ollama.ai/library).
|
||||
>The `ollama` [package](https://pypi.org/project/ollama/0.5.3/) bundles model weights,
|
||||
> configuration, and data into a single package, defined by a Modelfile. It optimizes
|
||||
> setup and configuration details, including GPU usage.
|
||||
>For a complete list of supported models and model variants, see the
|
||||
> [Ollama model library](https://ollama.com/search).
|
||||
|
||||
See [this guide](/docs/how_to/local_llms#ollama) for more details
|
||||
See [this guide](/docs/how_to/local_llms/#ollama) for more details
|
||||
on how to use `ollama` with LangChain.
|
||||
|
||||
## Installation and Setup
|
||||
@@ -23,7 +25,7 @@ Ollama will start as a background service automatically, if this is disabled, ru
|
||||
ollama serve
|
||||
```
|
||||
|
||||
After starting ollama, run `ollama pull <name-of-model>` to download a model from the [Ollama model library](https://ollama.ai/library):
|
||||
After starting ollama, run `ollama pull <name-of-model>` to download a model from the [Ollama model library](https://ollama.com/library):
|
||||
|
||||
```bash
|
||||
ollama pull gpt-oss:20b
|
||||
|
||||
@@ -1,31 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Recallio\n",
|
||||
"\n",
|
||||
"[Recallio](https://recallio.ai/) is a powerfull API allowing to store, index, and retrieve application “memories” with built-in fact extraction, dynamic summaries, reranked recall, and a full knowledge-graph layer.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"## Installation\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"pip install langchain-recallio\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"from langchain_recallio.memory import RecallioMemory\n",
|
||||
"```"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
||||
@@ -1,26 +0,0 @@
|
||||
# Scrapeless
|
||||
|
||||
[Scrapeless](https://scrapeless.com) offers flexible and feature-rich data acquisition services with extensive parameter customization and multi-format export support.
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
```bash
|
||||
pip install langchain-scrapeless
|
||||
```
|
||||
|
||||
You'll need to set up your Scrapeless API key:
|
||||
|
||||
```python
|
||||
import os
|
||||
os.environ["SCRAPELESS_API_KEY"] = "your-api-key"
|
||||
```
|
||||
|
||||
## Tools
|
||||
|
||||
The Scrapeless integration provides several tools:
|
||||
|
||||
- [ScrapelessDeepSerpGoogleSearchTool](/docs/integrations/tools/scrapeless_scraping_api) - Enables comprehensive extraction of Google SERP data across all result types.
|
||||
- [ScrapelessDeepSerpGoogleTrendsTool](/docs/integrations/tools/scrapeless_scraping_api) - Retrieves keyword trend data from Google, including popularity over time, regional interest, and related searches.
|
||||
- [ScrapelessUniversalScrapingTool](/docs/integrations/tools/scrapeless_universal_scraping) - Access and extract data from JS-Render websites that typically block bots.
|
||||
- [ScrapelessCrawlerCrawlTool](/docs/integrations/tools/scrapeless_crawl) - Crawl a website and its linked pages to extract comprehensive data.
|
||||
- [ScrapelessCrawlerScrapeTool](/docs/integrations/tools/scrapeless_crawl) - Extract information from a single webpage.
|
||||
@@ -1,43 +0,0 @@
|
||||
# langchain-siliconflow
|
||||
|
||||
This package contains the LangChain integration with SiliconFlow
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
pip install -U langchain-siliconflow
|
||||
```
|
||||
|
||||
And you should configure credentials by setting the following environment variables:
|
||||
|
||||
```bash
|
||||
export SILICONFLOW_API_KEY="your-api-key"
|
||||
```
|
||||
|
||||
You can set the following environment variable to use the `.cn` endpoint:
|
||||
|
||||
```bash
|
||||
export SILICONFLOW_BASE_URL="https://api.siliconflow.cn/v1"
|
||||
```
|
||||
|
||||
## Chat Models
|
||||
|
||||
`ChatSiliconFlow` class exposes chat models from SiliconFlow.
|
||||
|
||||
```python
|
||||
from langchain_siliconflow import ChatSiliconFlow
|
||||
|
||||
llm = ChatSiliconFlow()
|
||||
llm.invoke("Sing a ballad of LangChain.")
|
||||
```
|
||||
|
||||
## Embeddings
|
||||
|
||||
`SiliconFlowEmbeddings` class exposes embeddings from SiliconFlow.
|
||||
|
||||
```python
|
||||
from langchain_siliconflow import SiliconFlowEmbeddings
|
||||
|
||||
embeddings = SiliconFlowEmbeddings()
|
||||
embeddings.embed_query("What is the meaning of life?")
|
||||
```
|
||||
@@ -1,23 +0,0 @@
|
||||
# MCP Toolbox
|
||||
|
||||
The [MCP Toolbox](https://googleapis.github.io/genai-toolbox/getting-started/introduction/) in LangChain allows you to equip an agent with a set of tools. When the agent receives a query, it can intelligently select and use the most appropriate tool provided by MCP Toolbox to fulfill the request.
|
||||
|
||||
## What is it?
|
||||
|
||||
MCP Toolbox is essentially a container for your tools. Think of it as a multi-tool device for your agent; it can hold any tools you create. The agent then decides which specific tool to use based on the user's input.
|
||||
|
||||
This is particularly useful when you have an agent that needs to perform a variety of tasks that require different capabilities.
|
||||
|
||||
## Installation
|
||||
|
||||
To get started, you'll need to install the necessary package:
|
||||
|
||||
```bash
|
||||
pip install toolbox-langchain
|
||||
```
|
||||
|
||||
## Tutorial
|
||||
|
||||
For a complete, step-by-step guide on how to create, configure, and use MCP Toolbox with your agents, please refer to our detailed Jupyter notebook tutorial.
|
||||
|
||||
**[➡️ View the full tutorial here](/docs/integrations/tools/toolbox)**.
|
||||
@@ -1,101 +0,0 @@
|
||||
# TrueFoundry
|
||||
|
||||
TrueFoundry provides an enterprise-ready [AI Gateway](https://www.truefoundry.com/ai-gateway) to provide governance and observability to agentic frameworks like LangChain. TrueFoundry AI Gateway serves as a unified interface for LLM access, providing:
|
||||
|
||||
- **Unified API Access**: Connect to 250+ LLMs (OpenAI, Claude, Gemini, Groq, Mistral) through one API
|
||||
- **Low Latency**: Sub-3ms internal latency with intelligent routing and load balancing
|
||||
- **Enterprise Security**: SOC 2, HIPAA, GDPR compliance with RBAC and audit logging
|
||||
- **Quota and cost management**: Token-based quotas, rate limiting, and comprehensive usage tracking
|
||||
- **Observability**: Full request/response logging, metrics, and traces with customizable retention
|
||||
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before integrating LangChain with TrueFoundry, ensure you have:
|
||||
|
||||
1. **TrueFoundry Account**: A [TrueFoundry account](https://www.truefoundry.com/register) with at least one model provider configured. Follow quick start guide [here](https://docs.truefoundry.com/gateway/quick-start)
|
||||
2. **Personal Access Token**: Generate a token by following the [TrueFoundry token generation guide](https://docs.truefoundry.com/gateway/authentication)
|
||||
|
||||
## Quickstart
|
||||
|
||||
You can connect to TrueFoundry's unified LLM gateway through the `ChatOpenAI` interface.
|
||||
|
||||
- Set the `base_url` to your TrueFoundry endpoint (explained below)
|
||||
- Set the `api_key` to your TrueFoundry [PAT (Personal Access Token)](https://docs.truefoundry.com/gateway/authentication#personal-access-token-pat)
|
||||
- Use the same `model-name` as shown in the unified code snippet
|
||||
|
||||

|
||||
|
||||
### Installation
|
||||
|
||||
```bash
|
||||
pip install langchain-openai
|
||||
```
|
||||
|
||||
### Basic Setup
|
||||
|
||||
Connect to TrueFoundry by updating the `ChatOpenAI` model in LangChain:
|
||||
|
||||
```python
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
llm = ChatOpenAI(
|
||||
api_key=TRUEFOUNDRY_API_KEY,
|
||||
base_url=TRUEFOUNDRY_GATEWAY_BASE_URL,
|
||||
model="openai-main/gpt-4o" # Similarly you can call any model from any model provider
|
||||
)
|
||||
|
||||
llm.invoke("What is the meaning of life, universe and everything?")
|
||||
```
|
||||
|
||||
The request is routed through your TrueFoundry gateway to the specified model provider. TrueFoundry automatically handles rate limiting, load balancing, and observability.
|
||||
|
||||
### LangGraph Integration
|
||||
|
||||
|
||||
```python
|
||||
from langchain_openai import ChatOpenAI
|
||||
from langgraph.graph import StateGraph, MessagesState
|
||||
from langchain_core.messages import HumanMessage
|
||||
|
||||
# Define your LangGraph workflow
|
||||
def call_model(state: MessagesState):
|
||||
model = ChatOpenAI(
|
||||
api_key=TRUEFOUNDRY_API_KEY,
|
||||
base_url=TRUEFOUNDRY_GATEWAY_BASE_URL,
|
||||
# Copy the exact model name from gateway
|
||||
model="openai-main/gpt-4o"
|
||||
)
|
||||
response = model.invoke(state["messages"])
|
||||
return {"messages": [response]}
|
||||
|
||||
# Build workflow
|
||||
workflow = StateGraph(MessagesState)
|
||||
workflow.add_node("agent", call_model)
|
||||
workflow.set_entry_point("agent")
|
||||
workflow.set_finish_point("agent")
|
||||
|
||||
app = workflow.compile()
|
||||
|
||||
# Run agent through TrueFoundry
|
||||
result = app.invoke({"messages": [HumanMessage(content="Hello!")]})
|
||||
```
|
||||
|
||||
|
||||
## Observability and Governance
|
||||
|
||||

|
||||
|
||||
With the Metrics Dashboard, you can monitor and analyze:
|
||||
|
||||
- **Performance Metrics**: Track key latency metrics like Request Latency, Time to First Token (TTFS), and Inter-Token Latency (ITL) with P99, P90, and P50 percentiles
|
||||
- **Cost and Token Usage**: Gain visibility into your application's costs with detailed breakdowns of input/output tokens and the associated expenses for each model
|
||||
- **Usage Patterns**: Understand how your application is being used with detailed analytics on user activity, model distribution, and team-based usage
|
||||
- **Rate Limiting & Load Balancing**: Configure limits, distribute traffic across models, and set up fallbacks
|
||||
|
||||
## Support
|
||||
|
||||
For questions, issues, or support:
|
||||
|
||||
- **Email**: [support@truefoundry.com](mailto:support@truefoundry.com)
|
||||
- **Documentation**: [https://docs.truefoundry.com/](https://docs.truefoundry.com/)
|
||||
@@ -103,9 +103,7 @@
|
||||
"cell_type": "markdown",
|
||||
"id": "c84fb993",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:"
|
||||
]
|
||||
"source": "To enable automated tracing of your model calls, set your [LangSmith](https://docs.smith.langchain.com/) API key:"
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
@@ -159,7 +157,7 @@
|
||||
"from langchain_google_vertexai import VertexAIEmbeddings\n",
|
||||
"\n",
|
||||
"# Initialize the a specific Embeddings Model version\n",
|
||||
"embeddings = VertexAIEmbeddings(model_name=\"gemini-embedding-001\")"
|
||||
"embeddings = VertexAIEmbeddings(model_name=\"text-embedding-004\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -31,7 +31,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install -U langchain_oci"
|
||||
"!pip install -U oci"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -71,7 +71,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_oci.embeddings import OCIGenAIEmbeddings\n",
|
||||
"from langchain_community.embeddings import OCIGenAIEmbeddings\n",
|
||||
"\n",
|
||||
"# use default authN method API-key\n",
|
||||
"embeddings = OCIGenAIEmbeddings(\n",
|
||||
|
||||
@@ -1,304 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a6f91f20",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Anchor Browser\n",
|
||||
"\n",
|
||||
"Anchor is a platform for AI Agentic browser automation, which solves the challenge of automating workflows for web applications that lack APIs or have limited API coverage. It simplifies the creation, deployment, and management of browser-based automations, transforming complex web interactions into simple API endpoints.\n",
|
||||
"\n",
|
||||
"This notebook provides a quick overview for getting started with Anchor Browser tools. For more information of Anchor Browser visit [Anchorbrowser.io](https://anchorbrowser.io?utm=langchain) or the [Anchor Browser Docs](https://docs.anchorbrowser.io?utm=langchain)\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
"### Integration details\n",
|
||||
"\n",
|
||||
"Anchor Browser package for LangChain is [langchain-anchorbrowser](https://pypi.org/project/langchain-anchorbrowser), and the current latest version is .\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"### Tool features\n",
|
||||
"| Tool Name | Package | Description | Parameters |\n",
|
||||
"| :--- | :--- | :--- | :---|\n",
|
||||
"| `AnchorContentTool` | langchain-anchorbrowser | Extract text content from web pages | `url`, `format` |\n",
|
||||
"| `AnchorScreenshotTool` | langchain-anchorbrowser | Take screenshots of web pages | `url`, `width`, `height`, `image_quality`, `wait`, `scroll_all_content`, `capture_full_height`, `s3_target_address` |\n",
|
||||
"| `AnchorWebTaskToolKit` | langchain-anchorbrowser | Perform intelligent web tasks using AI (Simple & Advanced modes) | see below |\n",
|
||||
"\n",
|
||||
"The parameters allowed in `langchain-anchorbrowser` are only a subset of those listed in the Anchor Browser API reference respectively: [Get Webpage Content](https://docs.anchorbrowser.io/sdk-reference/tools/get-webpage-content?utm=langchain), [Screenshot Webpage](https://docs.anchorbrowser.io/sdk-reference/tools/screenshot-webpage?utm=langchain), and [Perform Web Task](https://docs.anchorbrowser.io/sdk-reference/ai-tools/perform-web-task?utm=langchain).\n",
|
||||
"\n",
|
||||
"**Info:** Anchor currently implements `SimpleAnchorWebTaskTool` and `AdvancedAnchorWebTaskTool` tools for langchain with `browser_use` agent. For \n",
|
||||
"\n",
|
||||
"#### AnchorWebTaskToolKit Tools\n",
|
||||
"\n",
|
||||
"The difference between each tool in this toolkit is the pydantic configuration structure.\n",
|
||||
"| Tool Name | Package | Parameters |\n",
|
||||
"| :--- | :--- | :--- |\n",
|
||||
"| `SimpleAnchorWebTaskTool` | langchain-anchorbrowser | prompt, url |\n",
|
||||
"| `AdvancedAnchorWebTaskTool` | langchain-anchorbrowser | prompt, url, output_schema |\n",
|
||||
"\n",
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"The integration lives in the `langchain-anchorbrowser` package."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f85b4089",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install --quiet -U langchain-anchorbrowser pydantic"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b15e9266",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Credentials\n",
|
||||
"\n",
|
||||
"Use your Anchor Browser Credentials. Get them on Anchor Browser [API Keys page](https://app.anchorbrowser.io/api-keys?utm=langchain) as needed."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "e0b178a2-8816-40ca-b57c-ccdd86dde9c9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import getpass\n",
|
||||
"import os\n",
|
||||
"\n",
|
||||
"if not os.environ.get(\"ANCHORBROWSER_API_KEY\"):\n",
|
||||
" os.environ[\"ANCHORBROWSER_API_KEY\"] = getpass.getpass(\"ANCHORBROWSER API key:\\n\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1c97218f-f366-479d-8bf7-fe9f2f6df73f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Instantiation\n",
|
||||
"\n",
|
||||
"Instantiace easily Anchor Browser tools instances."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "8b3ddfe9-ca79-494c-a7ab-1f56d9407a64",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_anchorbrowser import (\n",
|
||||
" AnchorContentTool,\n",
|
||||
" AnchorScreenshotTool,\n",
|
||||
" AdvancedAnchorWebTaskTool,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"anchor_content_tool = AnchorContentTool()\n",
|
||||
"anchor_screenshot_tool = AnchorScreenshotTool()\n",
|
||||
"anchor_advanced_web_task_tool = AdvancedAnchorWebTaskTool()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "74147a1a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Invocation\n",
|
||||
"\n",
|
||||
"### [Invoke directly with args](/docs/concepts/tools/#use-the-tool-directly)\n",
|
||||
"\n",
|
||||
"The full available argument list appear above in the tool features table."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "65310a8b-eb0c-4d9e-a618-4f4abe2414fc",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Get Markdown Content for https://www.anchorbrowser.io\n",
|
||||
"anchor_content_tool.invoke(\n",
|
||||
" {\"url\": \"https://www.anchorbrowser.io\", \"format\": \"markdown\"}\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Get a Screenshot for https://docs.anchorbrowser.io\n",
|
||||
"anchor_screenshot_tool.invoke(\n",
|
||||
" {\"url\": \"https://docs.anchorbrowser.io\", \"width\": 1280, \"height\": 720}\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Define a Pydantic model for the web task output schema\n",
|
||||
"from pydantic import BaseModel\n",
|
||||
"from typing import List\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"class NodeCpuUsage(BaseModel):\n",
|
||||
" node: str\n",
|
||||
" cluster: str\n",
|
||||
" cpu_avg_percentage: float\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"class OutputSchema(BaseModel):\n",
|
||||
" nodes_cpu_usage: List[NodeCpuUsage]\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# Run a web task to collect data from a web page\n",
|
||||
"anchor_advanced_web_task_tool.invoke(\n",
|
||||
" {\n",
|
||||
" \"prompt\": \"Collect the node names and their CPU average %\",\n",
|
||||
" \"url\": \"https://play.grafana.org/a/grafana-k8s-app/navigation/nodes?from=now-1h&to=now&refresh=1m\",\n",
|
||||
" \"output_schema\": OutputSchema.model_json_schema(),\n",
|
||||
" }\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d6e73897",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### [Invoke with ToolCall](/docs/concepts/tool_calling/#tool-execution)\n",
|
||||
"\n",
|
||||
"We can also invoke the tool with a model-generated ToolCall, in which case a ToolMessage will be returned:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f90e33a7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# This is usually generated by a model, but we'll create a tool call directly for demo purposes.\n",
|
||||
"model_generated_tool_call = {\n",
|
||||
" \"args\": {\"url\": \"https://www.anchorbrowser.io\", \"format\": \"markdown\"},\n",
|
||||
" \"id\": \"1\",\n",
|
||||
" \"name\": anchor_content_tool.name,\n",
|
||||
" \"type\": \"tool_call\",\n",
|
||||
"}\n",
|
||||
"anchor_content_tool.invoke(model_generated_tool_call)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "659f9fbd-6fcf-445f-aa8c-72d8e60154bd",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Chaining\n",
|
||||
"\n",
|
||||
"We can use our tool in a chain by first binding it to a [tool-calling model](/docs/how_to/tool_calling/) and then calling it:\n",
|
||||
"## Use within an agent"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c67bfd54",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -qU langchain langchain-openai"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "af3123ad-7a02-40e5-b58e-7d56e23e5830",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.chat_models import init_chat_model\n",
|
||||
"\n",
|
||||
"llm = init_chat_model(model=\"gpt-4o\", model_provider=\"openai\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "210511c8",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"if not os.environ.get(\"OPENAI_API_KEY\"):\n",
|
||||
" os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OPENAI API key:\\n\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "fdbf35b5-3aaf-4947-9ec6-48c21533fb95",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_core.prompts import ChatPromptTemplate\n",
|
||||
"from langchain_core.runnables import RunnableConfig, chain\n",
|
||||
"\n",
|
||||
"prompt = ChatPromptTemplate(\n",
|
||||
" [\n",
|
||||
" (\"system\", \"You are a helpful assistant.\"),\n",
|
||||
" (\"human\", \"{user_input}\"),\n",
|
||||
" (\"placeholder\", \"{messages}\"),\n",
|
||||
" ]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# specifying tool_choice will force the model to call this tool.\n",
|
||||
"llm_with_tools = llm.bind_tools(\n",
|
||||
" [anchor_content_tool], tool_choice=anchor_content_tool.name\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"llm_chain = prompt | llm_with_tools\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"@chain\n",
|
||||
"def tool_chain(user_input: str, config: RunnableConfig):\n",
|
||||
" input_ = {\"user_input\": user_input}\n",
|
||||
" ai_msg = llm_chain.invoke(input_, config=config)\n",
|
||||
" tool_msgs = anchor_content_tool.batch(ai_msg.tool_calls, config=config)\n",
|
||||
" return llm_chain.invoke({**input_, \"messages\": [ai_msg, *tool_msgs]}, config=config)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"tool_chain.invoke(input())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4ac8146c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
" - [PyPi](https://pypi.org/project/langchain-anchorbrowser)\n",
|
||||
" - [Github](https://github.com/anchorbrowser/langchain-anchorbrowser)\n",
|
||||
" - [Anchor Browser Docs](https://docs.anchorbrowser.io/introduction?utm=langchain)\n",
|
||||
" - [Anchor Browser API Reference](https://docs.anchorbrowser.io/api-reference/ai-tools/perform-web-task?utm=langchain)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "langchain",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
@@ -30,7 +30,7 @@
|
||||
"\n",
|
||||
"| Class | Package | Serializable | [JS support](https://js.langchain.com/docs/integrations/toolkits/ibm/) | Package downloads | Package latest |\n",
|
||||
"| :--- | :--- | :---: | :---: | :---: | :---: |\n",
|
||||
"| [WatsonxToolkit](https://python.langchain.com/api_reference/ibm/agent_toolkits/langchain_ibm.agent_toolkits.utility.toolkit.WatsonxToolkit.html) | [langchain-ibm](https://python.langchain.com/api_reference/ibm/index.html) | ❌ | ✅ |  |  |"
|
||||
"| [WatsonxToolkit](https://python.langchain.com/api_reference/ibm/toolkit/langchain_ibm.toolkit.WatsonxToolkit.html) | [langchain-ibm](https://python.langchain.com/api_reference/ibm/index.html) | ❌ | ✅ |  |  |"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -78,9 +78,7 @@
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"WATSONX_URL\"] = \"your service instance url\"\n",
|
||||
"os.environ[\"WATSONX_TOKEN\"] = \"your token for accessing the CLOUD or CPD cluster\"\n",
|
||||
"os.environ[\"WATSONX_PASSWORD\"] = \"your password for accessing the CPD cluster\"\n",
|
||||
"os.environ[\"WATSONX_USERNAME\"] = \"your username for accessing the CPD cluster\""
|
||||
"os.environ[\"WATSONX_TOKEN\"] = \"your token for accessing the service instance\""
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -118,38 +116,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_ibm.agent_toolkits.utility import WatsonxToolkit\n",
|
||||
"from langchain_ibm import WatsonxToolkit\n",
|
||||
"\n",
|
||||
"watsonx_toolkit = WatsonxToolkit(\n",
|
||||
" url=\"https://us-south.ml.cloud.ibm.com\",\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Alternatively, you can use Cloud Pak for Data credentials. For details, see [watsonx.ai software setup](https://ibm.github.io/watsonx-ai-python-sdk/setup_cpd.html). "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"watsonx_toolkit = WatsonxToolkit(\n",
|
||||
" url=\"PASTE YOUR URL HERE\",\n",
|
||||
" username=\"PASTE YOUR USERNAME HERE\",\n",
|
||||
" password=\"PASTE YOUR PASSWORD HERE\",\n",
|
||||
" version=\"5.2\",\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
@@ -176,7 +153,7 @@
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Tools"
|
||||
"## Tools\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -210,14 +187,6 @@
|
||||
"watsonx_toolkit.get_tools()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"> **NOTE** \n",
|
||||
"> The list of available tools may vary depending on whether it is IBM watsonx.ai for IBM Cloud or IBM watsonx.ai software."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
@@ -251,7 +220,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -266,7 +235,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"search_result = google_search.invoke({\"q\": \"IBM\"})\n",
|
||||
"search_result = google_search.invoke(input=\"IBM\")\n",
|
||||
"search_result"
|
||||
]
|
||||
},
|
||||
@@ -339,7 +308,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -348,7 +317,7 @@
|
||||
"config = {\"maxResults\": 3}\n",
|
||||
"google_search.set_tool_config(config)\n",
|
||||
"\n",
|
||||
"search_result = google_search.invoke({\"q\": \"IBM\"})\n",
|
||||
"search_result = google_search.invoke(input=\"IBM\")\n",
|
||||
"output = json.loads(search_result.get(\"output\"))"
|
||||
]
|
||||
},
|
||||
@@ -609,13 +578,13 @@
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For detailed documentation of all `WatsonxToolkit` features and configurations head to the [API reference](https://python.langchain.com/api_reference/ibm/agent_toolkits/langchain_ibm.agent_toolkits.utility.toolkit.WatsonxToolkit.html)."
|
||||
"For detailed documentation of all `WatsonxToolkit` features and configurations head to the [API reference](https://python.langchain.com/api_reference/ibm/toolkit/langchain_ibm.toolkit.WatsonxToolkit.html)."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "langchain_ibm_repo_env",
|
||||
"display_name": "langchain_env",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
|
||||
@@ -32,7 +32,6 @@
|
||||
"| [SmartScraperTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ |  |\n",
|
||||
"| [SmartCrawlerTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ |  |\n",
|
||||
"| [MarkdownifyTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ |  |\n",
|
||||
"| [AgenticScraperTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ |  |\n",
|
||||
"| [GetCreditsTool](https://python.langchain.com/docs/integrations/tools/scrapegraph) | langchain-scrapegraph | ✅ | ❌ |  |\n",
|
||||
"\n",
|
||||
"### Tool features\n",
|
||||
@@ -42,7 +41,6 @@
|
||||
"| SmartScraperTool | Extract structured data from websites | URL + prompt | JSON |\n",
|
||||
"| SmartCrawlerTool | Extract data from multiple pages with crawling | URL + prompt + crawl options | JSON |\n",
|
||||
"| MarkdownifyTool | Convert webpages to markdown | URL | Markdown text |\n",
|
||||
"| AgenticScraperTool | Extract specifying steps | URL | Markdown text |\n",
|
||||
"| GetCreditsTool | Check API credits | None | Credit info |\n",
|
||||
"\n",
|
||||
"\n",
|
||||
@@ -53,7 +51,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 3,
|
||||
"id": "f85b4089",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -81,7 +79,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 4,
|
||||
"id": "e0b178a2",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -287,7 +285,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 7,
|
||||
"id": "f90e33a7",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -331,7 +329,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 5,
|
||||
"id": "af3123ad",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -355,7 +353,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"execution_count": 8,
|
||||
"id": "fdbf35b5",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
|
||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@@ -1,339 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a6f91f20",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Scrapeless\n",
|
||||
"\n",
|
||||
"**Scrapeless** offers flexible and feature-rich data acquisition services with extensive parameter customization and multi-format export support. These capabilities empower LangChain to integrate and leverage external data more effectively. The core functional modules include:\n",
|
||||
"\n",
|
||||
"**DeepSerp**\n",
|
||||
"- **Google Search**: Enables comprehensive extraction of Google SERP data across all result types.\n",
|
||||
" - Supports selection of localized Google domains (e.g., `google.com`, `google.ad`) to retrieve region-specific search results.\n",
|
||||
" - Pagination supported for retrieving results beyond the first page.\n",
|
||||
" - Supports a search result filtering toggle to control whether to exclude duplicate or similar content.\n",
|
||||
"- **Google Trends**: Retrieves keyword trend data from Google, including popularity over time, regional interest, and related searches.\n",
|
||||
" - Supports multi-keyword comparison.\n",
|
||||
" - Supports multiple data types: `interest_over_time`, `interest_by_region`, `related_queries`, and `related_topics`.\n",
|
||||
" - Allows filtering by specific Google properties (Web, YouTube, News, Shopping) for source-specific trend analysis.\n",
|
||||
"\n",
|
||||
"**Universal Scraping**\n",
|
||||
"- Designed for modern, JavaScript-heavy websites, allowing dynamic content extraction.\n",
|
||||
" - Global premium proxy support for bypassing geo-restrictions and improving reliability.\n",
|
||||
"\n",
|
||||
"**Crawler**\n",
|
||||
"- **Crawl**: Recursively crawl a website and its linked pages to extract site-wide content.\n",
|
||||
" - Supports configurable crawl depth and scoped URL targeting.\n",
|
||||
"- **Scrape**: Extract content from a single webpage with high precision.\n",
|
||||
" - Supports \"main content only\" extraction to exclude ads, footers, and other non-essential elements.\n",
|
||||
" - Allows batch scraping of multiple standalone URLs.\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
"### Integration details\n",
|
||||
"\n",
|
||||
"| Class | Package | Serializable | JS support | Package latest |\n",
|
||||
"| :--- | :--- | :---: | :---: | :---: |\n",
|
||||
"| [ScrapelessUniversalScrapingTool](https://pypi.org/project/langchain-scrapeless/) | [langchain-scrapeless](https://pypi.org/project/langchain-scrapeless/) | ✅ | ❌ |  |\n",
|
||||
"\n",
|
||||
"### Tool features\n",
|
||||
"\n",
|
||||
"|Native async|Returns artifact|Return data|\n",
|
||||
"|:-:|:-:|:-:|\n",
|
||||
"|✅|✅|html, markdown, links, metadata, structured content|\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"The integration lives in the `langchain-scrapeless` package."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "raw",
|
||||
"id": "ca676665",
|
||||
"metadata": {
|
||||
"vscode": {
|
||||
"languageId": "raw"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"!pip install langchain-scrapeless"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b15e9266",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Credentials\n",
|
||||
"\n",
|
||||
"You'll need a Scrapeless API key to use this tool. You can set it as an environment variable:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e0b178a2-8816-40ca-b57c-ccdd86dde9c9",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"os.environ[\"SCRAPELESS_API_KEY\"] = \"your-api-key\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "1c97218f-f366-479d-8bf7-fe9f2f6df73f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Instantiation\n",
|
||||
"\n",
|
||||
"Here we show how to instantiate an instance of the Scrapeless Universal Scraping Tool. This tool allows you to scrape any website using a headless browser with JavaScript rendering capabilities, customizable output types, and geo-specific proxy support.\n",
|
||||
"\n",
|
||||
"The tool accepts the following parameters during instantiation:\n",
|
||||
"- `url` (required, str): The URL of the website to scrape.\n",
|
||||
"- `headless` (optional, bool): Whether to use a headless browser. Default is True.\n",
|
||||
"- `js_render` (optional, bool): Whether to enable JavaScript rendering. Default is True.\n",
|
||||
"- `js_wait_until` (optional, str): Defines when to consider the JavaScript-rendered page ready. Default is `'domcontentloaded'`. Options include:\n",
|
||||
" - `load`: Wait until the page is fully loaded.\n",
|
||||
" - `domcontentloaded`: Wait until the DOM is fully loaded.\n",
|
||||
" - `networkidle0`: Wait until the network is idle.\n",
|
||||
" - `networkidle2`: Wait until the network is idle for 2 seconds.\n",
|
||||
"- `outputs` (optional, str): The specific type of data to extract from the page. Options include:\n",
|
||||
" - `phone_numbers`\n",
|
||||
" - `headings`\n",
|
||||
" - `images`\n",
|
||||
" - `audios`\n",
|
||||
" - `videos`\n",
|
||||
" - `links`\n",
|
||||
" - `menus`\n",
|
||||
" - `hashtags`\n",
|
||||
" - `emails`\n",
|
||||
" - `metadata`\n",
|
||||
" - `tables`\n",
|
||||
" - `favicon`\n",
|
||||
"- `response_type` (optional, str): Defines the format of the response. Default is `'html'`. Options include:\n",
|
||||
" - `html`: Return the raw HTML of the page.\n",
|
||||
" - `plaintext`: Return the plain text content.\n",
|
||||
" - `markdown`: Return a Markdown version of the page.\n",
|
||||
" - `png`: Return a PNG screenshot.\n",
|
||||
" - `jpeg`: Return a JPEG screenshot.\n",
|
||||
"- `response_image_full_page` (optional, bool): Whether to capture and return a full-page image when using screenshot output (png or jpeg). Default is False.\n",
|
||||
"- `selector` (optional, str): A specific CSS selector to scope scraping within a part of the page. Default is `None`.\n",
|
||||
"- `proxy_country` (optional, str): Two-letter country code for geo-specific proxy access (e.g., `'us'`, `'gb'`, `'de'`, `'jp'`). Default is `'ANY'`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "74147a1a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Invocation\n",
|
||||
"\n",
|
||||
"### Basic Usage"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "65310a8b-eb0c-4d9e-a618-4f4abe2414fc",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"<!DOCTYPE html><html><head>\n",
|
||||
" <title>Example Domain</title>\n",
|
||||
"\n",
|
||||
" <meta charset=\"utf-8\">\n",
|
||||
" <meta http-equiv=\"Content-type\" content=\"text/html; charset=utf-8\">\n",
|
||||
" <meta name=\"viewport\" content=\"width=device-width, initial-scale=1\">\n",
|
||||
" <style type=\"text/css\">\n",
|
||||
" body {\n",
|
||||
" background-color: #f0f0f2;\n",
|
||||
" margin: 0;\n",
|
||||
" padding: 0;\n",
|
||||
" font-family: -apple-system, system-ui, BlinkMacSystemFont, \"Segoe UI\", \"Open Sans\", \"Helvetica Neue\", Helvetica, Arial, sans-serif;\n",
|
||||
" \n",
|
||||
" }\n",
|
||||
" div {\n",
|
||||
" width: 600px;\n",
|
||||
" margin: 5em auto;\n",
|
||||
" padding: 2em;\n",
|
||||
" background-color: #fdfdff;\n",
|
||||
" border-radius: 0.5em;\n",
|
||||
" box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);\n",
|
||||
" }\n",
|
||||
" a:link, a:visited {\n",
|
||||
" color: #38488f;\n",
|
||||
" text-decoration: none;\n",
|
||||
" }\n",
|
||||
" @media (max-width: 700px) {\n",
|
||||
" div {\n",
|
||||
" margin: 0 auto;\n",
|
||||
" width: auto;\n",
|
||||
" }\n",
|
||||
" }\n",
|
||||
" </style> \n",
|
||||
"</head>\n",
|
||||
"\n",
|
||||
"<body>\n",
|
||||
"<div>\n",
|
||||
" <h1>Example Domain</h1>\n",
|
||||
" <p>This domain is for use in illustrative examples in documents. You may use this\n",
|
||||
" domain in literature without prior coordination or asking for permission.</p>\n",
|
||||
" <p><a href=\"https://www.iana.org/domains/example\">More information...</a></p>\n",
|
||||
"</div>\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"</body></html>\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_scrapeless import ScrapelessUniversalScrapingTool\n",
|
||||
"\n",
|
||||
"tool = ScrapelessUniversalScrapingTool()\n",
|
||||
"\n",
|
||||
"# Basic usage\n",
|
||||
"result = tool.invoke(\"https://example.com\")\n",
|
||||
"print(result)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "d6e73897",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Advanced Usage with Parameters"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "f90e33a7",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"# Well hello there.\n",
|
||||
"\n",
|
||||
"Welcome to exmaple.com.\n",
|
||||
"Chances are you got here by mistake (example.com, anyone?)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_scrapeless import ScrapelessUniversalScrapingTool\n",
|
||||
"\n",
|
||||
"tool = ScrapelessUniversalScrapingTool()\n",
|
||||
"\n",
|
||||
"result = tool.invoke({\"url\": \"https://exmaple.com\", \"response_type\": \"markdown\"})\n",
|
||||
"print(result)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "659f9fbd-6fcf-445f-aa8c-72d8e60154bd",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Use within an agent"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "af3123ad-7a02-40e5-b58e-7d56e23e5830",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"================================\u001b[1m Human Message \u001b[0m=================================\n",
|
||||
"\n",
|
||||
"Use the scrapeless scraping tool to fetch https://www.scrapeless.com/en and extract the h1 tag.\n",
|
||||
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
|
||||
"Tool Calls:\n",
|
||||
" scrapeless_universal_scraping (call_jBrvMVL2ixhvf6gklhi7Gqtb)\n",
|
||||
" Call ID: call_jBrvMVL2ixhvf6gklhi7Gqtb\n",
|
||||
" Args:\n",
|
||||
" url: https://www.scrapeless.com/en\n",
|
||||
" outputs: headings\n",
|
||||
"=================================\u001b[1m Tool Message \u001b[0m=================================\n",
|
||||
"Name: scrapeless_universal_scraping\n",
|
||||
"\n",
|
||||
"{\"headings\":[\"Effortless Web Scraping Toolkitfor Business and Developers\",\"4.8\",\"4.5\",\"8.5\",\"A Flexible Toolkit for Accessing Public Web Data\",\"Deep SerpApi\",\"Scraping Browser\",\"Universal Scraping API\",\"Customized Services\",\"From Simple Data Scraping to Complex Anti-Bot Challenges, Scrapeless Has You Covered.\",\"Fully Compatible with Key Programming Languages and Tools\",\"Enterprise-level Data Scraping Solution\",\"Customized Data Scraping Solutions\",\"High Concurrency and High-Performance Scraping\",\"Data Cleaning and Transformation\",\"Real-Time Data Push and API Integration\",\"Data Security and Privacy Protection\",\"Enterprise-level SLA\",\"Why Scrapeless: Simplify Your Data Flow Effortlessly.\",\"Articles\",\"Organized Fresh Data\",\"Prices\",\"No need to hassle with browser maintenance\",\"Reviews\",\"Only pay for successful requests\",\"Products\",\"Fully scalable\",\"Unleash Your Competitive Edgein Data within the Industry\",\"Regulate Compliance for All Users\",\"Web Scraping Blog\",\"Scrapeless MCP Server Is Officially Live! Build Your Ultimate AI-Web Connector\",\"Product Updates | New Profile Feature\",\"How to Track Your Ranking on ChatGPT?\",\"For Scraping\",\"For Data\",\"For AI\",\"Top Scraper API\",\"Learning Center\",\"Legal\"]}\n",
|
||||
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
|
||||
"\n",
|
||||
"The h1 tag extracted from the website https://www.scrapeless.com/en is \"Effortless Web Scraping Toolkit for Business and Developers\".\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_openai import ChatOpenAI\n",
|
||||
"from langchain_scrapeless import ScrapelessUniversalScrapingTool\n",
|
||||
"from langgraph.prebuilt import create_react_agent\n",
|
||||
"\n",
|
||||
"llm = ChatOpenAI()\n",
|
||||
"\n",
|
||||
"tool = ScrapelessUniversalScrapingTool()\n",
|
||||
"\n",
|
||||
"# Use the tool with an agent\n",
|
||||
"tools = [tool]\n",
|
||||
"agent = create_react_agent(llm, tools)\n",
|
||||
"\n",
|
||||
"for chunk in agent.stream(\n",
|
||||
" {\n",
|
||||
" \"messages\": [\n",
|
||||
" (\n",
|
||||
" \"human\",\n",
|
||||
" \"Use the scrapeless scraping tool to fetch https://www.scrapeless.com/en and extract the h1 tag.\",\n",
|
||||
" )\n",
|
||||
" ]\n",
|
||||
" },\n",
|
||||
" stream_mode=\"values\",\n",
|
||||
"):\n",
|
||||
" chunk[\"messages\"][-1].pretty_print()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4ac8146c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"- [Scrapeless Documentation](https://docs.scrapeless.com/en/universal-scraping-api/quickstart/introduction/)\n",
|
||||
"- [Scrapeless API Reference](https://apidocs.scrapeless.com/api-12948840)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "langchain",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.12.11"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
@@ -153,7 +153,7 @@
|
||||
"from langgraph.prebuilt import create_react_agent\n",
|
||||
"\n",
|
||||
"llm = ChatAnthropic(\n",
|
||||
" model=\"claude-3-5-sonnet-latest\",\n",
|
||||
" model=\"claude-3-5-sonnet-20240620\",\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"langgraph_agent_executor = create_react_agent(llm, stripe_agent_toolkit.get_tools())\n",
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -73,9 +73,8 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "72461be913bfaf2b",
|
||||
"metadata": {},
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"## Instantiation\n",
|
||||
"\n",
|
||||
@@ -84,26 +83,26 @@
|
||||
"Instantiation\n",
|
||||
"The tool accepts various parameters during instantiation:\n",
|
||||
"\n",
|
||||
"- `max_results` (optional, int): Maximum number of search results to return. Default is 5.\n",
|
||||
"- `topic` (optional, str): Category of the search. Can be `'general'`, `'news'`, or `'finance'`. Default is `'general'`.\n",
|
||||
"- `include_answer` (optional, bool): Include an answer to original query in results. Default is False.\n",
|
||||
"- `include_raw_content` (optional, bool): Include cleaned and parsed HTML of each search result. Default is False.\n",
|
||||
"- `include_images` (optional, bool): Include a list of query related images in the response. Default is False.\n",
|
||||
"- `include_image_descriptions` (optional, bool): Include descriptive text for each image. Default is False.\n",
|
||||
"- `search_depth` (optional, str): Depth of the search, either `'basic'` or `'advanced'`. Default is `'basic'`.\n",
|
||||
"- `time_range` (optional, str): The time range back from the current date to filter results - `'day'`, `'week'`, `'month'`, or `'year'`. Default is None.\n",
|
||||
"- `include_domains` (optional, List[str]): List of domains to specifically include. Default is None.\n",
|
||||
"- `exclude_domains` (optional, List[str]): List of domains to specifically exclude. Default is None.\n",
|
||||
"- max_results (optional, int): Maximum number of search results to return. Default is 5.\n",
|
||||
"- topic (optional, str): Category of the search. Can be \"general\", \"news\", or \"finance\". Default is \"general\".\n",
|
||||
"- include_answer (optional, bool): Include an answer to original query in results. Default is False.\n",
|
||||
"- include_raw_content (optional, bool): Include cleaned and parsed HTML of each search result. Default is False.\n",
|
||||
"- include_images (optional, bool): Include a list of query related images in the response. Default is False.\n",
|
||||
"- include_image_descriptions (optional, bool): Include descriptive text for each image. Default is False.\n",
|
||||
"- search_depth (optional, str): Depth of the search, either \"basic\" or \"advanced\". Default is \"basic\".\n",
|
||||
"- time_range (optional, str): The time range back from the current date to filter results - \"day\", \"week\", \"month\", or \"year\". Default is None.\n",
|
||||
"- include_domains (optional, List[str]): List of domains to specifically include. Default is None.\n",
|
||||
"- exclude_domains (optional, List[str]): List of domains to specifically exclude. Default is None.\n",
|
||||
"\n",
|
||||
"For a comprehensive overview of the available parameters, refer to the [Tavily Search API documentation](https://docs.tavily.com/documentation/api-reference/endpoint/search)"
|
||||
]
|
||||
],
|
||||
"id": "72461be913bfaf2b"
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "dc382e5426394836",
|
||||
"metadata": {},
|
||||
"cell_type": "code",
|
||||
"outputs": [],
|
||||
"execution_count": null,
|
||||
"source": [
|
||||
"from langchain_tavily import TavilySearch\n",
|
||||
"\n",
|
||||
@@ -119,12 +118,12 @@
|
||||
" # include_domains=None,\n",
|
||||
" # exclude_domains=None\n",
|
||||
")"
|
||||
]
|
||||
],
|
||||
"id": "dc382e5426394836"
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "f997d2733b63f655",
|
||||
"metadata": {},
|
||||
"cell_type": "markdown",
|
||||
"source": [
|
||||
"## Invocation\n",
|
||||
"\n",
|
||||
@@ -135,22 +134,18 @@
|
||||
"- The following arguments can also be set during invocation : `include_images`, `search_depth` , `time_range`, `include_domains`, `exclude_domains`, `include_images`\n",
|
||||
"- For reliability and performance reasons, certain parameters that affect response size cannot be modified during invocation: `include_answer` and `include_raw_content`. These limitations prevent unexpected context window issues and ensure consistent results.\n",
|
||||
"\n",
|
||||
":::note\n",
|
||||
"\n",
|
||||
"The optional arguments are available for agents to dynamically set, if you set an argument during instantiation and then invoke the tool with a different value, the tool will use the value you passed during invocation.\n",
|
||||
"\n",
|
||||
":::"
|
||||
]
|
||||
"NOTE: The optional arguments are available for agents to dynamically set, if you set an argument during instantiation and then invoke the tool with a different value, the tool will use the value you passed during invocation."
|
||||
],
|
||||
"id": "f997d2733b63f655"
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "5e75399230ab9fc1",
|
||||
"metadata": {},
|
||||
"cell_type": "code",
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tool.invoke({\"query\": \"What happened at the last wimbledon\"})"
|
||||
]
|
||||
"execution_count": null,
|
||||
"source": "tool.invoke({\"query\": \"What happened at the last wimbledon\"})",
|
||||
"id": "5e75399230ab9fc1"
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
@@ -159,7 +154,7 @@
|
||||
"source": [
|
||||
"### [Invoke with ToolCall](/docs/concepts/tools)\n",
|
||||
"\n",
|
||||
"We can also invoke the tool with a model-generated `ToolCall`, in which case a `ToolMessage` will be returned:"
|
||||
"We can also invoke the tool with a model-generated ToolCall, in which case a ToolMessage will be returned:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -238,7 +233,7 @@
|
||||
"id": "1020a506-473b-4e6a-a563-7aaf92c4d183",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"We will need to install `langgraph`:"
|
||||
"We will need to install langgraph:"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -261,21 +256,21 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"================================\u001b[1m Human Message \u001b[0m=================================\n",
|
||||
"================================\u001B[1m Human Message \u001B[0m=================================\n",
|
||||
"\n",
|
||||
"What nation hosted the Euro 2024? Include only wikipedia sources.\n",
|
||||
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
|
||||
"==================================\u001B[1m Ai Message \u001B[0m==================================\n",
|
||||
"Tool Calls:\n",
|
||||
" tavily_search (call_yxmR4K2uadsQ8LKoyi8JyoLD)\n",
|
||||
" Call ID: call_yxmR4K2uadsQ8LKoyi8JyoLD\n",
|
||||
" Args:\n",
|
||||
" query: Euro 2024 host nation\n",
|
||||
" include_domains: ['wikipedia.org']\n",
|
||||
"=================================\u001b[1m Tool Message \u001b[0m=================================\n",
|
||||
"=================================\u001B[1m Tool Message \u001B[0m=================================\n",
|
||||
"Name: tavily_search\n",
|
||||
"\n",
|
||||
"{\"query\": \"Euro 2024 host nation\", \"follow_up_questions\": null, \"answer\": null, \"images\": [], \"results\": [{\"title\": \"UEFA Euro 2024 - Wikipedia\", \"url\": \"https://en.wikipedia.org/wiki/UEFA_Euro_2024\", \"content\": \"Tournament details Host country Germany Dates 14 June – 14 July Teams 24 Venue(s) 10 (in 10 host cities) Final positions Champions Spain (4th title) Runners-up England Tournament statistics Matches played 51 Goals scored 117 (2.29 per match) Attendance 2,681,288 (52,574 per match) Top scorer(s) Harry Kane Georges Mikautadze Jamal Musiala Cody Gakpo Ivan Schranz Dani Olmo (3 goals each) Best player(s) Rodri Best young player Lamine Yamal ← 2020 2028 → The 2024 UEFA European Football Championship, commonly referred to as UEFA Euro 2024 (stylised as UEFA EURO 2024) or simply Euro 2024, was the 17th UEFA European Championship, the quadrennial international football championship organised by UEFA for the European men's national teams of their member associations. Germany hosted the tournament, which took place from 14 June to 14 July 2024. The tournament involved 24 teams, with Georgia making their European Championship debut. [4] Host nation Germany were eliminated by Spain in the quarter-finals; Spain went on to win the tournament for a record fourth time after defeating England 2–1 in the final.\", \"score\": 0.9104262, \"raw_content\": null}, {\"title\": \"UEFA Euro 2024 - Simple English Wikipedia, the free encyclopedia\", \"url\": \"https://simple.wikipedia.org/wiki/UEFA_Euro_2024\", \"content\": \"The 2024 UEFA European Football Championship, also known as UEFA Euro 2024 or simply Euro 2024, was the 17th edition of the UEFA European Championship. Germany was hosting the tournament. ... The UEFA Executive Committee voted for the host in a secret ballot, with only a simple majority (more than half of the valid votes) required to determine\", \"score\": 0.81418616, \"raw_content\": null}, {\"title\": \"Championnat d'Europe de football 2024 — Wikipédia\", \"url\": \"https://fr.wikipedia.org/wiki/Championnat_d'Europe_de_football_2024\", \"content\": \"Le Championnat d'Europe de l'UEFA de football 2024 est la 17 e édition du Championnat d'Europe de football, communément abrégé en Euro 2024, compétition organisée par l'UEFA et rassemblant les meilleures équipes nationales masculines européennes. L'Allemagne est désignée pays organisateur de la compétition le 27 septembre 2018. C'est la troisième fois que des matches du Championnat\", \"score\": 0.8055255, \"raw_content\": null}, {\"title\": \"UEFA Euro 2024 bids - Wikipedia\", \"url\": \"https://en.wikipedia.org/wiki/UEFA_Euro_2024_bids\", \"content\": \"The bidding process of UEFA Euro 2024 ended on 27 September 2018 in Nyon, Switzerland, when Germany was announced to be the host. [1] Two bids came before the deadline, 3 March 2017, which were Germany and Turkey as single bids. ... Press agencies revealed on 24 October 2013, that the European football governing body UEFA would have decided on\", \"score\": 0.7882741, \"raw_content\": null}, {\"title\": \"2024 UEFA European Under-19 Championship - Wikipedia\", \"url\": \"https://en.wikipedia.org/wiki/2024_UEFA_European_Under-19_Championship\", \"content\": \"The 2024 UEFA European Under-19 Championship (also known as UEFA Under-19 Euro 2024) was the 21st edition of the UEFA European Under-19 Championship (71st edition if the Under-18 and Junior eras are included), the annual international youth football championship organised by UEFA for the men's under-19 national teams of Europe. Northern Ireland hosted the tournament from 15 to 28 July 2024.\", \"score\": 0.7783298, \"raw_content\": null}], \"response_time\": 1.67}\n",
|
||||
"==================================\u001b[1m Ai Message \u001b[0m==================================\n",
|
||||
"==================================\u001B[1m Ai Message \u001B[0m==================================\n",
|
||||
"\n",
|
||||
"The nation that hosted Euro 2024 was Germany. You can find more information on the [Wikipedia page for UEFA Euro 2024](https://en.wikipedia.org/wiki/UEFA_Euro_2024).\n"
|
||||
]
|
||||
@@ -309,14 +304,8 @@
|
||||
"source": [
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"For detailed documentation of all Tavily Search API features and configurations head to the [API reference](https://docs.tavily.com/documentation/api-reference/endpoint/search)."
|
||||
"For detailed documentation of all Tavily Search API features and configurations head to the API reference: https://docs.tavily.com/documentation/api-reference/endpoint/search"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "589ff839",
|
||||
"metadata": {},
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
|
||||
@@ -1,378 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "554b9f85",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# MCP Toolbox for Databases\n",
|
||||
"\n",
|
||||
"Integrate your databases with LangChain agents using MCP Toolbox.\n",
|
||||
"\n",
|
||||
"## Overview\n",
|
||||
"\n",
|
||||
"[MCP Toolbox for Databases](https://github.com/googleapis/genai-toolbox) is an open source MCP server for databases. It was designed with enterprise-grade and production-quality in mind. It enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more.\n",
|
||||
"\n",
|
||||
"Toolbox Tools can be seemlessly integrated with Langchain applications. For more\n",
|
||||
"information on [getting\n",
|
||||
"started](https://googleapis.github.io/genai-toolbox/getting-started/local_quickstart/) or\n",
|
||||
"[configuring](https://googleapis.github.io/genai-toolbox/getting-started/configure/)\n",
|
||||
"MCP Toolbox, see the\n",
|
||||
"[documentation](https://googleapis.github.io/genai-toolbox/getting-started/introduction/).\n",
|
||||
"\n",
|
||||
""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "788ff64c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"This guide assumes you have already done the following:\n",
|
||||
"\n",
|
||||
"1. Installed [Python 3.9+](https://wiki.python.org/moin/BeginnersGuide/Download) and [pip](https://pip.pypa.io/en/stable/installation/).\n",
|
||||
"2. Installed [PostgreSQL 16+ and the `psql` command-line client](https://www.postgresql.org/download/)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "4847d196",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 1. Setup your Database\n",
|
||||
"\n",
|
||||
"First, let's set up a PostgreSQL database. We'll create a new database, a dedicated user for MCP Toolbox, and a `hotels` table with some sample data.\n",
|
||||
"\n",
|
||||
"Connect to PostgreSQL using the `psql` command. You may need to adjust the command based on your PostgreSQL setup (e.g., if you need to specify a host or a different superuser).\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"psql -U postgres\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Now, run the following SQL commands to create the user, database, and grant the necessary permissions:\n",
|
||||
"\n",
|
||||
"```sql\n",
|
||||
"CREATE USER toolbox_user WITH PASSWORD 'my-password';\n",
|
||||
"CREATE DATABASE toolbox_db;\n",
|
||||
"GRANT ALL PRIVILEGES ON DATABASE toolbox_db TO toolbox_user;\n",
|
||||
"ALTER DATABASE toolbox_db OWNER TO toolbox_user;\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Connect to your newly created database with the new user:\n",
|
||||
"\n",
|
||||
"```sql\n",
|
||||
"\\c toolbox_db toolbox_user\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Finally, create the `hotels` table and insert some data:\n",
|
||||
"\n",
|
||||
"```sql\n",
|
||||
"CREATE TABLE hotels(\n",
|
||||
" id INTEGER NOT NULL PRIMARY KEY,\n",
|
||||
" name VARCHAR NOT NULL,\n",
|
||||
" location VARCHAR NOT NULL,\n",
|
||||
" price_tier VARCHAR NOT NULL,\n",
|
||||
" booked BIT NOT NULL\n",
|
||||
");\n",
|
||||
"\n",
|
||||
"INSERT INTO hotels(id, name, location, price_tier, booked)\n",
|
||||
"VALUES \n",
|
||||
" (1, 'Hilton Basel', 'Basel', 'Luxury', B'0'),\n",
|
||||
" (2, 'Marriott Zurich', 'Zurich', 'Upscale', B'0'),\n",
|
||||
" (3, 'Hyatt Regency Basel', 'Basel', 'Upper Upscale', B'0');\n",
|
||||
"```\n",
|
||||
"You can now exit `psql` by typing `\\q`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "855133f8",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 2. Install MCP Toolbox\n",
|
||||
"\n",
|
||||
"Next, we will install MCP Toolbox, define our tools in a `tools.yaml` configuration file, and run the MCP Toolbox server.\n",
|
||||
"\n",
|
||||
"For **macOS** users, the easiest way to install is with [Homebrew](https://formulae.brew.sh/formula/mcp-toolbox):\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"brew install mcp-toolbox\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"For other platforms, [download the latest MCP Toolbox binary for your operating system and architecture.](https://github.com/googleapis/genai-toolbox/releases)\n",
|
||||
"\n",
|
||||
"Create a `tools.yaml` file. This file defines the data sources MCP Toolbox can connect to and the tools it can expose to your agent. For production use, always use environment variables for secrets.\n",
|
||||
"\n",
|
||||
"```yaml\n",
|
||||
"sources:\n",
|
||||
" my-pg-source:\n",
|
||||
" kind: postgres\n",
|
||||
" host: 127.0.0.1\n",
|
||||
" port: 5432\n",
|
||||
" database: toolbox_db\n",
|
||||
" user: toolbox_user\n",
|
||||
" password: my-password\n",
|
||||
"\n",
|
||||
"tools:\n",
|
||||
" search-hotels-by-location:\n",
|
||||
" kind: postgres-sql\n",
|
||||
" source: my-pg-source\n",
|
||||
" description: Search for hotels based on location.\n",
|
||||
" parameters:\n",
|
||||
" - name: location\n",
|
||||
" type: string\n",
|
||||
" description: The location of the hotel.\n",
|
||||
" statement: SELECT id, name, location, price_tier FROM hotels WHERE location ILIKE '%' || $1 || '%';\n",
|
||||
" book-hotel:\n",
|
||||
" kind: postgres-sql\n",
|
||||
" source: my-pg-source\n",
|
||||
" description: >-\n",
|
||||
" Book a hotel by its ID. If the hotel is successfully booked, returns a confirmation message.\n",
|
||||
" parameters:\n",
|
||||
" - name: hotel_id\n",
|
||||
" type: integer\n",
|
||||
" description: The ID of the hotel to book.\n",
|
||||
" statement: UPDATE hotels SET booked = B'1' WHERE id = $1;\n",
|
||||
"\n",
|
||||
"toolsets:\n",
|
||||
" hotel_toolset:\n",
|
||||
" - search-hotels-by-location\n",
|
||||
" - book-hotel\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"Now, in a separate terminal window, start the MCP Toolbox server. If you installed via Homebrew, you can just run `toolbox`. If you downloaded the binary manually, you'll need to run `./toolbox` from the directory where you saved it:\n",
|
||||
"\n",
|
||||
"```bash\n",
|
||||
"toolbox --tools-file \"tools.yaml\"\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"MCP Toolbox will start on `http://127.0.0.1:5000` by default and will hot-reload if you make changes to your `tools.yaml` file."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "b9b2f041",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Instantiation"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d4c31f3b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!pip install toolbox-langchain"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "14a68a49",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from toolbox_langchain import ToolboxClient\n",
|
||||
"\n",
|
||||
"with ToolboxClient(\"http://127.0.0.1:5000\") as client:\n",
|
||||
" search_tool = await client.aload_tool(\"search-hotels-by-location\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "95eec50c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Invocation\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "8e99351b",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[{\"id\":1,\"location\":\"Basel\",\"name\":\"Hilton Basel\",\"price_tier\":\"Luxury\"},{\"id\":3,\"location\":\"Basel\",\"name\":\"Hyatt Regency Basel\",\"price_tier\":\"Upper Upscale\"}]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from toolbox_langchain import ToolboxClient\n",
|
||||
"\n",
|
||||
"with ToolboxClient(\"http://127.0.0.1:5000\") as client:\n",
|
||||
" search_tool = await client.aload_tool(\"search-hotels-by-location\")\n",
|
||||
" results = search_tool.invoke({\"location\": \"Basel\"})\n",
|
||||
" print(results)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9e8dbd39",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Use within an agent\n",
|
||||
"\n",
|
||||
"Now for the fun part! We'll install the required LangChain packages and create an agent that can use the tools we defined in MCP Toolbox."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9b716a84",
|
||||
"metadata": {
|
||||
"id": "install-packages"
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%pip install -U --quiet toolbox-langchain langgraph langchain-google-vertexai"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "affda34b",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"With the packages installed, we can define our agent. We will use `ChatVertexAI` for the model and `ToolboxClient` to load our tools. The `create_react_agent` from `langgraph.prebuilt` creates a robust agent that can reason about which tools to call.\n",
|
||||
"\n",
|
||||
"**Note:** Ensure your MCP Toolbox server is running in a separate terminal before executing the code below."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ddd82892",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langgraph.prebuilt import create_react_agent\n",
|
||||
"from langchain_google_vertexai import ChatVertexAI\n",
|
||||
"from langgraph.checkpoint.memory import MemorySaver\n",
|
||||
"from toolbox_langchain import ToolboxClient\n",
|
||||
"\n",
|
||||
"prompt = \"\"\"\n",
|
||||
"You're a helpful hotel assistant. You handle hotel searching and booking.\n",
|
||||
"When the user searches for a hotel, list the full details for each hotel found: id, name, location, and price tier.\n",
|
||||
"Always use the hotel ID for booking operations.\n",
|
||||
"For any bookings, provide a clear confirmation message.\n",
|
||||
"Don't ask for clarification or confirmation from the user; perform the requested action directly.\n",
|
||||
"\"\"\"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"async def run_queries(agent_executor):\n",
|
||||
" config = {\"configurable\": {\"thread_id\": \"hotel-thread-1\"}}\n",
|
||||
"\n",
|
||||
" # --- Query 1: Search for hotels ---\n",
|
||||
" query1 = \"I need to find a hotel in Basel.\"\n",
|
||||
" print(f'\\n--- USER: \"{query1}\" ---')\n",
|
||||
" inputs1 = {\"messages\": [(\"user\", prompt + query1)]}\n",
|
||||
" async for event in agent_executor.astream_events(\n",
|
||||
" inputs1, config=config, version=\"v2\"\n",
|
||||
" ):\n",
|
||||
" if event[\"event\"] == \"on_chat_model_end\" and event[\"data\"][\"output\"].content:\n",
|
||||
" print(f\"--- AGENT: ---\\n{event['data']['output'].content}\")\n",
|
||||
"\n",
|
||||
" # --- Query 2: Book a hotel ---\n",
|
||||
" query2 = \"Great, please book the Hyatt Regency Basel for me.\"\n",
|
||||
" print(f'\\n--- USER: \"{query2}\" ---')\n",
|
||||
" inputs2 = {\"messages\": [(\"user\", query2)]}\n",
|
||||
" async for event in agent_executor.astream_events(\n",
|
||||
" inputs2, config=config, version=\"v2\"\n",
|
||||
" ):\n",
|
||||
" if event[\"event\"] == \"on_chat_model_end\" and event[\"data\"][\"output\"].content:\n",
|
||||
" print(f\"--- AGENT: ---\\n{event['data']['output'].content}\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "54552733",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Run the agent"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9f7c199b",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"async def main():\n",
|
||||
" await run_hotel_agent()\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"async def run_hotel_agent():\n",
|
||||
" model = ChatVertexAI(model_name=\"gemini-2.5-flash\")\n",
|
||||
"\n",
|
||||
" # Load the tools from the running MCP Toolbox server\n",
|
||||
" async with ToolboxClient(\"http://127.0.0.1:5000\") as client:\n",
|
||||
" tools = await client.aload_toolset(\"hotel_toolset\")\n",
|
||||
"\n",
|
||||
" agent = create_react_agent(model, tools, checkpointer=MemorySaver())\n",
|
||||
"\n",
|
||||
" await run_queries(agent)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"await main()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "79bce43d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You've successfully connected a LangChain agent to a local database using MCP Toolbox! 🥳\n",
|
||||
"\n",
|
||||
"## API reference\n",
|
||||
"\n",
|
||||
"The primary class for this integration is `ToolboxClient`.\n",
|
||||
"\n",
|
||||
"For more information, see the following resources:\n",
|
||||
"- [Toolbox Official Documentation](https://googleapis.github.io/genai-toolbox/)\n",
|
||||
"- [Toolbox GitHub Repository](https://github.com/googleapis/genai-toolbox)\n",
|
||||
"- [Toolbox LangChain SDK](https://github.com/googleapis/mcp-toolbox-python-sdk/tree/main/packages/toolbox-langchain)\n",
|
||||
"\n",
|
||||
"MCP Toolbox has a variety of features to make developing Gen AI tools for databases seamless:\n",
|
||||
"- [Authenticated Parameters](https://googleapis.github.io/genai-toolbox/resources/tools/#authenticated-parameters): Bind tool inputs to values from OIDC tokens automatically, making it easy to run sensitive queries without potentially leaking data\n",
|
||||
"- [Authorized Invocations](https://googleapis.github.io/genai-toolbox/resources/tools/#authorized-invocations): Restrict access to use a tool based on the users Auth token\n",
|
||||
"- [OpenTelemetry](https://googleapis.github.io/genai-toolbox/how-to/export_telemetry/): Get metrics and tracing from MCP Toolbox with [OpenTelemetry](https://opentelemetry.io/docs/)\n",
|
||||
"\n",
|
||||
"# Community and Support\n",
|
||||
"\n",
|
||||
"We encourage you to get involved with the community:\n",
|
||||
"- ⭐️ Head over to the [GitHub repository](https://github.com/googleapis/genai-toolbox) to get started and follow along with updates.\n",
|
||||
"- 📚 Dive into the [official documentation](https://googleapis.github.io/genai-toolbox/getting-started/introduction/) for more advanced features and configurations.\n",
|
||||
"- 💬 Join our [Discord server](https://discord.com/invite/a4XjGqtmnG) to connect with the community and ask questions."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.13"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
||||
555
docs/docs/integrations/vectorstores/aerospike.ipynb
Normal file
555
docs/docs/integrations/vectorstores/aerospike.ipynb
Normal file
@@ -0,0 +1,555 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Aerospike\n",
|
||||
"\n",
|
||||
"[Aerospike Vector Search](https://aerospike.com/docs/vector) (AVS) is an\n",
|
||||
"extension to the Aerospike Database that enables searches across very large\n",
|
||||
"datasets stored in Aerospike. This new service lives outside of Aerospike and\n",
|
||||
"builds an index to perform those searches.\n",
|
||||
"\n",
|
||||
"This notebook showcases the functionality of the [LangChain Aerospike VectorStore\n",
|
||||
"integration](https://github.com/aerospike/langchain-aerospike).\n",
|
||||
"\n",
|
||||
"## Install AVS\n",
|
||||
"\n",
|
||||
"Before using this notebook, we need to have a running AVS instance. Use one of\n",
|
||||
"the [available installation methods](https://aerospike.com/docs/vector/install). \n",
|
||||
"\n",
|
||||
"When finished, store your AVS instance's IP address and port to use later\n",
|
||||
"in this demo:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"AVS_HOST = \"<avs_ip>\"\n",
|
||||
"AVS_PORT = 5000"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Install Dependencies \n",
|
||||
"The `sentence-transformers` dependency is large. This step could take several minutes to complete."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"vscode": {
|
||||
"languageId": "shellscript"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m25.0.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.1.1\u001b[0m\n",
|
||||
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!pip install --upgrade --quiet aerospike-vector-search==4.2.0 langchain-aerospike langchain-community sentence-transformers langchain"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Download Quotes Dataset\n",
|
||||
"\n",
|
||||
"We will download a dataset of approximately 100,000 quotes and use a subset of those quotes for semantic search."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"--2025-05-07 21:06:30-- https://github.com/aerospike/aerospike-vector-search-examples/raw/7dfab0fccca0852a511c6803aba46578729694b5/quote-semantic-search/container-volumes/quote-search/data/quotes.csv.tgz\n",
|
||||
"Resolving github.com (github.com)... 140.82.116.3\n",
|
||||
"Connecting to github.com (github.com)|140.82.116.3|:443... connected.\n",
|
||||
"HTTP request sent, awaiting response... 301 Moved Permanently\n",
|
||||
"Location: https://github.com/aerospike/aerospike-vector/raw/7dfab0fccca0852a511c6803aba46578729694b5/quote-semantic-search/container-volumes/quote-search/data/quotes.csv.tgz [following]\n",
|
||||
"--2025-05-07 21:06:30-- https://github.com/aerospike/aerospike-vector/raw/7dfab0fccca0852a511c6803aba46578729694b5/quote-semantic-search/container-volumes/quote-search/data/quotes.csv.tgz\n",
|
||||
"Reusing existing connection to github.com:443.\n",
|
||||
"HTTP request sent, awaiting response... 302 Found\n",
|
||||
"Location: https://raw.githubusercontent.com/aerospike/aerospike-vector/7dfab0fccca0852a511c6803aba46578729694b5/quote-semantic-search/container-volumes/quote-search/data/quotes.csv.tgz [following]\n",
|
||||
"--2025-05-07 21:06:30-- https://raw.githubusercontent.com/aerospike/aerospike-vector/7dfab0fccca0852a511c6803aba46578729694b5/quote-semantic-search/container-volumes/quote-search/data/quotes.csv.tgz\n",
|
||||
"Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.111.133, 185.199.108.133, ...\n",
|
||||
"Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.\n",
|
||||
"HTTP request sent, awaiting response... 200 OK\n",
|
||||
"Length: 11597643 (11M) [application/octet-stream]\n",
|
||||
"Saving to: ‘quotes.csv.tgz’\n",
|
||||
"\n",
|
||||
"quotes.csv.tgz 100%[===================>] 11.06M 12.7MB/s in 0.9s \n",
|
||||
"\n",
|
||||
"2025-05-07 21:06:32 (12.7 MB/s) - ‘quotes.csv.tgz’ saved [11597643/11597643]\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!wget https://github.com/aerospike/aerospike-vector-search-examples/raw/7dfab0fccca0852a511c6803aba46578729694b5/quote-semantic-search/container-volumes/quote-search/data/quotes.csv.tgz"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Load the Quotes Into Documents\n",
|
||||
"\n",
|
||||
"We will load our quotes dataset using the `CSVLoader` document loader. In this case, `lazy_load` returns an iterator to ingest our quotes more efficiently. In this example, we only load 5,000 quotes."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import itertools\n",
|
||||
"import os\n",
|
||||
"import tarfile\n",
|
||||
"\n",
|
||||
"from langchain_community.document_loaders.csv_loader import CSVLoader\n",
|
||||
"\n",
|
||||
"filename = \"./quotes.csv\"\n",
|
||||
"\n",
|
||||
"if not os.path.exists(filename) and os.path.exists(filename + \".tgz\"):\n",
|
||||
" # Untar the file\n",
|
||||
" with tarfile.open(filename + \".tgz\", \"r:gz\") as tar:\n",
|
||||
" tar.extractall(path=os.path.dirname(filename))\n",
|
||||
"\n",
|
||||
"NUM_QUOTES = 5000\n",
|
||||
"documents = CSVLoader(filename, metadata_columns=[\"author\", \"category\"]).lazy_load()\n",
|
||||
"documents = list(\n",
|
||||
" itertools.islice(documents, NUM_QUOTES)\n",
|
||||
") # Allows us to slice an iterator"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"page_content='quote: I'm selfish, impatient and a little insecure. I make mistakes, I am out of control and at times hard to handle. But if you can't handle me at my worst, then you sure as hell don't deserve me at my best.' metadata={'source': './quotes.csv', 'row': 0, 'author': 'Marilyn Monroe', 'category': 'attributed-no-source, best, life, love, mistakes, out-of-control, truth, worst'}\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(documents[0])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Create your Embedder\n",
|
||||
"\n",
|
||||
"In this step, we use HuggingFaceEmbeddings and the \"all-MiniLM-L6-v2\" sentence transformer model to embed our documents so we can perform a vector search."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"/var/folders/h5/lm2_c1xs3s32kwp11prnpftw0000gp/T/ipykernel_84638/3255399720.py:6: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the :class:`~langchain-huggingface package and should be used instead. To use it run `pip install -U :class:`~langchain-huggingface` and import as `from :class:`~langchain_huggingface import HuggingFaceEmbeddings``.\n",
|
||||
" embedder = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")\n",
|
||||
"/Users/dwelch/Desktop/everything/projects/langchain/myfork/langchain/.venv/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
|
||||
" from .autonotebook import tqdm as notebook_tqdm\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from aerospike_vector_search.types import VectorDistanceMetric\n",
|
||||
"from langchain_community.embeddings import HuggingFaceEmbeddings\n",
|
||||
"\n",
|
||||
"MODEL_DIM = 384\n",
|
||||
"MODEL_DISTANCE_CALC = VectorDistanceMetric.COSINE\n",
|
||||
"embedder = HuggingFaceEmbeddings(model_name=\"all-MiniLM-L6-v2\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Create an Aerospike Index and Embed Documents\n",
|
||||
"\n",
|
||||
"Before we add documents, we need to create an index in the Aerospike Database. In the example below, we use some convenience code that checks to see if the expected index already exists."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"quote-miniLM-L6-v2 does not exist. Creating index\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from aerospike_vector_search import Client, HostPort\n",
|
||||
"from aerospike_vector_search.types import VectorDistanceMetric\n",
|
||||
"from langchain_aerospike.vectorstores import Aerospike\n",
|
||||
"\n",
|
||||
"# Here we are using the AVS host and port you configured earlier\n",
|
||||
"seed = HostPort(host=AVS_HOST, port=AVS_PORT)\n",
|
||||
"\n",
|
||||
"# The namespace of where to place our vectors. This should match the vector configured in your docstore.conf file.\n",
|
||||
"NAMESPACE = \"test\"\n",
|
||||
"\n",
|
||||
"# The name of our new index.\n",
|
||||
"INDEX_NAME = \"quote-miniLM-L6-v2\"\n",
|
||||
"\n",
|
||||
"# AVS needs to know which metadata key contains our vector when creating the index and inserting documents.\n",
|
||||
"VECTOR_KEY = \"vector\"\n",
|
||||
"\n",
|
||||
"client = Client(seeds=seed)\n",
|
||||
"index_exists = False\n",
|
||||
"\n",
|
||||
"# Check if the index already exists. If not, create it\n",
|
||||
"for index in client.index_list():\n",
|
||||
" if index[\"id\"][\"namespace\"] == NAMESPACE and index[\"id\"][\"name\"] == INDEX_NAME:\n",
|
||||
" index_exists = True\n",
|
||||
" print(f\"{INDEX_NAME} already exists. Skipping creation\")\n",
|
||||
" break\n",
|
||||
"\n",
|
||||
"if not index_exists:\n",
|
||||
" print(f\"{INDEX_NAME} does not exist. Creating index\")\n",
|
||||
" client.index_create(\n",
|
||||
" namespace=NAMESPACE,\n",
|
||||
" name=INDEX_NAME,\n",
|
||||
" vector_field=VECTOR_KEY,\n",
|
||||
" vector_distance_metric=MODEL_DISTANCE_CALC,\n",
|
||||
" dimensions=MODEL_DIM,\n",
|
||||
" index_labels={\n",
|
||||
" \"model\": \"miniLM-L6-v2\",\n",
|
||||
" \"date\": \"05/04/2024\",\n",
|
||||
" \"dim\": str(MODEL_DIM),\n",
|
||||
" \"distance\": \"cosine\",\n",
|
||||
" },\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
"docstore = Aerospike.from_documents(\n",
|
||||
" documents,\n",
|
||||
" embedder,\n",
|
||||
" client=client,\n",
|
||||
" namespace=NAMESPACE,\n",
|
||||
" vector_key=VECTOR_KEY,\n",
|
||||
" index_name=INDEX_NAME,\n",
|
||||
" distance_strategy=MODEL_DISTANCE_CALC,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Search the Documents\n",
|
||||
"Now that we have embedded our vectors, we can use vector search on our quotes."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"~~~~ Document 0 ~~~~\n",
|
||||
"auto-generated id: 4984b472-8a32-4552-b3eb-f03b31b68031\n",
|
||||
"author: Carl Sagan, Cosmos\n",
|
||||
"quote: The Cosmos is all that is or was or ever will be. Our feeblest contemplations of the Cosmos stir us -- there is a tingling in the spine, a catch in the voice, a faint sensation, as if a distant memory, of falling from a height. We know we are approaching the greatest of mysteries.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 1 ~~~~\n",
|
||||
"auto-generated id: 486c8d87-8dd7-450d-9008-d7549e680ffb\n",
|
||||
"author: Renee Ahdieh, The Rose & the Dagger\n",
|
||||
"quote: From the stars, to the stars.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 2 ~~~~\n",
|
||||
"auto-generated id: 4b43b309-ce51-498c-b225-5254383b5b4a\n",
|
||||
"author: Elizabeth Gilbert\n",
|
||||
"quote: The love that moves the sun and the other stars.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 3 ~~~~\n",
|
||||
"auto-generated id: af784a10-f498-4570-bf81-2ffdca35440e\n",
|
||||
"author: Dante Alighieri, Paradiso\n",
|
||||
"quote: Love, that moves the sun and the other stars\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 4 ~~~~\n",
|
||||
"auto-generated id: b45d5d5e-d818-4206-ae6b-b1d166ea3d43\n",
|
||||
"author: Thich Nhat Hanh, Teachings on Love\n",
|
||||
"quote: Through my love for you, I want to express my love for the whole cosmos, the whole of humanity, and all beings. By living with you, I want to learn to love everyone and all species. If I succeed in loving you, I will be able to love everyone and all species on Earth... This is the real message of love.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"query = \"A quote about the beauty of the cosmos\"\n",
|
||||
"docs = docstore.similarity_search(\n",
|
||||
" query, k=5, index_name=INDEX_NAME, metadata_keys=[\"_id\", \"author\"]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def print_documents(docs):\n",
|
||||
" for i, doc in enumerate(docs):\n",
|
||||
" print(\"~~~~ Document\", i, \"~~~~\")\n",
|
||||
" print(\"auto-generated id:\", doc.metadata[\"_id\"])\n",
|
||||
" print(\"author: \", doc.metadata[\"author\"])\n",
|
||||
" print(doc.page_content)\n",
|
||||
" print(\"~~~~~~~~~~~~~~~~~~~~\\n\")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"print_documents(docs)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Embedding Additional Quotes as Text\n",
|
||||
"\n",
|
||||
"We can use `add_texts` to add additional quotes."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"New IDs\n",
|
||||
"['adf8064e-9c0e-46e2-b193-169c36432f4c', 'cf65b5ed-a0f4-491a-86ad-dcacc23c2815', '2ef52efd-d9b7-4077-bc14-defdf0b7dd2f']\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"docstore = Aerospike(\n",
|
||||
" client,\n",
|
||||
" embedder,\n",
|
||||
" NAMESPACE,\n",
|
||||
" index_name=INDEX_NAME,\n",
|
||||
" vector_key=VECTOR_KEY,\n",
|
||||
" distance_strategy=MODEL_DISTANCE_CALC,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"ids = docstore.add_texts(\n",
|
||||
" [\n",
|
||||
" \"quote: Rebellions are built on hope.\",\n",
|
||||
" \"quote: Logic is the beginning of wisdom, not the end.\",\n",
|
||||
" \"quote: If wishes were fishes, we’d all cast nets.\",\n",
|
||||
" ],\n",
|
||||
" metadatas=[\n",
|
||||
" {\"author\": \"Jyn Erso, Rogue One\"},\n",
|
||||
" {\"author\": \"Spock, Star Trek\"},\n",
|
||||
" {\"author\": \"Frank Herbert, Dune\"},\n",
|
||||
" ],\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(\"New IDs\")\n",
|
||||
"print(ids)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Search Documents Using Max Marginal Relevance Search\n",
|
||||
"\n",
|
||||
"We can use max marginal relevance search to find vectors that are similar to our query but dissimilar to each other. In this example, we create a retriever object using `as_retriever`, but this could be done just as easily by calling `docstore.max_marginal_relevance_search` directly. The `lambda_mult` search argument determines the diversity of our query response. 0 corresponds to maximum diversity and 1 to minimum diversity."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"~~~~ Document 0 ~~~~\n",
|
||||
"auto-generated id: 91e77b39-a528-40c6-a58a-486ae85f991a\n",
|
||||
"author: John Grogan, Marley and Me: Life and Love With the World's Worst Dog\n",
|
||||
"quote: Such short little lives our pets have to spend with us, and they spend most of it waiting for us to come home each day. It is amazing how much love and laughter they bring into our lives and even how much closer we become with each other because of them.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 1 ~~~~\n",
|
||||
"auto-generated id: c585b4ec-92b5-4579-948c-0529373abc2a\n",
|
||||
"author: John Grogan, Marley and Me: Life and Love With the World's Worst Dog\n",
|
||||
"quote: Dogs are great. Bad dogs, if you can really call them that, are perhaps the greatest of them all.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 2 ~~~~\n",
|
||||
"auto-generated id: 5768b31c-fac4-4af7-84b4-fb11bbfcb590\n",
|
||||
"author: Colleen Houck, Tiger's Curse\n",
|
||||
"quote: He then put both hands on the door on either side of my head and leaned in close, pinning me against it. I trembled like a downy rabbit caught in the clutches of a wolf. The wolf came closer. He bent his head and began nuzzling my cheek. The problem was…I wanted the wolf to devour me.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 3 ~~~~\n",
|
||||
"auto-generated id: 94f1b9fb-ad57-4f65-b470-7f49dd6c274c\n",
|
||||
"author: Ray Bradbury\n",
|
||||
"quote: Stuff your eyes with wonder,\" he said, \"live as if you'd drop dead in ten seconds. See the world. It's more fantastic than any dream made or paid for in factories. Ask no guarantees, ask for no security, there never was such an animal. And if there were, it would be related to the great sloth which hangs upside down in a tree all day every day, sleeping its life away. To hell with that,\" he said, \"shake the tree and knock the great sloth down on his ass.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"query = \"A quote about our favorite four-legged pets\"\n",
|
||||
"retriever = docstore.as_retriever(\n",
|
||||
" search_type=\"mmr\", search_kwargs={\"fetch_k\": 20, \"lambda_mult\": 0.7}\n",
|
||||
")\n",
|
||||
"matched_docs = retriever.invoke(query)\n",
|
||||
"\n",
|
||||
"print_documents(matched_docs)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Search Documents with a Relevance Threshold\n",
|
||||
"\n",
|
||||
"Another useful feature is a similarity search with a relevance threshold. Generally, we only want results that are most similar to our query but also within some range of proximity. A relevance of 1 is most similar and a relevance of 0 is most dissimilar."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"~~~~ Document 0 ~~~~\n",
|
||||
"auto-generated id: 6d9e67a6-0427-41e6-9e24-050518120d74\n",
|
||||
"author: Roy T. Bennett, The Light in the Heart\n",
|
||||
"quote: Never lose hope. Storms make people stronger and never last forever.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 1 ~~~~\n",
|
||||
"auto-generated id: 7d426e59-7935-4bcf-a676-cbe8dd4860e7\n",
|
||||
"author: Roy T. Bennett, The Light in the Heart\n",
|
||||
"quote: Difficulties and adversities viciously force all their might on us and cause us to fall apart, but they are necessary elements of individual growth and reveal our true potential. We have got to endure and overcome them, and move forward. Never lose hope. Storms make people stronger and never last forever.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 2 ~~~~\n",
|
||||
"auto-generated id: 6ec05e48-d162-440d-8819-001d2f3712f9\n",
|
||||
"author: Vincent van Gogh, The Letters of Vincent van Gogh\n",
|
||||
"quote: There is peace even in the storm\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n",
|
||||
"~~~~ Document 3 ~~~~\n",
|
||||
"auto-generated id: d3c3de59-4da4-4ae6-8f6d-83ed905dd320\n",
|
||||
"author: Edwin Morgan, A Book of Lives\n",
|
||||
"quote: Valentine WeatherKiss me with rain on your eyelashes,come on, let us sway together,under the trees, and to hell with thunder.\n",
|
||||
"~~~~~~~~~~~~~~~~~~~~\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"query = \"A quote about stormy weather\"\n",
|
||||
"retriever = docstore.as_retriever(\n",
|
||||
" search_type=\"similarity_score_threshold\",\n",
|
||||
" search_kwargs={\n",
|
||||
" \"score_threshold\": 0.4\n",
|
||||
" }, # A greater value returns items with more relevance\n",
|
||||
")\n",
|
||||
"matched_docs = retriever.invoke(query)\n",
|
||||
"\n",
|
||||
"print_documents(matched_docs)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Clean up\n",
|
||||
"\n",
|
||||
"We need to make sure we close our client to release resources and clean up threads."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"client.close()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Ready. Set. Search!\n",
|
||||
"\n",
|
||||
"Now that you are up to speed with Aerospike Vector Search's LangChain integration, you have the power of the Aerospike Database and the LangChain ecosystem at your finger tips. Happy building!"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": ".venv",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.12"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
@@ -9,7 +9,7 @@
|
||||
"\n",
|
||||
"This notebook covers how to get started with the `Chroma` vector store.\n",
|
||||
"\n",
|
||||
">[Chroma](https://docs.trychroma.com/getting-started) is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0. View the full docs of `Chroma` at [this page](https://docs.trychroma.com/integrations/frameworks/langchain), and find the API reference for the LangChain integration at [this page](https://python.langchain.com/api_reference/chroma/vectorstores/langchain_chroma.vectorstores.Chroma.html).\n",
|
||||
">[Chroma](https://docs.trychroma.com/getting-started) is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0. View the full docs of `Chroma` at [this page](https://docs.trychroma.com/reference/py-collection), and find the API reference for the LangChain integration at [this page](https://python.langchain.com/api_reference/chroma/vectorstores/langchain_chroma.vectorstores.Chroma.html).\n",
|
||||
"\n",
|
||||
":::info Chroma Cloud\n",
|
||||
"\n",
|
||||
@@ -522,39 +522,6 @@
|
||||
"vector_store.delete(ids=uuids[-1])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "675b3708-b5ef-4298-b950-eac27096b456",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Fork a vector store\n",
|
||||
"\n",
|
||||
"Forking lets you create a new `Chroma` vector store from an existing one instantly, using copy-on-write under the hood. This means that your new `Chroma` store is identical to the origin, but any modifications to it will not affect the origin, and vice-versa.\n",
|
||||
"\n",
|
||||
"Forks are great for any use case that benefits from data versioning. You can learn more about forking in the [Chroma docs](https://docs.trychroma.com/cloud/collection-forking).\n",
|
||||
"\n",
|
||||
"Note: Forking is only avaiable on `Chroma` instances with a Chroma Cloud connection."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "e08a0c79-4d2a-49ff-be63-d8591c268764",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"forked_store = vector_store.fork(new_name=\"my_forked_collection\")\n",
|
||||
"\n",
|
||||
"updated_document_2 = Document(\n",
|
||||
" page_content=\"The weather forecast for tomorrow is extrmeley hot, with a high of 100 degrees.\",\n",
|
||||
" metadata={\"source\": \"news\"},\n",
|
||||
" id=2,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Update does not affect 'vector_store'\n",
|
||||
"forked_store.update(ids=[\"2\"], documents=[updated_document_2])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "213acf08",
|
||||
@@ -642,7 +609,7 @@
|
||||
"source": [
|
||||
"#### Other search methods\n",
|
||||
"\n",
|
||||
"There are a variety of other search methods that are not covered in this notebook. For a full list of the search abilities available for `Chroma` check out the [API reference](https://python.langchain.com/api_reference/chroma/vectorstores/langchain_chroma.vectorstores.Chroma.html).\n",
|
||||
"There are a variety of other search methods that are not covered in this notebook, such as MMR search or searching by vector. For a full list of the search abilities available for `AstraDBVectorStore` check out the [API reference](https://python.langchain.com/api_reference/astradb/vectorstores/langchain_astradb.vectorstores.AstraDBVectorStore.html).\n",
|
||||
"\n",
|
||||
"### Query by turning into retriever\n",
|
||||
"\n",
|
||||
@@ -703,7 +670,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.13.0"
|
||||
"version": "3.12.0"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -23,7 +23,7 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"! docker run -d -p 8123:8123 -p 9000:9000 --name langchain-clickhouse-server --ulimit nofile=262144:262144 -e CLICKHOUSE_SKIP_USER_SETUP=1 clickhouse/clickhouse-server:25.7"
|
||||
"! docker run -d -p 8123:8123 -p9000:9000 --name langchain-clickhouse-server --ulimit nofile=262144:262144 clickhouse/clickhouse-server:24.7.6.8"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -310,8 +310,7 @@
|
||||
" where_str=f\"{meta}.source = 'tweet'\",\n",
|
||||
")\n",
|
||||
"for res in results:\n",
|
||||
" page_content, metadata = res\n",
|
||||
" print(f\"* {page_content} [{metadata}]\")"
|
||||
" print(f\"* {res.page_content} [{res.metadata}]\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
@@ -28,25 +28,10 @@
|
||||
"1. You must install and set up the JaguarDB server and its HTTP gateway server.\n",
|
||||
" Please refer to the instructions in:\n",
|
||||
" [www.jaguardb.com](http://www.jaguardb.com)\n",
|
||||
"\n",
|
||||
" **Method One: Docker**\n",
|
||||
"\n",
|
||||
" For quick setup in docker environment:\n",
|
||||
" docker pull jaguardb/jaguardb\n",
|
||||
" docker run -d -p 8888:8888 -p 8080:8080 --name jaguardb jaguardb/jaguardb\n",
|
||||
"\n",
|
||||
" **Method Two: Quick Setup(Linux)**\n",
|
||||
"\n",
|
||||
" Without Docker, run:\n",
|
||||
" ```\n",
|
||||
" curl -fsSL http://jaguardb.com/install.sh | sh\n",
|
||||
" ```\n",
|
||||
" This installs both the Jaguar vector database and HTTP gateway.\n",
|
||||
" The servers will start automatically after installation.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"2. You must install the http client package for JaguarDB:\n",
|
||||
" ```\n",
|
||||
" pip install -U jaguardb-http-client\n",
|
||||
|
||||
@@ -591,7 +591,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 36,
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"azdata_cell_guid": "d9127900-0942-48f1-bd4d-081c7fa3fcae",
|
||||
"language": "python"
|
||||
@@ -606,7 +606,7 @@
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain.document_loaders import AzureBlobStorageFileLoader\n",
|
||||
"from langchain_community.document_loaders import AzureBlobStorageFileLoader\n",
|
||||
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
|
||||
"from langchain_core.documents import Document\n",
|
||||
"\n",
|
||||
|
||||
@@ -47,20 +47,7 @@
|
||||
"\n",
|
||||
"Some Weaviate instances, such as those running on WCS, have authentication enabled, such as API key and/or username+password authentication.\n",
|
||||
"\n",
|
||||
"Read the [client authentication guide](https://weaviate.io/developers/weaviate/client-libraries/python#authentication) for more information, as well as the [in-depth authentication configuration page](https://weaviate.io/developers/weaviate/configuration/authentication).\n",
|
||||
"\n",
|
||||
"### Connect to an existing collection (reuse an index)\n",
|
||||
"If you already created a collection in your local Weaviate instance, you can connect to it directly:",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"from langchain_weaviate import WeaviateVectorStore\n",
|
||||
"\n",
|
||||
"store = WeaviateVectorStore(\n",
|
||||
" client=weaviate_client,\n",
|
||||
" index_name=\"Test\",\n",
|
||||
" text_key=\"text\",\n",
|
||||
")\n",
|
||||
"```\n"
|
||||
"Read the [client authentication guide](https://weaviate.io/developers/weaviate/client-libraries/python#authentication) for more information, as well as the [in-depth authentication configuration page](https://weaviate.io/developers/weaviate/configuration/authentication)."
|
||||
]
|
||||
},
|
||||
{
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user