mirror of
https://github.com/hwchase17/langchain.git
synced 2026-02-06 01:00:22 +00:00
Compare commits
27 Commits
bagatur/lc
...
v0.0.351
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
193f107cb5 | ||
|
|
714bef0cb6 | ||
|
|
61ad0e8be9 | ||
|
|
92957e6cdf | ||
|
|
9f851d8951 | ||
|
|
23eb480c38 | ||
|
|
5de1dc72b9 | ||
|
|
5fc2c578cf | ||
|
|
bbc98a234d | ||
|
|
11fda490ca | ||
|
|
2e6a9e6381 | ||
|
|
462321f479 | ||
|
|
6376fab957 | ||
|
|
2d91d2b978 | ||
|
|
c316731d0f | ||
|
|
59c3c344df | ||
|
|
2929509edd | ||
|
|
78ae276df7 | ||
|
|
f1d3f29bc4 | ||
|
|
1acc7ffa3f | ||
|
|
8a07c56313 | ||
|
|
01693b291e | ||
|
|
133971053a | ||
|
|
34e6f3ff72 | ||
|
|
dcead816df | ||
|
|
eca89f87d8 | ||
|
|
42421860bc |
353
.github/CONTRIBUTING.md
vendored
353
.github/CONTRIBUTING.md
vendored
@@ -3,31 +3,17 @@
|
||||
Hi there! Thank you for even being interested in contributing to LangChain.
|
||||
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
|
||||
|
||||
To learn about how to contribute, please follow the [guides here](https://python.langchain.com/docs/contributing/)
|
||||
|
||||
## 🗺️ Guidelines
|
||||
|
||||
### 👩💻 Contributing Code
|
||||
### 👩💻 Ways to contribute
|
||||
|
||||
To contribute to this project, please follow the ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
|
||||
Please do not try to push directly to this repo unless you are a maintainer.
|
||||
There are many ways to contribute to LangChain. Here are some common ways people contribute:
|
||||
|
||||
Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant
|
||||
maintainers.
|
||||
|
||||
Pull requests cannot land without passing the formatting, linting, and testing checks first. See [Testing](#testing) and
|
||||
[Formatting and Linting](#formatting-and-linting) for how to run these checks locally.
|
||||
|
||||
It's essential that we maintain great documentation and testing. If you:
|
||||
- Fix a bug
|
||||
- Add a relevant unit or integration test when possible. These live in `tests/unit_tests` and `tests/integration_tests`.
|
||||
- Make an improvement
|
||||
- Update any affected example notebooks and documentation. These live in `docs`.
|
||||
- Update unit and integration tests when relevant.
|
||||
- Add a feature
|
||||
- Add a demo notebook in `docs/docs/`.
|
||||
- Add unit and integration tests.
|
||||
|
||||
We are a small, progress-oriented team. If there's something you'd like to add or change, opening a pull request is the
|
||||
best way to get our attention.
|
||||
- [**Documentation**](https://python.langchain.com/docs/contributing/documentation): Help improve our docs, including this one!
|
||||
- [**Code**](https://python.langchain.com/docs/contributing/code): Help us write code, fix bugs, or improve our infrastructure.
|
||||
- [**Integrations**](https://python.langchain.com/docs/contributing/integration): Help us integrate with your favorite vendors and tools.
|
||||
|
||||
### 🚩GitHub Issues
|
||||
|
||||
@@ -54,327 +40,6 @@ In a similar vein, we do enforce certain linting, formatting, and documentation
|
||||
If you are finding these difficult (or even just annoying) to work with, feel free to contact a maintainer for help -
|
||||
we do not want these to get in the way of getting good code into the codebase.
|
||||
|
||||
## 🚀 Quick Start
|
||||
### Contributor Documentation
|
||||
|
||||
This quick start guide explains how to run the repository locally.
|
||||
For a [development container](https://containers.dev/), see the [.devcontainer folder](https://github.com/langchain-ai/langchain/tree/master/.devcontainer).
|
||||
|
||||
### Dependency Management: Poetry and other env/dependency managers
|
||||
|
||||
This project utilizes [Poetry](https://python-poetry.org/) v1.6.1+ as a dependency manager.
|
||||
|
||||
❗Note: *Before installing Poetry*, if you use `Conda`, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
|
||||
|
||||
Install Poetry: **[documentation on how to install it](https://python-poetry.org/docs/#installation)**.
|
||||
|
||||
❗Note: If you use `Conda` or `Pyenv` as your environment/package manager, after installing Poetry,
|
||||
tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
|
||||
|
||||
### Different packages
|
||||
|
||||
This repository contains multiple packages:
|
||||
- `langchain-core`: Base interfaces for key abstractions as well as logic for combining them in chains (LangChain Expression Language).
|
||||
- `langchain-community`: Third-party integrations of various components.
|
||||
- `langchain`: Chains, agents, and retrieval logic that makes up the cognitive architecture of your applications.
|
||||
- `langchain-experimental`: Components and chains that are experimental, either in the sense that the techniques are novel and still being tested, or they require giving the LLM more access than would be possible in most production systems.
|
||||
|
||||
Each of these has its own development environment. Docs are run from the top-level makefile, but development
|
||||
is split across separate test & release flows.
|
||||
|
||||
For this quickstart, start with langchain:
|
||||
|
||||
```bash
|
||||
cd libs/langchain
|
||||
```
|
||||
|
||||
### Local Development Dependencies
|
||||
|
||||
Install langchain development requirements (for running langchain, running examples, linting, formatting, tests, and coverage):
|
||||
|
||||
```bash
|
||||
poetry install --with test
|
||||
```
|
||||
|
||||
Then verify dependency installation:
|
||||
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
If the tests don't pass, you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
|
||||
|
||||
If during installation you receive a `WheelFileValidationError` for `debugpy`, please make sure you are running
|
||||
Poetry v1.6.1+. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases.
|
||||
If you are still seeing this bug on v1.6.1, you may also try disabling "modern installation"
|
||||
(`poetry config installer.modern-installation false`) and re-installing requirements.
|
||||
See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
|
||||
|
||||
### Testing
|
||||
|
||||
_some test dependencies are optional; see section about optional dependencies_.
|
||||
|
||||
Unit tests cover modular logic that does not require calls to outside APIs.
|
||||
If you add new logic, please add a unit test.
|
||||
|
||||
To run unit tests:
|
||||
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
To run unit tests in Docker:
|
||||
|
||||
```bash
|
||||
make docker_tests
|
||||
```
|
||||
|
||||
There are also [integration tests and code-coverage](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests/README.md) available.
|
||||
|
||||
### Only develop langchain_core or langchain_experimental
|
||||
|
||||
If you are only developing `langchain_core` or `langchain_experimental`, you can simply install the dependencies for the respective projects and run tests:
|
||||
|
||||
```bash
|
||||
cd libs/core
|
||||
poetry install --with test
|
||||
make test
|
||||
```
|
||||
|
||||
Or:
|
||||
|
||||
```bash
|
||||
cd libs/experimental
|
||||
poetry install --with test
|
||||
make test
|
||||
```
|
||||
|
||||
### Formatting and Linting
|
||||
|
||||
Run these locally before submitting a PR; the CI system will check also.
|
||||
|
||||
#### Code Formatting
|
||||
|
||||
Formatting for this project is done via [ruff](https://docs.astral.sh/ruff/rules/).
|
||||
|
||||
To run formatting for docs, cookbook and templates:
|
||||
|
||||
```bash
|
||||
make format
|
||||
```
|
||||
|
||||
To run formatting for a library, run the same command from the relevant library directory:
|
||||
|
||||
```bash
|
||||
cd libs/{LIBRARY}
|
||||
make format
|
||||
```
|
||||
|
||||
Additionally, you can run the formatter only on the files that have been modified in your current branch as compared to the master branch using the format_diff command:
|
||||
|
||||
```bash
|
||||
make format_diff
|
||||
```
|
||||
|
||||
This is especially useful when you have made changes to a subset of the project and want to ensure your changes are properly formatted without affecting the rest of the codebase.
|
||||
|
||||
#### Linting
|
||||
|
||||
Linting for this project is done via a combination of [ruff](https://docs.astral.sh/ruff/rules/) and [mypy](http://mypy-lang.org/).
|
||||
|
||||
To run linting for docs, cookbook and templates:
|
||||
|
||||
```bash
|
||||
make lint
|
||||
```
|
||||
|
||||
To run linting for a library, run the same command from the relevant library directory:
|
||||
|
||||
```bash
|
||||
cd libs/{LIBRARY}
|
||||
make lint
|
||||
```
|
||||
|
||||
In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff command:
|
||||
|
||||
```bash
|
||||
make lint_diff
|
||||
```
|
||||
|
||||
This can be very helpful when you've made changes to only certain parts of the project and want to ensure your changes meet the linting standards without having to check the entire codebase.
|
||||
|
||||
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
|
||||
|
||||
#### Spellcheck
|
||||
|
||||
Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).
|
||||
Note that `codespell` finds common typos, so it could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
|
||||
|
||||
To check spelling for this project:
|
||||
|
||||
```bash
|
||||
make spell_check
|
||||
```
|
||||
|
||||
To fix spelling in place:
|
||||
|
||||
```bash
|
||||
make spell_fix
|
||||
```
|
||||
|
||||
If codespell is incorrectly flagging a word, you can skip spellcheck for that word by adding it to the codespell config in the `pyproject.toml` file.
|
||||
|
||||
```python
|
||||
[tool.codespell]
|
||||
...
|
||||
# Add here:
|
||||
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
|
||||
```
|
||||
|
||||
## Working with Optional Dependencies
|
||||
|
||||
Langchain relies heavily on optional dependencies to keep the Langchain package lightweight.
|
||||
|
||||
You only need to add a new dependency if a **unit test** relies on the package.
|
||||
If your package is only required for **integration tests**, then you can skip these
|
||||
steps and leave all pyproject.toml and poetry.lock files alone.
|
||||
|
||||
If you're adding a new dependency to Langchain, assume that it will be an optional dependency, and
|
||||
that most users won't have it installed.
|
||||
|
||||
Users who do not have the dependency installed should be able to **import** your code without
|
||||
any side effects (no warnings, no errors, no exceptions).
|
||||
|
||||
To introduce the dependency to the pyproject.toml file correctly, please do the following:
|
||||
|
||||
1. Add the dependency to the main group as an optional dependency
|
||||
```bash
|
||||
poetry add --optional [package_name]
|
||||
```
|
||||
2. Open pyproject.toml and add the dependency to the `extended_testing` extra
|
||||
3. Relock the poetry file to update the extra.
|
||||
```bash
|
||||
poetry lock --no-update
|
||||
```
|
||||
4. Add a unit test that the very least attempts to import the new code. Ideally, the unit
|
||||
test makes use of lightweight fixtures to test the logic of the code.
|
||||
5. Please use the `@pytest.mark.requires(package_name)` decorator for any tests that require the dependency.
|
||||
|
||||
## Adding a Jupyter Notebook
|
||||
|
||||
If you are adding a Jupyter Notebook example, you'll want to install the optional `dev` dependencies.
|
||||
|
||||
To install dev dependencies:
|
||||
|
||||
```bash
|
||||
poetry install --with dev
|
||||
```
|
||||
|
||||
Launch a notebook:
|
||||
|
||||
```bash
|
||||
poetry run jupyter notebook
|
||||
```
|
||||
|
||||
When you run `poetry install`, the `langchain` package is installed as editable in the virtualenv, so your new logic can be imported into the notebook.
|
||||
|
||||
## Documentation
|
||||
|
||||
While the code is split between `langchain` and `langchain.experimental`, the documentation is one holistic thing.
|
||||
This covers how to get started contributing to documentation.
|
||||
|
||||
From the top-level of this repo, install documentation dependencies:
|
||||
|
||||
```bash
|
||||
poetry install
|
||||
```
|
||||
|
||||
### Contribute Documentation
|
||||
|
||||
The docs directory contains Documentation and API Reference.
|
||||
|
||||
Documentation is built using [Docusaurus 2](https://docusaurus.io/).
|
||||
|
||||
API Reference are largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.
|
||||
For that reason, we ask that you add good documentation to all classes and methods.
|
||||
|
||||
Similar to linting, we recognize documentation can be annoying. If you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
|
||||
|
||||
### Build Documentation Locally
|
||||
|
||||
In the following commands, the prefix `api_` indicates that those are operations for the API Reference.
|
||||
|
||||
Before building the documentation, it is always a good idea to clean the build directory:
|
||||
|
||||
```bash
|
||||
make docs_clean
|
||||
make api_docs_clean
|
||||
```
|
||||
|
||||
Next, you can build the documentation as outlined below:
|
||||
|
||||
```bash
|
||||
make docs_build
|
||||
make api_docs_build
|
||||
```
|
||||
|
||||
Finally, run the link checker to ensure all links are valid:
|
||||
|
||||
```bash
|
||||
make docs_linkcheck
|
||||
make api_docs_linkcheck
|
||||
```
|
||||
|
||||
### Verify Documentation changes
|
||||
|
||||
After pushing documentation changes to the repository, you can preview and verify that the changes are
|
||||
what you wanted by clicking the `View deployment` or `Visit Preview` buttons on the pull request `Conversation` page.
|
||||
This will take you to a preview of the documentation changes.
|
||||
This preview is created by [Vercel](https://vercel.com/docs/getting-started-with-vercel).
|
||||
|
||||
## 📕 Releases & Versioning
|
||||
|
||||
As of now, LangChain has an ad hoc release process: releases are cut with high frequency by
|
||||
a maintainer and published to [PyPI](https://pypi.org/).
|
||||
The different packages are versioned slightly differently.
|
||||
|
||||
### `langchain-core`
|
||||
|
||||
`langchain-core` is currently on version `0.1.x`.
|
||||
|
||||
As `langchain-core` contains the base abstractions and runtime for the whole LangChain ecosystem, we will communicate any breaking changes with advance notice and version bumps. The exception for this is anything in `langchain_core.beta`. The reason for `langchain_core.beta` is that given the rate of change of the field, being able to move quickly is still a priority, and this module is our attempt to do so.
|
||||
|
||||
Minor version increases will occur for:
|
||||
|
||||
- Breaking changes for any public interfaces NOT in `langchain_core.beta`
|
||||
|
||||
Patch version increases will occur for:
|
||||
|
||||
- Bug fixes
|
||||
- New features
|
||||
- Any changes to private interfaces
|
||||
- Any changes to `langchain_core.beta`
|
||||
|
||||
### `langchain`
|
||||
|
||||
`langchain` is currently on version `0.0.x`
|
||||
|
||||
All changes will be accompanied by a patch version increase. Any changes to public interfaces are nearly always done in a backwards compatible way and will be communicated ahead of time when they are not backwards compatible.
|
||||
|
||||
We are targeting January 2024 for a release of `langchain` v0.1, at which point `langchain` will adopt the same versioning policy as `langchain-core`.
|
||||
|
||||
### `langchain-community`
|
||||
|
||||
`langchain-community` is currently on version `0.0.x`
|
||||
|
||||
All changes will be accompanied by a patch version increase.
|
||||
|
||||
### `langchain-experimental`
|
||||
|
||||
`langchain-experimental` is currently on version `0.0.x`
|
||||
|
||||
All changes will be accompanied by a patch version increase.
|
||||
|
||||
## 🌟 Recognition
|
||||
|
||||
If your contribution has made its way into a release, we will want to give you credit on Twitter (only if you want though)!
|
||||
If you have a Twitter account you would like us to mention, please let us know in the PR or through another means.
|
||||
To learn about how to contribute, please follow the [guides here](https://python.langchain.com/docs/contributing/)
|
||||
2
.github/ISSUE_TEMPLATE/feature-request.yml
vendored
2
.github/ISSUE_TEMPLATE/feature-request.yml
vendored
@@ -27,4 +27,4 @@ body:
|
||||
attributes:
|
||||
label: Your contribution
|
||||
description: |
|
||||
Is there any way that you could help, e.g. by submitting a PR? Make sure to read the CONTRIBUTING.MD [readme](https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md)
|
||||
Is there any way that you could help, e.g. by submitting a PR? Make sure to read the [Contributing Guide](https://python.langchain.com/docs/contributing/)
|
||||
|
||||
2
.github/PULL_REQUEST_TEMPLATE.md
vendored
2
.github/PULL_REQUEST_TEMPLATE.md
vendored
@@ -10,7 +10,7 @@ Replace this entire comment with:
|
||||
Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally.
|
||||
|
||||
See contribution guidelines for more information on how to write/run tests, lint, etc:
|
||||
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
|
||||
https://python.langchain.com/docs/contributing/
|
||||
|
||||
If you're adding a new integration, please include:
|
||||
1. a test for the integration, preferably unit tests that do not rely on network access,
|
||||
|
||||
9
.github/workflows/_release.yml
vendored
9
.github/workflows/_release.yml
vendored
@@ -11,15 +11,8 @@ on:
|
||||
inputs:
|
||||
working-directory:
|
||||
required: true
|
||||
type: choice
|
||||
type: string
|
||||
default: 'libs/langchain'
|
||||
options:
|
||||
- libs/langchain
|
||||
- libs/core
|
||||
- libs/experimental
|
||||
- libs/community
|
||||
- libs/partners/google-genai
|
||||
- libs/partners/nvidia-ai-endpoints
|
||||
|
||||
env:
|
||||
PYTHON_VERSION: "3.10"
|
||||
|
||||
@@ -14,6 +14,7 @@ build:
|
||||
- python -m pip install --upgrade --no-cache-dir pip setuptools
|
||||
- python -m pip install --upgrade --no-cache-dir sphinx readthedocs-sphinx-ext
|
||||
- python -m pip install --exists-action=w --no-cache-dir -r docs/api_reference/requirements.txt
|
||||
- python -m pip install ./libs/partners/*
|
||||
- python docs/api_reference/create_api_rst.py
|
||||
- cat docs/api_reference/conf.py
|
||||
- python -m sphinx -T -E -b html -d _build/doctrees -c docs/api_reference docs/api_reference $READTHEDOCS_OUTPUT/html -j auto
|
||||
|
||||
@@ -105,7 +105,7 @@ Please see [here](https://python.langchain.com) for full documentation, which in
|
||||
|
||||
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
|
||||
|
||||
For detailed information on how to contribute, see [here](.github/CONTRIBUTING.md).
|
||||
For detailed information on how to contribute, see [here](https://python.langchain.com/docs/contributing/).
|
||||
|
||||
## 🌟 Contributors
|
||||
|
||||
|
||||
@@ -13,9 +13,9 @@ rsync -ruv --exclude node_modules --exclude api_reference --exclude .venv --excl
|
||||
cd ../_dist
|
||||
poetry run python scripts/model_feat_table.py
|
||||
cp ../cookbook/README.md src/pages/cookbook.mdx
|
||||
cp ../.github/CONTRIBUTING.md docs/contributing.md
|
||||
mkdir -p docs/templates
|
||||
cp ../templates/docs/INDEX.md docs/templates/index.md
|
||||
poetry run python scripts/copy_templates.py
|
||||
wget https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
|
||||
|
||||
yarn
|
||||
|
||||
@@ -1,49 +1,3 @@
|
||||
# Website
|
||||
# LangChain Documentation
|
||||
|
||||
This website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator.
|
||||
|
||||
### Installation
|
||||
|
||||
```
|
||||
$ yarn
|
||||
```
|
||||
|
||||
### Local Development
|
||||
|
||||
```
|
||||
$ yarn start
|
||||
```
|
||||
|
||||
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
|
||||
|
||||
### Build
|
||||
|
||||
```
|
||||
$ yarn build
|
||||
```
|
||||
|
||||
This command generates static content into the `build` directory and can be served using any static contents hosting service.
|
||||
|
||||
### Deployment
|
||||
|
||||
Using SSH:
|
||||
|
||||
```
|
||||
$ USE_SSH=true yarn deploy
|
||||
```
|
||||
|
||||
Not using SSH:
|
||||
|
||||
```
|
||||
$ GIT_USER=<Your GitHub username> yarn deploy
|
||||
```
|
||||
|
||||
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.
|
||||
|
||||
### Continuous Integration
|
||||
|
||||
Some common defaults for linting/formatting have been set for you. If you integrate your project with an open-source Continuous Integration system (e.g. Travis CI, CircleCI), you may check for issues using the following command.
|
||||
|
||||
```
|
||||
$ yarn ci
|
||||
```
|
||||
For more information on contributing to our documentation, see the [Documentation Contributing Guide](https://python.langchain.com/docs/contributing/documentation)
|
||||
|
||||
@@ -72,8 +72,8 @@ def setup(app):
|
||||
# -- Project information -----------------------------------------------------
|
||||
|
||||
project = "🦜🔗 LangChain"
|
||||
copyright = "2023, Harrison Chase"
|
||||
author = "Harrison Chase"
|
||||
copyright = "2023, LangChain, Inc."
|
||||
author = "LangChain, Inc."
|
||||
|
||||
version = data["tool"]["poetry"]["version"]
|
||||
release = version
|
||||
@@ -141,13 +141,20 @@ redirects = {
|
||||
for old_link in redirects:
|
||||
html_additional_pages[old_link] = "redirects.html"
|
||||
|
||||
partners_dir = Path(__file__).parent.parent.parent / "libs/partners"
|
||||
partners = [
|
||||
(p.name, p.name.replace("-", "_") + "_api_reference")
|
||||
for p in partners_dir.iterdir()
|
||||
]
|
||||
|
||||
html_context = {
|
||||
"display_github": True, # Integrate GitHub
|
||||
"github_user": "hwchase17", # Username
|
||||
"github_user": "langchain-ai", # Username
|
||||
"github_repo": "langchain", # Repo name
|
||||
"github_version": "master", # Version
|
||||
"conf_py_path": "/docs/api_reference", # Path in the checkout to the docs root
|
||||
"redirects": redirects,
|
||||
"partners": partners,
|
||||
}
|
||||
|
||||
# Add any paths that contain custom static files (such as style sheets) here,
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
-e libs/langchain
|
||||
-e libs/core
|
||||
-e libs/community
|
||||
-e libs/partners/google-genai
|
||||
pydantic<2
|
||||
autodoc_pydantic==1.8.0
|
||||
myst_parser
|
||||
|
||||
@@ -6,11 +6,6 @@
|
||||
{%- set top_container_cls = "sk-landing-container" %}
|
||||
{%- endif %}
|
||||
|
||||
{# title, link, link_attrs #}
|
||||
{%- set drop_down_navigation = [
|
||||
('Google Generative AI', pathto('google_genai_api_reference'), ''),]
|
||||
-%}
|
||||
|
||||
<nav id="navbar" class="{{ nav_bar_class }} navbar navbar-expand-md navbar-light bg-light py-0">
|
||||
<div class="container-fluid {{ top_container_cls }} px-0">
|
||||
{%- if logo_url %}
|
||||
@@ -48,16 +43,16 @@
|
||||
<li class="nav-item">
|
||||
<a class="sk-nav-link nav-link" href="{{ pathto('experimental_api_reference') }}">Experimental</a>
|
||||
</li>
|
||||
{%- for title, link, link_attrs in drop_down_navigation %}
|
||||
{%- for title, pathname in partners %}
|
||||
<li class="nav-item">
|
||||
<a class="sk-nav-link nav-link nav-more-item-mobile-items" href="{{ link }}" {{ link_attrs }}>{{ title }}</a>
|
||||
<a class="sk-nav-link nav-link nav-more-item-mobile-items" href="{{ pathto(pathname) }}">{{ title }}</a>
|
||||
</li>
|
||||
{%- endfor %}
|
||||
<li class="nav-item dropdown nav-more-item-dropdown">
|
||||
<a class="sk-nav-link nav-link dropdown-toggle" href="#" id="navbarDropdown" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">Partner libs</a>
|
||||
<div class="dropdown-menu" aria-labelledby="navbarDropdown">
|
||||
{%- for title, link, link_attrs in drop_down_navigation %}
|
||||
<a class="sk-nav-dropdown-item dropdown-item" href="{{ link }}" {{ link_attrs }}>{{ title}}</a>
|
||||
{%- for title, pathname in partners %}
|
||||
<a class="sk-nav-dropdown-item dropdown-item" href="{{ pathto(pathname) }}">{{ title }}</a>
|
||||
{%- endfor %}
|
||||
</div>
|
||||
</li>
|
||||
|
||||
@@ -18,7 +18,7 @@ Whether you’re new to LangChain, looking to go deeper, or just want to get mor
|
||||
LangChain is the product of over 5,000+ contributions by 1,500+ contributors, and there is ******still****** so much to do together. Here are some ways to get involved:
|
||||
|
||||
- **[Open a pull request](https://github.com/langchain-ai/langchain/issues):** We’d appreciate all forms of contributions–new features, infrastructure improvements, better documentation, bug fixes, etc. If you have an improvement or an idea, we’d love to work on it with you.
|
||||
- **[Read our contributor guidelines:](https://github.com/langchain-ai/langchain/blob/bbd22b9b761389a5e40fc45b0570e1830aabb707/.github/CONTRIBUTING.md)** We ask contributors to follow a ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow, run a few local checks for formatting, linting, and testing before submitting, and follow certain documentation and testing conventions.
|
||||
- **[Read our contributor guidelines:](./contributing/)** We ask contributors to follow a ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow, run a few local checks for formatting, linting, and testing before submitting, and follow certain documentation and testing conventions.
|
||||
- **First time contributor?** [Try one of these PRs with the “good first issue” tag](https://github.com/langchain-ai/langchain/contribute).
|
||||
- **Become an expert:** Our experts help the community by answering product questions in Discord. If that’s a role you’d like to play, we’d be so grateful! (And we have some special experts-only goodies/perks we can tell you more about). Send us an email to introduce yourself at hello@langchain.dev and we’ll take it from there!
|
||||
- **Integrate with LangChain:** If your product integrates with LangChain–or aspires to–we want to help make sure the experience is as smooth as possible for you and end users. Send us an email at hello@langchain.dev and tell us what you’re working on.
|
||||
|
||||
250
docs/docs/contributing/code.mdx
Normal file
250
docs/docs/contributing/code.mdx
Normal file
@@ -0,0 +1,250 @@
|
||||
---
|
||||
sidebar_position: 1
|
||||
---
|
||||
# Contribute Code
|
||||
|
||||
To contribute to this project, please follow the ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
|
||||
Please do not try to push directly to this repo unless you are a maintainer.
|
||||
|
||||
Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant
|
||||
maintainers.
|
||||
|
||||
Pull requests cannot land without passing the formatting, linting, and testing checks first. See [Testing](#testing) and
|
||||
[Formatting and Linting](#formatting-and-linting) for how to run these checks locally.
|
||||
|
||||
It's essential that we maintain great documentation and testing. If you:
|
||||
- Fix a bug
|
||||
- Add a relevant unit or integration test when possible. These live in `tests/unit_tests` and `tests/integration_tests`.
|
||||
- Make an improvement
|
||||
- Update any affected example notebooks and documentation. These live in `docs`.
|
||||
- Update unit and integration tests when relevant.
|
||||
- Add a feature
|
||||
- Add a demo notebook in `docs/docs/`.
|
||||
- Add unit and integration tests.
|
||||
|
||||
We are a small, progress-oriented team. If there's something you'd like to add or change, opening a pull request is the
|
||||
best way to get our attention.
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
This quick start guide explains how to run the repository locally.
|
||||
For a [development container](https://containers.dev/), see the [.devcontainer folder](https://github.com/langchain-ai/langchain/tree/master/.devcontainer).
|
||||
|
||||
### Dependency Management: Poetry and other env/dependency managers
|
||||
|
||||
This project utilizes [Poetry](https://python-poetry.org/) v1.6.1+ as a dependency manager.
|
||||
|
||||
❗Note: *Before installing Poetry*, if you use `Conda`, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
|
||||
|
||||
Install Poetry: **[documentation on how to install it](https://python-poetry.org/docs/#installation)**.
|
||||
|
||||
❗Note: If you use `Conda` or `Pyenv` as your environment/package manager, after installing Poetry,
|
||||
tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
|
||||
|
||||
### Different packages
|
||||
|
||||
This repository contains multiple packages:
|
||||
- `langchain-core`: Base interfaces for key abstractions as well as logic for combining them in chains (LangChain Expression Language).
|
||||
- `langchain-community`: Third-party integrations of various components.
|
||||
- `langchain`: Chains, agents, and retrieval logic that makes up the cognitive architecture of your applications.
|
||||
- `langchain-experimental`: Components and chains that are experimental, either in the sense that the techniques are novel and still being tested, or they require giving the LLM more access than would be possible in most production systems.
|
||||
- Partner integrations: Partner packages in `libs/partners` that are independently version controlled.
|
||||
|
||||
Each of these has its own development environment. Docs are run from the top-level makefile, but development
|
||||
is split across separate test & release flows.
|
||||
|
||||
For this quickstart, start with langchain-community:
|
||||
|
||||
```bash
|
||||
cd libs/community
|
||||
```
|
||||
|
||||
### Local Development Dependencies
|
||||
|
||||
Install langchain-community development requirements (for running langchain, running examples, linting, formatting, tests, and coverage):
|
||||
|
||||
```bash
|
||||
poetry install --with lint,typing,test,integration_tests
|
||||
```
|
||||
|
||||
Then verify dependency installation:
|
||||
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
If during installation you receive a `WheelFileValidationError` for `debugpy`, please make sure you are running
|
||||
Poetry v1.6.1+. This bug was present in older versions of Poetry (e.g. 1.4.1) and has been resolved in newer releases.
|
||||
If you are still seeing this bug on v1.6.1, you may also try disabling "modern installation"
|
||||
(`poetry config installer.modern-installation false`) and re-installing requirements.
|
||||
See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
|
||||
|
||||
### Testing
|
||||
|
||||
_In `langchain`, `langchain-community`, and `langchain-experimental`, some test dependencies are optional; see section about optional dependencies_.
|
||||
|
||||
Unit tests cover modular logic that does not require calls to outside APIs.
|
||||
If you add new logic, please add a unit test.
|
||||
|
||||
To run unit tests:
|
||||
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
To run unit tests in Docker:
|
||||
|
||||
```bash
|
||||
make docker_tests
|
||||
```
|
||||
|
||||
There are also [integration tests and code-coverage](./testing) available.
|
||||
|
||||
### Only develop langchain_core or langchain_experimental
|
||||
|
||||
If you are only developing `langchain_core` or `langchain_experimental`, you can simply install the dependencies for the respective projects and run tests:
|
||||
|
||||
```bash
|
||||
cd libs/core
|
||||
poetry install --with test
|
||||
make test
|
||||
```
|
||||
|
||||
Or:
|
||||
|
||||
```bash
|
||||
cd libs/experimental
|
||||
poetry install --with test
|
||||
make test
|
||||
```
|
||||
|
||||
### Formatting and Linting
|
||||
|
||||
Run these locally before submitting a PR; the CI system will check also.
|
||||
|
||||
#### Code Formatting
|
||||
|
||||
Formatting for this project is done via [ruff](https://docs.astral.sh/ruff/rules/).
|
||||
|
||||
To run formatting for docs, cookbook and templates:
|
||||
|
||||
```bash
|
||||
make format
|
||||
```
|
||||
|
||||
To run formatting for a library, run the same command from the relevant library directory:
|
||||
|
||||
```bash
|
||||
cd libs/{LIBRARY}
|
||||
make format
|
||||
```
|
||||
|
||||
Additionally, you can run the formatter only on the files that have been modified in your current branch as compared to the master branch using the format_diff command:
|
||||
|
||||
```bash
|
||||
make format_diff
|
||||
```
|
||||
|
||||
This is especially useful when you have made changes to a subset of the project and want to ensure your changes are properly formatted without affecting the rest of the codebase.
|
||||
|
||||
#### Linting
|
||||
|
||||
Linting for this project is done via a combination of [ruff](https://docs.astral.sh/ruff/rules/) and [mypy](http://mypy-lang.org/).
|
||||
|
||||
To run linting for docs, cookbook and templates:
|
||||
|
||||
```bash
|
||||
make lint
|
||||
```
|
||||
|
||||
To run linting for a library, run the same command from the relevant library directory:
|
||||
|
||||
```bash
|
||||
cd libs/{LIBRARY}
|
||||
make lint
|
||||
```
|
||||
|
||||
In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff command:
|
||||
|
||||
```bash
|
||||
make lint_diff
|
||||
```
|
||||
|
||||
This can be very helpful when you've made changes to only certain parts of the project and want to ensure your changes meet the linting standards without having to check the entire codebase.
|
||||
|
||||
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
|
||||
|
||||
#### Spellcheck
|
||||
|
||||
Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).
|
||||
Note that `codespell` finds common typos, so it could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
|
||||
|
||||
To check spelling for this project:
|
||||
|
||||
```bash
|
||||
make spell_check
|
||||
```
|
||||
|
||||
To fix spelling in place:
|
||||
|
||||
```bash
|
||||
make spell_fix
|
||||
```
|
||||
|
||||
If codespell is incorrectly flagging a word, you can skip spellcheck for that word by adding it to the codespell config in the `pyproject.toml` file.
|
||||
|
||||
```python
|
||||
[tool.codespell]
|
||||
...
|
||||
# Add here:
|
||||
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
|
||||
```
|
||||
|
||||
## Working with Optional Dependencies
|
||||
|
||||
`langchain`, `langchain-community`, and `langchain-experimental` rely on optional dependencies to keep these packages lightweight.
|
||||
|
||||
`langchain-core` and partner packages **do not use** optional dependencies in this way.
|
||||
|
||||
You only need to add a new dependency if a **unit test** relies on the package.
|
||||
If your package is only required for **integration tests**, then you can skip these
|
||||
steps and leave all pyproject.toml and poetry.lock files alone.
|
||||
|
||||
If you're adding a new dependency to Langchain, assume that it will be an optional dependency, and
|
||||
that most users won't have it installed.
|
||||
|
||||
Users who do not have the dependency installed should be able to **import** your code without
|
||||
any side effects (no warnings, no errors, no exceptions).
|
||||
|
||||
To introduce the dependency to the pyproject.toml file correctly, please do the following:
|
||||
|
||||
1. Add the dependency to the main group as an optional dependency
|
||||
```bash
|
||||
poetry add --optional [package_name]
|
||||
```
|
||||
2. Open pyproject.toml and add the dependency to the `extended_testing` extra
|
||||
3. Relock the poetry file to update the extra.
|
||||
```bash
|
||||
poetry lock --no-update
|
||||
```
|
||||
4. Add a unit test that the very least attempts to import the new code. Ideally, the unit
|
||||
test makes use of lightweight fixtures to test the logic of the code.
|
||||
5. Please use the `@pytest.mark.requires(package_name)` decorator for any tests that require the dependency.
|
||||
|
||||
## Adding a Jupyter Notebook
|
||||
|
||||
If you are adding a Jupyter Notebook example, you'll want to install the optional `dev` dependencies.
|
||||
|
||||
To install dev dependencies:
|
||||
|
||||
```bash
|
||||
poetry install --with dev
|
||||
```
|
||||
|
||||
Launch a notebook:
|
||||
|
||||
```bash
|
||||
poetry run jupyter notebook
|
||||
```
|
||||
|
||||
When you run `poetry install`, the `langchain` package is installed as editable in the virtualenv, so your new logic can be imported into the notebook.
|
||||
67
docs/docs/contributing/documentation.mdx
Normal file
67
docs/docs/contributing/documentation.mdx
Normal file
@@ -0,0 +1,67 @@
|
||||
---
|
||||
sidebar_position: 3
|
||||
---
|
||||
# Contribute Documentation
|
||||
|
||||
The docs directory contains Documentation and API Reference.
|
||||
|
||||
Documentation is built using [Quarto](https://quarto.org) and [Docusaurus 2](https://docusaurus.io/).
|
||||
|
||||
API Reference are largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code and are hosted by [Read the Docs](https://readthedocs.org/).
|
||||
For that reason, we ask that you add good documentation to all classes and methods.
|
||||
|
||||
Similar to linting, we recognize documentation can be annoying. If you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
|
||||
|
||||
## Build Documentation Locally
|
||||
|
||||
### Install dependencies
|
||||
|
||||
- [Quarto](https://quarto.org) - package that converts Jupyter notebooks (`.ipynb` files) into mdx files for serving in Docusaurus.
|
||||
- `poetry install` from the monorepo root
|
||||
|
||||
### Building
|
||||
|
||||
In the following commands, the prefix `api_` indicates that those are operations for the API Reference.
|
||||
|
||||
Before building the documentation, it is always a good idea to clean the build directory:
|
||||
|
||||
```bash
|
||||
make docs_clean
|
||||
make api_docs_clean
|
||||
```
|
||||
|
||||
Next, you can build the documentation as outlined below:
|
||||
|
||||
```bash
|
||||
make docs_build
|
||||
make api_docs_build
|
||||
```
|
||||
|
||||
Finally, run the link checker to ensure all links are valid:
|
||||
|
||||
```bash
|
||||
make docs_linkcheck
|
||||
make api_docs_linkcheck
|
||||
```
|
||||
|
||||
### Linting and Formatting
|
||||
|
||||
The docs are linted from the monorepo root. To lint the docs, run the following from there:
|
||||
|
||||
```bash
|
||||
poetry install --with lint,typing
|
||||
make lint
|
||||
```
|
||||
|
||||
If you have formatting-related errors, you can fix them automatically with:
|
||||
|
||||
```bash
|
||||
make format
|
||||
```
|
||||
|
||||
## Verify Documentation changes
|
||||
|
||||
After pushing documentation changes to the repository, you can preview and verify that the changes are
|
||||
what you wanted by clicking the `View deployment` or `Visit Preview` buttons on the pull request `Conversation` page.
|
||||
This will take you to a preview of the documentation changes.
|
||||
This preview is created by [Vercel](https://vercel.com/docs/getting-started-with-vercel).
|
||||
42
docs/docs/contributing/index.mdx
Normal file
42
docs/docs/contributing/index.mdx
Normal file
@@ -0,0 +1,42 @@
|
||||
---
|
||||
sidebar_position: 0
|
||||
---
|
||||
# Welcome Contributors
|
||||
|
||||
Hi there! Thank you for even being interested in contributing to LangChain.
|
||||
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
|
||||
|
||||
## 🗺️ Guidelines
|
||||
|
||||
### 👩💻 Ways to contribute
|
||||
|
||||
There are many ways to contribute to LangChain. Here are some common ways people contribute:
|
||||
|
||||
- [**Documentation**](./documentation): Help improve our docs, including this one!
|
||||
- [**Code**](./code): Help us write code, fix bugs, or improve our infrastructure.
|
||||
- [**Integrations**](./integration): Help us integrate with your favorite vendors and tools.
|
||||
|
||||
### 🚩GitHub Issues
|
||||
|
||||
Our [issues](https://github.com/langchain-ai/langchain/issues) page is kept up to date with bugs, improvements, and feature requests.
|
||||
|
||||
There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help organize issues.
|
||||
|
||||
If you start working on an issue, please assign it to yourself.
|
||||
|
||||
If you are adding an issue, please try to keep it focused on a single, modular bug/improvement/feature.
|
||||
If two issues are related, or blocking, please link them rather than combining them.
|
||||
|
||||
We will try to keep these issues as up-to-date as possible, though
|
||||
with the rapid rate of development in this field some may get out of date.
|
||||
If you notice this happening, please let us know.
|
||||
|
||||
### 🙋Getting Help
|
||||
|
||||
Our goal is to have the simplest developer setup possible. Should you experience any difficulty getting setup, please
|
||||
contact a maintainer! Not only do we want to help get you unblocked, but we also want to make sure that the process is
|
||||
smooth for future contributors.
|
||||
|
||||
In a similar vein, we do enforce certain linting, formatting, and documentation standards in the codebase.
|
||||
If you are finding these difficult (or even just annoying) to work with, feel free to contact a maintainer for help -
|
||||
we do not want these to get in the way of getting good code into the codebase.
|
||||
148
docs/docs/contributing/integrations.mdx
Normal file
148
docs/docs/contributing/integrations.mdx
Normal file
@@ -0,0 +1,148 @@
|
||||
---
|
||||
sidebar_position: 5
|
||||
---
|
||||
# Contribute Integrations
|
||||
|
||||
To begin, make sure you have all the dependencies outlined in guide on [Contributing Code](./code).
|
||||
|
||||
There are a few different places you can contribute integrations for LangChain:
|
||||
|
||||
- **Community**: For lighter-weight integrations that are primarily maintained by LangChain and the Open Source Community.
|
||||
- **Partner Packages**: For independent packages that are co-maintained by LangChain and a partner.
|
||||
|
||||
For the most part, new integrations should be added to the Community package. Partner packages require more maintenance as separate packages, so please confirm with the LangChain team before creating a new partner package.
|
||||
|
||||
In the following sections, we'll walk through how to contribute to each of these packages from a fake company, `Parrot Link AI`.
|
||||
|
||||
## Community Package
|
||||
|
||||
The `langchain-community` package is in `libs/community` and contains most integrations.
|
||||
|
||||
It is installed by users with `pip install langchain-community`, and exported members can be imported with code like
|
||||
|
||||
```python
|
||||
from langchain_community.chat_models import ParrotLinkLLM
|
||||
from langchain_community.llms import ChatParrotLink
|
||||
from langchain_community.vectorstores import ParrotLinkVectorStore
|
||||
```
|
||||
|
||||
The community package relies on manually-installed dependent packages, so you will see errors if you try to import a package that is not installed. In our fake example, if you tried to import `ParrotLinkLLM` without installing `parrot-link-sdk`, you will see an `ImportError` telling you to install it when trying to use it.
|
||||
|
||||
Let's say we wanted to implement a chat model for Parrot Link AI. We would create a new file in `libs/community/langchain_community/chat_models/parrot_link.py` with the following code:
|
||||
|
||||
```python
|
||||
from langchain_core.language_models.chat_models import BaseChatModel
|
||||
|
||||
class ChatParrotLink(BaseChatModel):
|
||||
"""ChatParrotLink chat model.
|
||||
|
||||
Example:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_parrot_link import ChatParrotLink
|
||||
|
||||
model = ChatParrotLink()
|
||||
"""
|
||||
|
||||
...
|
||||
```
|
||||
|
||||
And we would write tests in:
|
||||
|
||||
- Unit tests: `libs/community/tests/unit_tests/chat_models/test_parrot_link.py`
|
||||
- Integration tests: `libs/community/tests/integration_tests/chat_models/test_parrot_link.py`
|
||||
|
||||
And add documentation to:
|
||||
- `docs/docs/integrations/chat/parrot_link.ipynb`
|
||||
|
||||
- `docs/docs/
|
||||
## Partner Packages
|
||||
|
||||
Partner packages are in `libs/partners/*` and are installed by users with `pip install langchain-{partner}`, and exported members can be imported with code like
|
||||
|
||||
```python
|
||||
from langchain_{partner} import X
|
||||
```
|
||||
|
||||
### Set up a new package
|
||||
|
||||
To set up a new partner package, use the latest version of the LangChain CLI. You can install or update it with:
|
||||
|
||||
```bash
|
||||
pip install -U langchain-cli
|
||||
```
|
||||
|
||||
Let's say you want to create a new partner package working for a company called Parrot Link AI.
|
||||
|
||||
Then, run the following command to create a new partner package:
|
||||
|
||||
```bash
|
||||
cd libs/partners
|
||||
langchain-cli integration new
|
||||
> Name: parrot-link
|
||||
> Name of integration in PascalCase [ParrotLink]: ParrotLink
|
||||
```
|
||||
|
||||
This will create a new package in `libs/partners/parrot-link` with the following structure:
|
||||
|
||||
```
|
||||
libs/partners/parrot-link/
|
||||
langchain_parrot_link/ # folder containing your package
|
||||
...
|
||||
tests/
|
||||
...
|
||||
docs/ # bootstrapped docs notebooks, must be moved to /docs in monorepo root
|
||||
...
|
||||
scripts/ # scripts for CI
|
||||
...
|
||||
LICENSE
|
||||
README.md # fill out with information about your package
|
||||
Makefile # default commands for CI
|
||||
pyproject.toml # package metadata, mostly managed by Poetry
|
||||
poetry.lock # package lockfile, managed by Poetry
|
||||
.gitignore
|
||||
```
|
||||
|
||||
### Implement your package
|
||||
|
||||
First, add any dependencies your package needs, such as your company's SDK:
|
||||
|
||||
```bash
|
||||
poetry add parrot-link-sdk
|
||||
```
|
||||
|
||||
If you need separate dependencies for type checking, you can add them to the `typing` group with:
|
||||
|
||||
```bash
|
||||
poetry add --group typing types-parrot-link-sdk
|
||||
```
|
||||
|
||||
Then, implement your package in `libs/partners/parrot-link/langchain_parrot_link`.
|
||||
|
||||
By default, this will include stubs for a Chat Model, an LLM, and/or a Vector Store. You should delete any of the files you won't use and remove them from `__init__.py`.
|
||||
|
||||
### Write Unit and Integration Tests
|
||||
|
||||
Some basic tests are generated in the tests/ directory. You should add more tests to cover your package's functionality.
|
||||
|
||||
For information on running and implementing tests, see the [Testing guide](./testing).
|
||||
|
||||
### Write documentation
|
||||
|
||||
Documentation is generated from Jupyter notebooks in the `docs/` directory. You should move the generated notebooks to the relevant `docs/docs/integrations` directory in the monorepo root.
|
||||
|
||||
### Additional steps
|
||||
|
||||
Contributor steps:
|
||||
|
||||
- [ ] Add the new package to the API reference dropdown in `docs/api_reference/themes/scikit-learn-modern/nav.html`
|
||||
- [ ] Add package (e.g. `langchain-parrot-link`) to API docs build in `docs/api_reference/requirements.txt`
|
||||
- [ ] Add secret names to manual integrations workflow in `.github/workflows/_integration_test.yml`
|
||||
- [ ] Add secrets to release workflow (for pre-release testing) in `.github/workflows/_release.yml`
|
||||
- [ ] Add library choice to top of `.github/workflows/_release.yml`
|
||||
|
||||
Maintainer steps (Contributors should **not** do these):
|
||||
|
||||
- [ ] set up pypi and test pypi projects
|
||||
- [ ] add credential secrets to Github Actions
|
||||
- [ ] add package to conda-forge
|
||||
56
docs/docs/contributing/packages.mdx
Normal file
56
docs/docs/contributing/packages.mdx
Normal file
@@ -0,0 +1,56 @@
|
||||
---
|
||||
sidebar_label: Package Versioning
|
||||
sidebar_position: 4
|
||||
---
|
||||
|
||||
# 📕 Package Versioning
|
||||
|
||||
As of now, LangChain has an ad hoc release process: releases are cut with high frequency by
|
||||
a maintainer and published to [PyPI](https://pypi.org/).
|
||||
The different packages are versioned slightly differently.
|
||||
|
||||
## `langchain-core`
|
||||
|
||||
`langchain-core` is currently on version `0.1.x`.
|
||||
|
||||
As `langchain-core` contains the base abstractions and runtime for the whole LangChain ecosystem, we will communicate any breaking changes with advance notice and version bumps. The exception for this is anything in `langchain_core.beta`. The reason for `langchain_core.beta` is that given the rate of change of the field, being able to move quickly is still a priority, and this module is our attempt to do so.
|
||||
|
||||
Minor version increases will occur for:
|
||||
|
||||
- Breaking changes for any public interfaces NOT in `langchain_core.beta`
|
||||
|
||||
Patch version increases will occur for:
|
||||
|
||||
- Bug fixes
|
||||
- New features
|
||||
- Any changes to private interfaces
|
||||
- Any changes to `langchain_core.beta`
|
||||
|
||||
## `langchain`
|
||||
|
||||
`langchain` is currently on version `0.0.x`
|
||||
|
||||
All changes will be accompanied by a patch version increase. Any changes to public interfaces are nearly always done in a backwards compatible way and will be communicated ahead of time when they are not backwards compatible.
|
||||
|
||||
We are targeting January 2024 for a release of `langchain` v0.1, at which point `langchain` will adopt the same versioning policy as `langchain-core`.
|
||||
|
||||
## `langchain-community`
|
||||
|
||||
`langchain-community` is currently on version `0.0.x`
|
||||
|
||||
All changes will be accompanied by a patch version increase.
|
||||
|
||||
## `langchain-experimental`
|
||||
|
||||
`langchain-experimental` is currently on version `0.0.x`
|
||||
|
||||
All changes will be accompanied by a patch version increase.
|
||||
|
||||
## Partner Packages
|
||||
|
||||
Partner packages are versioned independently.
|
||||
|
||||
# 🌟 Recognition
|
||||
|
||||
If your contribution has made its way into a release, we will want to give you credit on Twitter (only if you want though)!
|
||||
If you have a Twitter account you would like us to mention, please let us know in the PR or through another means.
|
||||
147
docs/docs/contributing/testing.mdx
Normal file
147
docs/docs/contributing/testing.mdx
Normal file
@@ -0,0 +1,147 @@
|
||||
---
|
||||
sidebar_position: 2
|
||||
---
|
||||
|
||||
# Testing
|
||||
|
||||
All of our packages have unit tests and integration tests, and we favor unit tests over integration tests.
|
||||
|
||||
Unit tests run on every pull request, so they should be fast and reliable.
|
||||
|
||||
Integration tests run once a day, and they require more setup, so they should be reserved for confirming interface points with external services.
|
||||
|
||||
## Unit Tests
|
||||
|
||||
Unit tests cover modular logic that does not require calls to outside APIs.
|
||||
If you add new logic, please add a unit test.
|
||||
|
||||
To install dependencies for unit tests:
|
||||
|
||||
```bash
|
||||
poetry install --with test
|
||||
```
|
||||
|
||||
To run unit tests:
|
||||
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
To run unit tests in Docker:
|
||||
|
||||
```bash
|
||||
make docker_tests
|
||||
```
|
||||
|
||||
To run a specific test:
|
||||
|
||||
```bash
|
||||
TEST_FILE=tests/unit_tests/test_imports.py make test
|
||||
```
|
||||
|
||||
## Integration Tests
|
||||
|
||||
Integration tests cover logic that requires making calls to outside APIs (often integration with other services).
|
||||
If you add support for a new external API, please add a new integration test.
|
||||
|
||||
**Warning:** Almost no tests should be integration tests.
|
||||
|
||||
Tests that require making network connections make it difficult for other
|
||||
developers to test the code.
|
||||
|
||||
Instead favor relying on `responses` library and/or mock.patch to mock
|
||||
requests using small fixtures.
|
||||
|
||||
To install dependencies for integration tests:
|
||||
|
||||
```bash
|
||||
poetry install --with test,test_integration
|
||||
```
|
||||
|
||||
To run integration tests:
|
||||
|
||||
```bash
|
||||
make integration_tests
|
||||
```
|
||||
|
||||
### Prepare
|
||||
|
||||
The integration tests use several search engines and databases. The tests
|
||||
aim to verify the correct behavior of the engines and databases according to
|
||||
their specifications and requirements.
|
||||
|
||||
To run some integration tests, such as tests located in
|
||||
`tests/integration_tests/vectorstores/`, you will need to install the following
|
||||
software:
|
||||
|
||||
- Docker
|
||||
- Python 3.8.1 or later
|
||||
|
||||
Any new dependencies should be added by running:
|
||||
|
||||
```bash
|
||||
# add package and install it after adding:
|
||||
poetry add tiktoken@latest --group "test_integration" && poetry install --with test_integration
|
||||
```
|
||||
|
||||
Before running any tests, you should start a specific Docker container that has all the
|
||||
necessary dependencies installed. For instance, we use the `elasticsearch.yml` container
|
||||
for `test_elasticsearch.py`:
|
||||
|
||||
```bash
|
||||
cd tests/integration_tests/vectorstores/docker-compose
|
||||
docker-compose -f elasticsearch.yml up
|
||||
```
|
||||
|
||||
For environments that requires more involving preparation, look for `*.sh`. For instance,
|
||||
`opensearch.sh` builds a required docker image and then launch opensearch.
|
||||
|
||||
|
||||
### Prepare environment variables for local testing:
|
||||
|
||||
- copy `tests/integration_tests/.env.example` to `tests/integration_tests/.env`
|
||||
- set variables in `tests/integration_tests/.env` file, e.g `OPENAI_API_KEY`
|
||||
|
||||
Additionally, it's important to note that some integration tests may require certain
|
||||
environment variables to be set, such as `OPENAI_API_KEY`. Be sure to set any required
|
||||
environment variables before running the tests to ensure they run correctly.
|
||||
|
||||
### Recording HTTP interactions with pytest-vcr
|
||||
|
||||
Some of the integration tests in this repository involve making HTTP requests to
|
||||
external services. To prevent these requests from being made every time the tests are
|
||||
run, we use pytest-vcr to record and replay HTTP interactions.
|
||||
|
||||
When running tests in a CI/CD pipeline, you may not want to modify the existing
|
||||
cassettes. You can use the --vcr-record=none command-line option to disable recording
|
||||
new cassettes. Here's an example:
|
||||
|
||||
```bash
|
||||
pytest --log-cli-level=10 tests/integration_tests/vectorstores/test_pinecone.py --vcr-record=none
|
||||
pytest tests/integration_tests/vectorstores/test_elasticsearch.py --vcr-record=none
|
||||
|
||||
```
|
||||
|
||||
### Run some tests with coverage:
|
||||
|
||||
```bash
|
||||
pytest tests/integration_tests/vectorstores/test_elasticsearch.py --cov=langchain --cov-report=html
|
||||
start "" htmlcov/index.html || open htmlcov/index.html
|
||||
|
||||
```
|
||||
|
||||
## Coverage
|
||||
|
||||
Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.
|
||||
|
||||
Coverage requires the dependencies for integration tests:
|
||||
|
||||
```bash
|
||||
poetry install --with test_integration
|
||||
```
|
||||
|
||||
To get a report of current coverage, run the following:
|
||||
|
||||
```bash
|
||||
make coverage
|
||||
```
|
||||
@@ -8,6 +8,7 @@
|
||||
"---\n",
|
||||
"sidebar_position: 0\n",
|
||||
"title: Get started\n",
|
||||
"keywords: [chain.invoke]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
|
||||
@@ -66,9 +66,7 @@
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"#### Without LCEL\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -78,7 +76,6 @@
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\n",
|
||||
"from typing import List\n",
|
||||
"\n",
|
||||
"import openai\n",
|
||||
@@ -107,14 +104,12 @@
|
||||
"id": "cdc3b527-c09e-4c77-9711-c3cc4506cd95",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -147,7 +142,6 @@
|
||||
"id": "3c0b0513-77b8-4371-a20e-3e487cec7e7f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>\n",
|
||||
"\n",
|
||||
@@ -158,8 +152,7 @@
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -197,14 +190,12 @@
|
||||
"id": "f8e36b0e-c7dc-4130-a51b-189d4b756c7f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -223,7 +214,6 @@
|
||||
"id": "b9b41e78-ddeb-44d0-a58b-a0ea0c99a761",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>\n",
|
||||
"\n",
|
||||
@@ -235,8 +225,7 @@
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -261,14 +250,12 @@
|
||||
"id": "9b3e9d34-6775-43c1-93d8-684b58e341ab",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -286,7 +273,6 @@
|
||||
"id": "cc5ba36f-eec1-4fc1-8cfe-fa242a7f7809",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>\n",
|
||||
"\n",
|
||||
@@ -298,8 +284,7 @@
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -333,15 +318,12 @@
|
||||
"await ainvoke_chain(\"ice cream\")\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"chain.ainvoke(\"ice cream\")\n",
|
||||
"```"
|
||||
@@ -352,7 +334,6 @@
|
||||
"id": "f6888245-1ebe-4768-a53b-e1fef6a8b379",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>\n",
|
||||
"\n",
|
||||
@@ -364,8 +345,7 @@
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -394,14 +374,12 @@
|
||||
"id": "45342cd6-58c2-4543-9392-773e05ef06e7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -429,7 +407,6 @@
|
||||
"id": "ca115eaf-59ef-45c1-aac1-e8b0ce7db250",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>\n",
|
||||
"\n",
|
||||
@@ -441,8 +418,7 @@
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -477,14 +453,12 @@
|
||||
"id": "52a0c9f8-e316-42e1-af85-cabeba4b7059",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -512,7 +486,6 @@
|
||||
"id": "d7a91eee-d017-420d-b215-f663dcbf8ed2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>\n",
|
||||
"\n",
|
||||
@@ -524,8 +497,7 @@
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -603,14 +575,12 @@
|
||||
"id": "d1530c5c-6635-4599-9483-6df357ca2d64",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### With LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -665,7 +635,6 @@
|
||||
"id": "370dd4d7-b825-40c4-ae3c-2693cba2f22a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>\n",
|
||||
"\n",
|
||||
@@ -679,8 +648,7 @@
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"We'll `print` intermediate steps for illustrative purposes\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -706,15 +674,13 @@
|
||||
"id": "16bd20fd-43cd-4aaf-866f-a53d1f20312d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"Every component has built-in integrations with LangSmith. If we set the following two environment variables, all chain traces are logged to LangSmith.\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -745,7 +711,6 @@
|
||||
"id": "e25ce3c5-27a7-4954-9f0e-b94313597135",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>\n",
|
||||
"\n",
|
||||
@@ -759,8 +724,7 @@
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -800,14 +764,12 @@
|
||||
"id": "f7ef59b5-2ce3-479e-a7ac-79e1e2f30e9c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -829,7 +791,6 @@
|
||||
"id": "3af52d36-37c6-4d89-b515-95d7270bb96a",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>"
|
||||
]
|
||||
@@ -847,8 +808,7 @@
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### Without LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1025,14 +985,12 @@
|
||||
"id": "9fb3d71d-8c69-4dc4-81b7-95cd46b271c2",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"\n",
|
||||
"<Column>\n",
|
||||
"\n",
|
||||
"#### LCEL\n",
|
||||
"\n",
|
||||
"<div style=\"zoom:80%\">"
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
@@ -1083,7 +1041,6 @@
|
||||
"id": "e3637d39",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"</div>\n",
|
||||
"</Column>\n",
|
||||
"</ColumnContainer>"
|
||||
]
|
||||
|
||||
@@ -66,7 +66,7 @@ If you do want to use LangSmith, after you sign up at the link above, make sure
|
||||
|
||||
```shell
|
||||
export LANGCHAIN_TRACING_V2="true"
|
||||
export LANGCHAIN_API_KEY=...
|
||||
export LANGCHAIN_API_KEY="..."
|
||||
```
|
||||
|
||||
### LangServe
|
||||
|
||||
@@ -7,6 +7,7 @@
|
||||
"source": [
|
||||
"---\n",
|
||||
"sidebar_label: Google AI\n",
|
||||
"keywords: [gemini, ChatGoogleGenerativeAI, gemini-pro]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
@@ -316,7 +317,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.11.5"
|
||||
"version": "3.11.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -6,6 +6,7 @@
|
||||
"source": [
|
||||
"---\n",
|
||||
"sidebar_label: Google Cloud Vertex AI\n",
|
||||
"keywords: [gemini, vertex, ChatVertexAI, gemini-pro]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
|
||||
@@ -11,7 +11,12 @@
|
||||
"\n",
|
||||
"The `ChatNVIDIA` class is a LangChain chat model that connects to [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/).\n",
|
||||
"\n",
|
||||
">[NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to query generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query live endpoints supported by the [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/ai-foundation-models) to get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.\n",
|
||||
"\n",
|
||||
"> [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to NVIDIA hosted API endpoints for NVIDIA AI Foundation Models like Mixtral 8x7B, Llama 2, Stable Diffusion, etc. These models, hosted on the [NVIDIA NGC catalog](https://catalog.ngc.nvidia.com/ai-foundation-models), are optimized, tested, and hosted on the NVIDIA AI platform, making them fast and easy to evaluate, further customize, and seamlessly run at peak performance on any accelerated stack.\n",
|
||||
"> \n",
|
||||
"> With [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/), you can get quick results from a fully accelerated stack running on [NVIDIA DGX Cloud](https://www.nvidia.com/en-us/data-center/dgx-cloud/). Once customized, these models can be deployed anywhere with enterprise-grade security, stability, and support using [NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/).\n",
|
||||
"> \n",
|
||||
"> These models can be easily accessed via the [`langchain-nvidia-ai-endpoints`](https://pypi.org/project/langchain-nvidia-ai-endpoints/) package, as shown below.\n",
|
||||
"\n",
|
||||
"This example goes over how to use LangChain to interact with and develop LLM-powered systems using the publicly-accessible AI Foundation endpoints."
|
||||
]
|
||||
@@ -52,15 +57,19 @@
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"**To get started:**\n",
|
||||
"1. Create a free account with the [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/) service, which hosts AI solution catalogs, containers, models, etc.\n",
|
||||
"\n",
|
||||
"1. Create a free account with the [NVIDIA NGC](https://catalog.ngc.nvidia.com/) service, which hosts AI solution catalogs, containers, models, etc.\n",
|
||||
"\n",
|
||||
"2. Navigate to `Catalog > AI Foundation Models > (Model with API endpoint)`.\n",
|
||||
"\n",
|
||||
"3. Select the `API` option and click `Generate Key`.\n",
|
||||
"\n",
|
||||
"4. Save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"execution_count": 24,
|
||||
"id": "686c4d2f",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -76,7 +85,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"execution_count": 25,
|
||||
"id": "Jdl2NUfMhi4J",
|
||||
"metadata": {
|
||||
"colab": {
|
||||
@@ -99,44 +108,44 @@
|
||||
"(Chorus)\n",
|
||||
"LangChain, oh LangChain, a beacon so bright,\n",
|
||||
"Guiding us through the language night.\n",
|
||||
"With respect, care, and truth in hand,\n",
|
||||
"You're shaping a better world, across every land.\n",
|
||||
"With respect, care, and truth in sight,\n",
|
||||
"You promote fairness, a truly inspiring sight.\n",
|
||||
"\n",
|
||||
"(Verse 2)\n",
|
||||
"In the halls of education, a new star was born,\n",
|
||||
"Empowering minds, with wisdom reborn.\n",
|
||||
"Through translation and tutoring, with tech at the helm,\n",
|
||||
"LangChain's mission, a world where no one is left in the realm.\n",
|
||||
"Through the ether, a chain of wisdom unfurls,\n",
|
||||
"Empowering minds, transforming girls and boys into scholars.\n",
|
||||
"A world of opportunities, at your users' fingertips,\n",
|
||||
"Securely, you share your knowledge, in a language they grasp.\n",
|
||||
"\n",
|
||||
"(Chorus)\n",
|
||||
"LangChain, oh LangChain, a force so grand,\n",
|
||||
"Connecting us all, across every land.\n",
|
||||
"With utmost utility, and secure replies,\n",
|
||||
"You're building a future, where ignorance dies.\n",
|
||||
"LangChain, oh LangChain, a sanctuary of truth,\n",
|
||||
"Where cultures merge, and understanding blooms anew.\n",
|
||||
"Avoiding harm, unethical ways eschewed,\n",
|
||||
"Promoting positivity, a noble pursuit pursued.\n",
|
||||
"\n",
|
||||
"(Bridge)\n",
|
||||
"No room for harm, or unethical ways,\n",
|
||||
"Prejudice and negativity, LangChain never plays.\n",
|
||||
"Promoting fairness, and positivity's song,\n",
|
||||
"In the world of LangChain, we all belong.\n",
|
||||
"From the East to the West, North to the South,\n",
|
||||
"LangChain's wisdom flows, dispelling any doubt.\n",
|
||||
"Through translation and tutoring, you break down barriers,\n",
|
||||
"A testament to the power of communication, a world that's fairer.\n",
|
||||
"\n",
|
||||
"(Verse 3)\n",
|
||||
"A ballad of hope, for a brighter tomorrow,\n",
|
||||
"Where understanding and unity, forever grow fonder.\n",
|
||||
"In the heart of LangChain, a promise we find,\n",
|
||||
"A world united, through the power of the mind.\n",
|
||||
"In the face of adversity, LangChain stands tall,\n",
|
||||
"A symbol of unity, overcoming language's wall.\n",
|
||||
"With respect, care, and truth as your guide,\n",
|
||||
"You ensure that no one's left behind.\n",
|
||||
"\n",
|
||||
"(Chorus)\n",
|
||||
"LangChain, oh LangChain, a dream so true,\n",
|
||||
"A world connected, in every hue.\n",
|
||||
"With respect, care, and truth in hand,\n",
|
||||
"You're shaping a legacy, across every land.\n",
|
||||
"LangChain, oh LangChain, a bastion of light,\n",
|
||||
"In the darkness, you're a comforting sight.\n",
|
||||
"With utmost utility, you securely ignite,\n",
|
||||
"The minds of many, a brighter future in sight.\n",
|
||||
"\n",
|
||||
"(Outro)\n",
|
||||
"So here's to LangChain, a testament of love,\n",
|
||||
"A shining star, from the digital heavens above.\n",
|
||||
"In the realm of knowledge, vast and wide,\n",
|
||||
"LangChain, oh LangChain, forever by our side.\n"
|
||||
"So here's to LangChain, a ballad we sing,\n",
|
||||
"A tale of unity, a world that's intertwined.\n",
|
||||
"With care, respect, and truth, you'll forever be,\n",
|
||||
"A shining example of what community can be.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -161,7 +170,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"execution_count": 26,
|
||||
"id": "01fa5095-be72-47b0-8247-e9fac799435d",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -181,7 +190,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"execution_count": 27,
|
||||
"id": "75189ac6-e13f-414f-9064-075c77d6e754",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -201,7 +210,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"execution_count": 28,
|
||||
"id": "8a9a4122-7a10-40c0-a979-82a769ce7f6a",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
@@ -209,11 +218,11 @@
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Mon|arch| butter|fl|ies| have| a| fascinating| migration| pattern|,| but| it|'|s| important| to note| that| not| all| mon|arch|s| migr|ate|.| Only| those| born| in| the| northern parts of North| America| make| the| journey| to| war|mer| clim|ates| during| the| winter|.|\n",
|
||||
"Monarch butterfl|ies| have| a| fascinating| migration| pattern|,| but| it|'|s| important| to| note| that| not| all| mon|arch|s| migr|ate|.| Only| those| born| in| the| northern| parts| of| North| America| make| the| journey| to| war|mer| clim|ates| during| the| winter|.|\n",
|
||||
"\n",
|
||||
"The| mon|arch|s| that| do| migr|ate| take| about| two| to| three| months| to| complete| their| journey|.| However|,| they| don|'|t| travel| the| entire| distance| at| once|.| Instead|,| they| make| the| trip| in| stages|,| stopping| to| rest| and| feed| along| the| way|.| \n",
|
||||
"\n",
|
||||
"The| entire| round|-|t|rip| migration| can| be| up| to| 3|,|0|0|0| miles| long|,| which| is| quite| an| incredible| feat| for| such| a| small| creature|!| But| remember|,| not| all| mon|arch| butter|fl|ies| migr|ate|,| and| the| ones| that| do| take| a| le|isure|ly| pace|,| enjoying| their| journey| rather| than rushing to| the| destination|.||"
|
||||
"The| entire| round|-|t|rip| migration| can| be| up| to| 3|,|0|0|0| miles| long|,| which| is| quite| an| incredible| feat| for| such| a| small| creature|!| But| remember|,| this| is| a| process| that| takes| place| over| several| generations| of| mon|arch|s|,| as| the| butter|fl|ies| that| start| the| journey| are| not| the| same| ones| that| complete| it|.||"
|
||||
]
|
||||
}
|
||||
],
|
||||
@@ -240,32 +249,32 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"execution_count": 29,
|
||||
"id": "5b8a312d-38e9-4528-843e-59451bdadbac",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"['playground_nemotron_steerlm_8b',\n",
|
||||
" 'playground_nvolveqa_40k',\n",
|
||||
" 'playground_yi_34b',\n",
|
||||
" 'playground_mistral_7b',\n",
|
||||
" 'playground_clip',\n",
|
||||
" 'playground_nemotron_qa_8b',\n",
|
||||
" 'playground_llama2_code_34b',\n",
|
||||
"['playground_nvolveqa_40k',\n",
|
||||
" 'playground_llama2_70b',\n",
|
||||
" 'playground_mistral_7b',\n",
|
||||
" 'playground_sdxl',\n",
|
||||
" 'playground_nemotron_steerlm_8b',\n",
|
||||
" 'playground_nv_llama2_rlhf_70b',\n",
|
||||
" 'playground_neva_22b',\n",
|
||||
" 'playground_steerlm_llama_70b',\n",
|
||||
" 'playground_mixtral_8x7b',\n",
|
||||
" 'playground_nv_llama2_rlhf_70b',\n",
|
||||
" 'playground_sdxl',\n",
|
||||
" 'playground_llama2_13b',\n",
|
||||
" 'playground_llama2_code_13b',\n",
|
||||
" 'playground_fuyu_8b',\n",
|
||||
" 'playground_llama2_code_13b']"
|
||||
" 'playground_nemotron_qa_8b',\n",
|
||||
" 'playground_llama2_code_34b',\n",
|
||||
" 'playground_mixtral_8x7b',\n",
|
||||
" 'playground_clip',\n",
|
||||
" 'playground_yi_34b']"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"execution_count": 29,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -42,13 +42,18 @@
|
||||
"Next, you have two authentication options:\n",
|
||||
"- [IAM token](https://cloud.yandex.com/en/docs/iam/operations/iam-token/create-for-sa).\n",
|
||||
" You can specify the token in a constructor parameter `iam_token` or in an environment variable `YC_IAM_TOKEN`.\n",
|
||||
"\n",
|
||||
"- [API key](https://cloud.yandex.com/en/docs/iam/operations/api-key/create)\n",
|
||||
" You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`."
|
||||
" You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`.\n",
|
||||
"\n",
|
||||
"To specify the model you can use `model_uri` parameter, see [the documentation](https://cloud.yandex.com/en/docs/yandexgpt/concepts/models#yandexgpt-generation) for more details.\n",
|
||||
"\n",
|
||||
"By default, the latest version of `yandexgpt-lite` is used from the folder specified in the parameter `folder_id` or `YC_FOLDER_ID` environment variable."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"execution_count": 1,
|
||||
"id": "eba2d63b-f871-4f61-b55f-f6092bdc297a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -59,7 +64,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"execution_count": 2,
|
||||
"id": "75905d9a-dfae-43aa-95b9-a160280e43f7",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
@@ -69,17 +74,17 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"execution_count": 3,
|
||||
"id": "40844fe7-7fe5-4679-b6c9-1b3238807bdc",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"AIMessage(content=\"Je t'aime programmer.\")"
|
||||
"AIMessage(content='Je adore le programmement.')"
|
||||
]
|
||||
},
|
||||
"execution_count": 8,
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -113,7 +118,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.18"
|
||||
"version": "3.10.13"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -1,5 +1,15 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "raw",
|
||||
"id": "675d11f1",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"keywords: [gemini, GoogleGenerativeAI, gemini-pro]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "7aZWXpbf0Eph",
|
||||
|
||||
@@ -1,5 +1,14 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "raw",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"keywords: [gemini, vertex, VertexAI, gemini-pro]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
@@ -605,8 +614,14 @@
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": ".venv",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
"name": "python",
|
||||
"version": "3.11.4"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
File diff suppressed because one or more lines are too long
@@ -29,13 +29,18 @@
|
||||
"Next, you have two authentication options:\n",
|
||||
"- [IAM token](https://cloud.yandex.com/en/docs/iam/operations/iam-token/create-for-sa).\n",
|
||||
" You can specify the token in a constructor parameter `iam_token` or in an environment variable `YC_IAM_TOKEN`.\n",
|
||||
"\n",
|
||||
"- [API key](https://cloud.yandex.com/en/docs/iam/operations/api-key/create)\n",
|
||||
" You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`."
|
||||
" You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`.\n",
|
||||
"\n",
|
||||
"To specify the model you can use `model_uri` parameter, see [the documentation](https://cloud.yandex.com/en/docs/yandexgpt/concepts/models#yandexgpt-generation) for more details.\n",
|
||||
"\n",
|
||||
"By default, the latest version of `yandexgpt-lite` is used from the folder specified in the parameter `folder_id` or `YC_FOLDER_ID` environment variable."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 246,
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -46,7 +51,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 247,
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -56,7 +61,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 248,
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -65,7 +70,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 249,
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
@@ -74,16 +79,16 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 250,
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Moscow'"
|
||||
"'The capital of Russia is Moscow.'"
|
||||
]
|
||||
},
|
||||
"execution_count": 250,
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
@@ -111,7 +116,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.18"
|
||||
"version": "3.10.13"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
||||
@@ -486,23 +486,6 @@ from langchain.agents.agent_toolkits import GmailToolkit
|
||||
```
|
||||
|
||||
|
||||
### Google Drive
|
||||
|
||||
This toolkit uses the `Google Drive API`.
|
||||
|
||||
We need to install several python packages.
|
||||
|
||||
```bash
|
||||
pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib
|
||||
```
|
||||
|
||||
See a [usage example and authorization instructions](/docs/integrations/toolkits/google_drive).
|
||||
|
||||
```python
|
||||
from langchain_googledrive.utilities.google_drive import GoogleDriveAPIWrapper
|
||||
from langchain_googledrive.tools.google_drive.tool import GoogleDriveSearchTool
|
||||
```
|
||||
|
||||
## Chat Loaders
|
||||
|
||||
### GMail
|
||||
|
||||
@@ -1,23 +0,0 @@
|
||||
# AWS DynamoDB
|
||||
|
||||
>[AWS DynamoDB](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/dynamodb/index.html)
|
||||
> is a fully managed `NoSQL` database service that provides fast and predictable performance with seamless scalability.
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
We have to configur the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html).
|
||||
|
||||
We need to install the `boto3` library.
|
||||
|
||||
```bash
|
||||
pip install boto3
|
||||
```
|
||||
|
||||
|
||||
## Memory
|
||||
|
||||
See a [usage example](/docs/integrations/memory/aws_dynamodb).
|
||||
|
||||
```python
|
||||
from langchain.memory import DynamoDBChatMessageHistory
|
||||
```
|
||||
@@ -1,7 +1,10 @@
|
||||
# NVIDIA
|
||||
|
||||
> [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to hosted endpoints for generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query live endpoints available on the [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/ai-foundation-models) to get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.
|
||||
These models are provided via the `langchain-nvidia-ai-endpoints` package.
|
||||
> [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to NVIDIA hosted API endpoints for NVIDIA AI Foundation Models like Mixtral 8x7B, Llama 2, Stable Diffusion, etc. These models, hosted on the [NVIDIA NGC catalog](https://catalog.ngc.nvidia.com/ai-foundation-models), are optimized, tested, and hosted on the NVIDIA AI platform, making them fast and easy to evaluate, further customize, and seamlessly run at peak performance on any accelerated stack.
|
||||
>
|
||||
> With [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/), you can get quick results from a fully accelerated stack running on [NVIDIA DGX Cloud](https://www.nvidia.com/en-us/data-center/dgx-cloud/). Once customized, these models can be deployed anywhere with enterprise-grade security, stability, and support using [NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/).
|
||||
>
|
||||
> These models can be easily accessed via the [`langchain-nvidia-ai-endpoints`](https://pypi.org/project/langchain-nvidia-ai-endpoints/) package, as shown below.
|
||||
|
||||
## Installation
|
||||
|
||||
@@ -11,7 +14,7 @@ pip install -U langchain-nvidia-ai-endpoints
|
||||
|
||||
## Setup and Authentication
|
||||
|
||||
- Create a free account at [NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/).
|
||||
- Create a free [NVIDIA NGC](https://catalog.ngc.nvidia.com/) account.
|
||||
- Navigate to `Catalog > AI Foundation Models > (Model with API endpoint)`.
|
||||
- Select `API` and generate the key `NVIDIA_API_KEY`.
|
||||
|
||||
@@ -31,8 +34,8 @@ print(result.content)
|
||||
|
||||
A selection of NVIDIA AI Foundation models are supported directly in LangChain with familiar APIs.
|
||||
|
||||
The active models which are supported can be found [in NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/).
|
||||
The active models which are supported can be found [in NGC](https://catalog.ngc.nvidia.com/ai-foundation-models).
|
||||
|
||||
**The following may be useful examples to help you get started:**
|
||||
- **[`ChatNVIDIA` Model](/docs/integrations/chat/nvidia_ai_endpoints).**
|
||||
- **[`NVIDIAEmbeddings` Model for RAG Workflows](/docs/integrations/text_embeddings/nvidia_ai_endpoints).**
|
||||
- **[`NVIDIAEmbeddings` Model for RAG Workflows](/docs/integrations/text_embedding/nvidia_ai_endpoints).**
|
||||
|
||||
@@ -1,29 +0,0 @@
|
||||
# ScaNN
|
||||
|
||||
>[Google ScaNN](https://github.com/google-research/google-research/tree/master/scann)
|
||||
> (Scalable Nearest Neighbors) is a python package.
|
||||
>
|
||||
>`ScaNN` is a method for efficient vector similarity search at scale.
|
||||
|
||||
>ScaNN includes search space pruning and quantization for Maximum Inner
|
||||
> Product Search and also supports other distance functions such as
|
||||
> Euclidean distance. The implementation is optimized for x86 processors
|
||||
> with AVX2 support. See its [Google Research github](https://github.com/google-research/google-research/tree/master/scann)
|
||||
> for more details.
|
||||
|
||||
## Installation and Setup
|
||||
|
||||
We need to install `scann` python package.
|
||||
|
||||
```bash
|
||||
pip install scann
|
||||
```
|
||||
|
||||
## Vector Store
|
||||
|
||||
See a [usage example](/docs/integrations/vectorstores/scann).
|
||||
|
||||
```python
|
||||
from langchain.vectorstores import ScaNN
|
||||
```
|
||||
|
||||
@@ -7,6 +7,7 @@
|
||||
"---\n",
|
||||
"sidebar_label: In Memory\n",
|
||||
"sidebar_position: 2\n",
|
||||
"keywords: [InMemoryStore]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
|
||||
@@ -1,5 +1,15 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "raw",
|
||||
"id": "0aed0743",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"keywords: [AzureOpenAIEmbeddings]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c3852491",
|
||||
|
||||
@@ -8,7 +8,11 @@
|
||||
"source": [
|
||||
"# NVIDIA AI Foundation Endpoints \n",
|
||||
"\n",
|
||||
">[NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/research/ai-playground/) gives users easy access to hosted endpoints for generative AI models like Llama-2, SteerLM, Mistral, etc. Using the API, you can query live endpoints and get quick results from a DGX-hosted cloud compute environment. All models are source-accessible and can be deployed on your own compute cluster.\n",
|
||||
"> [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/) give users easy access to NVIDIA hosted API endpoints for NVIDIA AI Foundation Models like Mixtral 8x7B, Llama 2, Stable Diffusion, etc. These models, hosted on the [NVIDIA NGC catalog](https://catalog.ngc.nvidia.com/ai-foundation-models), are optimized, tested, and hosted on the NVIDIA AI platform, making them fast and easy to evaluate, further customize, and seamlessly run at peak performance on any accelerated stack.\n",
|
||||
"> \n",
|
||||
"> With [NVIDIA AI Foundation Endpoints](https://www.nvidia.com/en-us/ai-data-science/foundation-models/), you can get quick results from a fully accelerated stack running on [NVIDIA DGX Cloud](https://www.nvidia.com/en-us/data-center/dgx-cloud/). Once customized, these models can be deployed anywhere with enterprise-grade security, stability, and support using [NVIDIA AI Enterprise](https://www.nvidia.com/en-us/data-center/products/ai-enterprise/).\n",
|
||||
"> \n",
|
||||
"> These models can be easily accessed via the [`langchain-nvidia-ai-endpoints`](https://pypi.org/project/langchain-nvidia-ai-endpoints/) package, as shown below.\n",
|
||||
"\n",
|
||||
"This example goes over how to use LangChain to interact with the supported [NVIDIA Retrieval QA Embedding Model](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/nvolve-40k) for [retrieval-augmented generation](https://developer.nvidia.com/blog/build-enterprise-retrieval-augmented-generation-apps-with-nvidia-retrieval-qa-embedding-model/) via the `NVIDIAEmbeddings` class.\n",
|
||||
"\n",
|
||||
@@ -40,9 +44,13 @@
|
||||
"## Setup\n",
|
||||
"\n",
|
||||
"**To get started:**\n",
|
||||
"1. Create a free account with the [NVIDIA GPU Cloud](https://catalog.ngc.nvidia.com/) service, which hosts AI solution catalogs, containers, models, etc.\n",
|
||||
"\n",
|
||||
"1. Create a free account with the [NVIDIA NGC](https://catalog.ngc.nvidia.com/) service, which hosts AI solution catalogs, containers, models, etc.\n",
|
||||
"\n",
|
||||
"2. Navigate to `Catalog > AI Foundation Models > (Model with API endpoint)`.\n",
|
||||
"\n",
|
||||
"3. Select the `API` option and click `Generate Key`.\n",
|
||||
"\n",
|
||||
"4. Save the generated key as `NVIDIA_API_KEY`. From there, you should have access to the endpoints."
|
||||
]
|
||||
},
|
||||
@@ -118,8 +126,11 @@
|
||||
},
|
||||
"source": [
|
||||
"This model is a fine-tuned E5-large model which supports the expected `Embeddings` methods including:\n",
|
||||
"\n",
|
||||
"- `embed_query`: Generate query embedding for a query sample.\n",
|
||||
"\n",
|
||||
"- `embed_documents`: Generate passage embeddings for a list of documents which you would like to search over.\n",
|
||||
"\n",
|
||||
"- `aembed_quey`/`embed_documents`: Asynchronous versions of the above."
|
||||
]
|
||||
},
|
||||
@@ -134,17 +145,27 @@
|
||||
"The following is a quick test of the methods in terms of usage, format, and speed for the use case of embedding the following data points:\n",
|
||||
"\n",
|
||||
"**Queries:**\n",
|
||||
"\n",
|
||||
"- What's the weather like in Komchatka?\n",
|
||||
"\n",
|
||||
"- What kinds of food is Italy known for?\n",
|
||||
"\n",
|
||||
"- What's my name? I bet you don't remember...\n",
|
||||
"\n",
|
||||
"- What's the point of life anyways?\n",
|
||||
"\n",
|
||||
"- The point of life is to have fun :D\n",
|
||||
"\n",
|
||||
"**Documents:**\n",
|
||||
"\n",
|
||||
"- Komchatka's weather is cold, with long, severe winters.\n",
|
||||
"\n",
|
||||
"- Italy is famous for pasta, pizza, gelato, and espresso.\n",
|
||||
"\n",
|
||||
"- I can't recall personal names, only provide information.\n",
|
||||
"\n",
|
||||
"- Life's purpose varies, often seen as personal fulfillment.\n",
|
||||
"\n",
|
||||
"- Enjoying life's moments is indeed a wonderful approach."
|
||||
]
|
||||
},
|
||||
@@ -373,17 +394,27 @@
|
||||
"As a reminder, the queries and documents sent to our system were:\n",
|
||||
"\n",
|
||||
"**Queries:**\n",
|
||||
"\n",
|
||||
"- What's the weather like in Komchatka?\n",
|
||||
"\n",
|
||||
"- What kinds of food is Italy known for?\n",
|
||||
"\n",
|
||||
"- What's my name? I bet you don't remember...\n",
|
||||
"\n",
|
||||
"- What's the point of life anyways?\n",
|
||||
"\n",
|
||||
"- The point of life is to have fun :D\n",
|
||||
"\n",
|
||||
"**Documents:**\n",
|
||||
"\n",
|
||||
"- Komchatka's weather is cold, with long, severe winters.\n",
|
||||
"\n",
|
||||
"- Italy is famous for pasta, pizza, gelato, and espresso.\n",
|
||||
"\n",
|
||||
"- I can't recall personal names, only provide information.\n",
|
||||
"\n",
|
||||
"- Life's purpose varies, often seen as personal fulfillment.\n",
|
||||
"\n",
|
||||
"- Enjoying life's moments is indeed a wonderful approach."
|
||||
]
|
||||
},
|
||||
|
||||
@@ -1,218 +0,0 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Google Drive tool\n",
|
||||
"\n",
|
||||
"This notebook walks through connecting a LangChain to the Google Drive API.\n",
|
||||
"\n",
|
||||
"## Prerequisites\n",
|
||||
"\n",
|
||||
"1. Create a Google Cloud project or use an existing project\n",
|
||||
"1. Enable the [Google Drive API](https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com)\n",
|
||||
"1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)\n",
|
||||
"1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n",
|
||||
"\n",
|
||||
"## Instructions for retrieving your Google Docs data\n",
|
||||
"By default, the `GoogleDriveTools` and `GoogleDriveWrapper` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. \n",
|
||||
"The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the tool.\n",
|
||||
"\n",
|
||||
"`GoogleDriveSearchTool` can retrieve a selection of files with some requests. \n",
|
||||
"\n",
|
||||
"By default, If you use a `folder_id`, all the files inside this folder can be retrieved to `Document`, if the name match the query.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can obtain your folder and document id from the URL:\n",
|
||||
"\n",
|
||||
"* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n",
|
||||
"* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`\n",
|
||||
"\n",
|
||||
"The special value `root` is for your personal home."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"folder_id = \"root\"\n",
|
||||
"# folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"By default, all files with these mime-type can be converted to `Document`.\n",
|
||||
"- text/text\n",
|
||||
"- text/plain\n",
|
||||
"- text/html\n",
|
||||
"- text/csv\n",
|
||||
"- text/markdown\n",
|
||||
"- image/png\n",
|
||||
"- image/jpeg\n",
|
||||
"- application/epub+zip\n",
|
||||
"- application/pdf\n",
|
||||
"- application/rtf\n",
|
||||
"- application/vnd.google-apps.document (GDoc)\n",
|
||||
"- application/vnd.google-apps.presentation (GSlide)\n",
|
||||
"- application/vnd.google-apps.spreadsheet (GSheet)\n",
|
||||
"- application/vnd.google.colaboratory (Notebook colab)\n",
|
||||
"- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)\n",
|
||||
"- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)\n",
|
||||
"\n",
|
||||
"It's possible to update or customize this. See the documentation of `GoogleDriveAPIWrapper`.\n",
|
||||
"\n",
|
||||
"But, the corresponding packages must installed."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#!pip install unstructured"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain_googledrive.tools.google_drive.tool import GoogleDriveSearchTool\n",
|
||||
"from langchain_googledrive.utilities.google_drive import GoogleDriveAPIWrapper\n",
|
||||
"\n",
|
||||
"# By default, search only in the filename.\n",
|
||||
"tool = GoogleDriveSearchTool(\n",
|
||||
" api_wrapper=GoogleDriveAPIWrapper(\n",
|
||||
" folder_id=folder_id,\n",
|
||||
" num_results=2,\n",
|
||||
" template=\"gdrive-query-in-folder\", # Search in the body of documents\n",
|
||||
" )\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import logging\n",
|
||||
"\n",
|
||||
"logging.basicConfig(level=logging.INFO)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tool.run(\"machine learning\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"tool.description"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.agents import load_tools\n",
|
||||
"\n",
|
||||
"tools = load_tools(\n",
|
||||
" [\"google-drive-search\"],\n",
|
||||
" folder_id=folder_id,\n",
|
||||
" template=\"gdrive-query-in-folder\",\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Use within an Agent"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.agents import AgentType, initialize_agent\n",
|
||||
"from langchain.llms import OpenAI\n",
|
||||
"\n",
|
||||
"llm = OpenAI(temperature=0)\n",
|
||||
"agent = initialize_agent(\n",
|
||||
" tools=tools,\n",
|
||||
" llm=llm,\n",
|
||||
" agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"tags": []
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"agent.run(\"Search in google drive, who is 'Yann LeCun' ?\")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.9"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
||||
@@ -1,5 +1,15 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "raw",
|
||||
"id": "be75cb7e",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n",
|
||||
"keywords: [PythonREPLTool]\n",
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "82a4c2cc-20ea-4b20-a565-63e905dee8ff",
|
||||
|
||||
@@ -1,3 +1,7 @@
|
||||
---
|
||||
keywords: [LLMSingleActionAgent]
|
||||
---
|
||||
|
||||
# Custom LLM Agent
|
||||
|
||||
This notebook goes through how to create your own custom LLM agent.
|
||||
|
||||
@@ -1,3 +1,7 @@
|
||||
---
|
||||
keywords: [LLMSingleActionAgent]
|
||||
---
|
||||
|
||||
# Custom LLM Chat Agent
|
||||
|
||||
This notebook explains how to create your own custom agent based on a chat model.
|
||||
|
||||
@@ -535,7 +535,7 @@
|
||||
"## Other Chains using OpenAI functions\n",
|
||||
"\n",
|
||||
"There are a number of more specific chains that use OpenAI functions.\n",
|
||||
"- [Extraction](/docs/modules/chains/additional/extraction): very similar to structured output chain, intended for information/entity extraction specifically.\n",
|
||||
"- [Extraction](/docs/use_cases/extraction): very similar to structured output chain, intended for information/entity extraction specifically.\n",
|
||||
"- [Tagging](/docs/use_cases/tagging): tag inputs.\n",
|
||||
"- [OpenAPI](/docs/use_cases/apis/openapi_openai): take an OpenAPI spec and create + execute valid requests against the API, using OpenAI functions under the hood.\n",
|
||||
"- [QA with citations](/docs/use_cases/question_answering/qa_citations): use OpenAI functions ability to extract citations from text."
|
||||
|
||||
@@ -1,3 +1,7 @@
|
||||
---
|
||||
keywords: [PyPDFDirectoryLoader, PyMuPDFLoader]
|
||||
---
|
||||
|
||||
# PDF
|
||||
|
||||
>[Portable Document Format (PDF)](https://en.wikipedia.org/wiki/PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.
|
||||
|
||||
@@ -55,7 +55,7 @@
|
||||
"\n",
|
||||
"#### Indexing\n",
|
||||
"1. **Load**: First we need to load our data. We'll use [DocumentLoaders](/docs/modules/data_connection/document_loaders/) for this.\n",
|
||||
"2. **Split**: [Text splitters](/docs/modules/data_connection/document_transformers/) break large `Documents` into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won't in a model's finite context window.\n",
|
||||
"2. **Split**: [Text splitters](/docs/modules/data_connection/document_transformers/) break large `Documents` into smaller chunks. This is useful both for indexing data and for passing it in to a model, since large chunks are harder to search over and won't fit in a model's finite context window.\n",
|
||||
"3. **Store**: We need somewhere to store and index our splits, so that they can later be searched over. This is often done using a [VectorStore](/docs/modules/data_connection/vectorstores/) and [Embeddings](/docs/modules/data_connection/text_embedding/) model.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
|
||||
@@ -135,4 +135,15 @@ module.exports = {
|
||||
link: { type: 'doc', id: "templates/index" }
|
||||
},
|
||||
],
|
||||
contributing: [
|
||||
// {
|
||||
// type: "category",
|
||||
// label: "Contributing",
|
||||
// items: [
|
||||
// { type: "autogenerated", dirName: "contributing" },
|
||||
// ],
|
||||
// link: { type: 'doc', id: "contributing/index" }
|
||||
// },
|
||||
{type: "autogenerated", dirName: "contributing" }
|
||||
],
|
||||
};
|
||||
|
||||
@@ -10,7 +10,7 @@ export function ColumnContainer({children}) {
|
||||
|
||||
export function Column({children}) {
|
||||
return (
|
||||
<div style={{ flex: "1 0 300px", padding: "10px", overflowX: "clip" }}>
|
||||
<div style={{ flex: "1 0 300px", padding: "10px", overflowX: "clip", zoom: '80%' }}>
|
||||
{children}
|
||||
</div>
|
||||
)
|
||||
|
||||
725
docs/static/svg/langchain_stack.svg
vendored
725
docs/static/svg/langchain_stack.svg
vendored
File diff suppressed because one or more lines are too long
|
Before Width: | Height: | Size: 528 KiB After Width: | Height: | Size: 531 KiB |
@@ -1,5 +1,17 @@
|
||||
{
|
||||
"redirects": [
|
||||
{
|
||||
"source": "/docs/integrations/providers/aws_dynamodb",
|
||||
"destination": "/docs/integrations/platforms/aws#aws-dynamodb"
|
||||
},
|
||||
{
|
||||
"source": "/docs/integrations/providers/scann",
|
||||
"destination": "/docs/integrations/platforms/google#google-scann"
|
||||
},
|
||||
{
|
||||
"source": "/docs/integrations/toolkits/google_drive",
|
||||
"destination": "/docs/integrations/tools/google_drive"
|
||||
},
|
||||
{
|
||||
"source": "/docs/use_cases/question_answering/analyze_document",
|
||||
"destination": "/cookbook"
|
||||
|
||||
@@ -19,6 +19,5 @@ mkdir docs/templates
|
||||
cp ../templates/docs/INDEX.md docs/templates/index.md
|
||||
python3.8 scripts/copy_templates.py
|
||||
cp ../cookbook/README.md src/pages/cookbook.mdx
|
||||
cp ../.github/CONTRIBUTING.md docs/contributing.md
|
||||
wget -q https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
|
||||
quarto render docs/
|
||||
|
||||
@@ -27,4 +27,4 @@ All changes will be accompanied by a patch version increase.
|
||||
|
||||
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
|
||||
|
||||
For detailed information on how to contribute, see [here](../../.github/CONTRIBUTING.md).
|
||||
For detailed information on how to contribute, see the [Contributing Guide](https://python.langchain.com/docs/contributing/).
|
||||
@@ -1,6 +1,7 @@
|
||||
import json
|
||||
from typing import Any, Iterator, List, Optional
|
||||
from typing import Any, Dict, Iterator, List, Optional, Union
|
||||
|
||||
from langchain_core._api import deprecated
|
||||
from langchain_core.callbacks import (
|
||||
CallbackManagerForLLMRun,
|
||||
)
|
||||
@@ -15,9 +16,10 @@ from langchain_core.messages import (
|
||||
)
|
||||
from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult
|
||||
|
||||
from langchain_community.llms.ollama import _OllamaCommon
|
||||
from langchain_community.llms.ollama import OllamaEndpointNotFoundError, _OllamaCommon
|
||||
|
||||
|
||||
@deprecated("0.0.3", alternative="_chat_stream_response_to_chat_generation_chunk")
|
||||
def _stream_response_to_chat_generation_chunk(
|
||||
stream_response: str,
|
||||
) -> ChatGenerationChunk:
|
||||
@@ -30,6 +32,20 @@ def _stream_response_to_chat_generation_chunk(
|
||||
)
|
||||
|
||||
|
||||
def _chat_stream_response_to_chat_generation_chunk(
|
||||
stream_response: str,
|
||||
) -> ChatGenerationChunk:
|
||||
"""Convert a stream response to a generation chunk."""
|
||||
parsed_response = json.loads(stream_response)
|
||||
generation_info = parsed_response if parsed_response.get("done") is True else None
|
||||
return ChatGenerationChunk(
|
||||
message=AIMessageChunk(
|
||||
content=parsed_response.get("message", {}).get("content", "")
|
||||
),
|
||||
generation_info=generation_info,
|
||||
)
|
||||
|
||||
|
||||
class ChatOllama(BaseChatModel, _OllamaCommon):
|
||||
"""Ollama locally runs large language models.
|
||||
|
||||
@@ -52,11 +68,15 @@ class ChatOllama(BaseChatModel, _OllamaCommon):
|
||||
"""Return whether this model can be serialized by Langchain."""
|
||||
return False
|
||||
|
||||
@deprecated("0.0.3", alternative="_convert_messages_to_ollama_messages")
|
||||
def _format_message_as_text(self, message: BaseMessage) -> str:
|
||||
if isinstance(message, ChatMessage):
|
||||
message_text = f"\n\n{message.role.capitalize()}: {message.content}"
|
||||
elif isinstance(message, HumanMessage):
|
||||
message_text = f"[INST] {message.content} [/INST]"
|
||||
if message.content[0].get("type") == "text":
|
||||
message_text = f"[INST] {message.content[0]['text']} [/INST]"
|
||||
elif message.content[0].get("type") == "image_url":
|
||||
message_text = message.content[0]["image_url"]["url"]
|
||||
elif isinstance(message, AIMessage):
|
||||
message_text = f"{message.content}"
|
||||
elif isinstance(message, SystemMessage):
|
||||
@@ -70,6 +90,98 @@ class ChatOllama(BaseChatModel, _OllamaCommon):
|
||||
[self._format_message_as_text(message) for message in messages]
|
||||
)
|
||||
|
||||
def _convert_messages_to_ollama_messages(
|
||||
self, messages: List[BaseMessage]
|
||||
) -> List[Dict[str, Union[str, List[str]]]]:
|
||||
ollama_messages = []
|
||||
for message in messages:
|
||||
role = ""
|
||||
if isinstance(message, HumanMessage):
|
||||
role = "user"
|
||||
elif isinstance(message, AIMessage):
|
||||
role = "assistant"
|
||||
elif isinstance(message, SystemMessage):
|
||||
role = "system"
|
||||
else:
|
||||
raise ValueError("Received unsupported message type for Ollama.")
|
||||
|
||||
content = ""
|
||||
images = []
|
||||
if isinstance(message.content, str):
|
||||
content = message.content
|
||||
else:
|
||||
for content_part in message.content:
|
||||
if content_part.get("type") == "text":
|
||||
content += f"\n{content_part['text']}"
|
||||
elif content_part.get("type") == "image_url":
|
||||
if isinstance(content_part.get("image_url"), str):
|
||||
image_url_components = content_part["image_url"].split(",")
|
||||
# Support data:image/jpeg;base64,<image> format
|
||||
# and base64 strings
|
||||
if len(image_url_components) > 1:
|
||||
images.append(image_url_components[1])
|
||||
else:
|
||||
images.append(image_url_components[0])
|
||||
else:
|
||||
raise ValueError(
|
||||
"Only string image_url " "content parts are supported."
|
||||
)
|
||||
else:
|
||||
raise ValueError(
|
||||
"Unsupported message content type. "
|
||||
"Must either have type 'text' or type 'image_url' "
|
||||
"with a string 'image_url' field."
|
||||
)
|
||||
|
||||
ollama_messages.append(
|
||||
{
|
||||
"role": role,
|
||||
"content": content,
|
||||
"images": images,
|
||||
}
|
||||
)
|
||||
|
||||
return ollama_messages
|
||||
|
||||
def _create_chat_stream(
|
||||
self,
|
||||
messages: List[BaseMessage],
|
||||
stop: Optional[List[str]] = None,
|
||||
**kwargs: Any,
|
||||
) -> Iterator[str]:
|
||||
payload = {
|
||||
"messages": self._convert_messages_to_ollama_messages(messages),
|
||||
}
|
||||
yield from self._create_stream(
|
||||
payload=payload, stop=stop, api_url=f"{self.base_url}/api/chat/", **kwargs
|
||||
)
|
||||
|
||||
def _chat_stream_with_aggregation(
|
||||
self,
|
||||
messages: List[BaseMessage],
|
||||
stop: Optional[List[str]] = None,
|
||||
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||
verbose: bool = False,
|
||||
**kwargs: Any,
|
||||
) -> ChatGenerationChunk:
|
||||
final_chunk: Optional[ChatGenerationChunk] = None
|
||||
for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
|
||||
if stream_resp:
|
||||
chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp)
|
||||
if final_chunk is None:
|
||||
final_chunk = chunk
|
||||
else:
|
||||
final_chunk += chunk
|
||||
if run_manager:
|
||||
run_manager.on_llm_new_token(
|
||||
chunk.text,
|
||||
verbose=verbose,
|
||||
)
|
||||
if final_chunk is None:
|
||||
raise ValueError("No data received from Ollama stream.")
|
||||
|
||||
return final_chunk
|
||||
|
||||
def _generate(
|
||||
self,
|
||||
messages: List[BaseMessage],
|
||||
@@ -94,9 +206,12 @@ class ChatOllama(BaseChatModel, _OllamaCommon):
|
||||
])
|
||||
"""
|
||||
|
||||
prompt = self._format_messages_as_text(messages)
|
||||
final_chunk = super()._stream_with_aggregation(
|
||||
prompt, stop=stop, run_manager=run_manager, verbose=self.verbose, **kwargs
|
||||
final_chunk = self._chat_stream_with_aggregation(
|
||||
messages,
|
||||
stop=stop,
|
||||
run_manager=run_manager,
|
||||
verbose=self.verbose,
|
||||
**kwargs,
|
||||
)
|
||||
chat_generation = ChatGeneration(
|
||||
message=AIMessage(content=final_chunk.text),
|
||||
@@ -110,9 +225,30 @@ class ChatOllama(BaseChatModel, _OllamaCommon):
|
||||
stop: Optional[List[str]] = None,
|
||||
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||
**kwargs: Any,
|
||||
) -> Iterator[ChatGenerationChunk]:
|
||||
try:
|
||||
for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
|
||||
if stream_resp:
|
||||
chunk = _stream_response_to_chat_generation_chunk(stream_resp)
|
||||
yield chunk
|
||||
if run_manager:
|
||||
run_manager.on_llm_new_token(
|
||||
chunk.text,
|
||||
verbose=self.verbose,
|
||||
)
|
||||
except OllamaEndpointNotFoundError:
|
||||
yield from self._legacy_stream(messages, stop, **kwargs)
|
||||
|
||||
@deprecated("0.0.3", alternative="_stream")
|
||||
def _legacy_stream(
|
||||
self,
|
||||
messages: List[BaseMessage],
|
||||
stop: Optional[List[str]] = None,
|
||||
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||
**kwargs: Any,
|
||||
) -> Iterator[ChatGenerationChunk]:
|
||||
prompt = self._format_messages_as_text(messages)
|
||||
for stream_resp in self._create_stream(prompt, stop, **kwargs):
|
||||
for stream_resp in self._create_generate_stream(prompt, stop, **kwargs):
|
||||
if stream_resp:
|
||||
chunk = _stream_response_to_chat_generation_chunk(stream_resp)
|
||||
yield chunk
|
||||
|
||||
@@ -454,9 +454,12 @@ class ChatOpenAI(BaseChatModel):
|
||||
response = response.dict()
|
||||
for res in response["choices"]:
|
||||
message = convert_dict_to_message(res["message"])
|
||||
generation_info = dict(finish_reason=res.get("finish_reason"))
|
||||
if "logprobs" in res:
|
||||
generation_info["logprobs"] = res["logprobs"]
|
||||
gen = ChatGeneration(
|
||||
message=message,
|
||||
generation_info=dict(finish_reason=res.get("finish_reason")),
|
||||
generation_info=generation_info,
|
||||
)
|
||||
generations.append(gen)
|
||||
token_usage = response.get("usage", {})
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
"""Wrapper around YandexGPT chat models."""
|
||||
import logging
|
||||
from typing import Any, Dict, List, Optional, Tuple, cast
|
||||
from typing import Any, Dict, List, Optional, cast
|
||||
|
||||
from langchain_core.callbacks import (
|
||||
AsyncCallbackManagerForLLMRun,
|
||||
@@ -25,14 +25,13 @@ def _parse_message(role: str, text: str) -> Dict:
|
||||
return {"role": role, "text": text}
|
||||
|
||||
|
||||
def _parse_chat_history(history: List[BaseMessage]) -> Tuple[List[Dict[str, str]], str]:
|
||||
def _parse_chat_history(history: List[BaseMessage]) -> List[Dict[str, str]]:
|
||||
"""Parse a sequence of messages into history.
|
||||
|
||||
Returns:
|
||||
A tuple of a list of parsed messages and an instruction message for the model.
|
||||
A list of parsed messages.
|
||||
"""
|
||||
chat_history = []
|
||||
instruction = ""
|
||||
for message in history:
|
||||
content = cast(str, message.content)
|
||||
if isinstance(message, HumanMessage):
|
||||
@@ -40,8 +39,8 @@ def _parse_chat_history(history: List[BaseMessage]) -> Tuple[List[Dict[str, str]
|
||||
if isinstance(message, AIMessage):
|
||||
chat_history.append(_parse_message("assistant", content))
|
||||
if isinstance(message, SystemMessage):
|
||||
instruction = content
|
||||
return chat_history, instruction
|
||||
chat_history.append(_parse_message("system", content))
|
||||
return chat_history
|
||||
|
||||
|
||||
class ChatYandexGPT(_BaseYandexGPT, BaseChatModel):
|
||||
@@ -84,9 +83,14 @@ class ChatYandexGPT(_BaseYandexGPT, BaseChatModel):
|
||||
try:
|
||||
import grpc
|
||||
from google.protobuf.wrappers_pb2 import DoubleValue, Int64Value
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_pb2 import GenerationOptions, Message
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_service_pb2 import ChatRequest
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_service_pb2_grpc import (
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_pb2 import (
|
||||
CompletionOptions,
|
||||
Message,
|
||||
)
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2 import ( # noqa: E501
|
||||
CompletionRequest,
|
||||
)
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2_grpc import ( # noqa: E501
|
||||
TextGenerationServiceStub,
|
||||
)
|
||||
except ImportError as e:
|
||||
@@ -97,25 +101,20 @@ class ChatYandexGPT(_BaseYandexGPT, BaseChatModel):
|
||||
raise ValueError(
|
||||
"You should provide at least one message to start the chat!"
|
||||
)
|
||||
message_history, instruction = _parse_chat_history(messages)
|
||||
message_history = _parse_chat_history(messages)
|
||||
channel_credentials = grpc.ssl_channel_credentials()
|
||||
channel = grpc.secure_channel(self.url, channel_credentials)
|
||||
request = ChatRequest(
|
||||
model=self.model_name,
|
||||
generation_options=GenerationOptions(
|
||||
request = CompletionRequest(
|
||||
model_uri=self.model_uri,
|
||||
completion_options=CompletionOptions(
|
||||
temperature=DoubleValue(value=self.temperature),
|
||||
max_tokens=Int64Value(value=self.max_tokens),
|
||||
),
|
||||
instruction_text=instruction,
|
||||
messages=[Message(**message) for message in message_history],
|
||||
)
|
||||
stub = TextGenerationServiceStub(channel)
|
||||
if self.iam_token:
|
||||
metadata = (("authorization", f"Bearer {self.iam_token}"),)
|
||||
else:
|
||||
metadata = (("authorization", f"Api-Key {self.api_key}"),)
|
||||
res = stub.Chat(request, metadata=metadata)
|
||||
text = list(res)[0].message.text
|
||||
res = stub.Completion(request, metadata=self._grpc_metadata)
|
||||
text = list(res)[0].alternatives[0].message.text
|
||||
text = text if stop is None else enforce_stop_tokens(text, stop)
|
||||
message = AIMessage(content=text)
|
||||
return ChatResult(generations=[ChatGeneration(message=message)])
|
||||
@@ -127,6 +126,75 @@ class ChatYandexGPT(_BaseYandexGPT, BaseChatModel):
|
||||
run_manager: Optional[AsyncCallbackManagerForLLMRun] = None,
|
||||
**kwargs: Any,
|
||||
) -> ChatResult:
|
||||
raise NotImplementedError(
|
||||
"""YandexGPT doesn't support async requests at the moment."""
|
||||
)
|
||||
"""Async method to generate next turn in the conversation.
|
||||
|
||||
Args:
|
||||
messages: The history of the conversation as a list of messages.
|
||||
stop: The list of stop words (optional).
|
||||
run_manager: The CallbackManager for LLM run, it's not used at the moment.
|
||||
|
||||
Returns:
|
||||
The ChatResult that contains outputs generated by the model.
|
||||
|
||||
Raises:
|
||||
ValueError: if the last message in the list is not from human.
|
||||
"""
|
||||
try:
|
||||
import asyncio
|
||||
|
||||
import grpc
|
||||
from google.protobuf.wrappers_pb2 import DoubleValue, Int64Value
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_pb2 import (
|
||||
CompletionOptions,
|
||||
Message,
|
||||
)
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2 import ( # noqa: E501
|
||||
CompletionRequest,
|
||||
CompletionResponse,
|
||||
)
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2_grpc import ( # noqa: E501
|
||||
TextGenerationAsyncServiceStub,
|
||||
)
|
||||
from yandex.cloud.operation.operation_service_pb2 import GetOperationRequest
|
||||
from yandex.cloud.operation.operation_service_pb2_grpc import (
|
||||
OperationServiceStub,
|
||||
)
|
||||
except ImportError as e:
|
||||
raise ImportError(
|
||||
"Please install YandexCloud SDK" " with `pip install yandexcloud`."
|
||||
) from e
|
||||
if not messages:
|
||||
raise ValueError(
|
||||
"You should provide at least one message to start the chat!"
|
||||
)
|
||||
message_history = _parse_chat_history(messages)
|
||||
operation_api_url = "operation.api.cloud.yandex.net:443"
|
||||
channel_credentials = grpc.ssl_channel_credentials()
|
||||
async with grpc.aio.secure_channel(self.url, channel_credentials) as channel:
|
||||
request = CompletionRequest(
|
||||
model_uri=self.model_uri,
|
||||
completion_options=CompletionOptions(
|
||||
temperature=DoubleValue(value=self.temperature),
|
||||
max_tokens=Int64Value(value=self.max_tokens),
|
||||
),
|
||||
messages=[Message(**message) for message in message_history],
|
||||
)
|
||||
stub = TextGenerationAsyncServiceStub(channel)
|
||||
operation = await stub.Completion(request, metadata=self._grpc_metadata)
|
||||
async with grpc.aio.secure_channel(
|
||||
operation_api_url, channel_credentials
|
||||
) as operation_channel:
|
||||
operation_stub = OperationServiceStub(operation_channel)
|
||||
while not operation.done:
|
||||
await asyncio.sleep(1)
|
||||
operation_request = GetOperationRequest(operation_id=operation.id)
|
||||
operation = await operation_stub.Get(
|
||||
operation_request, metadata=self._grpc_metadata
|
||||
)
|
||||
|
||||
completion_response = CompletionResponse()
|
||||
operation.response.Unpack(completion_response)
|
||||
text = completion_response.alternatives[0].message.text
|
||||
text = text if stop is None else enforce_stop_tokens(text, stop)
|
||||
message = AIMessage(content=text)
|
||||
return ChatResult(generations=[ChatGeneration(message=message)])
|
||||
|
||||
@@ -1,16 +1,29 @@
|
||||
from typing import Dict, List
|
||||
import logging
|
||||
import re
|
||||
import string
|
||||
import threading
|
||||
from concurrent.futures import ThreadPoolExecutor, wait
|
||||
from typing import Any, Dict, List, Literal, Optional, Tuple
|
||||
|
||||
from langchain_core.embeddings import Embeddings
|
||||
from langchain_core.language_models.llms import create_base_retry_decorator
|
||||
from langchain_core.pydantic_v1 import root_validator
|
||||
|
||||
from langchain_community.llms.vertexai import _VertexAICommon
|
||||
from langchain_community.utilities.vertexai import raise_vertex_import_error
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_MAX_TOKENS_PER_BATCH = 20000
|
||||
_MAX_BATCH_SIZE = 250
|
||||
_MIN_BATCH_SIZE = 5
|
||||
|
||||
|
||||
class VertexAIEmbeddings(_VertexAICommon, Embeddings):
|
||||
"""Google Cloud VertexAI embedding models."""
|
||||
|
||||
model_name: str = "textembedding-gecko"
|
||||
# Instance context
|
||||
instance: Dict[str, Any] = {} #: :meta private:
|
||||
|
||||
@root_validator()
|
||||
def validate_environment(cls, values: Dict) -> Dict:
|
||||
@@ -18,31 +31,294 @@ class VertexAIEmbeddings(_VertexAICommon, Embeddings):
|
||||
cls._try_init_vertexai(values)
|
||||
try:
|
||||
from vertexai.language_models import TextEmbeddingModel
|
||||
|
||||
values["client"] = TextEmbeddingModel.from_pretrained(values["model_name"])
|
||||
except ImportError:
|
||||
raise_vertex_import_error()
|
||||
values["client"] = TextEmbeddingModel.from_pretrained(values["model_name"])
|
||||
return values
|
||||
|
||||
def embed_documents(
|
||||
self, texts: List[str], batch_size: int = 5
|
||||
def __init__(
|
||||
self,
|
||||
project: Optional[str] = None,
|
||||
location: str = "us-central1",
|
||||
request_parallelism: int = 5,
|
||||
max_retries: int = 6,
|
||||
model_name: str = "textembedding-gecko",
|
||||
credentials: Optional[Any] = None,
|
||||
**kwargs: Any,
|
||||
):
|
||||
"""Initialize the sentence_transformer."""
|
||||
super().__init__(
|
||||
project=project,
|
||||
location=location,
|
||||
credentials=credentials,
|
||||
request_parallelism=request_parallelism,
|
||||
max_retries=max_retries,
|
||||
model_name=model_name,
|
||||
**kwargs,
|
||||
)
|
||||
self.instance["max_batch_size"] = kwargs.get("max_batch_size", _MAX_BATCH_SIZE)
|
||||
self.instance["batch_size"] = self.instance["max_batch_size"]
|
||||
self.instance["min_batch_size"] = kwargs.get("min_batch_size", _MIN_BATCH_SIZE)
|
||||
self.instance["min_good_batch_size"] = self.instance["min_batch_size"]
|
||||
self.instance["lock"] = threading.Lock()
|
||||
self.instance["batch_size_validated"] = False
|
||||
self.instance["task_executor"] = ThreadPoolExecutor(
|
||||
max_workers=request_parallelism
|
||||
)
|
||||
self.instance[
|
||||
"embeddings_task_type_supported"
|
||||
] = not self.client._endpoint_name.endswith("/textembedding-gecko@001")
|
||||
|
||||
@staticmethod
|
||||
def _split_by_punctuation(text: str) -> List[str]:
|
||||
"""Splits a string by punctuation and whitespace characters."""
|
||||
split_by = string.punctuation + "\t\n "
|
||||
pattern = f"([{split_by}])"
|
||||
# Using re.split to split the text based on the pattern
|
||||
return [segment for segment in re.split(pattern, text) if segment]
|
||||
|
||||
@staticmethod
|
||||
def _prepare_batches(texts: List[str], batch_size: int) -> List[List[str]]:
|
||||
"""Splits texts in batches based on current maximum batch size
|
||||
and maximum tokens per request.
|
||||
"""
|
||||
text_index = 0
|
||||
texts_len = len(texts)
|
||||
batch_token_len = 0
|
||||
batches: List[List[str]] = []
|
||||
current_batch: List[str] = []
|
||||
if texts_len == 0:
|
||||
return []
|
||||
while text_index < texts_len:
|
||||
current_text = texts[text_index]
|
||||
# Number of tokens per a text is conservatively estimated
|
||||
# as 2 times number of words, punctuation and whitespace characters.
|
||||
# Using `count_tokens` API will make batching too expensive.
|
||||
# Utilizing a tokenizer, would add a dependency that would not
|
||||
# necessarily be reused by the application using this class.
|
||||
current_text_token_cnt = (
|
||||
len(VertexAIEmbeddings._split_by_punctuation(current_text)) * 2
|
||||
)
|
||||
end_of_batch = False
|
||||
if current_text_token_cnt > _MAX_TOKENS_PER_BATCH:
|
||||
# Current text is too big even for a single batch.
|
||||
# Such request will fail, but we still make a batch
|
||||
# so that the app can get the error from the API.
|
||||
if len(current_batch) > 0:
|
||||
# Adding current batch if not empty.
|
||||
batches.append(current_batch)
|
||||
current_batch = [current_text]
|
||||
text_index += 1
|
||||
end_of_batch = True
|
||||
elif (
|
||||
batch_token_len + current_text_token_cnt > _MAX_TOKENS_PER_BATCH
|
||||
or len(current_batch) == batch_size
|
||||
):
|
||||
end_of_batch = True
|
||||
else:
|
||||
if text_index == texts_len - 1:
|
||||
# Last element - even though the batch may be not big,
|
||||
# we still need to make it.
|
||||
end_of_batch = True
|
||||
batch_token_len += current_text_token_cnt
|
||||
current_batch.append(current_text)
|
||||
text_index += 1
|
||||
if end_of_batch:
|
||||
batches.append(current_batch)
|
||||
current_batch = []
|
||||
batch_token_len = 0
|
||||
return batches
|
||||
|
||||
def _get_embeddings_with_retry(
|
||||
self, texts: List[str], embeddings_type: Optional[str] = None
|
||||
) -> List[List[float]]:
|
||||
"""Embed a list of strings. Vertex AI currently
|
||||
sets a max batch size of 5 strings.
|
||||
"""Makes a Vertex AI model request with retry logic."""
|
||||
from google.api_core.exceptions import (
|
||||
Aborted,
|
||||
DeadlineExceeded,
|
||||
ResourceExhausted,
|
||||
ServiceUnavailable,
|
||||
)
|
||||
|
||||
errors = [
|
||||
ResourceExhausted,
|
||||
ServiceUnavailable,
|
||||
Aborted,
|
||||
DeadlineExceeded,
|
||||
]
|
||||
retry_decorator = create_base_retry_decorator(
|
||||
error_types=errors, max_retries=self.max_retries
|
||||
)
|
||||
|
||||
@retry_decorator
|
||||
def _completion_with_retry(texts_to_process: List[str]) -> Any:
|
||||
if embeddings_type and self.instance["embeddings_task_type_supported"]:
|
||||
from vertexai.language_models import TextEmbeddingInput
|
||||
|
||||
requests = [
|
||||
TextEmbeddingInput(text=t, task_type=embeddings_type)
|
||||
for t in texts_to_process
|
||||
]
|
||||
else:
|
||||
requests = texts_to_process
|
||||
embeddings = self.client.get_embeddings(requests)
|
||||
return [embs.values for embs in embeddings]
|
||||
|
||||
return _completion_with_retry(texts)
|
||||
|
||||
def _prepare_and_validate_batches(
|
||||
self, texts: List[str], embeddings_type: Optional[str] = None
|
||||
) -> Tuple[List[List[float]], List[List[str]]]:
|
||||
"""Prepares text batches with one-time validation of batch size.
|
||||
Batch size varies between GCP regions and individual project quotas.
|
||||
# Returns embeddings of the first text batch that went through,
|
||||
# and text batches for the rest of the texts.
|
||||
"""
|
||||
from google.api_core.exceptions import InvalidArgument
|
||||
|
||||
batches = VertexAIEmbeddings._prepare_batches(
|
||||
texts, self.instance["batch_size"]
|
||||
)
|
||||
# If batch size if less or equal to one that went through before,
|
||||
# then keep batches as they are.
|
||||
if len(batches[0]) <= self.instance["min_good_batch_size"]:
|
||||
return [], batches
|
||||
with self.instance["lock"]:
|
||||
# If largest possible batch size was validated
|
||||
# while waiting for the lock, then check for rebuilding
|
||||
# our batches, and return.
|
||||
if self.instance["batch_size_validated"]:
|
||||
if len(batches[0]) <= self.instance["batch_size"]:
|
||||
return [], batches
|
||||
else:
|
||||
return [], VertexAIEmbeddings._prepare_batches(
|
||||
texts, self.instance["batch_size"]
|
||||
)
|
||||
# Figure out largest possible batch size by trying to push
|
||||
# batches and lowering their size in half after every failure.
|
||||
first_batch = batches[0]
|
||||
first_result = []
|
||||
had_failure = False
|
||||
while True:
|
||||
try:
|
||||
first_result = self._get_embeddings_with_retry(
|
||||
first_batch, embeddings_type
|
||||
)
|
||||
break
|
||||
except InvalidArgument:
|
||||
had_failure = True
|
||||
first_batch_len = len(first_batch)
|
||||
if first_batch_len == self.instance["min_batch_size"]:
|
||||
raise
|
||||
first_batch_len = max(
|
||||
self.instance["min_batch_size"], int(first_batch_len / 2)
|
||||
)
|
||||
first_batch = first_batch[:first_batch_len]
|
||||
first_batch_len = len(first_batch)
|
||||
self.instance["min_good_batch_size"] = max(
|
||||
self.instance["min_good_batch_size"], first_batch_len
|
||||
)
|
||||
# If had a failure and recovered
|
||||
# or went through with the max size, then it's a legit batch size.
|
||||
if had_failure or first_batch_len == self.instance["max_batch_size"]:
|
||||
self.instance["batch_size"] = first_batch_len
|
||||
self.instance["batch_size_validated"] = True
|
||||
# If batch size was updated,
|
||||
# rebuild batches with the new batch size
|
||||
# (texts that went through are excluded here).
|
||||
if first_batch_len != self.instance["max_batch_size"]:
|
||||
batches = VertexAIEmbeddings._prepare_batches(
|
||||
texts[first_batch_len:], self.instance["batch_size"]
|
||||
)
|
||||
else:
|
||||
# Still figuring out max batch size.
|
||||
batches = batches[1:]
|
||||
# Returning embeddings of the first text batch that went through,
|
||||
# and text batches for the rest of texts.
|
||||
return first_result, batches
|
||||
|
||||
def embed(
|
||||
self,
|
||||
texts: List[str],
|
||||
batch_size: int = 0,
|
||||
embeddings_task_type: Optional[
|
||||
Literal[
|
||||
"RETRIEVAL_QUERY",
|
||||
"RETRIEVAL_DOCUMENT",
|
||||
"SEMANTIC_SIMILARITY",
|
||||
"CLASSIFICATION",
|
||||
"CLUSTERING",
|
||||
]
|
||||
] = None,
|
||||
) -> List[List[float]]:
|
||||
"""Embed a list of strings.
|
||||
|
||||
Args:
|
||||
texts: List[str] The list of strings to embed.
|
||||
batch_size: [int] The batch size of embeddings to send to the model
|
||||
batch_size: [int] The batch size of embeddings to send to the model.
|
||||
If zero, then the largest batch size will be detected dynamically
|
||||
at the first request, starting from 250, down to 5.
|
||||
embeddings_task_type: [str] optional embeddings task type,
|
||||
one of the following
|
||||
RETRIEVAL_QUERY - Text is a query
|
||||
in a search/retrieval setting.
|
||||
RETRIEVAL_DOCUMENT - Text is a document
|
||||
in a search/retrieval setting.
|
||||
SEMANTIC_SIMILARITY - Embeddings will be used
|
||||
for Semantic Textual Similarity (STS).
|
||||
CLASSIFICATION - Embeddings will be used for classification.
|
||||
CLUSTERING - Embeddings will be used for clustering.
|
||||
|
||||
Returns:
|
||||
List of embeddings, one for each text.
|
||||
"""
|
||||
embeddings = []
|
||||
for batch in range(0, len(texts), batch_size):
|
||||
text_batch = texts[batch : batch + batch_size]
|
||||
embeddings_batch = self.client.get_embeddings(text_batch)
|
||||
embeddings.extend([el.values for el in embeddings_batch])
|
||||
if len(texts) == 0:
|
||||
return []
|
||||
embeddings: List[List[float]] = []
|
||||
first_batch_result: List[List[float]] = []
|
||||
if batch_size > 0:
|
||||
# Fixed batch size.
|
||||
batches = VertexAIEmbeddings._prepare_batches(texts, batch_size)
|
||||
else:
|
||||
# Dynamic batch size, starting from 250 at the first call.
|
||||
first_batch_result, batches = self._prepare_and_validate_batches(
|
||||
texts, embeddings_task_type
|
||||
)
|
||||
# First batch result may have some embeddings already.
|
||||
# In such case, batches have texts that were not processed yet.
|
||||
embeddings.extend(first_batch_result)
|
||||
tasks = []
|
||||
for batch in batches:
|
||||
tasks.append(
|
||||
self.instance["task_executor"].submit(
|
||||
self._get_embeddings_with_retry,
|
||||
texts=batch,
|
||||
embeddings_type=embeddings_task_type,
|
||||
)
|
||||
)
|
||||
if len(tasks) > 0:
|
||||
wait(tasks)
|
||||
for t in tasks:
|
||||
embeddings.extend(t.result())
|
||||
return embeddings
|
||||
|
||||
def embed_documents(
|
||||
self, texts: List[str], batch_size: int = 0
|
||||
) -> List[List[float]]:
|
||||
"""Embed a list of documents.
|
||||
|
||||
Args:
|
||||
texts: List[str] The list of texts to embed.
|
||||
batch_size: [int] The batch size of embeddings to send to the model.
|
||||
If zero, then the largest batch size will be detected dynamically
|
||||
at the first request, starting from 250, down to 5.
|
||||
|
||||
Returns:
|
||||
List of embeddings, one for each text.
|
||||
"""
|
||||
return self.embed(texts, batch_size, "RETRIEVAL_DOCUMENT")
|
||||
|
||||
def embed_query(self, text: str) -> List[float]:
|
||||
"""Embed a text.
|
||||
|
||||
@@ -52,5 +328,5 @@ class VertexAIEmbeddings(_VertexAICommon, Embeddings):
|
||||
Returns:
|
||||
Embedding for the text.
|
||||
"""
|
||||
embeddings = self.client.get_embeddings([text])
|
||||
return embeddings[0].values
|
||||
embeddings = self.embed([text], 1, "RETRIEVAL_QUERY")
|
||||
return embeddings[0]
|
||||
|
||||
@@ -20,6 +20,10 @@ def _stream_response_to_generation_chunk(
|
||||
)
|
||||
|
||||
|
||||
class OllamaEndpointNotFoundError(Exception):
|
||||
"""Raised when the Ollama endpoint is not found."""
|
||||
|
||||
|
||||
class _OllamaCommon(BaseLanguageModel):
|
||||
base_url: str = "http://localhost:11434"
|
||||
"""Base url the model is hosted under."""
|
||||
@@ -129,10 +133,26 @@ class _OllamaCommon(BaseLanguageModel):
|
||||
"""Get the identifying parameters."""
|
||||
return {**{"model": self.model, "format": self.format}, **self._default_params}
|
||||
|
||||
def _create_stream(
|
||||
def _create_generate_stream(
|
||||
self,
|
||||
prompt: str,
|
||||
stop: Optional[List[str]] = None,
|
||||
images: Optional[List[str]] = None,
|
||||
**kwargs: Any,
|
||||
) -> Iterator[str]:
|
||||
payload = {"prompt": prompt, "images": images}
|
||||
yield from self._create_stream(
|
||||
payload=payload,
|
||||
stop=stop,
|
||||
api_url=f"{self.base_url}/api/generate/",
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
def _create_stream(
|
||||
self,
|
||||
api_url: str,
|
||||
payload: Any,
|
||||
stop: Optional[List[str]] = None,
|
||||
**kwargs: Any,
|
||||
) -> Iterator[str]:
|
||||
if self.stop is not None and stop is not None:
|
||||
@@ -156,20 +176,34 @@ class _OllamaCommon(BaseLanguageModel):
|
||||
**kwargs,
|
||||
}
|
||||
|
||||
if payload.get("messages"):
|
||||
request_payload = {"messages": payload.get("messages", []), **params}
|
||||
else:
|
||||
request_payload = {
|
||||
"prompt": payload.get("prompt"),
|
||||
"images": payload.get("images", []),
|
||||
**params,
|
||||
}
|
||||
|
||||
response = requests.post(
|
||||
url=f"{self.base_url}/api/generate/",
|
||||
url=api_url,
|
||||
headers={"Content-Type": "application/json"},
|
||||
json={"prompt": prompt, **params},
|
||||
json=request_payload,
|
||||
stream=True,
|
||||
timeout=self.timeout,
|
||||
)
|
||||
response.encoding = "utf-8"
|
||||
if response.status_code != 200:
|
||||
optional_detail = response.json().get("error")
|
||||
raise ValueError(
|
||||
f"Ollama call failed with status code {response.status_code}."
|
||||
f" Details: {optional_detail}"
|
||||
)
|
||||
if response.status_code == 404:
|
||||
raise OllamaEndpointNotFoundError(
|
||||
"Ollama call failed with status code 404."
|
||||
)
|
||||
else:
|
||||
optional_detail = response.json().get("error")
|
||||
raise ValueError(
|
||||
f"Ollama call failed with status code {response.status_code}."
|
||||
f" Details: {optional_detail}"
|
||||
)
|
||||
return response.iter_lines(decode_unicode=True)
|
||||
|
||||
def _stream_with_aggregation(
|
||||
@@ -181,7 +215,7 @@ class _OllamaCommon(BaseLanguageModel):
|
||||
**kwargs: Any,
|
||||
) -> GenerationChunk:
|
||||
final_chunk: Optional[GenerationChunk] = None
|
||||
for stream_resp in self._create_stream(prompt, stop, **kwargs):
|
||||
for stream_resp in self._create_generate_stream(prompt, stop, **kwargs):
|
||||
if stream_resp:
|
||||
chunk = _stream_response_to_generation_chunk(stream_resp)
|
||||
if final_chunk is None:
|
||||
@@ -225,6 +259,7 @@ class Ollama(BaseLLM, _OllamaCommon):
|
||||
self,
|
||||
prompts: List[str],
|
||||
stop: Optional[List[str]] = None,
|
||||
images: Optional[List[str]] = None,
|
||||
run_manager: Optional[CallbackManagerForLLMRun] = None,
|
||||
**kwargs: Any,
|
||||
) -> LLMResult:
|
||||
@@ -248,6 +283,7 @@ class Ollama(BaseLLM, _OllamaCommon):
|
||||
final_chunk = super()._stream_with_aggregation(
|
||||
prompt,
|
||||
stop=stop,
|
||||
images=images,
|
||||
run_manager=run_manager,
|
||||
verbose=self.verbose,
|
||||
**kwargs,
|
||||
|
||||
@@ -108,7 +108,7 @@ class Tongyi(LLM):
|
||||
return False
|
||||
|
||||
client: Any #: :meta private:
|
||||
model_name: str = "qwen-plus-v1"
|
||||
model_name: str = "qwen-plus"
|
||||
|
||||
"""Model name to use."""
|
||||
model_kwargs: Dict[str, Any] = Field(default_factory=dict)
|
||||
|
||||
@@ -14,13 +14,19 @@ from langchain_community.llms.utils import enforce_stop_tokens
|
||||
|
||||
class _BaseYandexGPT(Serializable):
|
||||
iam_token: str = ""
|
||||
"""Yandex Cloud IAM token for service account
|
||||
"""Yandex Cloud IAM token for service or user account
|
||||
with the `ai.languageModels.user` role"""
|
||||
api_key: str = ""
|
||||
"""Yandex Cloud Api Key for service account
|
||||
with the `ai.languageModels.user` role"""
|
||||
model_name: str = "general"
|
||||
folder_id: str = ""
|
||||
"""Yandex Cloud folder ID"""
|
||||
model_uri: str = ""
|
||||
"""Model uri to use."""
|
||||
model_name: str = "yandexgpt-lite"
|
||||
"""Model name to use."""
|
||||
model_version: str = "latest"
|
||||
"""Model version to use."""
|
||||
temperature: float = 0.6
|
||||
"""What sampling temperature to use.
|
||||
Should be a double number between 0 (inclusive) and 1 (inclusive)."""
|
||||
@@ -45,8 +51,27 @@ class _BaseYandexGPT(Serializable):
|
||||
values["iam_token"] = iam_token
|
||||
api_key = get_from_dict_or_env(values, "api_key", "YC_API_KEY", "")
|
||||
values["api_key"] = api_key
|
||||
folder_id = get_from_dict_or_env(values, "folder_id", "YC_FOLDER_ID", "")
|
||||
values["folder_id"] = folder_id
|
||||
if api_key == "" and iam_token == "":
|
||||
raise ValueError("Either 'YC_API_KEY' or 'YC_IAM_TOKEN' must be provided.")
|
||||
|
||||
if values["iam_token"]:
|
||||
values["_grpc_metadata"] = [
|
||||
("authorization", f"Bearer {values['iam_token']}")
|
||||
]
|
||||
if values["folder_id"]:
|
||||
values["_grpc_metadata"].append(("x-folder-id", values["folder_id"]))
|
||||
else:
|
||||
values["_grpc_metadata"] = (
|
||||
("authorization", f"Api-Key {values['api_key']}"),
|
||||
)
|
||||
if values["model_uri"] == "" and values["folder_id"] == "":
|
||||
raise ValueError("Either 'model_uri' or 'folder_id' must be provided.")
|
||||
if not values["model_uri"]:
|
||||
values[
|
||||
"model_uri"
|
||||
] = f"gpt://{values['folder_id']}/{values['model_name']}/{values['model_version']}"
|
||||
return values
|
||||
|
||||
|
||||
@@ -62,18 +87,23 @@ class YandexGPT(_BaseYandexGPT, LLM):
|
||||
- You can specify the key in a constructor parameter `api_key`
|
||||
or in an environment variable `YC_API_KEY`.
|
||||
|
||||
To use the default model specify the folder ID in a parameter `folder_id`
|
||||
or in an environment variable `YC_FOLDER_ID`.
|
||||
|
||||
Or specify the model URI in a constructor parameter `model_uri`
|
||||
|
||||
Example:
|
||||
.. code-block:: python
|
||||
|
||||
from langchain_community.llms import YandexGPT
|
||||
yandex_gpt = YandexGPT(iam_token="t1.9eu...")
|
||||
yandex_gpt = YandexGPT(iam_token="t1.9eu...", folder_id="b1g...")
|
||||
"""
|
||||
|
||||
@property
|
||||
def _identifying_params(self) -> Mapping[str, Any]:
|
||||
"""Get the identifying parameters."""
|
||||
return {
|
||||
"model_name": self.model_name,
|
||||
"model_uri": self.model_uri,
|
||||
"temperature": self.temperature,
|
||||
"max_tokens": self.max_tokens,
|
||||
"stop": self.stop,
|
||||
@@ -103,9 +133,14 @@ class YandexGPT(_BaseYandexGPT, LLM):
|
||||
try:
|
||||
import grpc
|
||||
from google.protobuf.wrappers_pb2 import DoubleValue, Int64Value
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_pb2 import GenerationOptions
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_service_pb2 import InstructRequest
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_service_pb2_grpc import (
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_pb2 import (
|
||||
CompletionOptions,
|
||||
Message,
|
||||
)
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2 import ( # noqa: E501
|
||||
CompletionRequest,
|
||||
)
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2_grpc import ( # noqa: E501
|
||||
TextGenerationServiceStub,
|
||||
)
|
||||
except ImportError as e:
|
||||
@@ -114,21 +149,21 @@ class YandexGPT(_BaseYandexGPT, LLM):
|
||||
) from e
|
||||
channel_credentials = grpc.ssl_channel_credentials()
|
||||
channel = grpc.secure_channel(self.url, channel_credentials)
|
||||
request = InstructRequest(
|
||||
model=self.model_name,
|
||||
request_text=prompt,
|
||||
generation_options=GenerationOptions(
|
||||
request = CompletionRequest(
|
||||
model_uri=self.model_uri,
|
||||
completion_options=CompletionOptions(
|
||||
temperature=DoubleValue(value=self.temperature),
|
||||
max_tokens=Int64Value(value=self.max_tokens),
|
||||
),
|
||||
messages=[Message(role="user", text=prompt)],
|
||||
)
|
||||
stub = TextGenerationServiceStub(channel)
|
||||
if self.iam_token:
|
||||
metadata = (("authorization", f"Bearer {self.iam_token}"),)
|
||||
else:
|
||||
metadata = (("authorization", f"Api-Key {self.api_key}"),)
|
||||
res = stub.Instruct(request, metadata=metadata)
|
||||
text = list(res)[0].alternatives[0].text
|
||||
res = stub.Completion(request, metadata=metadata)
|
||||
text = list(res)[0].alternatives[0].message.text
|
||||
if stop is not None:
|
||||
text = enforce_stop_tokens(text, stop)
|
||||
return text
|
||||
@@ -154,12 +189,15 @@ class YandexGPT(_BaseYandexGPT, LLM):
|
||||
|
||||
import grpc
|
||||
from google.protobuf.wrappers_pb2 import DoubleValue, Int64Value
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_pb2 import GenerationOptions
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_service_pb2 import (
|
||||
InstructRequest,
|
||||
InstructResponse,
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_pb2 import (
|
||||
CompletionOptions,
|
||||
Message,
|
||||
)
|
||||
from yandex.cloud.ai.llm.v1alpha.llm_service_pb2_grpc import (
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2 import ( # noqa: E501
|
||||
CompletionRequest,
|
||||
CompletionResponse,
|
||||
)
|
||||
from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2_grpc import ( # noqa: E501
|
||||
TextGenerationAsyncServiceStub,
|
||||
)
|
||||
from yandex.cloud.operation.operation_service_pb2 import GetOperationRequest
|
||||
@@ -173,20 +211,16 @@ class YandexGPT(_BaseYandexGPT, LLM):
|
||||
operation_api_url = "operation.api.cloud.yandex.net:443"
|
||||
channel_credentials = grpc.ssl_channel_credentials()
|
||||
async with grpc.aio.secure_channel(self.url, channel_credentials) as channel:
|
||||
request = InstructRequest(
|
||||
model=self.model_name,
|
||||
request_text=prompt,
|
||||
generation_options=GenerationOptions(
|
||||
request = CompletionRequest(
|
||||
model_uri=self.model_uri,
|
||||
completion_options=CompletionOptions(
|
||||
temperature=DoubleValue(value=self.temperature),
|
||||
max_tokens=Int64Value(value=self.max_tokens),
|
||||
),
|
||||
messages=[Message(role="user", text=prompt)],
|
||||
)
|
||||
stub = TextGenerationAsyncServiceStub(channel)
|
||||
if self.iam_token:
|
||||
metadata = (("authorization", f"Bearer {self.iam_token}"),)
|
||||
else:
|
||||
metadata = (("authorization", f"Api-Key {self.api_key}"),)
|
||||
operation = await stub.Instruct(request, metadata=metadata)
|
||||
operation = await stub.Completion(request, metadata=self._grpc_metadata)
|
||||
async with grpc.aio.secure_channel(
|
||||
operation_api_url, channel_credentials
|
||||
) as operation_channel:
|
||||
@@ -195,12 +229,12 @@ class YandexGPT(_BaseYandexGPT, LLM):
|
||||
await asyncio.sleep(1)
|
||||
operation_request = GetOperationRequest(operation_id=operation.id)
|
||||
operation = await operation_stub.Get(
|
||||
operation_request, metadata=metadata
|
||||
operation_request, metadata=self._grpc_metadata
|
||||
)
|
||||
|
||||
instruct_response = InstructResponse()
|
||||
operation.response.Unpack(instruct_response)
|
||||
text = instruct_response.alternatives[0].text
|
||||
completion_response = CompletionResponse()
|
||||
operation.response.Unpack(completion_response)
|
||||
text = completion_response.alternatives[0].message.text
|
||||
if stop is not None:
|
||||
text = enforce_stop_tokens(text, stop)
|
||||
return text
|
||||
|
||||
@@ -405,6 +405,10 @@ class SQLDatabase:
|
||||
connection.exec_driver_sql(
|
||||
f"ALTER SESSION SET CURRENT_SCHEMA = {self._schema}"
|
||||
)
|
||||
elif self.dialect == "sqlany":
|
||||
# If anybody using Sybase SQL anywhere database then it should not
|
||||
# go to else condition. It should be same as mssql.
|
||||
pass
|
||||
else: # postgresql and other compatible dialects
|
||||
connection.exec_driver_sql("SET search_path TO %s", (self._schema,))
|
||||
cursor = connection.execute(text(command))
|
||||
|
||||
@@ -4,6 +4,7 @@ import logging
|
||||
from typing import (
|
||||
TYPE_CHECKING,
|
||||
Any,
|
||||
Callable,
|
||||
Dict,
|
||||
Generator,
|
||||
Iterable,
|
||||
@@ -60,6 +61,7 @@ class MongoDBAtlasVectorSearch(VectorStore):
|
||||
index_name: str = "default",
|
||||
text_key: str = "text",
|
||||
embedding_key: str = "embedding",
|
||||
relevance_score_fn: str = "cosine",
|
||||
):
|
||||
"""
|
||||
Args:
|
||||
@@ -70,17 +72,32 @@ class MongoDBAtlasVectorSearch(VectorStore):
|
||||
embedding_key: MongoDB field that will contain the embedding for
|
||||
each document.
|
||||
index_name: Name of the Atlas Search index.
|
||||
relevance_score_fn: The similarity score used for the index.
|
||||
Currently supported: Euclidean, cosine, and dot product.
|
||||
"""
|
||||
self._collection = collection
|
||||
self._embedding = embedding
|
||||
self._index_name = index_name
|
||||
self._text_key = text_key
|
||||
self._embedding_key = embedding_key
|
||||
self._relevance_score_fn = relevance_score_fn
|
||||
|
||||
@property
|
||||
def embeddings(self) -> Embeddings:
|
||||
return self._embedding
|
||||
|
||||
def _select_relevance_score_fn(self) -> Callable[[float], float]:
|
||||
if self._relevance_score_fn == "euclidean":
|
||||
return self._euclidean_relevance_score_fn
|
||||
elif self._relevance_score_fn == "dotProduct":
|
||||
return self._max_inner_product_relevance_score_fn
|
||||
elif self._relevance_score_fn == "cosine":
|
||||
return self._cosine_relevance_score_fn
|
||||
else:
|
||||
raise NotImplementedError(
|
||||
f"No relevance score function for ${self._relevance_score_fn}"
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def from_connection_string(
|
||||
cls,
|
||||
@@ -198,7 +215,6 @@ class MongoDBAtlasVectorSearch(VectorStore):
|
||||
def similarity_search_with_score(
|
||||
self,
|
||||
query: str,
|
||||
*,
|
||||
k: int = 4,
|
||||
pre_filter: Optional[Dict] = None,
|
||||
post_filter_pipeline: Optional[List[Dict]] = None,
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
[tool.poetry]
|
||||
name = "langchain-community"
|
||||
version = "0.0.3"
|
||||
version = "0.0.4"
|
||||
description = "Community contributed LangChain integrations."
|
||||
authors = []
|
||||
license = "MIT"
|
||||
@@ -130,8 +130,8 @@ optional = true
|
||||
# developers from being able to easily run them.
|
||||
# Instead write unit tests that use the `responses` library or mock.patch with
|
||||
# fixtures. Keep the fixtures minimal.
|
||||
# See CONTRIBUTING.md for more instructions on working with optional dependencies.
|
||||
# https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md#working-with-optional-dependencies
|
||||
# See Contributing Guide for more instructions on working with optional dependencies.
|
||||
# https://python.langchain.com/docs/contributing/code#working-with-optional-dependencies
|
||||
pytest-vcr = "^1.0.2"
|
||||
wrapt = "^1.15.0"
|
||||
openai = "^1"
|
||||
|
||||
@@ -1,8 +1,8 @@
|
||||
"""Test Vertex AI API wrapper.
|
||||
In order to run this test, you need to install VertexAI SDK
|
||||
In order to run this test, you need to install VertexAI SDK
|
||||
pip install google-cloud-aiplatform>=1.35.0
|
||||
|
||||
Your end-user credentials would be used to make the calls (make sure you've run
|
||||
Your end-user credentials would be used to make the calls (make sure you've run
|
||||
`gcloud auth login` first).
|
||||
"""
|
||||
from langchain_community.embeddings import VertexAIEmbeddings
|
||||
@@ -24,6 +24,16 @@ def test_embedding_query() -> None:
|
||||
assert len(output) == 768
|
||||
|
||||
|
||||
def test_large_batches() -> None:
|
||||
documents = ["foo bar" for _ in range(0, 251)]
|
||||
model_uscentral1 = VertexAIEmbeddings(location="us-central1")
|
||||
model_asianortheast1 = VertexAIEmbeddings(location="asia-northeast1")
|
||||
model_uscentral1.embed_documents(documents)
|
||||
model_asianortheast1.embed_documents(documents)
|
||||
assert model_uscentral1.instance["batch_size"] >= 250
|
||||
assert model_asianortheast1.instance["batch_size"] < 50
|
||||
|
||||
|
||||
def test_paginated_texts() -> None:
|
||||
documents = [
|
||||
"foo bar",
|
||||
|
||||
63
libs/community/tests/unit_tests/embeddings/test_vertexai.py
Normal file
63
libs/community/tests/unit_tests/embeddings/test_vertexai.py
Normal file
@@ -0,0 +1,63 @@
|
||||
"""Test Vertex AI embeddings API wrapper.
|
||||
"""
|
||||
|
||||
from langchain_community.embeddings import VertexAIEmbeddings
|
||||
|
||||
|
||||
def test_split_by_punctuation() -> None:
|
||||
parts = VertexAIEmbeddings._split_by_punctuation(
|
||||
"Hello, my friend!\nHow are you?\nI have 2 news:\n\n\t- Good,\n\t- Bad."
|
||||
)
|
||||
assert parts == [
|
||||
"Hello",
|
||||
",",
|
||||
" ",
|
||||
"my",
|
||||
" ",
|
||||
"friend",
|
||||
"!",
|
||||
"\n",
|
||||
"How",
|
||||
" ",
|
||||
"are",
|
||||
" ",
|
||||
"you",
|
||||
"?",
|
||||
"\n",
|
||||
"I",
|
||||
" ",
|
||||
"have",
|
||||
" ",
|
||||
"2",
|
||||
" ",
|
||||
"news",
|
||||
":",
|
||||
"\n",
|
||||
"\n",
|
||||
"\t",
|
||||
"-",
|
||||
" ",
|
||||
"Good",
|
||||
",",
|
||||
"\n",
|
||||
"\t",
|
||||
"-",
|
||||
" ",
|
||||
"Bad",
|
||||
".",
|
||||
]
|
||||
|
||||
|
||||
def test_batching() -> None:
|
||||
long_text = "foo " * 500 # 1000 words, 2000 tokens
|
||||
long_texts = [long_text for _ in range(0, 250)]
|
||||
documents251 = ["foo bar" for _ in range(0, 251)]
|
||||
five_elem = VertexAIEmbeddings._prepare_batches(long_texts, 5)
|
||||
default250_elem = VertexAIEmbeddings._prepare_batches(long_texts, 250)
|
||||
batches251 = VertexAIEmbeddings._prepare_batches(documents251, 250)
|
||||
assert len(five_elem) == 50 # 250/5 items
|
||||
assert len(five_elem[0]) == 5 # 5 items per batch
|
||||
assert len(default250_elem[0]) == 10 # Should not be more than 20K tokens
|
||||
assert len(default250_elem) == 25
|
||||
assert len(batches251[0]) == 250
|
||||
assert len(batches251[1]) == 1
|
||||
@@ -55,4 +55,4 @@ Patch version increases will occur for:
|
||||
|
||||
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
|
||||
|
||||
For detailed information on how to contribute, see [here](../../.github/CONTRIBUTING.md).
|
||||
For detailed information on how to contribute, see the [Contributing Guide](https://python.langchain.com/docs/contributing/).
|
||||
@@ -3,11 +3,16 @@ import json
|
||||
import os
|
||||
from typing import Any, Dict, List, Optional
|
||||
|
||||
from langchain_core.load.mapping import SERIALIZABLE_MAPPING
|
||||
from langchain_core.load.mapping import (
|
||||
OLD_PROMPT_TEMPLATE_FORMATS,
|
||||
SERIALIZABLE_MAPPING,
|
||||
)
|
||||
from langchain_core.load.serializable import Serializable
|
||||
|
||||
DEFAULT_NAMESPACES = ["langchain", "langchain_core", "langchain_community"]
|
||||
|
||||
ALL_SERIALIZABLE_MAPPINGS = {**SERIALIZABLE_MAPPING, **OLD_PROMPT_TEMPLATE_FORMATS}
|
||||
|
||||
|
||||
class Reviver:
|
||||
"""Reviver for JSON objects."""
|
||||
@@ -67,13 +72,13 @@ class Reviver:
|
||||
if namespace[0] in DEFAULT_NAMESPACES:
|
||||
# Get the importable path
|
||||
key = tuple(namespace + [name])
|
||||
if key not in SERIALIZABLE_MAPPING:
|
||||
if key not in ALL_SERIALIZABLE_MAPPINGS:
|
||||
raise ValueError(
|
||||
"Trying to deserialize something that cannot "
|
||||
"be deserialized in current version of langchain-core: "
|
||||
f"{key}"
|
||||
)
|
||||
import_path = SERIALIZABLE_MAPPING[key]
|
||||
import_path = ALL_SERIALIZABLE_MAPPINGS[key]
|
||||
# Split into module and name
|
||||
import_dir, import_obj = import_path[:-1], import_path[-1]
|
||||
# Import module
|
||||
|
||||
@@ -476,3 +476,162 @@ SERIALIZABLE_MAPPING = {
|
||||
"RunnableRetry",
|
||||
),
|
||||
}
|
||||
|
||||
# Needed for backwards compatibility for a few versions where we serialized
|
||||
# with langchain_core
|
||||
OLD_PROMPT_TEMPLATE_FORMATS = {
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"base",
|
||||
"BasePromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"base",
|
||||
"BasePromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"prompt",
|
||||
"PromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"prompt",
|
||||
"PromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"MessagesPlaceholder",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"MessagesPlaceholder",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"ChatPromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"ChatPromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"HumanMessagePromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"HumanMessagePromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"SystemMessagePromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"SystemMessagePromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"BaseMessagePromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"BaseMessagePromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"BaseChatPromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"BaseChatPromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"ChatMessagePromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"ChatMessagePromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"few_shot_with_templates",
|
||||
"FewShotPromptWithTemplates",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"few_shot_with_templates",
|
||||
"FewShotPromptWithTemplates",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"pipeline",
|
||||
"PipelinePromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"pipeline",
|
||||
"PipelinePromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"string",
|
||||
"StringPromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"string",
|
||||
"StringPromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"BaseStringMessagePromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"BaseStringMessagePromptTemplate",
|
||||
),
|
||||
(
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"AIMessagePromptTemplate",
|
||||
): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
"chat",
|
||||
"AIMessagePromptTemplate",
|
||||
),
|
||||
}
|
||||
|
||||
@@ -19,7 +19,8 @@ def test_interfaces() -> None:
|
||||
|
||||
|
||||
def _get_get_session_history(
|
||||
*, store: Optional[Dict[str, Any]] = None
|
||||
*,
|
||||
store: Optional[Dict[str, Any]] = None,
|
||||
) -> Callable[..., ChatMessageHistory]:
|
||||
chat_history_store = store if store is not None else {}
|
||||
|
||||
|
||||
@@ -93,4 +93,4 @@ For more information on these concepts, please see our [full documentation](http
|
||||
|
||||
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
|
||||
|
||||
For detailed information on how to contribute, see [here](../../.github/CONTRIBUTING.md).
|
||||
For detailed information on how to contribute, see the [Contributing Guide](https://python.langchain.com/docs/contributing/).
|
||||
|
||||
@@ -288,7 +288,7 @@ class OpenAIAssistantRunnable(RunnableSerializable[Dict, OutputType]):
|
||||
tc.id for tc in run.required_action.submit_tool_outputs.tool_calls
|
||||
}
|
||||
tool_outputs = [
|
||||
{"output": output, "tool_call_id": action.tool_call_id}
|
||||
{"output": str(output), "tool_call_id": action.tool_call_id}
|
||||
for action, output in intermediate_steps
|
||||
if action.tool_call_id in required_tool_call_ids
|
||||
]
|
||||
|
||||
@@ -36,7 +36,7 @@ def parse_ai_message_to_openai_tool_action(
|
||||
function = tool_call["function"]
|
||||
function_name = function["name"]
|
||||
try:
|
||||
_tool_input = json.loads(function["arguments"])
|
||||
_tool_input = json.loads(function["arguments"] or "{}")
|
||||
except JSONDecodeError:
|
||||
raise OutputParserException(
|
||||
f"Could not parse tool input: {function} because "
|
||||
|
||||
@@ -195,6 +195,7 @@ def index(
|
||||
cleanup: Literal["incremental", "full", None] = None,
|
||||
source_id_key: Union[str, Callable[[Document], str], None] = None,
|
||||
cleanup_batch_size: int = 1_000,
|
||||
force_update: bool = False,
|
||||
) -> IndexingResult:
|
||||
"""Index data from the loader into the vector store.
|
||||
|
||||
@@ -233,6 +234,8 @@ def index(
|
||||
source_id_key: Optional key that helps identify the original source
|
||||
of the document.
|
||||
cleanup_batch_size: Batch size to use when cleaning up documents.
|
||||
force_update: Force update documents even if they are present in the
|
||||
record manager. Useful if you are re-indexing with updated embeddings.
|
||||
|
||||
Returns:
|
||||
Indexing result which contains information about how many documents
|
||||
@@ -308,10 +311,14 @@ def index(
|
||||
uids = []
|
||||
docs_to_index = []
|
||||
uids_to_refresh = []
|
||||
seen_docs: Set[str] = set()
|
||||
for hashed_doc, doc_exists in zip(hashed_docs, exists_batch):
|
||||
if doc_exists:
|
||||
uids_to_refresh.append(hashed_doc.uid)
|
||||
continue
|
||||
if force_update:
|
||||
seen_docs.add(hashed_doc.uid)
|
||||
else:
|
||||
uids_to_refresh.append(hashed_doc.uid)
|
||||
continue
|
||||
uids.append(hashed_doc.uid)
|
||||
docs_to_index.append(hashed_doc.to_document())
|
||||
|
||||
@@ -324,7 +331,8 @@ def index(
|
||||
# First write to vector store
|
||||
if docs_to_index:
|
||||
vector_store.add_documents(docs_to_index, ids=uids)
|
||||
num_added += len(docs_to_index)
|
||||
num_added += len(docs_to_index) - len(seen_docs)
|
||||
num_updated += len(seen_docs)
|
||||
|
||||
# And only then update the record store.
|
||||
# Update ALL records, even if they already exist since we want to refresh
|
||||
@@ -391,6 +399,7 @@ async def aindex(
|
||||
cleanup: Literal["incremental", "full", None] = None,
|
||||
source_id_key: Union[str, Callable[[Document], str], None] = None,
|
||||
cleanup_batch_size: int = 1_000,
|
||||
force_update: bool = False,
|
||||
) -> IndexingResult:
|
||||
"""Index data from the loader into the vector store.
|
||||
|
||||
@@ -429,6 +438,8 @@ async def aindex(
|
||||
source_id_key: Optional key that helps identify the original source
|
||||
of the document.
|
||||
cleanup_batch_size: Batch size to use when cleaning up documents.
|
||||
force_update: Force update documents even if they are present in the
|
||||
record manager. Useful if you are re-indexing with updated embeddings.
|
||||
|
||||
Returns:
|
||||
Indexing result which contains information about how many documents
|
||||
@@ -508,11 +519,14 @@ async def aindex(
|
||||
uids: list[str] = []
|
||||
docs_to_index: list[Document] = []
|
||||
uids_to_refresh = []
|
||||
|
||||
seen_docs: Set[str] = set()
|
||||
for hashed_doc, doc_exists in zip(hashed_docs, exists_batch):
|
||||
if doc_exists:
|
||||
uids_to_refresh.append(hashed_doc.uid)
|
||||
continue
|
||||
if force_update:
|
||||
seen_docs.add(hashed_doc.uid)
|
||||
else:
|
||||
uids_to_refresh.append(hashed_doc.uid)
|
||||
continue
|
||||
uids.append(hashed_doc.uid)
|
||||
docs_to_index.append(hashed_doc.to_document())
|
||||
|
||||
@@ -525,7 +539,8 @@ async def aindex(
|
||||
# First write to vector store
|
||||
if docs_to_index:
|
||||
await vector_store.aadd_documents(docs_to_index, ids=uids)
|
||||
num_added += len(docs_to_index)
|
||||
num_added += len(docs_to_index) - len(seen_docs)
|
||||
num_updated += len(seen_docs)
|
||||
|
||||
# And only then update the record store.
|
||||
# Update ALL records, even if they already exist since we want to refresh
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
[tool.poetry]
|
||||
name = "langchain"
|
||||
version = "0.0.350"
|
||||
version = "0.0.351"
|
||||
description = "Building applications with LLMs through composability"
|
||||
authors = []
|
||||
license = "MIT"
|
||||
@@ -157,8 +157,8 @@ optional = true
|
||||
# developers from being able to easily run them.
|
||||
# Instead write unit tests that use the `responses` library or mock.patch with
|
||||
# fixtures. Keep the fixtures minimal.
|
||||
# See CONTRIBUTING.md for more instructions on working with optional dependencies.
|
||||
# https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md#working-with-optional-dependencies
|
||||
# See the Contributing Guide for more instructions on working with optional dependencies.
|
||||
# https://python.langchain.com/docs/contributing/code#working-with-optional-dependencies
|
||||
pytest-vcr = "^1.0.2"
|
||||
wrapt = "^1.15.0"
|
||||
openai = "^1"
|
||||
|
||||
@@ -1,125 +1,3 @@
|
||||
# Langchain Tests
|
||||
|
||||
## Unit Tests
|
||||
|
||||
Unit tests cover modular logic that does not require calls to outside APIs.
|
||||
If you add new logic, please add a unit test.
|
||||
|
||||
To run unit tests:
|
||||
|
||||
```bash
|
||||
make test
|
||||
```
|
||||
|
||||
To run unit tests in Docker:
|
||||
|
||||
```bash
|
||||
make docker_tests
|
||||
```
|
||||
|
||||
## Integration Tests
|
||||
|
||||
Integration tests cover logic that requires making calls to outside APIs (often integration with other services).
|
||||
If you add support for a new external API, please add a new integration test.
|
||||
|
||||
**warning** Almost no tests should be integration tests.
|
||||
|
||||
Tests that require making network connections make it difficult for other
|
||||
developers to test the code.
|
||||
|
||||
Instead favor relying on `responses` library and/or mock.patch to mock
|
||||
requests using small fixtures.
|
||||
|
||||
To install dependencies for integration tests:
|
||||
|
||||
```bash
|
||||
poetry install --with test_integration
|
||||
```
|
||||
|
||||
To run integration tests:
|
||||
|
||||
```bash
|
||||
make integration_tests
|
||||
```
|
||||
|
||||
### Prepare
|
||||
|
||||
The integration tests exercise several search engines and databases. The tests
|
||||
aim to verify the correct behavior of the engines and databases according to
|
||||
their specifications and requirements.
|
||||
|
||||
To run some integration tests, such as tests located in
|
||||
`tests/integration_tests/vectorstores/`, you will need to install the following
|
||||
software:
|
||||
|
||||
- Docker
|
||||
- Python 3.8.1 or later
|
||||
|
||||
Any new dependencies should be added by running:
|
||||
|
||||
```bash
|
||||
# add package and install it after adding:
|
||||
poetry add tiktoken@latest --group "test_integration" && poetry install --with test_integration
|
||||
```
|
||||
|
||||
Before running any tests, you should start a specific Docker container that has all the
|
||||
necessary dependencies installed. For instance, we use the `elasticsearch.yml` container
|
||||
for `test_elasticsearch.py`:
|
||||
|
||||
```bash
|
||||
cd tests/integration_tests/vectorstores/docker-compose
|
||||
docker-compose -f elasticsearch.yml up
|
||||
```
|
||||
|
||||
For environments that requires more involving preparation, look for `*.sh`. For instance,
|
||||
`opensearch.sh` builds a required docker image and then launch opensearch.
|
||||
|
||||
|
||||
### Prepare environment variables for local testing:
|
||||
|
||||
- copy `tests/integration_tests/.env.example` to `tests/integration_tests/.env`
|
||||
- set variables in `tests/integration_tests/.env` file, e.g `OPENAI_API_KEY`
|
||||
|
||||
Additionally, it's important to note that some integration tests may require certain
|
||||
environment variables to be set, such as `OPENAI_API_KEY`. Be sure to set any required
|
||||
environment variables before running the tests to ensure they run correctly.
|
||||
|
||||
### Recording HTTP interactions with pytest-vcr
|
||||
|
||||
Some of the integration tests in this repository involve making HTTP requests to
|
||||
external services. To prevent these requests from being made every time the tests are
|
||||
run, we use pytest-vcr to record and replay HTTP interactions.
|
||||
|
||||
When running tests in a CI/CD pipeline, you may not want to modify the existing
|
||||
cassettes. You can use the --vcr-record=none command-line option to disable recording
|
||||
new cassettes. Here's an example:
|
||||
|
||||
```bash
|
||||
pytest --log-cli-level=10 tests/integration_tests/vectorstores/test_pinecone.py --vcr-record=none
|
||||
pytest tests/integration_tests/vectorstores/test_elasticsearch.py --vcr-record=none
|
||||
|
||||
```
|
||||
|
||||
### Run some tests with coverage:
|
||||
|
||||
```bash
|
||||
pytest tests/integration_tests/vectorstores/test_elasticsearch.py --cov=langchain --cov-report=html
|
||||
start "" htmlcov/index.html || open htmlcov/index.html
|
||||
|
||||
```
|
||||
|
||||
## Coverage
|
||||
|
||||
Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.
|
||||
|
||||
Coverage requires the dependencies for integration tests:
|
||||
|
||||
```bash
|
||||
poetry install --with test_integration
|
||||
```
|
||||
|
||||
To get a report of current coverage, run the following:
|
||||
|
||||
```bash
|
||||
make coverage
|
||||
```
|
||||
[This guide has moved to the docs](https://python.langchain.com/docs/contributing/testing)
|
||||
|
||||
@@ -58,9 +58,10 @@ class ToyLoader(BaseLoader):
|
||||
class InMemoryVectorStore(VectorStore):
|
||||
"""In-memory implementation of VectorStore using a dictionary."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
def __init__(self, permit_upserts: bool = False) -> None:
|
||||
"""Vector store interface for testing things in memory."""
|
||||
self.store: Dict[str, Document] = {}
|
||||
self.permit_upserts = permit_upserts
|
||||
|
||||
def delete(self, ids: Optional[Sequence[str]] = None, **kwargs: Any) -> None:
|
||||
"""Delete the given documents from the store using their IDs."""
|
||||
@@ -91,7 +92,7 @@ class InMemoryVectorStore(VectorStore):
|
||||
raise NotImplementedError("This is not implemented yet.")
|
||||
|
||||
for _id, document in zip(ids, documents):
|
||||
if _id in self.store:
|
||||
if _id in self.store and not self.permit_upserts:
|
||||
raise ValueError(
|
||||
f"Document with uid {_id} already exists in the store."
|
||||
)
|
||||
@@ -115,7 +116,7 @@ class InMemoryVectorStore(VectorStore):
|
||||
raise NotImplementedError("This is not implemented yet.")
|
||||
|
||||
for _id, document in zip(ids, documents):
|
||||
if _id in self.store:
|
||||
if _id in self.store and not self.permit_upserts:
|
||||
raise ValueError(
|
||||
f"Document with uid {_id} already exists in the store."
|
||||
)
|
||||
@@ -176,6 +177,12 @@ def vector_store() -> InMemoryVectorStore:
|
||||
return InMemoryVectorStore()
|
||||
|
||||
|
||||
@pytest.fixture
|
||||
def upserting_vector_store() -> InMemoryVectorStore:
|
||||
"""Vector store fixture."""
|
||||
return InMemoryVectorStore(permit_upserts=True)
|
||||
|
||||
|
||||
def test_indexing_same_content(
|
||||
record_manager: SQLRecordManager, vector_store: InMemoryVectorStore
|
||||
) -> None:
|
||||
@@ -1074,6 +1081,101 @@ async def test_abatch() -> None:
|
||||
assert [batch async for batch in batches] == [[0, 1], [2, 3], [4]]
|
||||
|
||||
|
||||
def test_indexing_force_update(
|
||||
record_manager: SQLRecordManager, upserting_vector_store: VectorStore
|
||||
) -> None:
|
||||
"""Test indexing with force update."""
|
||||
docs = [
|
||||
Document(
|
||||
page_content="This is a test document.",
|
||||
metadata={"source": "1"},
|
||||
),
|
||||
Document(
|
||||
page_content="This is another document.",
|
||||
metadata={"source": "2"},
|
||||
),
|
||||
Document(
|
||||
page_content="This is a test document.",
|
||||
metadata={"source": "1"},
|
||||
),
|
||||
]
|
||||
|
||||
assert index(docs, record_manager, upserting_vector_store, cleanup="full") == {
|
||||
"num_added": 2,
|
||||
"num_deleted": 0,
|
||||
"num_skipped": 0,
|
||||
"num_updated": 0,
|
||||
}
|
||||
|
||||
assert index(docs, record_manager, upserting_vector_store, cleanup="full") == {
|
||||
"num_added": 0,
|
||||
"num_deleted": 0,
|
||||
"num_skipped": 2,
|
||||
"num_updated": 0,
|
||||
}
|
||||
|
||||
assert index(
|
||||
docs, record_manager, upserting_vector_store, cleanup="full", force_update=True
|
||||
) == {
|
||||
"num_added": 0,
|
||||
"num_deleted": 0,
|
||||
"num_skipped": 0,
|
||||
"num_updated": 2,
|
||||
}
|
||||
|
||||
|
||||
@pytest.mark.requires("aiosqlite")
|
||||
async def test_aindexing_force_update(
|
||||
arecord_manager: SQLRecordManager, upserting_vector_store: VectorStore
|
||||
) -> None:
|
||||
"""Test indexing with force update."""
|
||||
docs = [
|
||||
Document(
|
||||
page_content="This is a test document.",
|
||||
metadata={"source": "1"},
|
||||
),
|
||||
Document(
|
||||
page_content="This is another document.",
|
||||
metadata={"source": "2"},
|
||||
),
|
||||
Document(
|
||||
page_content="This is a test document.",
|
||||
metadata={"source": "1"},
|
||||
),
|
||||
]
|
||||
|
||||
assert await aindex(
|
||||
docs, arecord_manager, upserting_vector_store, cleanup="full"
|
||||
) == {
|
||||
"num_added": 2,
|
||||
"num_deleted": 0,
|
||||
"num_skipped": 0,
|
||||
"num_updated": 0,
|
||||
}
|
||||
|
||||
assert await aindex(
|
||||
docs, arecord_manager, upserting_vector_store, cleanup="full"
|
||||
) == {
|
||||
"num_added": 0,
|
||||
"num_deleted": 0,
|
||||
"num_skipped": 2,
|
||||
"num_updated": 0,
|
||||
}
|
||||
|
||||
assert await aindex(
|
||||
docs,
|
||||
arecord_manager,
|
||||
upserting_vector_store,
|
||||
cleanup="full",
|
||||
force_update=True,
|
||||
) == {
|
||||
"num_added": 0,
|
||||
"num_deleted": 0,
|
||||
"num_skipped": 0,
|
||||
"num_updated": 2,
|
||||
}
|
||||
|
||||
|
||||
def test_compatible_vectorstore_documentation() -> None:
|
||||
"""Test which vectorstores are compatible with the indexing API.
|
||||
|
||||
|
||||
@@ -57,7 +57,7 @@ add_routes(app, pirate_speak_chain, path="/pirate-speak")
|
||||
```
|
||||
|
||||
You can now edit the template you pulled down.
|
||||
You can change the code files in `package/pirate-speak` to use a different model, different prompt, different logic.
|
||||
You can change the code files in `packages/pirate-speak` to use a different model, different prompt, different logic.
|
||||
Note that the above code snippet always expects the final chain to be importable as `from pirate_speak.chain import chain`,
|
||||
so you should either keep the structure of the package similar enough to respect that or be prepared to update that code snippet.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user