Compare commits

..

1 Commits

Author SHA1 Message Date
Dev 2049
1aca66459b wip 2023-05-26 02:53:40 -07:00
4528 changed files with 448978 additions and 410698 deletions

42
.devcontainer/Dockerfile Normal file
View File

@@ -0,0 +1,42 @@
# This is a Dockerfile for Developer Container
# Use the Python base image
ARG VARIANT="3.11-bullseye"
FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT} AS langchain-dev-base
USER vscode
# Define the version of Poetry to install (default is 1.4.2)
# Define the directory of python virtual environment
ARG PYTHON_VIRTUALENV_HOME=/home/vscode/langchain-py-env \
POETRY_VERSION=1.4.2
ENV POETRY_VIRTUALENVS_IN_PROJECT=false \
POETRY_NO_INTERACTION=true
# Create a Python virtual environment for Poetry and install it
RUN python3 -m venv ${PYTHON_VIRTUALENV_HOME} && \
$PYTHON_VIRTUALENV_HOME/bin/pip install --upgrade pip && \
$PYTHON_VIRTUALENV_HOME/bin/pip install poetry==${POETRY_VERSION}
ENV PATH="$PYTHON_VIRTUALENV_HOME/bin:$PATH" \
VIRTUAL_ENV=$PYTHON_VIRTUALENV_HOME
# Setup for bash
RUN poetry completions bash >> /home/vscode/.bash_completion && \
echo "export PATH=$PYTHON_VIRTUALENV_HOME/bin:$PATH" >> ~/.bashrc
# Set the working directory for the app
WORKDIR /workspaces/langchain
# Use a multi-stage build to install dependencies
FROM langchain-dev-base AS langchain-dev-dependencies
ARG PYTHON_VIRTUALENV_HOME
# Copy only the dependency files for installation
COPY pyproject.toml poetry.lock poetry.toml ./
# Install the Poetry dependencies (this layer will be cached as long as the dependencies don't change)
RUN poetry install --no-interaction --no-ansi --with dev,test,docs

View File

@@ -1,49 +0,0 @@
# Dev container
This project includes a [dev container](https://containers.dev/), which lets you use a container as a full-featured dev environment.
You can use the dev container configuration in this folder to build and run the app without needing to install any of its tools locally! You can use it in [GitHub Codespaces](https://github.com/features/codespaces) or the [VS Code Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).
## GitHub Codespaces
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/langchain-ai/langchain)
You may use the button above, or follow these steps to open this repo in a Codespace:
1. Click the **Code** drop-down menu at the top of <https://github.com/langchain-ai/langchain>.
1. Click on the **Codespaces** tab.
1. Click **Create codespace on master**.
For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).
## VS Code Dev Containers
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
> [!NOTE]
> If you click the link above you will open the main repo (`langchain-ai/langchain`) and *not* your local cloned repo. This is fine if you only want to run and test the library, but if you want to contribute you can use the link below and replace with your username and cloned repo name:
```txt
https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/&lt;YOUR_USERNAME&gt;/&lt;YOUR_CLONED_REPO_NAME&gt;
```
Then you will have a local cloned repo where you can contribute and then create pull requests.
If you already have VS Code and Docker installed, you can use the button above to get started. This will use VSCode to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.
Alternatively you can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:
1. If this is your first time using a development container, please ensure your system meets the pre-reqs (i.e. have Docker installed) in the [getting started steps](https://aka.ms/vscode-remote/containers/getting-started).
2. Open a locally cloned copy of the code:
- Fork and Clone this repository to your local filesystem.
- Press <kbd>F1</kbd> and select the **Dev Containers: Open Folder in Container...** command.
- Select the cloned copy of this folder, wait for the container to start, and try things out!
You can learn more in the [Dev Containers documentation](https://code.visualstudio.com/docs/devcontainers/containers).
## Tips and tricks
- If you are working with the same repository folder in a container and Windows, you'll want consistent line endings (otherwise you may see hundreds of changes in the SCM view). The `.gitattributes` file in the root of this repo will disable line ending conversion and should prevent this. See [tips and tricks](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files) for more info.
- If you'd like to review the contents of the image used in this dev container, you can check it out in the [devcontainers/images](https://github.com/devcontainers/images/tree/main/src/python) repo.

View File

@@ -1,58 +1,33 @@
// For format details, see https://aka.ms/devcontainer.json. For config options, see the
// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-docker-compose
// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-dockerfile
{
// Name for the dev container
"name": "langchain",
// Point to a Docker Compose file
"dockerComposeFile": "./docker-compose.yaml",
// Required when using Docker Compose. The name of the service to connect to once running
"service": "langchain",
// The optional 'workspaceFolder' property is the path VS Code should open by default when
// connected. This is typically a file mount in .devcontainer/docker-compose.yml
"workspaceFolder": "/workspaces/langchain",
"mounts": [
"source=langchain-workspaces,target=/workspaces/langchain,type=volume"
],
// Prevent the container from shutting down
"overrideCommand": true,
// Features to add to the dev container. More info: https://containers.dev/features
"features": {
"ghcr.io/devcontainers/features/git:1": {},
"ghcr.io/devcontainers/features/github-cli:1": {}
},
"containerEnv": {
"UV_LINK_MODE": "copy"
},
// Use 'forwardPorts' to make a list of ports inside the container available locally.
// "forwardPorts": [],
// Run commands after the container is created
"postCreateCommand": "uv sync && echo 'LangChain (Python) dev environment ready!'",
// Configure tool-specific properties.
"customizations": {
"vscode": {
"extensions": [
"ms-python.python",
"ms-python.debugpy",
"ms-python.mypy-type-checker",
"ms-python.isort",
"unifiedjs.vscode-mdx",
"davidanson.vscode-markdownlint",
"ms-toolsai.jupyter",
"GitHub.copilot",
"GitHub.copilot-chat"
],
"settings": {
"python.defaultInterpreterPath": ".venv/bin/python",
"python.formatting.provider": "none",
"[python]": {
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": true
}
}
}
}
}
// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.
// "remoteUser": "root"
"dockerComposeFile": "./docker-compose.yaml",
"service": "langchain",
"workspaceFolder": "/workspaces/langchain",
"name": "langchain",
"customizations": {
"vscode": {
"extensions": [
"ms-python.python"
],
"settings": {
"python.defaultInterpreterPath": "/home/vscode/langchain-py-env/bin/python3.11"
}
}
},
// Features to add to the dev container. More info: https://containers.dev/features.
"features": {},
// Use 'forwardPorts' to make a list of ports inside the container available locally.
// "forwardPorts": [],
// Uncomment the next line to run commands after the container is created.
// "postCreateCommand": "cat /etc/os-release",
// Uncomment to connect as an existing user other than the container default. More info: https://aka.ms/dev-containers-non-root.
// "remoteUser": "devcontainer"
"remoteUser": "vscode",
"overrideCommand": true
}

View File

@@ -2,12 +2,30 @@ version: '3'
services:
langchain:
build:
dockerfile: libs/langchain/dev.Dockerfile
context: ..
dockerfile: .devcontainer/Dockerfile
context: ../
volumes:
- ../:/workspaces/langchain
networks:
- langchain-network
- langchain-network
# environment:
# MONGO_ROOT_USERNAME: root
# MONGO_ROOT_PASSWORD: example123
# depends_on:
# - mongo
# mongo:
# image: mongo
# restart: unless-stopped
# environment:
# MONGO_INITDB_ROOT_USERNAME: root
# MONGO_INITDB_ROOT_PASSWORD: example123
# ports:
# - "27017:27017"
# networks:
# - langchain-network
networks:
langchain-network:
driver: bridge

View File

@@ -1,52 +0,0 @@
# top-most EditorConfig file
root = true
# All files
[*]
charset = utf-8
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
# Python files
[*.py]
indent_style = space
indent_size = 4
max_line_length = 88
# JSON files
[*.json]
indent_style = space
indent_size = 2
# YAML files
[*.{yml,yaml}]
indent_style = space
indent_size = 2
# Markdown files
[*.md]
indent_style = space
indent_size = 2
trim_trailing_whitespace = false
# Configuration files
[*.{toml,ini,cfg}]
indent_style = space
indent_size = 4
# Shell scripts
[*.sh]
indent_style = space
indent_size = 2
# Makefile
[Makefile]
indent_style = tab
indent_size = 4
# Jupyter notebooks
[*.ipynb]
# Jupyter may include trailing whitespace in cell
# outputs that's semantically meaningful
trim_trailing_whitespace = false

3
.gitattributes vendored
View File

@@ -1,3 +0,0 @@
* text=auto eol=lf
*.{cmd,[cC][mM][dD]} text eol=crlf
*.{bat,[bB][aA][tT]} text eol=crlf

3
.github/CODEOWNERS vendored
View File

@@ -1,3 +0,0 @@
/.github/ @baskaryan @ccurme @eyurtsev
/libs/core/ @eyurtsev
/libs/partners/ @ccurme @mdrxy

View File

@@ -1,132 +0,0 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
conduct@langchain.dev.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations

View File

@@ -1,6 +1,206 @@
# Contributing to LangChain
Hi there! Thank you for even being interested in contributing to LangChain.
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
As an open source project in a rapidly developing field, we are extremely open
to contributions, whether they be in the form of new features, improved infra, better documentation, or bug fixes.
## 🗺️ Guidelines
### 👩‍💻 Contributing Code
To contribute to this project, please follow a ["fork and pull request"](https://docs.github.com/en/get-started/quickstart/contributing-to-projects) workflow.
Please do not try to push directly to this repo unless you are maintainer.
Please follow the checked-in pull request template when opening pull requests. Note related issues and tag relevant
maintainers.
Pull requests cannot land without passing the formatting, linting and testing checks first. See
[Common Tasks](#-common-tasks) for how to run these checks locally.
It's essential that we maintain great documentation and testing. If you:
- Fix a bug
- Add a relevant unit or integration test when possible. These live in `tests/unit_tests` and `tests/integration_tests`.
- Make an improvement
- Update any affected example notebooks and documentation. These lives in `docs`.
- Update unit and integration tests when relevant.
- Add a feature
- Add a demo notebook in `docs/modules`.
- Add unit and integration tests.
We're a small, building-oriented team. If there's something you'd like to add or change, opening a pull request is the
best way to get our attention.
### 🚩GitHub Issues
Our [issues](https://github.com/hwchase17/langchain/issues) page is kept up to date
with bugs, improvements, and feature requests.
There is a taxonomy of labels to help with sorting and discovery of issues of interest. Please use these to help
organize issues.
If you start working on an issue, please assign it to yourself.
If you are adding an issue, please try to keep it focused on a single, modular bug/improvement/feature.
If two issues are related, or blocking, please link them rather than combining them.
We will try to keep these issues as up to date as possible, though
with the rapid rate of develop in this field some may get out of date.
If you notice this happening, please let us know.
### 🙋Getting Help
Our goal is to have the simplest developer setup possible. Should you experience any difficulty getting setup, please
contact a maintainer! Not only do we want to help get you unblocked, but we also want to make sure that the process is
smooth for future contributors.
In a similar vein, we do enforce certain linting, formatting, and documentation standards in the codebase.
If you are finding these difficult (or even just annoying) to work with, feel free to contact a maintainer for help -
we do not want these to get in the way of getting good code into the codebase.
## 🚀 Quick Start
This project uses [Poetry](https://python-poetry.org/) as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.
❗Note: If you use `Conda` or `Pyenv` as your environment / package manager, avoid dependency conflicts by doing the following first:
1. *Before installing Poetry*, create and activate a new Conda env (e.g. `conda create -n langchain python=3.9`)
2. Install Poetry (see above)
3. Tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)
4. Continue with the following steps.
To install requirements:
```bash
poetry install -E all
```
This will install all requirements for running the package, examples, linting, formatting, tests, and coverage. Note the `-E all` flag will install all optional dependencies necessary for integration testing.
❗Note: If you're running Poetry 1.4.1 and receive a `WheelFileValidationError` for `debugpy` during installation, you can try either downgrading to Poetry 1.4.0 or disabling "modern installation" (`poetry config installer.modern-installation false`) and re-install requirements. See [this `debugpy` issue](https://github.com/microsoft/debugpy/issues/1246) for more details.
Now, you should be able to run the common tasks in the following section. To double check, run `make test`, all tests should pass. If they don't you may need to pip install additional dependencies, such as `numexpr` and `openapi_schema_pydantic`.
## ✅ Common Tasks
Type `make` for a list of common tasks.
### Code Formatting
Formatting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/) and [isort](https://pycqa.github.io/isort/).
To run formatting for this project:
```bash
make format
```
### Linting
Linting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/), [isort](https://pycqa.github.io/isort/), [flake8](https://flake8.pycqa.org/en/latest/), and [mypy](http://mypy-lang.org/).
To run linting for this project:
```bash
make lint
```
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
### Coverage
Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.
To get a report of current coverage, run the following:
```bash
make coverage
```
### Testing
Unit tests cover modular logic that does not require calls to outside APIs.
To run unit tests:
```bash
make test
```
To run unit tests in Docker:
```bash
make docker_tests
```
If you add new logic, please add a unit test.
Integration tests cover logic that requires making calls to outside APIs (often integration with other services).
To run integration tests:
```bash
make integration_tests
```
If you add support for a new external API, please add a new integration test.
### Adding a Jupyter Notebook
If you are adding a Jupyter notebook example, you'll want to install the optional `dev` dependencies.
To install dev dependencies:
```bash
poetry install --with dev
```
Launch a notebook:
```bash
poetry run jupyter notebook
```
When you run `poetry install`, the `langchain` package is installed as editable in the virtualenv, so your new logic can be imported into the notebook.
## Documentation
### Contribute Documentation
Docs are largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.
For that reason, we ask that you add good documentation to all classes and methods.
Similar to linting, we recognize documentation can be annoying. If you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
### Build Documentation Locally
Before building the documentation, it is always a good idea to clean the build directory:
```bash
make docs_clean
```
Next, you can run the linkchecker to make sure all links are valid:
```bash
make docs_linkcheck
```
Finally, you can build the documentation as outlined below:
```bash
make docs_build
```
## 🏭 Release Process
As of now, LangChain has an ad hoc release process: releases are cut with high frequency by
a developer and published to [PyPI](https://pypi.org/project/langchain/).
LangChain follows the [semver](https://semver.org/) versioning standard. However, as pre-1.0 software,
even patch releases may contain [non-backwards-compatible changes](https://semver.org/#spec-item-4).
### 🌟 Recognition
If your contribution has made its way into a release, we will want to give you credit on Twitter (only if you want though)!
If you have a Twitter account you would like us to mention, please let us know in the PR or in another manner.
To learn how to contribute to LangChain, please follow the [contribution guide here](https://docs.langchain.com/oss/python/contributing).

View File

@@ -1,142 +1,106 @@
name: "\U0001F41B Bug Report"
description: Report a bug in LangChain. To report a security issue, please instead use the security option below. For questions, please use the LangChain forum.
labels: ["bug"]
type: bug
description: Submit a bug report to help us improve LangChain
labels: ["02 Bug Report"]
body:
- type: markdown
attributes:
value: |
Thank you for taking the time to file a bug report.
value: >
Thank you for taking the time to file a bug report. Before creating a new
issue, please make sure to take a few moments to check the issue tracker
for existing issues about the bug.
For usage questions, feature requests and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
Check these before submitting to see if your issue has already been reported, fixed or if there's another way to solve your problem:
* [Documentation](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference Documentation](https://reference.langchain.com/python/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
* [LangChain Forum](https://forum.langchain.com/),
- type: checkboxes
id: checks
- type: textarea
id: system-info
attributes:
label: Checked other resources
description: Please confirm and check all the following options.
options:
- label: This is a bug, not a usage question.
required: true
- label: I added a clear and descriptive title that summarizes this issue.
required: true
- label: I used the GitHub search to find a similar question and didn't find it.
required: true
- label: I am sure that this is a bug in LangChain rather than my code.
required: true
- label: The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
required: true
- label: This is not related to the langchain-community package.
required: true
- label: I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
required: true
- type: checkboxes
id: package
label: System Info
description: Please share your system info with us.
placeholder: LangChain version, platform, python version, ...
validations:
required: true
- type: textarea
id: who-can-help
attributes:
label: Package (Required)
label: Who can help?
description: |
Which `langchain` package(s) is this bug related to? Select at least one.
Your issue will be replied to more quickly if you can figure out the right person to tag with @
If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**.
Note that if the package you are reporting for is not listed here, it is not in this repository (e.g. `langchain-google-genai` is in [`langchain-ai/langchain-google`](https://github.com/langchain-ai/langchain-google/)).
The core maintainers strive to read all issues, but tagging them will help them prioritize.
Please report issues for other packages to their respective repositories.
Please tag fewer than 3 people.
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoader Abstractions
- @eyurtsev
LLM/Chat Wrappers
- @hwchase17
- @agola11
Tools / Toolkits
- @vowelparrot
placeholder: "@Username ..."
- type: checkboxes
id: information-scripts-examples
attributes:
label: Information
description: "The problem arises when using:"
options:
- label: langchain
- label: langchain-openai
- label: langchain-anthropic
- label: langchain-classic
- label: langchain-core
- label: langchain-cli
- label: langchain-model-profiles
- label: langchain-tests
- label: langchain-text-splitters
- label: langchain-chroma
- label: langchain-deepseek
- label: langchain-exa
- label: langchain-fireworks
- label: langchain-groq
- label: langchain-huggingface
- label: langchain-mistralai
- label: langchain-nomic
- label: langchain-ollama
- label: langchain-perplexity
- label: langchain-prompty
- label: langchain-qdrant
- label: langchain-xai
- label: Other / not sure / general
- label: "The official example notebooks/scripts"
- label: "My own modified scripts"
- type: checkboxes
id: related-components
attributes:
label: Related Components
description: "Select the components related to the issue (if applicable):"
options:
- label: "LLMs/Chat Models"
- label: "Embedding Models"
- label: "Prompts / Prompt Templates / Prompt Selectors"
- label: "Output Parsers"
- label: "Document Loaders"
- label: "Vector Stores / Retrievers"
- label: "Memory"
- label: "Agents / Agent Executors"
- label: "Tools / Toolkits"
- label: "Chains"
- label: "Callbacks/Tracing"
- label: "Async"
- type: textarea
id: reproduction
validations:
required: true
attributes:
label: Example Code (Python)
label: Reproduction
description: |
Please add a self-contained, [minimal, reproducible, example](https://stackoverflow.com/help/minimal-reproducible-example) with your use case.
Please provide a [code sample](https://stackoverflow.com/help/minimal-reproducible-example) that reproduces the problem you ran into. It can be a Colab link or just a code snippet.
If you have code snippets, error messages, stack traces please provide them here as well.
Important! Use code tags to correctly format your code. See https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting
Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
If a maintainer can copy it, run it, and see it right away, there's a much higher chance that you'll be able to get help.
**Important!**
* Avoid screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
* Reduce your code to the minimum required to reproduce the issue if possible.
(This will be automatically formatted into code, so no need for backticks.)
render: python
placeholder: |
from langchain_core.runnables import RunnableLambda
Steps to reproduce the behavior:
def bad_code(inputs) -> int:
raise NotImplementedError('For demo purpose')
1.
2.
3.
chain = RunnableLambda(bad_code)
chain.invoke('Hello!')
- type: textarea
attributes:
label: Error Message and Stack Trace (if applicable)
description: |
If you are reporting an error, please copy and paste the full error message and
stack trace.
(This will be automatically formatted into code, so no need for backticks.)
render: shell
- type: textarea
id: description
attributes:
label: Description
description: |
What is the problem, question, or error?
Write a short description telling what you are doing, what you expect to happen, and what is currently happening.
placeholder: |
* I'm trying to use the `langchain` library to do X.
* I expect to see Y.
* Instead, it does Z.
id: expected-behavior
validations:
required: true
- type: textarea
id: system-info
attributes:
label: System Info
description: |
Please share your system info with us.
Run the following command in your terminal and paste the output here:
`python -m langchain_core.sys_info`
or if you have an existing python interpreter running:
```python
from langchain_core import sys_info
sys_info.print_sys_info()
```
placeholder: |
python -m langchain_core.sys_info
validations:
required: true
label: Expected behavior
description: "A clear and concise description of what you would expect to happen."

View File

@@ -1,18 +1,6 @@
blank_issues_enabled: false
blank_issues_enabled: true
version: 2.1
contact_links:
- name: 📚 Documentation issue
url: https://github.com/langchain-ai/docs/issues/new?template=01-langchain.yml
about: Report an issue related to the LangChain documentation
- name: 💬 LangChain Forum
url: https://forum.langchain.com/
about: General community discussions and support
- name: 📚 LangChain Documentation
url: https://docs.langchain.com/oss/python/langchain/overview
about: View the official LangChain documentation
- name: 📚 API Reference Documentation
url: https://reference.langchain.com/python/
about: View the official LangChain API reference documentation
- name: 💬 LangChain Forum
url: https://forum.langchain.com/
about: Ask questions and get help from the community
- name: Discord
url: https://discord.gg/6adMQxSpJS
about: General community discussions

View File

@@ -0,0 +1,19 @@
name: Documentation
description: Report an issue related to the LangChain documentation.
title: "DOC: <Please write a comprehensive title after the 'DOC: ' prefix>"
labels: [03 - Documentation]
body:
- type: textarea
attributes:
label: "Issue with current documentation:"
description: >
Please make sure to leave a reference to the document/code you're
referring to.
- type: textarea
attributes:
label: "Idea or request for content:"
description: >
Please describe as clearly as possible what topics you think are missing
from the current documentation.

View File

@@ -1,152 +1,30 @@
name: " Feature Request"
description: Request a new feature or enhancement for LangChain. For questions, please use the LangChain forum.
labels: ["feature request"]
type: feature
name: "\U0001F680 Feature request"
description: Submit a proposal/request for a new LangChain feature
labels: ["02 Feature Request"]
body:
- type: markdown
attributes:
value: |
Thank you for taking the time to request a new feature.
Use this to request NEW FEATURES or ENHANCEMENTS in LangChain. For bug reports, please use the bug report template. For usage questions and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
Relevant links to check before filing a feature request to see if your request has already been made or
if there's another way to achieve what you want:
* [Documentation](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference Documentation](https://reference.langchain.com/python/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
* [LangChain Forum](https://forum.langchain.com/),
- type: checkboxes
id: checks
attributes:
label: Checked other resources
description: Please confirm and check all the following options.
options:
- label: This is a feature request, not a bug report or usage question.
required: true
- label: I added a clear and descriptive title that summarizes the feature request.
required: true
- label: I used the GitHub search to find a similar feature request and didn't find it.
required: true
- label: I checked the LangChain documentation and API reference to see if this feature already exists.
required: true
- label: This is not related to the langchain-community package.
required: true
- type: checkboxes
id: package
attributes:
label: Package (Required)
description: |
Which `langchain` package(s) is this request related to? Select at least one.
Note that if the package you are requesting for is not listed here, it is not in this repository (e.g. `langchain-google-genai` is in `langchain-ai/langchain`).
Please submit feature requests for other packages to their respective repositories.
options:
- label: langchain
- label: langchain-openai
- label: langchain-anthropic
- label: langchain-classic
- label: langchain-core
- label: langchain-cli
- label: langchain-model-profiles
- label: langchain-tests
- label: langchain-text-splitters
- label: langchain-chroma
- label: langchain-deepseek
- label: langchain-exa
- label: langchain-fireworks
- label: langchain-groq
- label: langchain-huggingface
- label: langchain-mistralai
- label: langchain-nomic
- label: langchain-ollama
- label: langchain-perplexity
- label: langchain-prompty
- label: langchain-qdrant
- label: langchain-xai
- label: Other / not sure / general
- type: textarea
id: feature-description
id: feature-request
validations:
required: true
attributes:
label: Feature Description
label: Feature request
description: |
Please provide a clear and concise description of the feature you would like to see added to LangChain.
A clear and concise description of the feature proposal. Please provide links to any relevant GitHub repos, papers, or other resources if relevant.
What specific functionality are you requesting? Be as detailed as possible.
placeholder: |
I would like LangChain to support...
This feature would allow users to...
- type: textarea
id: use-case
id: motivation
validations:
required: true
attributes:
label: Use Case
label: Motivation
description: |
Describe the specific use case or problem this feature would solve.
Please outline the motivation for the proposal. Is your feature request related to a problem? e.g., I'm always frustrated when [...]. If this is related to another GitHub issue, please link here too.
Why do you need this feature? What problem does it solve for you or other users?
placeholder: |
I'm trying to build an application that...
Currently, I have to work around this by...
This feature would help me/users to...
- type: textarea
id: proposed-solution
id: contribution
validations:
required: false
required: true
attributes:
label: Proposed Solution
label: Your contribution
description: |
If you have ideas about how this feature could be implemented, please describe them here.
This is optional but can be helpful for maintainers to understand your vision.
placeholder: |
I think this could be implemented by...
The API could look like...
```python
# Example of how the feature might work
```
- type: textarea
id: alternatives
validations:
required: false
attributes:
label: Alternatives Considered
description: |
Have you considered any alternative solutions or workarounds?
What other approaches have you tried or considered?
placeholder: |
I've tried using...
Alternative approaches I considered:
1. ...
2. ...
But these don't work because...
- type: textarea
id: additional-context
validations:
required: false
attributes:
label: Additional Context
description: |
Add any other context, screenshots, examples, or references that would help explain your feature request.
placeholder: |
Related issues: #...
Similar features in other libraries:
- ...
Additional context or examples:
- ...
Is there any way that you could help, e.g. by submitting a PR? Make sure to read the CONTRIBUTING.MD [readme](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md)

18
.github/ISSUE_TEMPLATE/other.yml vendored Normal file
View File

@@ -0,0 +1,18 @@
name: Other Issue
description: Raise an issue that wouldn't be covered by the other templates.
title: "Issue: <Please write a comprehensive title after the 'Issue: ' prefix>"
labels: [04 - Other]
body:
- type: textarea
attributes:
label: "Issue you'd like to raise."
description: >
Please describe the issue you'd like to raise as clearly as possible.
Make sure to include any relevant links or references.
- type: textarea
attributes:
label: "Suggestion:"
description: >
Please outline a suggestion to improve the issue here.

View File

@@ -1,50 +0,0 @@
name: 🔒 Privileged
description: You are a LangChain maintainer, or was asked directly by a maintainer to create an issue here. If not, check the other options.
body:
- type: markdown
attributes:
value: |
If you are not a LangChain maintainer, employee, or were not asked directly by a maintainer to create an issue, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead.
- type: checkboxes
id: privileged
attributes:
label: Privileged issue
description: Confirm that you are allowed to create an issue here.
options:
- label: I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create an issue here.
required: true
- type: textarea
id: content
attributes:
label: Issue Content
description: Add the content of the issue here.
- type: checkboxes
id: package
attributes:
label: Package (Required)
description: |
Please select package(s) that this issue is related to.
options:
- label: langchain
- label: langchain-openai
- label: langchain-anthropic
- label: langchain-classic
- label: langchain-core
- label: langchain-cli
- label: langchain-model-profiles
- label: langchain-tests
- label: langchain-text-splitters
- label: langchain-chroma
- label: langchain-deepseek
- label: langchain-exa
- label: langchain-fireworks
- label: langchain-groq
- label: langchain-huggingface
- label: langchain-mistralai
- label: langchain-nomic
- label: langchain-ollama
- label: langchain-perplexity
- label: langchain-prompty
- label: langchain-qdrant
- label: langchain-xai
- label: Other / not sure / general

View File

@@ -1,121 +0,0 @@
name: "📋 Task"
description: Create a task for project management and tracking by LangChain maintainers. If you are not a maintainer, please use other templates or the forum.
labels: ["task"]
type: task
body:
- type: markdown
attributes:
value: |
Thanks for creating a task to help organize LangChain development.
This template is for **maintainer tasks** such as project management, development planning, refactoring, documentation updates, and other organizational work.
If you are not a LangChain maintainer or were not asked directly by a maintainer to create a task, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead or use the appropriate bug report or feature request templates on the previous page.
- type: checkboxes
id: maintainer
attributes:
label: Maintainer task
description: Confirm that you are allowed to create a task here.
options:
- label: I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create a task here.
required: true
- type: textarea
id: task-description
attributes:
label: Task Description
description: |
Provide a clear and detailed description of the task.
What needs to be done? Be specific about the scope and requirements.
placeholder: |
This task involves...
The goal is to...
Specific requirements:
- ...
- ...
validations:
required: true
- type: textarea
id: acceptance-criteria
attributes:
label: Acceptance Criteria
description: |
Define the criteria that must be met for this task to be considered complete.
What are the specific deliverables or outcomes expected?
placeholder: |
This task will be complete when:
- [ ] ...
- [ ] ...
- [ ] ...
validations:
required: true
- type: textarea
id: context
attributes:
label: Context and Background
description: |
Provide any relevant context, background information, or links to related issues/PRs.
Why is this task needed? What problem does it solve?
placeholder: |
Background:
- ...
Related issues/PRs:
- #...
Additional context:
- ...
validations:
required: false
- type: textarea
id: dependencies
attributes:
label: Dependencies
description: |
List any dependencies or blockers for this task.
Are there other tasks, issues, or external factors that need to be completed first?
placeholder: |
This task depends on:
- [ ] Issue #...
- [ ] PR #...
- [ ] External dependency: ...
Blocked by:
- ...
validations:
required: false
- type: checkboxes
id: package
attributes:
label: Package (Required)
description: |
Please select package(s) that this task is related to.
options:
- label: langchain
- label: langchain-openai
- label: langchain-anthropic
- label: langchain-classic
- label: langchain-core
- label: langchain-cli
- label: langchain-model-profiles
- label: langchain-tests
- label: langchain-text-splitters
- label: langchain-chroma
- label: langchain-deepseek
- label: langchain-exa
- label: langchain-fireworks
- label: langchain-groq
- label: langchain-huggingface
- label: langchain-mistralai
- label: langchain-nomic
- label: langchain-ollama
- label: langchain-perplexity
- label: langchain-prompty
- label: langchain-qdrant
- label: langchain-xai
- label: Other / not sure / general

View File

@@ -1,28 +1,46 @@
(Replace this entire block of text)
# Your PR Title (What it does)
Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. **If any of these steps are not completed, your PR will not be considered for review.**
<!--
Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution.
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- Examples:
- feat(core): add multi-tenant support
- fix(cli): resolve flag parsing error
- docs(openai): update API usage examples
- Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, text-splitters, docs, anthropic, chroma, deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant, xai, infra
- Once you've written the title, please delete this checklist item; do not include it in the PR.
Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change.
- [ ] **PR message**: ***Delete this entire checklist*** and replace with
- **Description:** a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
- **Dependencies:** any dependencies required for this change
After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost.
-->
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. **We will not consider a PR unless these three are passing in CI.** See [contribution guidelines](https://docs.langchain.com/oss/python/contributing) for more.
<!-- Remove if not applicable -->
Additional guidelines:
Fixes # (issue)
- Most PRs should not touch more than one package.
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests. Likewise, please do not update the `uv.lock` files unless you are adding a required dependency.
- Changes should be backwards compatible.
- Make sure optional dependencies are imported within a function.
## Before submitting
<!-- If you're adding a new integration, include an integration test and an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->

76
.github/actions/poetry_setup/action.yml vendored Normal file
View File

@@ -0,0 +1,76 @@
# An action for setting up poetry install with caching.
# Using a custom action since the default action does not
# take poetry install groups into account.
# Action code from:
# https://github.com/actions/setup-python/issues/505#issuecomment-1273013236
name: poetry-install-with-caching
description: Poetry install with support for caching of dependency groups.
inputs:
python-version:
description: Python version, supporting MAJOR.MINOR only
required: true
poetry-version:
description: Poetry version
required: true
install-command:
description: Command run for installing dependencies
required: false
default: poetry install
cache-key:
description: Cache key to use for manual handling of caching
required: true
working-directory:
description: Directory to run install-command in
required: false
default: ""
runs:
using: composite
steps:
- uses: actions/setup-python@v4
name: Setup python $${ inputs.python-version }}
with:
python-version: ${{ inputs.python-version }}
- uses: actions/cache@v3
id: cache-pip
name: Cache Pip ${{ inputs.python-version }}
env:
SEGMENT_DOWNLOAD_TIMEOUT_MIN: "15"
with:
path: |
~/.cache/pip
key: pip-${{ runner.os }}-${{ runner.arch }}-py-${{ inputs.python-version }}
- run: pipx install poetry==${{ inputs.poetry-version }} --python python${{ inputs.python-version }}
shell: bash
- name: Check Poetry File
shell: bash
run: |
poetry check
- name: Check lock file
shell: bash
run: |
poetry lock --check
- uses: actions/cache@v3
id: cache-poetry
env:
SEGMENT_DOWNLOAD_TIMEOUT_MIN: "15"
with:
path: |
~/.cache/pypoetry/virtualenvs
~/.cache/pypoetry/cache
~/.cache/pypoetry/artifacts
key: poetry-${{ runner.os }}-${{ runner.arch }}-py-${{ inputs.python-version }}-poetry-${{ inputs.poetry-version }}-${{ inputs.cache-key }}-${{ hashFiles('poetry.lock') }}
- run: ${{ inputs.install-command }}
working-directory: ${{ inputs.working-directory }}
shell: bash

View File

@@ -1,39 +0,0 @@
# Helper to set up Python and uv with caching
name: uv-install
description: Set up Python and uv with caching
inputs:
python-version:
description: Python version, supporting MAJOR.MINOR only
required: true
enable-cache:
description: Enable caching for uv dependencies
required: false
default: "true"
cache-suffix:
description: Custom cache key suffix for cache invalidation
required: false
default: ""
working-directory:
description: Working directory for cache glob scoping
required: false
default: "**"
env:
UV_VERSION: "0.5.25"
runs:
using: composite
steps:
- name: Install uv and set the python version
uses: astral-sh/setup-uv@v6
with:
version: ${{ env.UV_VERSION }}
python-version: ${{ inputs.python-version }}
enable-cache: ${{ inputs.enable-cache }}
cache-dependency-glob: |
${{ inputs.working-directory }}/pyproject.toml
${{ inputs.working-directory }}/uv.lock
${{ inputs.working-directory }}/requirements*.txt
cache-suffix: ${{ inputs.cache-suffix }}

View File

@@ -1,330 +0,0 @@
# Global Development Guidelines for LangChain Projects
## Core Development Principles
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
**Bad - Breaking Change:**
```python
def get_user(id, verbose=False): # Changed from `user_id`
pass
```
**Good - Stable Interface:**
```python
def get_user(user_id: str, verbose: bool = False) -> User:
"""Retrieve user by ID with optional verbose output."""
pass
```
**Before making ANY changes to public APIs:**
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring admonitions (using MkDocs Material, like `!!! warning`)
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
### 2. Code Quality Standards
**All Python code MUST include type hints and return types.**
**Bad:**
```python
def p(u, d):
return [x for x in u if x not in d]
```
**Good:**
```python
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Filter out users that are not in the known users set.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
return [user for user in users if user not in known_users]
```
**Style Requirements:**
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
- Avoid unnecessary abstraction or premature optimization
- Follow existing patterns in the codebase you're modifying
### 3. Testing Requirements
**Every new feature or bugfix MUST be covered by unit tests.**
**Test Organization:**
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- Use `pytest` as the testing framework
**Test Quality Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
Checklist questions:
- [ ] Does the test suite fail if your new logic is broken?
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
- [ ] Do tests use fixtures or mocks where needed?
```python
def test_filter_unknown_users():
"""Test filtering unknown users from a list."""
users = ["alice", "bob", "charlie"]
known_users = {"alice", "bob"}
result = filter_unknown_users(users, known_users)
assert result == ["charlie"]
assert len(result) == 1
```
### 4. Security and Risk Assessment
**Security Checklist:**
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
**Bad:**
```python
def load_config(path):
with open(path) as f:
return eval(f.read()) # ⚠️ Never eval config
```
**Good:**
```python
import json
def load_config(path: str) -> dict:
with open(path) as f:
return json.load(f)
```
### 5. Documentation Standards
**Use Google-style docstrings with Args and Returns sections for all public functions.**
**Insufficient Documentation:**
```python
def send_email(to, msg):
"""Send an email to a recipient."""
```
**Complete Documentation:**
```python
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""
Send an email to a recipient with specified priority.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level.
Returns:
True if email was sent successfully, False otherwise.
Raises:
InvalidEmailError: If the email address format is invalid.
SMTPConnectionError: If unable to connect to email server.
"""
```
**Documentation Guidelines:**
- Types go in function signatures, NOT in docstrings
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
### 6. Architectural Improvements
**When you encounter code that could be improved, suggest better designs:**
**Poor Design:**
```python
def process_data(data, db_conn, email_client, logger):
# Function doing too many things
validated = validate_data(data)
result = db_conn.save(validated)
email_client.send_notification(result)
logger.log(f"Processed {len(data)} items")
return result
```
**Better Design:**
```python
@dataclass
class ProcessingResult:
"""Result of data processing operation."""
items_processed: int
success: bool
errors: List[str] = field(default_factory=list)
class DataProcessor:
"""Handles data validation, storage, and notification."""
def __init__(self, db_conn: Database, email_client: EmailClient):
self.db = db_conn
self.email = email_client
def process(self, data: List[dict]) -> ProcessingResult:
"""Process and store data with notifications.
Args:
data: List of data items to process.
Returns:
ProcessingResult with details of the operation.
"""
validated = self._validate_data(data)
result = self.db.save(validated)
self._notify_completion(result)
return result
```
**Design Improvement Areas:**
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
- Reduce code duplication through shared utilities
- Make unit testing easier
- Improve separation of concerns (single responsibility)
- Make unit testing easier through dependency injection
- Add clarity without adding complexity
- Prefer dataclasses for structured data
## Development Tools & Commands
### Package Management
```bash
# Add package
uv add package-name
# Sync project dependencies
uv sync
uv lock
```
### Testing
```bash
# Run unit tests (no network)
make test
# Don't run integration tests, as API keys must be set
# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py
```
### Code Quality
```bash
# Lint code
make lint
# Format code
make format
# Type checking
uv run --group lint mypy .
```
### Dependency Management Patterns
**Local Development Dependencies:**
```toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }
```
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
```python
from langchain_core.tools import tool
@tool
def search_database(query: str) -> str:
"""Search the database for relevant information.
Args:
query: The search query string.
"""
# Implementation here
return results
```
## Commit Standards
**Use Conventional Commits format for PR titles:**
- `feat(core): add multi-tenant support`
- `!fix(cli): resolve flag parsing error` (breaking change uses exclamation mark)
- `docs: update API usage examples`
- `docs(openai): update API usage examples`
## Framework-Specific Guidelines
- Follow the existing patterns in `langchain_core` for base abstractions
- Implement proper streaming support where applicable
- Avoid deprecated components
### Partner Integrations
- Follow the established patterns in existing partner libraries
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
- Include comprehensive integration tests
- Document API key requirements and authentication
---
## Quick Reference Checklist
Before submitting code changes:
- [ ] **Breaking Changes**: Verified no public API changes
- [ ] **Type Hints**: All functions have complete type annotations
- [ ] **Tests**: New functionality is fully tested
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
- [ ] **Documentation**: Google-style docstrings for public functions
- [ ] **Code Quality**: `make lint` and `make format` pass
- [ ] **Architecture**: Suggested improvements where applicable
- [ ] **Commit Message**: Follows Conventional Commits format

View File

@@ -1,11 +0,0 @@
# Please see the documentation for all configuration options:
# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates
# and
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"

View File

@@ -1,25 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<svg id="Layer_1" data-name="Layer 1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1584.81 250">
<defs>
<style>
.cls-1 {
fill: #1c3c3c;
stroke-width: 0px;
}
</style>
</defs>
<g id="LanChain-logo">
<g id="LangChain-logotype">
<polygon class="cls-1" points="596.33 49.07 596.33 200.67 700.76 200.67 700.76 177.78 620.04 177.78 620.04 49.07 596.33 49.07"/>
<path class="cls-1" d="M1126.83,49.07c-20.53,0-37.95,7.4-50.38,21.41-12.32,13.88-18.82,33.36-18.82,56.33,0,47.23,27.25,77.75,69.41,77.75,29.71,0,52.71-15.54,61.54-41.56l2.14-6.31-23.53-8.94-2.17,7.03c-5.26,17.01-18.75,26.38-37.99,26.38-27.48,0-44.55-20.82-44.55-54.34s17.23-54.34,44.97-54.34c19.23,0,30.31,7.54,35.95,24.44l2.46,7.37,22.91-10.75-2.1-5.9c-8.96-25.22-29.65-38.56-59.85-38.56Z"/>
<path class="cls-1" d="M756.43,85.05c-22.76,0-39.78,10.67-46.69,29.27-.44,1.19-1.77,4.78-1.77,4.78l19.51,12.62,2.65-6.91c4.52-11.78,12.88-17.27,26.3-17.27s21.1,6.51,20.96,19.33c0,.52-.04,2.09-.04,2.09,0,0-17.76,2.88-25.08,4.43-31.23,6.6-44.31,18.52-44.31,38.02,0,10.39,5.77,21.64,16.3,27.95,6.32,3.78,14.57,5.21,23.68,5.21,5.99,0,11.81-.89,17.2-2.53,12.25-4.07,15.67-12.07,15.67-12.07v10.46h20.29v-74.78c0-25.42-16.7-40.6-44.67-40.6ZM777.46,164.85c0,7.86-8.56,18.93-28.5,18.93-5.63,0-9.62-1.49-12.28-3.71-3.56-2.97-4.73-7.24-4.24-11.01.21-1.64,1.2-5.17,4.87-8.23,3.75-3.13,10.38-5.37,20.62-7.6,8.42-1.83,19.54-3.85,19.54-3.85v15.48Z"/>
<path class="cls-1" d="M876.11,85.04c-2.82,0-5.57.2-8.24.57-18.17,2.73-23.49,11.96-23.49,11.96l.02-9.31h-22.74s0,112.19,0,112.19h23.71v-62.18c0-21.13,15.41-30.75,29.73-30.75,15.48,0,23,8.32,23,25.45v67.48h23.71v-70.74c0-27.56-17.51-44.67-45.69-44.67Z"/>
<path class="cls-1" d="M1539.12,85.04c-2.82,0-5.57.2-8.24.57-18.17,2.73-23.49,11.96-23.49,11.96v-9.32h-22.72v112.2h23.71v-62.18c0-21.13,15.41-30.75,29.73-30.75,15.48,0,23,8.32,23,25.45v67.48h23.71v-70.74c0-27.56-17.51-44.67-45.69-44.67Z"/>
<path class="cls-1" d="M1020.76,88.26v11.55s-5.81-14.77-32.24-14.77c-32.84,0-53.24,22.66-53.24,59.15,0,20.59,6.58,36.8,18.19,47.04,9.03,7.96,21.09,12.04,35.45,12.32,9.99.19,16.46-2.53,20.5-5.1,7.76-4.94,10.64-9.63,10.64-9.63,0,0-.33,3.67-.93,8.64-.43,3.6-1.24,6.13-1.24,6.13h0c-3.61,12.85-14.17,20.28-29.57,20.28s-24.73-5.07-26.58-15.06l-23.05,6.88c3.98,19.2,22,30.66,48.2,30.66,17.81,0,31.77-4.84,41.5-14.4,9.81-9.64,14.79-23.53,14.79-41.29v-102.41h-22.42ZM1019.26,145.21c0,22.44-10.96,35.84-29.32,35.84-19.67,0-30.95-13.44-30.95-36.86s11.28-36.66,30.95-36.66c17.92,0,29.15,13.34,29.32,34.82v2.86Z"/>
<path class="cls-1" d="M1259.01,85.04c-2.6,0-5.13.17-7.59.49-17.88,2.79-23.14,11.9-23.14,11.9v-2.67h-.01s0-45.69,0-45.69h-23.71v151.39h23.71v-62.18c0-21.27,15.41-30.95,29.73-30.95,15.48,0,23,8.32,23,25.45v67.68h23.71v-70.94c0-27.01-17.94-44.47-45.69-44.47Z"/>
<circle class="cls-1" cx="1450.93" cy="64.47" r="15.37"/>
<path class="cls-1" d="M1439.14,88.2v56.94h0c-6.75-5.56-14.6-9.75-23.5-12.26v-7.23c0-25.42-16.7-40.6-44.67-40.6-22.76,0-39.78,10.67-46.69,29.27-.44,1.19-1.77,4.78-1.77,4.78l19.51,12.62,2.65-6.91c4.52-11.78,12.88-17.27,26.3-17.27s21.1,6.51,20.96,19.33c0,.08,0,1.15,0,2.86-10.04-.28-19.38.69-27.77,2.66,0,0,0,0,0,0-11.06,2.5-31.6,8.85-38.94,25.36-.05.11-1.13,2.96-1.13,2.96-1.06,3.28-1.59,6.84-1.59,10.7,0,10.39,5.77,21.64,16.3,27.95,6.32,3.78,14.57,5.21,23.68,5.21,5.88,0,11.6-.86,16.91-2.44,12.49-4.04,15.96-12.16,15.96-12.16v10.47h20.29v-34.27c-5.7-3.56-14.26-5.66-23.65-5.64,0,2.65,0,4.33,0,4.33,0,7.86-8.56,18.93-28.5,18.93-5.63,0-9.62-1.49-12.28-3.71-3.56-2.97-4.73-7.24-4.24-11.01.21-1.64,1.2-5.17,4.87-8.23l-.04-.11c8.42-6.89,24.97-9.64,40.17-9.04v.03c12.94.47,22.62,3.01,29.53,7.77,1.88,1.19,3.65,2.52,5.28,3.98,6.94,6.23,9.73,13.9,10.93,18.38,1.95,7.31,1.43,18.57,1.43,18.57h23.59v-112.2h-23.59Z"/>
</g>
<path id="LangChain-symbol" class="cls-1" d="M393.52,75.2c9.66,9.66,9.66,25.38,0,35.04l-21.64,21.29-.22-1.22c-1.58-8.75-5.74-16.69-12.02-22.97-4.73-4.72-10.32-8.21-16.62-10.37-3.91,3.93-6.06,9.08-6.06,14.5,0,1.1.1,2.24.3,3.38,3.47,1.25,6.54,3.18,9.12,5.76,9.66,9.66,9.66,25.38,0,35.04l-18.84,18.84c-4.83,4.83-11.17,7.24-17.52,7.24s-12.69-2.41-17.52-7.24c-9.66-9.66-9.66-25.38,0-35.04l21.64-21.28.22,1.22c1.57,8.73,5.73,16.67,12.03,22.96,4.74,4.74,9.99,7.89,16.28,10.04l1.16-1.16c3.52-3.52,5.45-8.2,5.45-13.19,0-1.11-.1-2.22-.29-3.31-3.63-1.2-6.62-2.91-9.34-5.63-3.92-3.92-6.36-8.93-7.04-14.48-.05-.4-.08-.79-.12-1.19-.54-7.23,2.07-14.29,7.16-19.37l18.84-18.84c4.67-4.67,10.89-7.25,17.52-7.25s12.85,2.57,17.52,7.25ZM491.9,125c0,68.93-56.08,125-125,125H125C56.08,250,0,193.93,0,125S56.08,0,125,0h241.9c68.93,0,125,56.08,125,125ZM240.9,187.69c1.97-2.39-7.13-9.12-8.99-11.59-3.78-4.1-3.8-10-6.35-14.79-6.24-14.46-13.41-28.81-23.44-41.05-10.6-13.39-23.68-24.47-35.17-37.04-8.53-8.77-10.81-21.26-18.34-30.69-10.38-15.33-43.2-19.51-48.01,2.14.02.68-.19,1.11-.78,1.54-2.66,1.93-5.03,4.14-7.02,6.81-4.87,6.78-5.62,18.28.46,24.37.2-3.21.31-6.24,2.85-8.54,4.7,4.03,11.8,5.46,17.25,2.45,12.04,17.19,9.04,40.97,18.6,59.49,2.64,4.38,5.3,8.85,8.69,12.69,2.75,4.28,12.25,9.33,12.81,13.29.1,6.8-.7,14.23,3.76,19.92,2.1,4.26-3.06,8.54-7.22,8.01-5.4.74-11.99-3.63-16.72-.94-1.67,1.81-4.94-.19-6.38,2.32-.5,1.3-3.2,3.13-1.59,4.38,1.79-1.36,3.45-2.78,5.86-1.97-.36,1.96,1.19,2.24,2.42,2.81-.04,1.33-.82,2.69.2,3.82,1.19-1.2,1.9-2.9,3.79-3.4,6.28,8.37,12.67-8.47,26.26-.89-2.76-.14-5.21.21-7.07,2.48-.46.51-.85,1.11-.04,1.77,7.33-4.73,7.29,1.62,12.05-.33,3.66-1.91,7.3-4.3,11.65-3.62-4.23,1.22-4.4,4.62-6.88,7.49-.42.44-.62.94-.13,1.67,8.78-.74,9.5-3.66,16.59-7.24,5.29-3.23,10.56,4.6,15.14.14,1.01-.97,2.39-.64,3.64-.77-1.6-8.53-19.19,1.56-18.91-9.88,5.66-3.85,4.36-11.22,4.74-17.17,6.51,3.61,13.75,5.71,20.13,9.16,3.22,5.2,8.27,12.07,15,11.62.18-.52.34-.98.53-1.51,2.04.35,4.66,1.7,5.78-.88,3.05,3.19,7.53,3.03,11.52,2.21,2.95-2.4-5.55-5.82-6.69-8.29ZM419.51,92.72c0-11.64-4.52-22.57-12.73-30.78-8.21-8.21-19.14-12.73-30.79-12.73s-22.58,4.52-30.79,12.73l-18.84,18.84c-4.4,4.4-7.74,9.57-9.93,15.36l-.13.33-.34.1c-6.84,2.11-12.87,5.73-17.92,10.78l-18.84,18.84c-16.97,16.98-16.97,44.6,0,61.57,8.21,8.21,19.14,12.73,30.78,12.73h0c11.64,0,22.58-4.52,30.79-12.73l18.84-18.84c4.38-4.38,7.7-9.53,9.89-15.31l.13-.33.34-.11c6.72-2.06,12.92-5.8,17.95-10.82l18.84-18.84c8.21-8.21,12.73-19.14,12.73-30.79ZM172.38,173.6c-1.62,6.32-2.15,17.09-10.37,17.4-.68,3.65,2.53,5.02,5.44,3.85,2.89-1.33,4.26,1.05,5.23,3.42,4.46.65,11.06-1.49,11.31-6.77-6.66-3.84-8.72-11.14-11.62-17.9Z"/>
</g>
</svg>

Before

Width:  |  Height:  |  Size: 6.4 KiB

View File

@@ -1,25 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<svg id="Layer_1" data-name="Layer 1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1584.81 250">
<defs>
<style>
.cls-1 {
fill: #fff;
stroke-width: 0px;
}
</style>
</defs>
<g id="LanChain-logo">
<g id="LangChain-logotype">
<polygon class="cls-1" points="596.33 49.07 596.33 200.67 700.76 200.67 700.76 177.78 620.04 177.78 620.04 49.07 596.33 49.07"/>
<path class="cls-1" d="M1126.83,49.07c-20.53,0-37.95,7.4-50.38,21.41-12.32,13.88-18.82,33.36-18.82,56.33,0,47.23,27.25,77.75,69.41,77.75,29.71,0,52.71-15.54,61.54-41.56l2.14-6.31-23.53-8.94-2.17,7.03c-5.26,17.01-18.75,26.38-37.99,26.38-27.48,0-44.55-20.82-44.55-54.34s17.23-54.34,44.97-54.34c19.23,0,30.31,7.54,35.95,24.44l2.46,7.37,22.91-10.75-2.1-5.9c-8.96-25.22-29.65-38.56-59.85-38.56Z"/>
<path class="cls-1" d="M756.43,85.05c-22.76,0-39.78,10.67-46.69,29.27-.44,1.19-1.77,4.78-1.77,4.78l19.51,12.62,2.65-6.91c4.52-11.78,12.88-17.27,26.3-17.27s21.1,6.51,20.96,19.33c0,.52-.04,2.09-.04,2.09,0,0-17.76,2.88-25.08,4.43-31.23,6.6-44.31,18.52-44.31,38.02,0,10.39,5.77,21.64,16.3,27.95,6.32,3.78,14.57,5.21,23.68,5.21,5.99,0,11.81-.89,17.2-2.53,12.25-4.07,15.67-12.07,15.67-12.07v10.46h20.29v-74.78c0-25.42-16.7-40.6-44.67-40.6ZM777.46,164.85c0,7.86-8.56,18.93-28.5,18.93-5.63,0-9.62-1.49-12.28-3.71-3.56-2.97-4.73-7.24-4.24-11.01.21-1.64,1.2-5.17,4.87-8.23,3.75-3.13,10.38-5.37,20.62-7.6,8.42-1.83,19.54-3.85,19.54-3.85v15.48Z"/>
<path class="cls-1" d="M876.11,85.04c-2.82,0-5.57.2-8.24.57-18.17,2.73-23.49,11.96-23.49,11.96l.02-9.31h-22.74s0,112.19,0,112.19h23.71v-62.18c0-21.13,15.41-30.75,29.73-30.75,15.48,0,23,8.32,23,25.45v67.48h23.71v-70.74c0-27.56-17.51-44.67-45.69-44.67Z"/>
<path class="cls-1" d="M1539.12,85.04c-2.82,0-5.57.2-8.24.57-18.17,2.73-23.49,11.96-23.49,11.96v-9.32h-22.72v112.2h23.71v-62.18c0-21.13,15.41-30.75,29.73-30.75,15.48,0,23,8.32,23,25.45v67.48h23.71v-70.74c0-27.56-17.51-44.67-45.69-44.67Z"/>
<path class="cls-1" d="M1020.76,88.26v11.55s-5.81-14.77-32.24-14.77c-32.84,0-53.24,22.66-53.24,59.15,0,20.59,6.58,36.8,18.19,47.04,9.03,7.96,21.09,12.04,35.45,12.32,9.99.19,16.46-2.53,20.5-5.1,7.76-4.94,10.64-9.63,10.64-9.63,0,0-.33,3.67-.93,8.64-.43,3.6-1.24,6.13-1.24,6.13h0c-3.61,12.85-14.17,20.28-29.57,20.28s-24.73-5.07-26.58-15.06l-23.05,6.88c3.98,19.2,22,30.66,48.2,30.66,17.81,0,31.77-4.84,41.5-14.4,9.81-9.64,14.79-23.53,14.79-41.29v-102.41h-22.42ZM1019.26,145.21c0,22.44-10.96,35.84-29.32,35.84-19.67,0-30.95-13.44-30.95-36.86s11.28-36.66,30.95-36.66c17.92,0,29.15,13.34,29.32,34.82v2.86Z"/>
<path class="cls-1" d="M1259.01,85.04c-2.6,0-5.13.17-7.59.49-17.88,2.79-23.14,11.9-23.14,11.9v-2.67h-.01s0-45.69,0-45.69h-23.71v151.39h23.71v-62.18c0-21.27,15.41-30.95,29.73-30.95,15.48,0,23,8.32,23,25.45v67.68h23.71v-70.94c0-27.01-17.94-44.47-45.69-44.47Z"/>
<circle class="cls-1" cx="1450.93" cy="64.47" r="15.37"/>
<path class="cls-1" d="M1439.14,88.2v56.94h0c-6.75-5.56-14.6-9.75-23.5-12.26v-7.23c0-25.42-16.7-40.6-44.67-40.6-22.76,0-39.78,10.67-46.69,29.27-.44,1.19-1.77,4.78-1.77,4.78l19.51,12.62,2.65-6.91c4.52-11.78,12.88-17.27,26.3-17.27s21.1,6.51,20.96,19.33c0,.08,0,1.15,0,2.86-10.04-.28-19.38.69-27.77,2.66,0,0,0,0,0,0-11.06,2.5-31.6,8.85-38.94,25.36-.05.11-1.13,2.96-1.13,2.96-1.06,3.28-1.59,6.84-1.59,10.7,0,10.39,5.77,21.64,16.3,27.95,6.32,3.78,14.57,5.21,23.68,5.21,5.88,0,11.6-.86,16.91-2.44,12.49-4.04,15.96-12.16,15.96-12.16v10.47h20.29v-34.27c-5.7-3.56-14.26-5.66-23.65-5.64,0,2.65,0,4.33,0,4.33,0,7.86-8.56,18.93-28.5,18.93-5.63,0-9.62-1.49-12.28-3.71-3.56-2.97-4.73-7.24-4.24-11.01.21-1.64,1.2-5.17,4.87-8.23l-.04-.11c8.42-6.89,24.97-9.64,40.17-9.04v.03c12.94.47,22.62,3.01,29.53,7.77,1.88,1.19,3.65,2.52,5.28,3.98,6.94,6.23,9.73,13.9,10.93,18.38,1.95,7.31,1.43,18.57,1.43,18.57h23.59v-112.2h-23.59Z"/>
</g>
<path id="LangChain-symbol" class="cls-1" d="M393.52,75.2c9.66,9.66,9.66,25.38,0,35.04l-21.64,21.29-.22-1.22c-1.58-8.75-5.74-16.69-12.02-22.97-4.73-4.72-10.32-8.21-16.62-10.37-3.91,3.93-6.06,9.08-6.06,14.5,0,1.1.1,2.24.3,3.38,3.47,1.25,6.54,3.18,9.12,5.76,9.66,9.66,9.66,25.38,0,35.04l-18.84,18.84c-4.83,4.83-11.17,7.24-17.52,7.24s-12.69-2.41-17.52-7.24c-9.66-9.66-9.66-25.38,0-35.04l21.64-21.28.22,1.22c1.57,8.73,5.73,16.67,12.03,22.96,4.74,4.74,9.99,7.89,16.28,10.04l1.16-1.16c3.52-3.52,5.45-8.2,5.45-13.19,0-1.11-.1-2.22-.29-3.31-3.63-1.2-6.62-2.91-9.34-5.63-3.92-3.92-6.36-8.93-7.04-14.48-.05-.4-.08-.79-.12-1.19-.54-7.23,2.07-14.29,7.16-19.37l18.84-18.84c4.67-4.67,10.89-7.25,17.52-7.25s12.85,2.57,17.52,7.25ZM491.9,125c0,68.93-56.08,125-125,125H125C56.08,250,0,193.93,0,125S56.08,0,125,0h241.9C435.82,0,491.9,56.08,491.9,125ZM240.9,187.69c1.97-2.39-7.13-9.12-8.99-11.59-3.78-4.1-3.8-10-6.35-14.79-6.24-14.46-13.41-28.81-23.44-41.05-10.6-13.39-23.68-24.47-35.17-37.04-8.53-8.77-10.81-21.26-18.34-30.69-10.38-15.33-43.2-19.51-48.01,2.14.02.68-.19,1.11-.78,1.54-2.66,1.93-5.03,4.14-7.02,6.81-4.87,6.78-5.62,18.28.46,24.37.2-3.21.31-6.24,2.85-8.54,4.7,4.03,11.8,5.46,17.25,2.45,12.04,17.19,9.04,40.97,18.6,59.49,2.64,4.38,5.3,8.85,8.69,12.69,2.75,4.28,12.25,9.33,12.81,13.29.1,6.8-.7,14.23,3.76,19.92,2.1,4.26-3.06,8.54-7.22,8.01-5.4.74-11.99-3.63-16.72-.94-1.67,1.81-4.94-.19-6.38,2.32-.5,1.3-3.2,3.13-1.59,4.38,1.79-1.36,3.45-2.78,5.86-1.97-.36,1.96,1.19,2.24,2.42,2.81-.04,1.33-.82,2.69.2,3.82,1.19-1.2,1.9-2.9,3.79-3.4,6.28,8.37,12.67-8.47,26.26-.89-2.76-.14-5.21.21-7.07,2.48-.46.51-.85,1.11-.04,1.77,7.33-4.73,7.29,1.62,12.05-.33,3.66-1.91,7.3-4.3,11.65-3.62-4.23,1.22-4.4,4.62-6.88,7.49-.42.44-.62.94-.13,1.67,8.78-.74,9.5-3.66,16.59-7.24,5.29-3.23,10.56,4.6,15.14.14,1.01-.97,2.39-.64,3.64-.77-1.6-8.53-19.19,1.56-18.91-9.88,5.66-3.85,4.36-11.22,4.74-17.17,6.51,3.61,13.75,5.71,20.13,9.16,3.22,5.2,8.27,12.07,15,11.62.18-.52.34-.98.53-1.51,2.04.35,4.66,1.7,5.78-.88,3.05,3.19,7.53,3.03,11.52,2.21,2.95-2.4-5.55-5.82-6.69-8.29ZM419.51,92.72c0-11.64-4.52-22.57-12.73-30.78-8.21-8.21-19.14-12.73-30.79-12.73s-22.58,4.52-30.79,12.73l-18.84,18.84c-4.4,4.4-7.74,9.57-9.93,15.36l-.13.33-.34.1c-6.84,2.11-12.87,5.73-17.92,10.78l-18.84,18.84c-16.97,16.98-16.97,44.6,0,61.57,8.21,8.21,19.14,12.73,30.78,12.73h0c11.64,0,22.58-4.52,30.79-12.73l18.84-18.84c4.38-4.38,7.7-9.53,9.89-15.31l.13-.33.34-.11c6.72-2.06,12.92-5.8,17.95-10.82l18.84-18.84c8.21-8.21,12.73-19.14,12.73-30.79ZM172.38,173.6c-1.62,6.32-2.15,17.09-10.37,17.4-.68,3.65,2.53,5.02,5.44,3.85,2.89-1.33,4.26,1.05,5.23,3.42,4.46.65,11.06-1.49,11.31-6.77-6.66-3.84-8.72-11.14-11.62-17.9Z"/>
</g>
</svg>

Before

Width:  |  Height:  |  Size: 6.4 KiB

View File

@@ -1,163 +0,0 @@
# Label PRs (config)
# Automatically applies labels based on changed files and branch patterns
# Core packages
core:
- changed-files:
- any-glob-to-any-file:
- "libs/core/**/*"
langchain-classic:
- changed-files:
- any-glob-to-any-file:
- "libs/langchain/**/*"
langchain:
- changed-files:
- any-glob-to-any-file:
- "libs/langchain_v1/**/*"
cli:
- changed-files:
- any-glob-to-any-file:
- "libs/cli/**/*"
standard-tests:
- changed-files:
- any-glob-to-any-file:
- "libs/standard-tests/**/*"
model-profiles:
- changed-files:
- any-glob-to-any-file:
- "libs/model-profiles/**/*"
text-splitters:
- changed-files:
- any-glob-to-any-file:
- "libs/text-splitters/**/*"
# Partner integrations
integration:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/**/*"
anthropic:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/anthropic/**/*"
chroma:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/chroma/**/*"
deepseek:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/deepseek/**/*"
exa:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/exa/**/*"
fireworks:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/fireworks/**/*"
groq:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/groq/**/*"
huggingface:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/huggingface/**/*"
mistralai:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/mistralai/**/*"
nomic:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/nomic/**/*"
ollama:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/ollama/**/*"
openai:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/openai/**/*"
perplexity:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/perplexity/**/*"
prompty:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/prompty/**/*"
qdrant:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/qdrant/**/*"
xai:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/xai/**/*"
# Infrastructure and DevOps
infra:
- changed-files:
- any-glob-to-any-file:
- ".github/**/*"
- "Makefile"
- ".pre-commit-config.yaml"
- "scripts/**/*"
- "docker/**/*"
- "Dockerfile*"
github_actions:
- changed-files:
- any-glob-to-any-file:
- ".github/workflows/**/*"
- ".github/actions/**/*"
dependencies:
- changed-files:
- any-glob-to-any-file:
- "**/pyproject.toml"
- "uv.lock"
- "**/requirements*.txt"
- "**/poetry.lock"
# Documentation
documentation:
- changed-files:
- any-glob-to-any-file:
- "**/*.md"
- "**/*.rst"
- "**/README*"
# Security related changes
security:
- changed-files:
- any-glob-to-any-file:
- "**/*security*"
- "**/*auth*"
- "**/*credential*"
- "**/*secret*"
- "**/*token*"
- ".github/workflows/security*"

View File

@@ -1,340 +0,0 @@
"""Analyze git diffs to determine which directories need to be tested.
Intelligently determines which LangChain packages and directories need to be tested,
linted, or built based on the changes. Handles dependency relationships between
packages, maps file changes to appropriate CI job configurations, and outputs JSON
configurations for GitHub Actions.
- Maps changed files to affected package directories (libs/core, libs/partners/*, etc.)
- Builds dependency graph to include dependent packages when core components change
- Generates test matrix configurations with appropriate Python versions
- Handles special cases for Pydantic version testing and performance benchmarks
Used as part of the check_diffs workflow.
"""
import glob
import json
import os
import sys
from collections import defaultdict
from pathlib import Path
from typing import Dict, List, Set
import tomllib
from get_min_versions import get_min_version_from_toml
from packaging.requirements import Requirement
LANGCHAIN_DIRS = [
"libs/core",
"libs/text-splitters",
"libs/langchain",
"libs/langchain_v1",
"libs/model-profiles",
]
# When set to True, we are ignoring core dependents
# in order to be able to get CI to pass for each individual
# package that depends on core
# e.g. if you touch core, we don't then add textsplitters/etc to CI
IGNORE_CORE_DEPENDENTS = False
# ignored partners are removed from dependents
# but still run if directly edited
IGNORED_PARTNERS = [
# remove huggingface from dependents because of CI instability
# specifically in huggingface jobs
# https://github.com/langchain-ai/langchain/issues/25558
"huggingface",
# prompty exhibiting issues with numpy for Python 3.13
# https://github.com/langchain-ai/langchain/actions/runs/12651104685/job/35251034969?pr=29065
"prompty",
]
def all_package_dirs() -> Set[str]:
return {
"/".join(path.split("/")[:-1]).lstrip("./")
for path in glob.glob("./libs/**/pyproject.toml", recursive=True)
if "libs/cli" not in path and "libs/standard-tests" not in path
}
def dependents_graph() -> dict:
"""Construct a mapping of package -> dependents
Done such that we can run tests on all dependents of a package when a change is made.
"""
dependents = defaultdict(set)
for path in glob.glob("./libs/**/pyproject.toml", recursive=True):
if "template" in path:
continue
# load regular and test deps from pyproject.toml
with open(path, "rb") as f:
pyproject = tomllib.load(f)
pkg_dir = "libs" + "/".join(path.split("libs")[1].split("/")[:-1])
for dep in [
*pyproject["project"]["dependencies"],
*pyproject["dependency-groups"]["test"],
]:
requirement = Requirement(dep)
package_name = requirement.name
if "langchain" in dep:
dependents[package_name].add(pkg_dir)
continue
# load extended deps from extended_testing_deps.txt
package_path = Path(path).parent
extended_requirement_path = package_path / "extended_testing_deps.txt"
if extended_requirement_path.exists():
with open(extended_requirement_path, "r") as f:
extended_deps = f.read().splitlines()
for depline in extended_deps:
if depline.startswith("-e "):
# editable dependency
assert depline.startswith("-e ../partners/"), (
"Extended test deps should only editable install partner packages"
)
partner = depline.split("partners/")[1]
dep = f"langchain-{partner}"
else:
dep = depline.split("==")[0]
if "langchain" in dep:
dependents[dep].add(pkg_dir)
for k in dependents:
for partner in IGNORED_PARTNERS:
if f"libs/partners/{partner}" in dependents[k]:
dependents[k].remove(f"libs/partners/{partner}")
return dependents
def add_dependents(dirs_to_eval: Set[str], dependents: dict) -> List[str]:
updated = set()
for dir_ in dirs_to_eval:
# handle core manually because it has so many dependents
if "core" in dir_:
updated.add(dir_)
continue
pkg = "langchain-" + dir_.split("/")[-1]
updated.update(dependents[pkg])
updated.add(dir_)
return list(updated)
def _get_configs_for_single_dir(job: str, dir_: str) -> List[Dict[str, str]]:
if job == "test-pydantic":
return _get_pydantic_test_configs(dir_)
if job == "codspeed":
py_versions = ["3.13"]
elif dir_ == "libs/core":
py_versions = ["3.10", "3.11", "3.12", "3.13", "3.14"]
# custom logic for specific directories
elif dir_ in {"libs/partners/chroma"}:
py_versions = ["3.10", "3.13"]
else:
py_versions = ["3.10", "3.14"]
return [{"working-directory": dir_, "python-version": py_v} for py_v in py_versions]
def _get_pydantic_test_configs(
dir_: str, *, python_version: str = "3.12"
) -> List[Dict[str, str]]:
with open("./libs/core/uv.lock", "rb") as f:
core_uv_lock_data = tomllib.load(f)
for package in core_uv_lock_data["package"]:
if package["name"] == "pydantic":
core_max_pydantic_minor = package["version"].split(".")[1]
break
with open(f"./{dir_}/uv.lock", "rb") as f:
dir_uv_lock_data = tomllib.load(f)
for package in dir_uv_lock_data["package"]:
if package["name"] == "pydantic":
dir_max_pydantic_minor = package["version"].split(".")[1]
break
core_min_pydantic_version = get_min_version_from_toml(
"./libs/core/pyproject.toml", "release", python_version, include=["pydantic"]
)["pydantic"]
core_min_pydantic_minor = (
core_min_pydantic_version.split(".")[1]
if "." in core_min_pydantic_version
else "0"
)
dir_min_pydantic_version = get_min_version_from_toml(
f"./{dir_}/pyproject.toml", "release", python_version, include=["pydantic"]
).get("pydantic", "0.0.0")
dir_min_pydantic_minor = (
dir_min_pydantic_version.split(".")[1]
if "." in dir_min_pydantic_version
else "0"
)
max_pydantic_minor = min(
int(dir_max_pydantic_minor),
int(core_max_pydantic_minor),
)
min_pydantic_minor = max(
int(dir_min_pydantic_minor),
int(core_min_pydantic_minor),
)
configs = [
{
"working-directory": dir_,
"pydantic-version": f"2.{v}.0",
"python-version": python_version,
}
for v in range(min_pydantic_minor, max_pydantic_minor + 1)
]
return configs
def _get_configs_for_multi_dirs(
job: str, dirs_to_run: Dict[str, Set[str]], dependents: dict
) -> List[Dict[str, str]]:
if job == "lint":
dirs = add_dependents(
dirs_to_run["lint"] | dirs_to_run["test"] | dirs_to_run["extended-test"],
dependents,
)
elif job in ["test", "compile-integration-tests", "dependencies", "test-pydantic"]:
dirs = add_dependents(
dirs_to_run["test"] | dirs_to_run["extended-test"], dependents
)
elif job == "extended-tests":
dirs = list(dirs_to_run["extended-test"])
elif job == "codspeed":
dirs = list(dirs_to_run["codspeed"])
else:
raise ValueError(f"Unknown job: {job}")
return [
config for dir_ in dirs for config in _get_configs_for_single_dir(job, dir_)
]
if __name__ == "__main__":
files = sys.argv[1:]
dirs_to_run: Dict[str, set] = {
"lint": set(),
"test": set(),
"extended-test": set(),
"codspeed": set(),
}
docs_edited = False
if len(files) >= 300:
# max diff length is 300 files - there are likely files missing
dirs_to_run["lint"] = all_package_dirs()
dirs_to_run["test"] = all_package_dirs()
dirs_to_run["extended-test"] = set(LANGCHAIN_DIRS)
for file in files:
if any(
file.startswith(dir_)
for dir_ in (
".github/workflows",
".github/tools",
".github/actions",
".github/scripts/check_diff.py",
)
):
# Infrastructure changes (workflows, actions, CI scripts) trigger tests on
# all core packages as a safety measure. This ensures that changes to CI/CD
# infrastructure don't inadvertently break package testing, even if the change
# appears unrelated (e.g., documentation build workflows). This is intentionally
# conservative to catch unexpected side effects from workflow modifications.
#
# Example: A PR modifying .github/workflows/api_doc_build.yml will trigger
# lint/test jobs for libs/core, libs/text-splitters, libs/langchain, and
# libs/langchain_v1, even though the workflow may only affect documentation.
dirs_to_run["extended-test"].update(LANGCHAIN_DIRS)
if file.startswith("libs/core"):
dirs_to_run["codspeed"].add("libs/core")
if any(file.startswith(dir_) for dir_ in LANGCHAIN_DIRS):
# add that dir and all dirs after in LANGCHAIN_DIRS
# for extended testing
found = False
for dir_ in LANGCHAIN_DIRS:
if dir_ == "libs/core" and IGNORE_CORE_DEPENDENTS:
dirs_to_run["extended-test"].add(dir_)
continue
if file.startswith(dir_):
found = True
if found:
dirs_to_run["extended-test"].add(dir_)
elif file.startswith("libs/standard-tests"):
# TODO: update to include all packages that rely on standard-tests (all partner packages)
# Note: won't run on external repo partners
dirs_to_run["lint"].add("libs/standard-tests")
dirs_to_run["test"].add("libs/standard-tests")
dirs_to_run["test"].add("libs/partners/mistralai")
dirs_to_run["test"].add("libs/partners/openai")
dirs_to_run["test"].add("libs/partners/anthropic")
dirs_to_run["test"].add("libs/partners/fireworks")
dirs_to_run["test"].add("libs/partners/groq")
elif file.startswith("libs/cli"):
dirs_to_run["lint"].add("libs/cli")
dirs_to_run["test"].add("libs/cli")
elif file.startswith("libs/partners"):
partner_dir = file.split("/")[2]
if os.path.isdir(f"libs/partners/{partner_dir}") and [
filename
for filename in os.listdir(f"libs/partners/{partner_dir}")
if not filename.startswith(".")
] != ["README.md"]:
dirs_to_run["test"].add(f"libs/partners/{partner_dir}")
# Skip codspeed for partners without benchmarks or in IGNORED_PARTNERS
if partner_dir not in IGNORED_PARTNERS:
dirs_to_run["codspeed"].add(f"libs/partners/{partner_dir}")
# Skip if the directory was deleted or is just a tombstone readme
elif file.startswith("libs/"):
# Check if this is a root-level file in libs/ (e.g., libs/README.md)
file_parts = file.split("/")
if len(file_parts) == 2:
# Root-level file in libs/, skip it (no tests needed)
continue
raise ValueError(
f"Unknown lib: {file}. check_diff.py likely needs "
"an update for this new library!"
)
elif file in [
"pyproject.toml",
"uv.lock",
]: # root uv files
docs_edited = True
dependents = dependents_graph()
# we now have dirs_by_job
# todo: clean this up
map_job_to_configs = {
job: _get_configs_for_multi_dirs(job, dirs_to_run, dependents)
for job in [
"lint",
"test",
"extended-tests",
"compile-integration-tests",
"dependencies",
"test-pydantic",
"codspeed",
]
}
for key, value in map_job_to_configs.items():
json_output = json.dumps(value)
print(f"{key}={json_output}")

View File

@@ -1,36 +0,0 @@
"""Check that no dependencies allow prereleases unless we're releasing a prerelease."""
import sys
import tomllib
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
with open(toml_file, "rb") as file:
toml_data = tomllib.load(file)
# See if we're releasing an rc or dev version
version = toml_data["project"]["version"]
releasing_rc = "rc" in version or "dev" in version
# If not, iterate through dependencies and make sure none allow prereleases
if not releasing_rc:
dependencies = toml_data["project"]["dependencies"]
for dep_version in dependencies:
dep_version_string = (
dep_version["version"] if isinstance(dep_version, dict) else dep_version
)
if "rc" in dep_version_string:
raise ValueError(
f"Dependency {dep_version} has a prerelease version. Please remove this."
)
if isinstance(dep_version, dict) and dep_version.get(
"allow-prereleases", False
):
raise ValueError(
f"Dependency {dep_version} has allow-prereleases set to true. Please remove this."
)

View File

@@ -1,199 +0,0 @@
"""Get minimum versions of dependencies from a pyproject.toml file."""
import sys
from collections import defaultdict
if sys.version_info >= (3, 11):
import tomllib
else:
# For Python 3.10 and below, which doesnt have stdlib tomllib
import tomli as tomllib
import re
from typing import List
import requests
from packaging.requirements import Requirement
from packaging.specifiers import SpecifierSet
from packaging.version import Version, parse
MIN_VERSION_LIBS = [
"langchain-core",
"langchain",
"langchain-text-splitters",
"numpy",
"SQLAlchemy",
]
# some libs only get checked on release because of simultaneous changes in
# multiple libs
SKIP_IF_PULL_REQUEST = [
"langchain-core",
"langchain-text-splitters",
"langchain",
]
def get_pypi_versions(package_name: str) -> List[str]:
"""Fetch all available versions for a package from PyPI.
Args:
package_name: Name of the package
Returns:
List of all available versions
Raises:
requests.exceptions.RequestException: If PyPI API request fails
KeyError: If package not found or response format unexpected
"""
pypi_url = f"https://pypi.org/pypi/{package_name}/json"
response = requests.get(pypi_url)
response.raise_for_status()
return list(response.json()["releases"].keys())
def get_minimum_version(package_name: str, spec_string: str) -> str | None:
"""Find the minimum published version that satisfies the given constraints.
Args:
package_name: Name of the package
spec_string: Version specification string (e.g., ">=0.2.43,<0.4.0,!=0.3.0")
Returns:
Minimum compatible version or None if no compatible version found
"""
# Rewrite occurrences of ^0.0.z to 0.0.z (can be anywhere in constraint string)
spec_string = re.sub(r"\^0\.0\.(\d+)", r"0.0.\1", spec_string)
# Rewrite occurrences of ^0.y.z to >=0.y.z,<0.y+1 (can be anywhere in constraint string)
for y in range(1, 10):
spec_string = re.sub(
rf"\^0\.{y}\.(\d+)", rf">=0.{y}.\1,<0.{y + 1}", spec_string
)
# Rewrite occurrences of ^x.y.z to >=x.y.z,<x+1.0.0 (can be anywhere in constraint string)
for x in range(1, 10):
spec_string = re.sub(
rf"\^{x}\.(\d+)\.(\d+)", rf">={x}.\1.\2,<{x + 1}", spec_string
)
spec_set = SpecifierSet(spec_string)
all_versions = get_pypi_versions(package_name)
valid_versions = []
for version_str in all_versions:
try:
version = parse(version_str)
if spec_set.contains(version):
valid_versions.append(version)
except ValueError:
continue
return str(min(valid_versions)) if valid_versions else None
def _check_python_version_from_requirement(
requirement: Requirement, python_version: str
) -> bool:
if not requirement.marker:
return True
else:
marker_str = str(requirement.marker)
if "python_version" in marker_str or "python_full_version" in marker_str:
python_version_str = "".join(
char
for char in marker_str
if char.isdigit() or char in (".", "<", ">", "=", ",")
)
return check_python_version(python_version, python_version_str)
return True
def get_min_version_from_toml(
toml_path: str,
versions_for: str,
python_version: str,
*,
include: list | None = None,
):
# Parse the TOML file
with open(toml_path, "rb") as file:
toml_data = tomllib.load(file)
dependencies = defaultdict(list)
for dep in toml_data["project"]["dependencies"]:
requirement = Requirement(dep)
dependencies[requirement.name].append(requirement)
# Initialize a dictionary to store the minimum versions
min_versions = {}
# Iterate over the libs in MIN_VERSION_LIBS
for lib in set(MIN_VERSION_LIBS + (include or [])):
if versions_for == "pull_request" and lib in SKIP_IF_PULL_REQUEST:
# some libs only get checked on release because of simultaneous
# changes in multiple libs
continue
# Check if the lib is present in the dependencies
if lib in dependencies:
if include and lib not in include:
continue
requirements = dependencies[lib]
for requirement in requirements:
if _check_python_version_from_requirement(requirement, python_version):
version_string = str(requirement.specifier)
break
# Use parse_version to get the minimum supported version from version_string
min_version = get_minimum_version(lib, version_string)
# Store the minimum version in the min_versions dictionary
min_versions[lib] = min_version
return min_versions
def check_python_version(version_string, constraint_string):
"""Check if the given Python version matches the given constraints.
Args:
version_string: A string representing the Python version (e.g. "3.8.5").
constraint_string: A string representing the package's Python version
constraints (e.g. ">=3.6, <4.0").
Returns:
True if the version matches the constraints
"""
# Rewrite occurrences of ^0.0.z to 0.0.z (can be anywhere in constraint string)
constraint_string = re.sub(r"\^0\.0\.(\d+)", r"0.0.\1", constraint_string)
# Rewrite occurrences of ^0.y.z to >=0.y.z,<0.y+1.0 (can be anywhere in constraint string)
for y in range(1, 10):
constraint_string = re.sub(
rf"\^0\.{y}\.(\d+)", rf">=0.{y}.\1,<0.{y + 1}.0", constraint_string
)
# Rewrite occurrences of ^x.y.z to >=x.y.z,<x+1.0.0 (can be anywhere in constraint string)
for x in range(1, 10):
constraint_string = re.sub(
rf"\^{x}\.0\.(\d+)", rf">={x}.0.\1,<{x + 1}.0.0", constraint_string
)
try:
version = Version(version_string)
constraints = SpecifierSet(constraint_string)
return version in constraints
except Exception as e:
print(f"Error: {e}")
return False
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
versions_for = sys.argv[2]
python_version = sys.argv[3]
assert versions_for in ["release", "pull_request"]
# Call the function to get the minimum versions
min_versions = get_min_version_from_toml(toml_file, versions_for, python_version)
print(" ".join([f"{lib}=={version}" for lib, version in min_versions.items()]))

View File

@@ -1,756 +0,0 @@
#!/usr/bin/env python3
#
# git-restore-mtime - Change mtime of files based on commit date of last change
#
# Copyright (C) 2012 Rodrigo Silva (MestreLion) <linux@rodrigosilva.com>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. See <http://www.gnu.org/licenses/gpl.html>
#
# Source: https://github.com/MestreLion/git-tools
# Version: July 13, 2023 (commit hash 5f832e72453e035fccae9d63a5056918d64476a2)
"""
Change the modification time (mtime) of files in work tree, based on the
date of the most recent commit that modified the file, including renames.
Ignores untracked files and uncommitted deletions, additions and renames, and
by default modifications too.
---
Useful prior to generating release tarballs, so each file is archived with a
date that is similar to the date when the file was actually last modified,
assuming the actual modification date and its commit date are close.
"""
# TODO:
# - Add -z on git whatchanged/ls-files, so we don't deal with filename decoding
# - When Python is bumped to 3.7, use text instead of universal_newlines on subprocess
# - Update "Statistics for some large projects" with modern hardware and repositories.
# - Create a README.md for git-restore-mtime alone. It deserves extensive documentation
# - Move Statistics there
# - See git-extras as a good example on project structure and documentation
# FIXME:
# - When current dir is outside the worktree, e.g. using --work-tree, `git ls-files`
# assume any relative pathspecs are to worktree root, not the current dir. As such,
# relative pathspecs may not work.
# - Renames are tricky:
# - R100 should not change mtime, but original name is not on filelist. Should
# track renames until a valid (A, M) mtime found and then set on current name.
# - Should set mtime for both current and original directories.
# - Check mode changes with unchanged blobs?
# - Check file (A, D) for the directory mtime is not sufficient:
# - Renames also change dir mtime, unless rename was on a parent dir
# - If most recent change of all files in a dir was a Modification (M),
# dir might not be touched at all.
# - Dirs containing only subdirectories but no direct files will also
# not be touched. They're files' [grand]parent dir, but never their dirname().
# - Some solutions:
# - After files done, perform some dir processing for missing dirs, finding latest
# file (A, D, R)
# - Simple approach: dir mtime is the most recent child (dir or file) mtime
# - Use a virtual concept of "created at most at" to fill missing info, bubble up
# to parents and grandparents
# - When handling [grand]parent dirs, stay inside <pathspec>
# - Better handling of merge commits. `-m` is plain *wrong*. `-c/--cc` is perfect, but
# painfully slow. First pass without merge commits is not accurate. Maybe add a new
# `--accurate` mode for `--cc`?
if __name__ != "__main__":
raise ImportError("{} should not be used as a module.".format(__name__))
import argparse
import datetime
import logging
import os.path
import shlex
import signal
import subprocess
import sys
import time
__version__ = "2022.12+dev"
# Update symlinks only if the platform supports not following them
UPDATE_SYMLINKS = bool(os.utime in getattr(os, "supports_follow_symlinks", []))
# Call os.path.normpath() only if not in a POSIX platform (Windows)
NORMALIZE_PATHS = os.path.sep != "/"
# How many files to process in each batch when re-trying merge commits
STEPMISSING = 100
# (Extra) keywords for the os.utime() call performed by touch()
UTIME_KWS = {} if not UPDATE_SYMLINKS else {"follow_symlinks": False}
# Command-line interface ######################################################
def parse_args():
parser = argparse.ArgumentParser(description=__doc__.split("\n---")[0])
group = parser.add_mutually_exclusive_group()
group.add_argument(
"--quiet",
"-q",
dest="loglevel",
action="store_const",
const=logging.WARNING,
default=logging.INFO,
help="Suppress informative messages and summary statistics.",
)
group.add_argument(
"--verbose",
"-v",
action="count",
help="""
Print additional information for each processed file.
Specify twice to further increase verbosity.
""",
)
parser.add_argument(
"--cwd",
"-C",
metavar="DIRECTORY",
help="""
Run as if %(prog)s was started in directory %(metavar)s.
This affects how --work-tree, --git-dir and PATHSPEC arguments are handled.
See 'man 1 git' or 'git --help' for more information.
""",
)
parser.add_argument(
"--git-dir",
dest="gitdir",
metavar="GITDIR",
help="""
Path to the git repository, by default auto-discovered by searching
the current directory and its parents for a .git/ subdirectory.
""",
)
parser.add_argument(
"--work-tree",
dest="workdir",
metavar="WORKTREE",
help="""
Path to the work tree root, by default the parent of GITDIR if it's
automatically discovered, or the current directory if GITDIR is set.
""",
)
parser.add_argument(
"--force",
"-f",
default=False,
action="store_true",
help="""
Force updating files with uncommitted modifications.
Untracked files and uncommitted deletions, renames and additions are
always ignored.
""",
)
parser.add_argument(
"--merge",
"-m",
default=False,
action="store_true",
help="""
Include merge commits.
Leads to more recent times and more files per commit, thus with the same
time, which may or may not be what you want.
Including merge commits may lead to fewer commits being evaluated as files
are found sooner, which can improve performance, sometimes substantially.
But as merge commits are usually huge, processing them may also take longer.
By default, merge commits are only used for files missing from regular commits.
""",
)
parser.add_argument(
"--first-parent",
default=False,
action="store_true",
help="""
Consider only the first parent, the "main branch", when evaluating merge commits.
Only effective when merge commits are processed, either when --merge is
used or when finding missing files after the first regular log search.
See --skip-missing.
""",
)
parser.add_argument(
"--skip-missing",
"-s",
dest="missing",
default=True,
action="store_false",
help="""
Do not try to find missing files.
If merge commits were not evaluated with --merge and some files were
not found in regular commits, by default %(prog)s searches for these
files again in the merge commits.
This option disables this retry, so files found only in merge commits
will not have their timestamp updated.
""",
)
parser.add_argument(
"--no-directories",
"-D",
dest="dirs",
default=True,
action="store_false",
help="""
Do not update directory timestamps.
By default, use the time of its most recently created, renamed or deleted file.
Note that just modifying a file will NOT update its directory time.
""",
)
parser.add_argument(
"--test",
"-t",
default=False,
action="store_true",
help="Test run: do not actually update any file timestamp.",
)
parser.add_argument(
"--commit-time",
"-c",
dest="commit_time",
default=False,
action="store_true",
help="Use commit time instead of author time.",
)
parser.add_argument(
"--oldest-time",
"-o",
dest="reverse_order",
default=False,
action="store_true",
help="""
Update times based on the oldest, instead of the most recent commit of a file.
This reverses the order in which the git log is processed to emulate a
file "creation" date. Note this will be inaccurate for files deleted and
re-created at later dates.
""",
)
parser.add_argument(
"--skip-older-than",
metavar="SECONDS",
type=int,
help="""
Ignore files that are currently older than %(metavar)s.
Useful in workflows that assume such files already have a correct timestamp,
as it may improve performance by processing fewer files.
""",
)
parser.add_argument(
"--skip-older-than-commit",
"-N",
default=False,
action="store_true",
help="""
Ignore files older than the timestamp it would be updated to.
Such files may be considered "original", likely in the author's repository.
""",
)
parser.add_argument(
"--unique-times",
default=False,
action="store_true",
help="""
Set the microseconds to a unique value per commit.
Allows telling apart changes that would otherwise have identical timestamps,
as git's time accuracy is in seconds.
""",
)
parser.add_argument(
"pathspec",
nargs="*",
metavar="PATHSPEC",
help="""
Only modify paths matching %(metavar)s, relative to current directory.
By default, update all but untracked files and submodules.
""",
)
parser.add_argument(
"--version",
"-V",
action="version",
version="%(prog)s version {version}".format(version=get_version()),
)
args_ = parser.parse_args()
if args_.verbose:
args_.loglevel = max(logging.TRACE, logging.DEBUG // args_.verbose)
args_.debug = args_.loglevel <= logging.DEBUG
return args_
def get_version(version=__version__):
if not version.endswith("+dev"):
return version
try:
cwd = os.path.dirname(os.path.realpath(__file__))
return Git(cwd=cwd, errors=False).describe().lstrip("v")
except Git.Error:
return "-".join((version, "unknown"))
# Helper functions ############################################################
def setup_logging():
"""Add TRACE logging level and corresponding method, return the root logger"""
logging.TRACE = TRACE = logging.DEBUG // 2
logging.Logger.trace = lambda _, m, *a, **k: _.log(TRACE, m, *a, **k)
return logging.getLogger()
def normalize(path):
r"""Normalize paths from git, handling non-ASCII characters.
Git stores paths as UTF-8 normalization form C.
If path contains non-ASCII or non-printable characters, git outputs the UTF-8
in octal-escaped notation, escaping double-quotes and backslashes, and then
double-quoting the whole path.
https://git-scm.com/docs/git-config#Documentation/git-config.txt-corequotePath
This function reverts this encoding, so:
normalize(r'"Back\\slash_double\"quote_a\303\247a\303\255"') =>
r'Back\slash_double"quote_açaí')
Paths with invalid UTF-8 encoding, such as single 0x80-0xFF bytes (e.g, from
Latin1/Windows-1251 encoding) are decoded using surrogate escape, the same
method used by Python for filesystem paths. So 0xE6 ("æ" in Latin1, r'\\346'
from Git) is decoded as "\udce6". See https://peps.python.org/pep-0383/ and
https://vstinner.github.io/painful-history-python-filesystem-encoding.html
Also see notes on `windows/non-ascii-paths.txt` about path encodings on
non-UTF-8 platforms and filesystems.
"""
if path and path[0] == '"':
# Python 2: path = path[1:-1].decode("string-escape")
# Python 3: https://stackoverflow.com/a/46650050/624066
path = (
path[1:-1] # Remove enclosing double quotes
.encode("latin1") # Convert to bytes, required by 'unicode-escape'
.decode("unicode-escape") # Perform the actual octal-escaping decode
.encode("latin1") # 1:1 mapping to bytes, UTF-8 encoded
.decode("utf8", "surrogateescape")
) # Decode from UTF-8
if NORMALIZE_PATHS:
# Make sure the slash matches the OS; for Windows we need a backslash
path = os.path.normpath(path)
return path
def dummy(*_args, **_kwargs):
"""No-op function used in dry-run tests"""
def touch(path, mtime):
"""The actual mtime update"""
os.utime(path, (mtime, mtime), **UTIME_KWS)
def touch_ns(path, mtime_ns):
"""The actual mtime update, using nanoseconds for unique timestamps"""
os.utime(path, None, ns=(mtime_ns, mtime_ns), **UTIME_KWS)
def isodate(secs: int):
# time.localtime() accepts floats, but discards fractional part
return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(secs))
def isodate_ns(ns: int):
# for integers fromtimestamp() is equivalent and ~16% slower than isodate()
return datetime.datetime.fromtimestamp(ns / 1000000000).isoformat(sep=" ")
def get_mtime_ns(secs: int, idx: int):
# Time resolution for filesystems and functions:
# ext-4 and other POSIX filesystems: 1 nanosecond
# NTFS (Windows default): 100 nanoseconds
# datetime.datetime() (due to 64-bit float epoch): 1 microsecond
us = idx % 1000000 # 10**6
return 1000 * (1000000 * secs + us)
def get_mtime_path(path):
return os.path.getmtime(path)
# Git class and parse_log(), the heart of the script ##########################
class Git:
def __init__(self, workdir=None, gitdir=None, cwd=None, errors=True):
self.gitcmd = ["git"]
self.errors = errors
self._proc = None
if workdir:
self.gitcmd.extend(("--work-tree", workdir))
if gitdir:
self.gitcmd.extend(("--git-dir", gitdir))
if cwd:
self.gitcmd.extend(("-C", cwd))
self.workdir, self.gitdir = self._get_repo_dirs()
def ls_files(self, paths: list = None):
return (normalize(_) for _ in self._run("ls-files --full-name", paths))
def ls_dirty(self, force=False):
return (
normalize(_[3:].split(" -> ", 1)[-1])
for _ in self._run("status --porcelain")
if _[:2] != "??" and (not force or (_[0] in ("R", "A") or _[1] == "D"))
)
def log(
self,
merge=False,
first_parent=False,
commit_time=False,
reverse_order=False,
paths: list = None,
):
cmd = "whatchanged --pretty={}".format("%ct" if commit_time else "%at")
if merge:
cmd += " -m"
if first_parent:
cmd += " --first-parent"
if reverse_order:
cmd += " --reverse"
return self._run(cmd, paths)
def describe(self):
return self._run("describe --tags", check=True)[0]
def terminate(self):
if self._proc is None:
return
try:
self._proc.terminate()
except OSError:
# Avoid errors on OpenBSD
pass
def _get_repo_dirs(self):
return (
os.path.normpath(_)
for _ in self._run(
"rev-parse --show-toplevel --absolute-git-dir", check=True
)
)
def _run(self, cmdstr: str, paths: list = None, output=True, check=False):
cmdlist = self.gitcmd + shlex.split(cmdstr)
if paths:
cmdlist.append("--")
cmdlist.extend(paths)
popen_args = dict(universal_newlines=True, encoding="utf8")
if not self.errors:
popen_args["stderr"] = subprocess.DEVNULL
log.trace("Executing: %s", " ".join(cmdlist))
if not output:
return subprocess.call(cmdlist, **popen_args)
if check:
try:
stdout: str = subprocess.check_output(cmdlist, **popen_args)
return stdout.splitlines()
except subprocess.CalledProcessError as e:
raise self.Error(e.returncode, e.cmd, e.output, e.stderr)
self._proc = subprocess.Popen(cmdlist, stdout=subprocess.PIPE, **popen_args)
return (_.rstrip() for _ in self._proc.stdout)
def __del__(self):
self.terminate()
class Error(subprocess.CalledProcessError):
"""Error from git executable"""
def parse_log(filelist, dirlist, stats, git, merge=False, filterlist=None):
mtime = 0
datestr = isodate(0)
for line in git.log(
merge, args.first_parent, args.commit_time, args.reverse_order, filterlist
):
stats["loglines"] += 1
# Blank line between Date and list of files
if not line:
continue
# Date line
if line[0] != ":": # Faster than `not line.startswith(':')`
stats["commits"] += 1
mtime = int(line)
if args.unique_times:
mtime = get_mtime_ns(mtime, stats["commits"])
if args.debug:
datestr = isodate(mtime)
continue
# File line: three tokens if it describes a renaming, otherwise two
tokens = line.split("\t")
# Possible statuses:
# M: Modified (content changed)
# A: Added (created)
# D: Deleted
# T: Type changed: to/from regular file, symlinks, submodules
# R099: Renamed (moved), with % of unchanged content. 100 = pure rename
# Not possible in log: C=Copied, U=Unmerged, X=Unknown, B=pairing Broken
status = tokens[0].split(" ")[-1]
file = tokens[-1]
# Handles non-ASCII chars and OS path separator
file = normalize(file)
def do_file():
if args.skip_older_than_commit and get_mtime_path(file) <= mtime:
stats["skip"] += 1
return
if args.debug:
log.debug(
"%d\t%d\t%d\t%s\t%s",
stats["loglines"],
stats["commits"],
stats["files"],
datestr,
file,
)
try:
touch(os.path.join(git.workdir, file), mtime)
stats["touches"] += 1
except Exception as e:
log.error("ERROR: %s: %s", e, file)
stats["errors"] += 1
def do_dir():
if args.debug:
log.debug(
"%d\t%d\t-\t%s\t%s",
stats["loglines"],
stats["commits"],
datestr,
"{}/".format(dirname or "."),
)
try:
touch(os.path.join(git.workdir, dirname), mtime)
stats["dirtouches"] += 1
except Exception as e:
log.error("ERROR: %s: %s", e, dirname)
stats["direrrors"] += 1
if file in filelist:
stats["files"] -= 1
filelist.remove(file)
do_file()
if args.dirs and status in ("A", "D"):
dirname = os.path.dirname(file)
if dirname in dirlist:
dirlist.remove(dirname)
do_dir()
# All files done?
if not stats["files"]:
git.terminate()
return
# Main Logic ##################################################################
def main():
start = time.time() # yes, Wall time. CPU time is not realistic for users.
stats = {
_: 0
for _ in (
"loglines",
"commits",
"touches",
"skip",
"errors",
"dirtouches",
"direrrors",
)
}
logging.basicConfig(level=args.loglevel, format="%(message)s")
log.trace("Arguments: %s", args)
# First things first: Where and Who are we?
if args.cwd:
log.debug("Changing directory: %s", args.cwd)
try:
os.chdir(args.cwd)
except OSError as e:
log.critical(e)
return e.errno
# Using both os.chdir() and `git -C` is redundant, but might prevent side effects
# `git -C` alone could be enough if we make sure that:
# - all paths, including args.pathspec, are processed by git: ls-files, rev-parse
# - touch() / os.utime() path argument is always prepended with git.workdir
try:
git = Git(workdir=args.workdir, gitdir=args.gitdir, cwd=args.cwd)
except Git.Error as e:
# Not in a git repository, and git already informed user on stderr. So we just...
return e.returncode
# Get the files managed by git and build file list to be processed
if UPDATE_SYMLINKS and not args.skip_older_than:
filelist = set(git.ls_files(args.pathspec))
else:
filelist = set()
for path in git.ls_files(args.pathspec):
fullpath = os.path.join(git.workdir, path)
# Symlink (to file, to dir or broken - git handles the same way)
if not UPDATE_SYMLINKS and os.path.islink(fullpath):
log.warning(
"WARNING: Skipping symlink, no OS support for updates: %s", path
)
continue
# skip files which are older than given threshold
if (
args.skip_older_than
and start - get_mtime_path(fullpath) > args.skip_older_than
):
continue
# Always add files relative to worktree root
filelist.add(path)
# If --force, silently ignore uncommitted deletions (not in the filesystem)
# and renames / additions (will not be found in log anyway)
if args.force:
filelist -= set(git.ls_dirty(force=True))
# Otherwise, ignore any dirty files
else:
dirty = set(git.ls_dirty())
if dirty:
log.warning(
"WARNING: Modified files in the working directory were ignored."
"\nTo include such files, commit your changes or use --force."
)
filelist -= dirty
# Build dir list to be processed
dirlist = set(os.path.dirname(_) for _ in filelist) if args.dirs else set()
stats["totalfiles"] = stats["files"] = len(filelist)
log.info("{0:,} files to be processed in work dir".format(stats["totalfiles"]))
if not filelist:
# Nothing to do. Exit silently and without errors, just like git does
return
# Process the log until all files are 'touched'
log.debug("Line #\tLog #\tF.Left\tModification Time\tFile Name")
parse_log(filelist, dirlist, stats, git, args.merge, args.pathspec)
# Missing files
if filelist:
# Try to find them in merge logs, if not done already
# (usually HUGE, thus MUCH slower!)
if args.missing and not args.merge:
filterlist = list(filelist)
missing = len(filterlist)
log.info(
"{0:,} files not found in log, trying merge commits".format(missing)
)
for i in range(0, missing, STEPMISSING):
parse_log(
filelist,
dirlist,
stats,
git,
merge=True,
filterlist=filterlist[i : i + STEPMISSING],
)
# Still missing some?
for file in filelist:
log.warning("WARNING: not found in the log: %s", file)
# Final statistics
# Suggestion: use git-log --before=mtime to brag about skipped log entries
def log_info(msg, *a, width=13):
ifmt = "{:%d,}" % (width,) # not using 'n' for consistency with ffmt
ffmt = "{:%d,.2f}" % (width,)
# %-formatting lacks a thousand separator, must pre-render with .format()
log.info(msg.replace("%d", ifmt).replace("%f", ffmt).format(*a))
log_info(
"Statistics:\n%f seconds\n%d log lines processed\n%d commits evaluated",
time.time() - start,
stats["loglines"],
stats["commits"],
)
if args.dirs:
if stats["direrrors"]:
log_info("%d directory update errors", stats["direrrors"])
log_info("%d directories updated", stats["dirtouches"])
if stats["touches"] != stats["totalfiles"]:
log_info("%d files", stats["totalfiles"])
if stats["skip"]:
log_info("%d files skipped", stats["skip"])
if stats["files"]:
log_info("%d files missing", stats["files"])
if stats["errors"]:
log_info("%d file update errors", stats["errors"])
log_info("%d files updated", stats["touches"])
if args.test:
log.info("TEST RUN - No files modified!")
# Keep only essential, global assignments here. Any other logic must be in main()
log = setup_logging()
args = parse_args()
# Set the actual touch() and other functions based on command-line arguments
if args.unique_times:
touch = touch_ns
isodate = isodate_ns
# Make sure this is always set last to ensure --test behaves as intended
if args.test:
touch = dummy
# UI done, it's showtime!
try:
sys.exit(main())
except KeyboardInterrupt:
log.info("\nAborting")
signal.signal(signal.SIGINT, signal.SIG_DFL)
os.kill(os.getpid(), signal.SIGINT)

View File

@@ -1,65 +0,0 @@
# Validates that a package's integration tests compile without syntax or import errors.
#
# (If an integration test fails to compile, it won't run.)
#
# Called as part of check_diffs.yml workflow
#
# Runs pytest with compile marker to check syntax/imports.
name: "🔗 Compile Integration Tests"
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
python-version:
required: true
type: string
description: "Python version to use"
permissions:
contents: read
env:
UV_FROZEN: "true"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
timeout-minutes: 20
name: "Python ${{ inputs.python-version }}"
steps:
- uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ inputs.python-version }}
cache-suffix: compile-integration-tests-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: "📦 Install Integration Dependencies"
shell: bash
run: uv sync --group test --group test_integration
- name: "🔗 Check Integration Tests Compile"
shell: bash
run: uv run pytest -m compile tests/integration_tests
- name: "🧹 Verify Clean Working Directory"
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -1,75 +0,0 @@
# Runs linting.
#
# Uses the package's Makefile to run the checks, specifically the
# `lint_package` and `lint_tests` targets.
#
# Called as part of check_diffs.yml workflow.
name: "🧹 Linting"
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
python-version:
required: true
type: string
description: "Python version to use"
permissions:
contents: read
env:
WORKDIR: ${{ inputs.working-directory == '' && '.' || inputs.working-directory }}
# This env var allows us to get inline annotations when ruff has complaints.
RUFF_OUTPUT_FORMAT: github
UV_FROZEN: "true"
jobs:
# Linting job - runs quality checks on package and test code
build:
name: "Python ${{ inputs.python-version }}"
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: "📋 Checkout Code"
uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ inputs.python-version }}
cache-suffix: lint-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: "📦 Install Lint & Typing Dependencies"
working-directory: ${{ inputs.working-directory }}
run: |
uv sync --group lint --group typing
- name: "🔍 Analyze Package Code with Linters"
working-directory: ${{ inputs.working-directory }}
run: |
make lint_package
- name: "📦 Install Test Dependencies (non-partners)"
# (For directories NOT starting with libs/partners/)
if: ${{ ! startsWith(inputs.working-directory, 'libs/partners/') }}
working-directory: ${{ inputs.working-directory }}
run: |
uv sync --inexact --group test
- name: "📦 Install Test Dependencies"
if: ${{ startsWith(inputs.working-directory, 'libs/partners/') }}
working-directory: ${{ inputs.working-directory }}
run: |
uv sync --inexact --group test --group test_integration
- name: "🔍 Analyze Test Code with Linters"
working-directory: ${{ inputs.working-directory }}
run: |
make lint_tests

View File

@@ -1,556 +0,0 @@
# Builds and publishes LangChain packages to PyPI.
#
# Manually triggered, though can be used as a reusable workflow (workflow_call).
#
# Handles version bumping, building, and publishing to PyPI with authentication.
name: "🚀 Package Release"
run-name: "Release ${{ inputs.working-directory }} ${{ inputs.release-version }}"
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
workflow_dispatch:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
default: "libs/langchain"
release-version:
required: true
type: string
default: "0.1.0"
description: "New version of package being released"
dangerous-nonmaster-release:
required: false
type: boolean
default: false
description: "Release from a non-master branch (danger!) - Only use for hotfixes"
env:
PYTHON_VERSION: "3.11"
UV_FROZEN: "true"
UV_NO_SYNC: "true"
permissions:
contents: write # Required for creating GitHub releases
jobs:
# Build the distribution package and extract version info
# Runs in isolated environment with minimal permissions for security
build:
if: github.ref == 'refs/heads/master' || inputs.dangerous-nonmaster-release
environment: Scheduled testing
runs-on: ubuntu-latest
permissions:
contents: read
outputs:
pkg-name: ${{ steps.check-version.outputs.pkg-name }}
version: ${{ steps.check-version.outputs.version }}
steps:
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
# We want to keep this build stage *separate* from the release stage,
# so that there's no sharing of permissions between them.
# (Release stage has trusted publishing and GitHub repo contents write access,
#
# Otherwise, a malicious `build` step (e.g. via a compromised dependency)
# could get access to our GitHub or PyPI credentials.
#
# Per the trusted publishing GitHub Action:
# > It is strongly advised to separate jobs for building [...]
# > from the publish job.
# https://github.com/pypa/gh-action-pypi-publish#non-goals
- name: Build project for distribution
run: uv build
working-directory: ${{ inputs.working-directory }}
- name: Upload build
uses: actions/upload-artifact@v5
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Check version
id: check-version
shell: python
working-directory: ${{ inputs.working-directory }}
run: |
import os
import tomllib
with open("pyproject.toml", "rb") as f:
data = tomllib.load(f)
pkg_name = data["project"]["name"]
version = data["project"]["version"]
with open(os.environ["GITHUB_OUTPUT"], "a") as f:
f.write(f"pkg-name={pkg_name}\n")
f.write(f"version={version}\n")
release-notes:
needs:
- build
runs-on: ubuntu-latest
permissions:
contents: read
outputs:
release-body: ${{ steps.generate-release-body.outputs.release-body }}
steps:
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain
path: langchain
sparse-checkout: | # this only grabs files for relevant dir
${{ inputs.working-directory }}
ref: ${{ github.ref }} # this scopes to just ref'd branch
fetch-depth: 0 # this fetches entire commit history
- name: Check tags
id: check-tags
shell: bash
working-directory: langchain/${{ inputs.working-directory }}
env:
PKG_NAME: ${{ needs.build.outputs.pkg-name }}
VERSION: ${{ needs.build.outputs.version }}
run: |
# Handle regular versions and pre-release versions differently
if [[ "$VERSION" == *"-"* ]]; then
# This is a pre-release version (contains a hyphen)
# Extract the base version without the pre-release suffix
BASE_VERSION=${VERSION%%-*}
# Look for the latest release of the same base version
REGEX="^$PKG_NAME==$BASE_VERSION\$"
PREV_TAG=$(git tag --sort=-creatordate | (grep -P "$REGEX" || true) | head -1)
# If no exact base version match, look for the latest release of any kind
if [ -z "$PREV_TAG" ]; then
REGEX="^$PKG_NAME==\\d+\\.\\d+\\.\\d+\$"
PREV_TAG=$(git tag --sort=-creatordate | (grep -P "$REGEX" || true) | head -1)
fi
else
# Regular version handling
PREV_TAG="$PKG_NAME==${VERSION%.*}.$(( ${VERSION##*.} - 1 ))"; [[ "${VERSION##*.}" -eq 0 ]] && PREV_TAG=""
# backup case if releasing e.g. 0.3.0, looks up last release
# note if last release (chronologically) was e.g. 0.1.47 it will get
# that instead of the last 0.2 release
if [ -z "$PREV_TAG" ]; then
REGEX="^$PKG_NAME==\\d+\\.\\d+\\.\\d+\$"
echo $REGEX
PREV_TAG=$(git tag --sort=-creatordate | (grep -P $REGEX || true) | head -1)
fi
fi
# if PREV_TAG is empty or came out to 0.0.0, let it be empty
if [ -z "$PREV_TAG" ] || [ "$PREV_TAG" = "$PKG_NAME==0.0.0" ]; then
echo "No previous tag found - first release"
else
# confirm prev-tag actually exists in git repo with git tag
GIT_TAG_RESULT=$(git tag -l "$PREV_TAG")
if [ -z "$GIT_TAG_RESULT" ]; then
echo "Previous tag $PREV_TAG not found in git repo"
exit 1
fi
fi
TAG="${PKG_NAME}==${VERSION}"
if [ "$TAG" == "$PREV_TAG" ]; then
echo "No new version to release"
exit 1
fi
echo tag="$TAG" >> $GITHUB_OUTPUT
echo prev-tag="$PREV_TAG" >> $GITHUB_OUTPUT
- name: Generate release body
id: generate-release-body
working-directory: langchain
env:
WORKING_DIR: ${{ inputs.working-directory }}
PKG_NAME: ${{ needs.build.outputs.pkg-name }}
TAG: ${{ steps.check-tags.outputs.tag }}
PREV_TAG: ${{ steps.check-tags.outputs.prev-tag }}
run: |
PREAMBLE="Changes since $PREV_TAG"
# if PREV_TAG is empty or 0.0.0, then we are releasing the first version
if [ -z "$PREV_TAG" ] || [ "$PREV_TAG" = "$PKG_NAME==0.0.0" ]; then
PREAMBLE="Initial release"
PREV_TAG=$(git rev-list --max-parents=0 HEAD)
fi
{
echo 'release-body<<EOF'
echo $PREAMBLE
echo
git log --format="%s" "$PREV_TAG"..HEAD -- $WORKING_DIR
echo EOF
} >> "$GITHUB_OUTPUT"
test-pypi-publish:
needs:
- build
- release-notes
runs-on: ubuntu-latest
permissions:
# This permission is used for trusted publishing:
# https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
#
# Trusted publishing has to also be configured on PyPI for each package:
# https://docs.pypi.org/trusted-publishers/adding-a-publisher/
id-token: write
steps:
- uses: actions/checkout@v5
- uses: actions/download-artifact@v6
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Publish to test PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true
repository-url: https://test.pypi.org/legacy/
# We overwrite any existing distributions with the same name and version.
# This is *only for CI use* and is *extremely dangerous* otherwise!
# https://github.com/pypa/gh-action-pypi-publish#tolerating-release-package-file-duplicates
skip-existing: true
# Temp workaround since attestations are on by default as of gh-action-pypi-publish v1.11.0
attestations: false
pre-release-checks:
needs:
- build
- release-notes
- test-pypi-publish
runs-on: ubuntu-latest
permissions:
contents: read
timeout-minutes: 20
steps:
- uses: actions/checkout@v5
# We explicitly *don't* set up caching here. This ensures our tests are
# maximally sensitive to catching breakage.
#
# For example, here's a way that caching can cause a falsely-passing test:
# - Make the langchain package manifest no longer list a dependency package
# as a requirement. This means it won't be installed by `pip install`,
# and attempting to use it would cause a crash.
# - That dependency used to be required, so it may have been cached.
# When restoring the venv packages from cache, that dependency gets included.
# - Tests pass, because the dependency is present even though it wasn't specified.
# - The package is published, and it breaks on the missing dependency when
# used in the real world.
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
id: setup-python
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v6
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Import dist package
shell: bash
working-directory: ${{ inputs.working-directory }}
env:
PKG_NAME: ${{ needs.build.outputs.pkg-name }}
VERSION: ${{ needs.build.outputs.version }}
# Here we use:
# - The default regular PyPI index as the *primary* index, meaning
# that it takes priority (https://pypi.org/simple)
# - The test PyPI index as an extra index, so that any dependencies that
# are not found on test PyPI can be resolved and installed anyway.
# (https://test.pypi.org/simple). This will include the PKG_NAME==VERSION
# package because VERSION will not have been uploaded to regular PyPI yet.
# - attempt install again after 5 seconds if it fails because there is
# sometimes a delay in availability on test pypi
run: |
uv venv
VIRTUAL_ENV=.venv uv pip install dist/*.whl
# Replace all dashes in the package name with underscores,
# since that's how Python imports packages with dashes in the name.
# also remove _official suffix
IMPORT_NAME="$(echo "$PKG_NAME" | sed s/-/_/g | sed s/_official//g)"
uv run python -c "import $IMPORT_NAME; print(dir($IMPORT_NAME))"
- name: Import test dependencies
run: uv sync --group test
working-directory: ${{ inputs.working-directory }}
# Overwrite the local version of the package with the built version
- name: Import published package (again)
working-directory: ${{ inputs.working-directory }}
shell: bash
env:
PKG_NAME: ${{ needs.build.outputs.pkg-name }}
VERSION: ${{ needs.build.outputs.version }}
run: |
VIRTUAL_ENV=.venv uv pip install dist/*.whl
- name: Check for prerelease versions
# Block release if any dependencies allow prerelease versions
# (unless this is itself a prerelease version)
working-directory: ${{ inputs.working-directory }}
run: |
uv run python $GITHUB_WORKSPACE/.github/scripts/check_prerelease_dependencies.py pyproject.toml
- name: Run unit tests
run: make tests
working-directory: ${{ inputs.working-directory }}
- name: Get minimum versions
# Find the minimum published versions that satisfies the given constraints
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
VIRTUAL_ENV=.venv uv pip install packaging requests
python_version="$(uv run python --version | awk '{print $2}')"
min_versions="$(uv run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml release $python_version)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: Run unit tests with minimum dependency versions
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
VIRTUAL_ENV=.venv uv pip install --force-reinstall --editable .
VIRTUAL_ENV=.venv uv pip install --force-reinstall $MIN_VERSIONS
make tests
working-directory: ${{ inputs.working-directory }}
- name: Import integration test dependencies
run: uv sync --group test --group test_integration
working-directory: ${{ inputs.working-directory }}
- name: Run integration tests
# Uses the Makefile's `integration_tests` target for the specified package
if: ${{ startsWith(inputs.working-directory, 'libs/partners/') }}
env:
AI21_API_KEY: ${{ secrets.AI21_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
NOMIC_API_KEY: ${{ secrets.NOMIC_API_KEY }}
WATSONX_APIKEY: ${{ secrets.WATSONX_APIKEY }}
WATSONX_PROJECT_ID: ${{ secrets.WATSONX_PROJECT_ID }}
ASTRA_DB_API_ENDPOINT: ${{ secrets.ASTRA_DB_API_ENDPOINT }}
ASTRA_DB_APPLICATION_TOKEN: ${{ secrets.ASTRA_DB_APPLICATION_TOKEN }}
ASTRA_DB_KEYSPACE: ${{ secrets.ASTRA_DB_KEYSPACE }}
ES_URL: ${{ secrets.ES_URL }}
ES_CLOUD_ID: ${{ secrets.ES_CLOUD_ID }}
ES_API_KEY: ${{ secrets.ES_API_KEY }}
MONGODB_ATLAS_URI: ${{ secrets.MONGODB_ATLAS_URI }}
UPSTAGE_API_KEY: ${{ secrets.UPSTAGE_API_KEY }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_API_KEY }}
PPLX_API_KEY: ${{ secrets.PPLX_API_KEY }}
LANGCHAIN_TESTS_USER_AGENT: ${{ secrets.LANGCHAIN_TESTS_USER_AGENT }}
run: make integration_tests
working-directory: ${{ inputs.working-directory }}
# Test select published packages against new core
# Done when code changes are made to langchain-core
test-prior-published-packages-against-new-core:
# Installs the new core with old partners: Installs the new unreleased core
# alongside the previously published partner packages and runs integration tests
needs:
- build
- release-notes
- test-pypi-publish
- pre-release-checks
runs-on: ubuntu-latest
permissions:
contents: read
strategy:
matrix:
partner: [anthropic]
fail-fast: false # Continue testing other partners if one fails
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
ANTHROPIC_FILES_API_IMAGE_ID: ${{ secrets.ANTHROPIC_FILES_API_IMAGE_ID }}
ANTHROPIC_FILES_API_PDF_ID: ${{ secrets.ANTHROPIC_FILES_API_PDF_ID }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
LANGCHAIN_TESTS_USER_AGENT: ${{ secrets.LANGCHAIN_TESTS_USER_AGENT }}
steps:
- uses: actions/checkout@v5
# We implement this conditional as Github Actions does not have good support
# for conditionally needing steps. https://github.com/actions/runner/issues/491
# TODO: this seems to be resolved upstream, so we can probably remove this workaround
- name: Check if libs/core
run: |
if [ "${{ startsWith(inputs.working-directory, 'libs/core') }}" != "true" ]; then
echo "Not in libs/core. Exiting successfully."
exit 0
fi
- name: Set up Python + uv
if: startsWith(inputs.working-directory, 'libs/core')
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v6
if: startsWith(inputs.working-directory, 'libs/core')
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Test against ${{ matrix.partner }}
if: startsWith(inputs.working-directory, 'libs/core')
run: |
# Identify latest tag, excluding pre-releases
LATEST_PACKAGE_TAG="$(
git ls-remote --tags origin "langchain-${{ matrix.partner }}*" \
| awk '{print $2}' \
| sed 's|refs/tags/||' \
| grep -E '[0-9]+\.[0-9]+\.[0-9]+$' \
| sort -Vr \
| head -n 1
)"
echo "Latest package tag: $LATEST_PACKAGE_TAG"
# Shallow-fetch just that single tag
git fetch --depth=1 origin tag "$LATEST_PACKAGE_TAG"
# Checkout the latest package files
rm -rf $GITHUB_WORKSPACE/libs/partners/${{ matrix.partner }}/*
rm -rf $GITHUB_WORKSPACE/libs/standard-tests/*
cd $GITHUB_WORKSPACE/libs/
git checkout "$LATEST_PACKAGE_TAG" -- standard-tests/
git checkout "$LATEST_PACKAGE_TAG" -- partners/${{ matrix.partner }}/
cd partners/${{ matrix.partner }}
# Print as a sanity check
echo "Version number from pyproject.toml: "
cat pyproject.toml | grep "version = "
# Run tests
uv sync --group test --group test_integration
uv pip install ../../core/dist/*.whl
make integration_tests
publish:
# Publishes the package to PyPI
needs:
- build
- release-notes
- test-pypi-publish
- pre-release-checks
- test-prior-published-packages-against-new-core
runs-on: ubuntu-latest
permissions:
# This permission is used for trusted publishing:
# https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
#
# Trusted publishing has to also be configured on PyPI for each package:
# https://docs.pypi.org/trusted-publishers/adding-a-publisher/
id-token: write
defaults:
run:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v6
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true
# Temp workaround since attestations are on by default as of gh-action-pypi-publish v1.11.0
attestations: false
mark-release:
# Marks the GitHub release with the new version tag
needs:
- build
- release-notes
- test-pypi-publish
- pre-release-checks
- publish
runs-on: ubuntu-latest
permissions:
# This permission is needed by `ncipollo/release-action` to
# create the GitHub release/tag
contents: write
defaults:
run:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v6
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Create Tag
uses: ncipollo/release-action@v1
with:
artifacts: "dist/*"
token: ${{ secrets.GITHUB_TOKEN }}
generateReleaseNotes: false
tag: ${{needs.build.outputs.pkg-name}}==${{ needs.build.outputs.version }}
body: ${{ needs.release-notes.outputs.release-body }}
commit: ${{ github.sha }}
makeLatest: ${{ needs.build.outputs.pkg-name == 'langchain-core'}}

View File

@@ -1,85 +0,0 @@
# Runs unit tests with both current and minimum supported dependency versions
# to ensure compatibility across the supported range.
name: "🧪 Unit Testing"
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
python-version:
required: true
type: string
description: "Python version to use"
permissions:
contents: read
env:
UV_FROZEN: "true"
UV_NO_SYNC: "true"
jobs:
# Main test job - runs unit tests with current deps, then retests with minimum versions
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
timeout-minutes: 20
name: "Python ${{ inputs.python-version }}"
steps:
- name: "📋 Checkout Code"
uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
uses: "./.github/actions/uv_setup"
id: setup-python
with:
python-version: ${{ inputs.python-version }}
cache-suffix: test-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: "📦 Install Test Dependencies"
shell: bash
run: uv sync --group test --dev
- name: "🧪 Run Core Unit Tests"
shell: bash
run: |
make test
- name: "🔍 Calculate Minimum Dependency Versions"
working-directory: ${{ inputs.working-directory }}
id: min-version
shell: bash
run: |
VIRTUAL_ENV=.venv uv pip install packaging tomli requests
python_version="$(uv run python --version | awk '{print $2}')"
min_versions="$(uv run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml pull_request $python_version)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: "🧪 Run Tests with Minimum Dependencies"
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
VIRTUAL_ENV=.venv uv pip install $MIN_VERSIONS
make tests
working-directory: ${{ inputs.working-directory }}
- name: "🧹 Verify Clean Working Directory"
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -1,73 +0,0 @@
# Facilitate unit testing against different Pydantic versions for a provided package.
name: "🐍 Pydantic Version Testing"
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
python-version:
required: false
type: string
description: "Python version to use"
default: "3.12"
pydantic-version:
required: true
type: string
description: "Pydantic version to test."
permissions:
contents: read
env:
UV_FROZEN: "true"
UV_NO_SYNC: "true"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
timeout-minutes: 20
name: "Pydantic ~=${{ inputs.pydantic-version }}"
steps:
- name: "📋 Checkout Code"
uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ inputs.python-version }}
cache-suffix: test-pydantic-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: "📦 Install Test Dependencies"
shell: bash
run: uv sync --group test
- name: "🔄 Install Specific Pydantic Version"
shell: bash
env:
PYDANTIC_VERSION: ${{ inputs.pydantic-version }}
run: VIRTUAL_ENV=.venv uv pip install "pydantic~=$PYDANTIC_VERSION"
- name: "🧪 Run Core Tests"
shell: bash
run: |
make test
- name: "🧹 Verify Clean Working Directory"
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -1,107 +0,0 @@
name: Auto Label Issues by Package
on:
issues:
types: [opened, edited]
jobs:
label-by-package:
permissions:
issues: write
runs-on: ubuntu-latest
steps:
- name: Sync package labels
uses: actions/github-script@v6
with:
script: |
const body = context.payload.issue.body || "";
// Extract text under "### Package"
const match = body.match(/### Package\s+([\s\S]*?)\n###/i);
if (!match) return;
const packageSection = match[1].trim();
// Mapping table for package names to labels
const mapping = {
"langchain": "langchain",
"langchain-openai": "openai",
"langchain-anthropic": "anthropic",
"langchain-classic": "langchain-classic",
"langchain-core": "core",
"langchain-cli": "cli",
"langchain-model-profiles": "model-profiles",
"langchain-tests": "standard-tests",
"langchain-text-splitters": "text-splitters",
"langchain-chroma": "chroma",
"langchain-deepseek": "deepseek",
"langchain-exa": "exa",
"langchain-fireworks": "fireworks",
"langchain-groq": "groq",
"langchain-huggingface": "huggingface",
"langchain-mistralai": "mistralai",
"langchain-nomic": "nomic",
"langchain-ollama": "ollama",
"langchain-perplexity": "perplexity",
"langchain-prompty": "prompty",
"langchain-qdrant": "qdrant",
"langchain-xai": "xai",
};
// All possible package labels we manage
const allPackageLabels = Object.values(mapping);
const selectedLabels = [];
// Check if this is checkbox format (multiple selection)
const checkboxMatches = packageSection.match(/- \[x\]\s+([^\n\r]+)/gi);
if (checkboxMatches) {
// Handle checkbox format
for (const match of checkboxMatches) {
const packageName = match.replace(/- \[x\]\s+/i, '').trim();
const label = mapping[packageName];
if (label && !selectedLabels.includes(label)) {
selectedLabels.push(label);
}
}
} else {
// Handle dropdown format (single selection)
const label = mapping[packageSection];
if (label) {
selectedLabels.push(label);
}
}
// Get current issue labels
const issue = await github.rest.issues.get({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number
});
const currentLabels = issue.data.labels.map(label => label.name);
const currentPackageLabels = currentLabels.filter(label => allPackageLabels.includes(label));
// Determine labels to add and remove
const labelsToAdd = selectedLabels.filter(label => !currentPackageLabels.includes(label));
const labelsToRemove = currentPackageLabels.filter(label => !selectedLabels.includes(label));
// Add new labels
if (labelsToAdd.length > 0) {
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
labels: labelsToAdd
});
}
// Remove old labels
for (const label of labelsToRemove) {
await github.rest.issues.removeLabel({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
name: label
});
}

View File

@@ -1,51 +0,0 @@
# Ensures version numbers in pyproject.toml and version.py stay in sync.
#
# (Prevents releases with mismatched version numbers)
name: "🔍 Check Version Equality"
on:
pull_request:
paths:
- "libs/core/pyproject.toml"
- "libs/core/langchain_core/version.py"
permissions:
contents: read
jobs:
check_version_equality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- name: "✅ Verify pyproject.toml & version.py Match"
run: |
# Check core versions
CORE_PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/core/pyproject.toml)
CORE_VERSION_PY_VERSION=$(grep -Po '(?<=^VERSION = ")[^"]*' libs/core/langchain_core/version.py)
# Compare core versions
if [ "$CORE_PYPROJECT_VERSION" != "$CORE_VERSION_PY_VERSION" ]; then
echo "langchain-core versions in pyproject.toml and version.py do not match!"
echo "pyproject.toml version: $CORE_PYPROJECT_VERSION"
echo "version.py version: $CORE_VERSION_PY_VERSION"
exit 1
else
echo "Core versions match: $CORE_PYPROJECT_VERSION"
fi
# Check langchain_v1 versions
LANGCHAIN_PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/langchain_v1/pyproject.toml)
LANGCHAIN_INIT_PY_VERSION=$(grep -Po '(?<=^__version__ = ")[^"]*' libs/langchain_v1/langchain/__init__.py)
# Compare langchain_v1 versions
if [ "$LANGCHAIN_PYPROJECT_VERSION" != "$LANGCHAIN_INIT_PY_VERSION" ]; then
echo "langchain_v1 versions in pyproject.toml and __init__.py do not match!"
echo "pyproject.toml version: $LANGCHAIN_PYPROJECT_VERSION"
echo "version.py version: $LANGCHAIN_INIT_PY_VERSION"
exit 1
else
echo "Langchain v1 versions match: $LANGCHAIN_PYPROJECT_VERSION"
fi

View File

@@ -1,261 +0,0 @@
# Primary CI workflow.
#
# Only runs against packages that have changed files.
#
# Runs:
# - Linting (_lint.yml)
# - Unit Tests (_test.yml)
# - Pydantic compatibility tests (_test_pydantic.yml)
# - Integration test compilation checks (_compile_integration_test.yml)
# - Extended test suites that require additional dependencies
# - Codspeed benchmarks (if not labeled 'codspeed-ignore')
#
# Reports status to GitHub checks and PR status.
name: "🔧 CI"
on:
push:
branches: [master]
pull_request:
merge_group:
# Optimizes CI performance by canceling redundant workflow runs
# If another push to the same PR or branch happens while this workflow is still running,
# cancel the earlier run in favor of the next run.
#
# There's no point in testing an outdated version of the code. GitHub only allows
# a limited number of job runners to be active at the same time, so it's better to
# cancel pointless jobs early so that more useful jobs can run sooner.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
permissions:
contents: read
env:
UV_FROZEN: "true"
UV_NO_SYNC: "true"
jobs:
# This job analyzes which files changed and creates a dynamic test matrix
# to only run tests/lints for the affected packages, improving CI efficiency
build:
name: "Detect Changes & Set Matrix"
runs-on: ubuntu-latest
if: ${{ !contains(github.event.pull_request.labels.*.name, 'ci-ignore') }}
steps:
- name: "📋 Checkout Code"
uses: actions/checkout@v5
- name: "🐍 Setup Python 3.11"
uses: actions/setup-python@v6
with:
python-version: "3.11"
- name: "📂 Get Changed Files"
id: files
uses: Ana06/get-changed-files@v2.3.0
- name: "🔍 Analyze Changed Files & Generate Build Matrix"
id: set-matrix
run: |
python -m pip install packaging requests
python .github/scripts/check_diff.py ${{ steps.files.outputs.all }} >> $GITHUB_OUTPUT
outputs:
lint: ${{ steps.set-matrix.outputs.lint }}
test: ${{ steps.set-matrix.outputs.test }}
extended-tests: ${{ steps.set-matrix.outputs.extended-tests }}
compile-integration-tests: ${{ steps.set-matrix.outputs.compile-integration-tests }}
dependencies: ${{ steps.set-matrix.outputs.dependencies }}
test-pydantic: ${{ steps.set-matrix.outputs.test-pydantic }}
codspeed: ${{ steps.set-matrix.outputs.codspeed }}
# Run linting only on packages that have changed files
lint:
needs: [build]
if: ${{ needs.build.outputs.lint != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.lint) }}
fail-fast: false
uses: ./.github/workflows/_lint.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
# Run unit tests only on packages that have changed files
test:
needs: [build]
if: ${{ needs.build.outputs.test != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.test) }}
fail-fast: false
uses: ./.github/workflows/_test.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
# Test compatibility with different Pydantic versions for affected packages
test-pydantic:
needs: [build]
if: ${{ needs.build.outputs.test-pydantic != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.test-pydantic) }}
fail-fast: false
uses: ./.github/workflows/_test_pydantic.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
pydantic-version: ${{ matrix.job-configs.pydantic-version }}
secrets: inherit
# Verify integration tests compile without actually running them (faster feedback)
compile-integration-tests:
name: "Compile Integration Tests"
needs: [build]
if: ${{ needs.build.outputs.compile-integration-tests != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.compile-integration-tests) }}
fail-fast: false
uses: ./.github/workflows/_compile_integration_test.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
# Run extended test suites that require additional dependencies
extended-tests:
name: "Extended Tests"
needs: [build]
if: ${{ needs.build.outputs.extended-tests != '[]' }}
strategy:
matrix:
# note different variable for extended test dirs
job-configs: ${{ fromJson(needs.build.outputs.extended-tests) }}
fail-fast: false
runs-on: ubuntu-latest
timeout-minutes: 20
defaults:
run:
working-directory: ${{ matrix.job-configs.working-directory }}
steps:
- uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ matrix.job-configs.python-version }} + UV"
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ matrix.job-configs.python-version }}
cache-suffix: extended-tests-${{ matrix.job-configs.working-directory }}
working-directory: ${{ matrix.job-configs.working-directory }}
- name: "📦 Install Dependencies & Run Extended Tests"
shell: bash
run: |
echo "Running extended tests, installing dependencies with uv..."
uv venv
uv sync --group test
VIRTUAL_ENV=.venv uv pip install -r extended_testing_deps.txt
VIRTUAL_ENV=.venv make extended_tests
- name: "🧹 Verify Clean Working Directory"
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'
# Run codspeed benchmarks only on packages that have changed files
codspeed:
name: "⚡ CodSpeed Benchmarks"
needs: [build]
if: ${{ needs.build.outputs.codspeed != '[]' && !contains(github.event.pull_request.labels.*.name, 'codspeed-ignore') }}
runs-on: ubuntu-latest
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.codspeed) }}
fail-fast: false
steps:
- uses: actions/checkout@v5
- name: "📦 Install UV Package Manager"
uses: astral-sh/setup-uv@v7
with:
python-version: "3.13"
- uses: actions/setup-python@v6
with:
python-version: "3.13"
- name: "📦 Install Test Dependencies"
run: uv sync --group test
working-directory: ${{ matrix.job-configs.working-directory }}
- name: "⚡ Run Benchmarks: ${{ matrix.job-configs.working-directory }}"
uses: CodSpeedHQ/action@v4
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
ANTHROPIC_FILES_API_IMAGE_ID: ${{ secrets.ANTHROPIC_FILES_API_IMAGE_ID }}
ANTHROPIC_FILES_API_PDF_ID: ${{ secrets.ANTHROPIC_FILES_API_PDF_ID }}
AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_API_KEY }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
NOMIC_API_KEY: ${{ secrets.NOMIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
PPLX_API_KEY: ${{ secrets.PPLX_API_KEY }}
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
with:
token: ${{ secrets.CODSPEED_TOKEN }}
run: |
cd ${{ matrix.job-configs.working-directory }}
if [ "${{ matrix.job-configs.working-directory }}" = "libs/core" ]; then
uv run --no-sync pytest ./tests/benchmarks --codspeed
else
uv run --no-sync pytest ./tests/ --codspeed
fi
mode: ${{ matrix.job-configs.working-directory == 'libs/core' && 'walltime' || 'instrumentation' }}
# Final status check - ensures all required jobs passed before allowing merge
ci_success:
name: "✅ CI Success"
needs:
[
build,
lint,
test,
compile-integration-tests,
extended-tests,
test-pydantic,
codspeed,
]
if: |
always()
runs-on: ubuntu-latest
env:
JOBS_JSON: ${{ toJSON(needs) }}
RESULTS_JSON: ${{ toJSON(needs.*.result) }}
EXIT_CODE: ${{!contains(needs.*.result, 'failure') && !contains(needs.*.result, 'cancelled') && '0' || '1'}}
steps:
- name: "🎉 All Checks Passed"
run: |
echo $JOBS_JSON
echo $RESULTS_JSON
echo "Exiting with $EXIT_CODE"
exit $EXIT_CODE

View File

@@ -1,181 +0,0 @@
# Routine integration tests against partner libraries with live API credentials.
#
# Uses `make integration_tests` for each library in the matrix.
#
# Runs daily. Can also be triggered manually for immediate updates.
name: "⏰ Integration Tests"
run-name: "Run Integration Tests - ${{ inputs.working-directory-force || 'all libs' }} (Python ${{ inputs.python-version-force || '3.10, 3.13' }})"
on:
workflow_dispatch:
inputs:
working-directory-force:
type: string
description: "From which folder this pipeline executes - defaults to all in matrix - example value: libs/partners/anthropic"
python-version-force:
type: string
description: "Python version to use - defaults to 3.10 and 3.13 in matrix - example value: 3.11"
schedule:
- cron: "0 13 * * *" # Runs daily at 1PM UTC (9AM EDT/6AM PDT)
permissions:
contents: read
env:
UV_FROZEN: "true"
DEFAULT_LIBS: '["libs/partners/openai", "libs/partners/anthropic", "libs/partners/fireworks", "libs/partners/groq", "libs/partners/mistralai", "libs/partners/xai", "libs/partners/google-vertexai", "libs/partners/google-genai", "libs/partners/aws"]'
jobs:
# Generate dynamic test matrix based on input parameters or defaults
# Only runs on the main repo (for scheduled runs) or when manually triggered
compute-matrix:
if: github.repository_owner == 'langchain-ai' || github.event_name != 'schedule'
runs-on: ubuntu-latest
name: "📋 Compute Test Matrix"
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- name: "🔢 Generate Python & Library Matrix"
id: set-matrix
env:
DEFAULT_LIBS: ${{ env.DEFAULT_LIBS }}
WORKING_DIRECTORY_FORCE: ${{ github.event.inputs.working-directory-force || '' }}
PYTHON_VERSION_FORCE: ${{ github.event.inputs.python-version-force || '' }}
run: |
# echo "matrix=..." where matrix is a json formatted str with keys python-version and working-directory
# python-version should default to 3.10 and 3.13, but is overridden to [PYTHON_VERSION_FORCE] if set
# working-directory should default to DEFAULT_LIBS, but is overridden to [WORKING_DIRECTORY_FORCE] if set
python_version='["3.10", "3.13"]'
working_directory="$DEFAULT_LIBS"
if [ -n "$PYTHON_VERSION_FORCE" ]; then
python_version="[\"$PYTHON_VERSION_FORCE\"]"
fi
if [ -n "$WORKING_DIRECTORY_FORCE" ]; then
working_directory="[\"$WORKING_DIRECTORY_FORCE\"]"
fi
matrix="{\"python-version\": $python_version, \"working-directory\": $working_directory}"
echo $matrix
echo "matrix=$matrix" >> $GITHUB_OUTPUT
# Run integration tests against partner libraries with live API credentials
build:
if: github.repository_owner == 'langchain-ai' || github.event_name != 'schedule'
name: "🐍 Python ${{ matrix.python-version }}: ${{ matrix.working-directory }}"
runs-on: ubuntu-latest
needs: [compute-matrix]
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
python-version: ${{ fromJSON(needs.compute-matrix.outputs.matrix).python-version }}
working-directory: ${{ fromJSON(needs.compute-matrix.outputs.matrix).working-directory }}
steps:
- uses: actions/checkout@v5
with:
path: langchain
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-google
path: langchain-google
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-aws
path: langchain-aws
- name: "📦 Organize External Libraries"
run: |
rm -rf \
langchain/libs/partners/google-genai \
langchain/libs/partners/google-vertexai
mv langchain-google/libs/genai langchain/libs/partners/google-genai
mv langchain-google/libs/vertexai langchain/libs/partners/google-vertexai
mv langchain-aws/libs/aws langchain/libs/partners/aws
- name: "🐍 Set up Python ${{ matrix.python-version }} + UV"
uses: "./langchain/.github/actions/uv_setup"
with:
python-version: ${{ matrix.python-version }}
- name: "🔐 Authenticate to Google Cloud"
id: "auth"
uses: google-github-actions/auth@v3
with:
credentials_json: "${{ secrets.GOOGLE_CREDENTIALS }}"
- name: "🔐 Configure AWS Credentials"
uses: aws-actions/configure-aws-credentials@v5
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ secrets.AWS_REGION }}
- name: "📦 Install Dependencies"
run: |
echo "Running scheduled tests, installing dependencies with uv..."
cd langchain/${{ matrix.working-directory }}
uv sync --group test --group test_integration
- name: "🚀 Run Integration Tests"
env:
AI21_API_KEY: ${{ secrets.AI21_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
ANTHROPIC_FILES_API_IMAGE_ID: ${{ secrets.ANTHROPIC_FILES_API_IMAGE_ID }}
ANTHROPIC_FILES_API_PDF_ID: ${{ secrets.ANTHROPIC_FILES_API_PDF_ID }}
ASTRA_DB_API_ENDPOINT: ${{ secrets.ASTRA_DB_API_ENDPOINT }}
ASTRA_DB_APPLICATION_TOKEN: ${{ secrets.ASTRA_DB_APPLICATION_TOKEN }}
ASTRA_DB_KEYSPACE: ${{ secrets.ASTRA_DB_KEYSPACE }}
AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_API_KEY }}
ES_URL: ${{ secrets.ES_URL }}
ES_CLOUD_ID: ${{ secrets.ES_CLOUD_ID }}
ES_API_KEY: ${{ secrets.ES_API_KEY }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
MONGODB_ATLAS_URI: ${{ secrets.MONGODB_ATLAS_URI }}
NOMIC_API_KEY: ${{ secrets.NOMIC_API_KEY }}
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
PPLX_API_KEY: ${{ secrets.PPLX_API_KEY }}
TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}
UPSTAGE_API_KEY: ${{ secrets.UPSTAGE_API_KEY }}
WATSONX_APIKEY: ${{ secrets.WATSONX_APIKEY }}
WATSONX_PROJECT_ID: ${{ secrets.WATSONX_PROJECT_ID }}
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
LANGCHAIN_TESTS_USER_AGENT: ${{ secrets.LANGCHAIN_TESTS_USER_AGENT }}
run: |
cd langchain/${{ matrix.working-directory }}
make integration_tests
- name: "🧹 Clean up External Libraries"
# Clean up external libraries to avoid affecting the following git status check
run: |
rm -rf \
langchain/libs/partners/google-genai \
langchain/libs/partners/google-vertexai \
langchain/libs/partners/aws
- name: "🧹 Verify Clean Working Directory"
working-directory: langchain
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

38
.github/workflows/linkcheck.yml vendored Normal file
View File

@@ -0,0 +1,38 @@
name: linkcheck
on:
push:
branches: [master]
pull_request:
paths:
- 'docs/**'
env:
POETRY_VERSION: "1.4.2"
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.11"
steps:
- uses: actions/checkout@v3
- name: Install poetry
run: |
pipx install poetry==$POETRY_VERSION
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: poetry
- name: Install dependencies
run: |
poetry install --with docs
- name: Build the docs
run: |
make docs_build
- name: Analyzing the docs with linkcheck
run: |
make docs_linkcheck

36
.github/workflows/lint.yml vendored Normal file
View File

@@ -0,0 +1,36 @@
name: lint
on:
push:
branches: [master]
pull_request:
env:
POETRY_VERSION: "1.4.2"
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
steps:
- uses: actions/checkout@v3
- name: Install poetry
run: |
pipx install poetry==$POETRY_VERSION
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
cache: poetry
- name: Install dependencies
run: |
poetry install
- name: Analysing the code with our lint
run: |
make lint

View File

@@ -1,28 +0,0 @@
# Label PRs based on changed files.
#
# See `.github/pr-file-labeler.yml` to see rules for each label/directory.
name: "🏷️ Pull Request Labeler"
on:
# Safe since we're not checking out or running the PR's code
# Never check out the PR's head in a pull_request_target job
pull_request_target:
types: [opened, synchronize, reopened, edited]
jobs:
labeler:
name: "label"
permissions:
contents: read
pull-requests: write
issues: write
runs-on: ubuntu-latest
steps:
- name: Label Pull Request
uses: actions/labeler@v6
with:
repo-token: "${{ secrets.GITHUB_TOKEN }}"
configuration-path: .github/pr-file-labeler.yml
sync-labels: false

View File

@@ -1,44 +0,0 @@
# Label PRs based on their titles.
#
# Uses conventional commit types from PR titles to apply labels.
# Note: Scope-based labeling (e.g., integration labels) is handled by pr_labeler_file.yml
name: "🏷️ PR Title Labeler"
on:
# Safe since we're not checking out or running the PR's code
# Never check out the PR's head in a pull_request_target job
pull_request_target:
types: [opened, edited]
jobs:
pr-title-labeler:
name: "label"
permissions:
contents: read
pull-requests: write
issues: write
runs-on: ubuntu-latest
steps:
- name: Label PR based on title
uses: bcoe/conventional-release-labels@v1
with:
token: ${{ secrets.GITHUB_TOKEN }}
type_labels: >-
{
"feat": "feature",
"fix": "fix",
"docs": "documentation",
"style": "linting",
"refactor": "refactor",
"perf": "performance",
"test": "tests",
"build": "infra",
"ci": "infra",
"chore": "infra",
"revert": "revert",
"release": "release",
"breaking": "breaking"
}
ignored_types: '[]'

View File

@@ -1,110 +0,0 @@
# PR title linting.
#
# FORMAT (Conventional Commits 1.0.0):
#
# <type>[optional scope]: <description>
# [optional body]
# [optional footer(s)]
#
# Examples:
# feat(core): add multitenant support
# fix(cli): resolve flag parsing error
# docs: update API usage examples
# docs(openai): update API usage examples
#
# Allowed Types:
# * feat — a new feature (MINOR)
# * fix — a bug fix (PATCH)
# * docs — documentation only changes
# * style — formatting, linting, etc.; no code change or typing refactors
# * refactor — code change that neither fixes a bug nor adds a feature
# * perf — code change that improves performance
# * test — adding tests or correcting existing
# * build — changes that affect the build system/external dependencies
# * ci — continuous integration/configuration changes
# * chore — other changes that don't modify source or test files
# * revert — reverts a previous commit
# * release — prepare a new release
#
# Allowed Scope(s) (optional):
# core, cli, langchain, langchain_v1, langchain-classic, standard-tests,
# text-splitters, docs, anthropic, chroma, deepseek, exa, fireworks, groq,
# huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant,
# xai, infra, deps
#
# Multiple scopes can be used by separating them with a comma.
#
# Rules:
# 1. The 'Type' must start with a lowercase letter.
# 2. Breaking changes: append "!" after type/scope (e.g., feat!: drop x support)
# 3. When releasing (updating the pyproject.toml and uv.lock), the commit message
# should be: `release(scope): x.y.z` (e.g., `release(core): 1.2.0` with no
# body, footer, or preceeding/proceeding text).
#
# Enforces Conventional Commits format for pull request titles to maintain a clear and
# machine-readable change history.
name: "🏷️ PR Title Lint"
permissions:
pull-requests: read
on:
pull_request:
types: [opened, edited, synchronize]
jobs:
# Validates that PR title follows Conventional Commits 1.0.0 specification
lint-pr-title:
name: "validate format"
runs-on: ubuntu-latest
steps:
- name: "✅ Validate Conventional Commits Format"
uses: amannn/action-semantic-pull-request@v6
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
types: |
feat
fix
docs
style
refactor
perf
test
build
ci
chore
revert
release
scopes: |
core
cli
langchain
langchain-classic
model-profiles
standard-tests
text-splitters
docs
anthropic
chroma
deepseek
exa
fireworks
groq
huggingface
mistralai
nomic
ollama
openai
perplexity
prompty
qdrant
xai
infra
requireScope: false
disallowScopes: |
release
[A-Z]+
ignoreLabels: |
ignore-lint-pr-title

49
.github/workflows/release.yml vendored Normal file
View File

@@ -0,0 +1,49 @@
name: release
on:
pull_request:
types:
- closed
branches:
- master
paths:
- 'pyproject.toml'
env:
POETRY_VERSION: "1.4.2"
jobs:
if_release:
if: |
${{ github.event.pull_request.merged == true }}
&& ${{ contains(github.event.pull_request.labels.*.name, 'release') }}
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install poetry
run: pipx install poetry==$POETRY_VERSION
- name: Set up Python 3.10
uses: actions/setup-python@v4
with:
python-version: "3.10"
cache: "poetry"
- name: Build project for distribution
run: poetry build
- name: Check Version
id: check-version
run: |
echo version=$(poetry version --short) >> $GITHUB_OUTPUT
- name: Create Release
uses: ncipollo/release-action@v1
with:
artifacts: "dist/*"
token: ${{ secrets.GITHUB_TOKEN }}
draft: false
generateReleaseNotes: true
tag: v${{ steps.check-version.outputs.version }}
commit: master
- name: Publish to PyPI
env:
POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_API_TOKEN }}
run: |
poetry publish

49
.github/workflows/test.yml vendored Normal file
View File

@@ -0,0 +1,49 @@
name: test
on:
push:
branches: [master]
pull_request:
workflow_dispatch:
env:
POETRY_VERSION: "1.4.2"
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
test_type:
- "core"
- "extended"
name: Python ${{ matrix.python-version }} ${{ matrix.test_type }}
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
poetry-version: "1.4.2"
cache-key: ${{ matrix.test_type }}
install-command: |
if [ "${{ matrix.test_type }}" == "core" ]; then
echo "Running core tests, installing dependencies with poetry..."
poetry install
else
echo "Running extended tests, installing dependencies with poetry..."
poetry install -E extended_testing
fi
- name: Run ${{matrix.test_type}} tests
run: |
if [ "${{ matrix.test_type }}" == "core" ]; then
make test
else
make extended_tests
fi
shell: bash

View File

@@ -1,164 +0,0 @@
# Build the API reference documentation for v0.3 branch.
#
# Manual trigger only.
#
# Built HTML pushed to langchain-ai/langchain-api-docs-html.
#
# Looks for langchain-ai org repos in packages.yml and checks them out.
# Calls prep_api_docs_build.py.
name: "📚 API Docs (v0.3)"
run-name: "Build & Deploy API Reference (v0.3)"
on:
workflow_dispatch:
env:
PYTHON_VERSION: "3.11"
jobs:
build:
if: github.repository == 'langchain-ai/langchain' || github.event_name != 'schedule'
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v5
with:
ref: v0.3
path: langchain
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-api-docs-html
path: langchain-api-docs-html
token: ${{ secrets.TOKEN_GITHUB_API_DOCS_HTML }}
- name: "📋 Extract Repository List with yq"
id: get-unsorted-repos
uses: mikefarah/yq@master
with:
cmd: |
# Extract repos from packages.yml that are in the langchain-ai org
# (excluding 'langchain' itself)
yq '
.packages[]
| select(
(
(.repo | test("^langchain-ai/"))
and
(.repo != "langchain-ai/langchain")
)
or
(.include_in_api_ref // false)
)
| .repo
' langchain/libs/packages.yml
- name: "📋 Parse YAML & Checkout Repositories"
env:
REPOS_UNSORTED: ${{ steps.get-unsorted-repos.outputs.result }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Get unique repositories
REPOS=$(echo "$REPOS_UNSORTED" | sort -u)
# Checkout each unique repository
for repo in $REPOS; do
# Validate repository format (allow any org with proper format)
if [[ ! "$repo" =~ ^[a-zA-Z0-9_.-]+/[a-zA-Z0-9_.-]+$ ]]; then
echo "Error: Invalid repository format: $repo"
exit 1
fi
REPO_NAME=$(echo $repo | cut -d'/' -f2)
# Additional validation for repo name
if [[ ! "$REPO_NAME" =~ ^[a-zA-Z0-9_.-]+$ ]]; then
echo "Error: Invalid repository name: $REPO_NAME"
exit 1
fi
echo "Checking out $repo to $REPO_NAME"
# Special handling for langchain-tavily: checkout by commit hash
if [[ "$REPO_NAME" == "langchain-tavily" ]]; then
git clone https://github.com/$repo.git $REPO_NAME
cd $REPO_NAME
git checkout f3515654724a9e87bdfe2c2f509d6cdde646e563
cd ..
else
git clone --depth 1 --branch v0.3 https://github.com/$repo.git $REPO_NAME
fi
done
- name: "🐍 Setup Python ${{ env.PYTHON_VERSION }}"
uses: actions/setup-python@v6
id: setup-python
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: "📦 Install Initial Python Dependencies using uv"
working-directory: langchain
run: |
python -m pip install -U uv
python -m uv pip install --upgrade --no-cache-dir pip setuptools pyyaml
- name: "📦 Organize Library Directories"
# Places cloned partner packages into libs/partners structure
run: python langchain/.github/scripts/prep_api_docs_build.py
- name: "🧹 Clear Prior Build"
run:
# Remove artifacts from prior docs build
rm -rf langchain-api-docs-html/api_reference_build/html
- name: "📦 Install Documentation Dependencies using uv"
working-directory: langchain
run: |
# Install all partner packages in editable mode with overrides
python -m uv pip install $(ls ./libs/partners | grep -v azure-ai | xargs -I {} echo "./libs/partners/{}") --overrides ./docs/vercel_overrides.txt --prerelease=allow
# Install langchain-azure-ai with tools extra
python -m uv pip install "./libs/partners/azure-ai[tools]" --overrides ./docs/vercel_overrides.txt --prerelease=allow
# Install core langchain and other main packages
python -m uv pip install libs/core libs/langchain libs/text-splitters libs/community libs/experimental libs/standard-tests
# Install Sphinx and related packages for building docs
python -m uv pip install -r docs/api_reference/requirements.txt
- name: "🔧 Configure Git Settings"
working-directory: langchain
run: |
git config --local user.email "actions@github.com"
git config --local user.name "Github Actions"
- name: "📚 Build API Documentation"
working-directory: langchain
run: |
# Generate the API reference RST files
python docs/api_reference/create_api_rst.py
# Build the HTML documentation using Sphinx
# -T: show full traceback on exception
# -E: don't use cached environment (force rebuild, ignore cached doctrees)
# -b html: build HTML docs (vs PDS, etc.)
# -d: path for the cached environment (parsed document trees / doctrees)
# - Separate from output dir for faster incremental builds
# -c: path to conf.py
# -j auto: parallel build using all available CPU cores
python -m sphinx -T -E -b html -d ../langchain-api-docs-html/_build/doctrees -c docs/api_reference docs/api_reference ../langchain-api-docs-html/api_reference_build/html -j auto
# Post-process the generated HTML
python docs/api_reference/scripts/custom_formatter.py ../langchain-api-docs-html/api_reference_build/html
# Default index page is blank so we copy in the actual home page.
cp ../langchain-api-docs-html/api_reference_build/html/{reference,index}.html
# Removes Sphinx's intermediate build artifacts after the build is complete.
rm -rf ../langchain-api-docs-html/_build/
# Commit and push changes to langchain-api-docs-html repo
- uses: EndBug/add-and-commit@v9
with:
cwd: langchain-api-docs-html
message: "Update API docs build from v0.3 branch"

View File

@@ -1,8 +0,0 @@
With the deprecation of v0 docs, the following files will need to be migrated/supported
in the new docs repo:
- run_notebooks.yml: New repo should run Integration tests on code snippets?
- people.yml: Need to fix and somehow display on the new docs site
- Subsequently, `.github/actions/people/`
- _test_doc_imports.yml
- check-broken-links.yml

34
.gitignore vendored
View File

@@ -1,8 +1,6 @@
.vs/
.claude/
.vscode/
.idea/
#Emacs backup
*~
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
@@ -32,12 +30,6 @@ share/python-wheels/
*.egg
MANIFEST
# Google GitHub Actions credentials files created by:
# https://github.com/google-github-actions/auth
#
# That action recommends adding this gitignore to prevent accidentally committing keys.
gha-creds-*.json
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
@@ -61,7 +53,6 @@ coverage.xml
*.py,cover
.hypothesis/
.pytest_cache/
.codspeed/
# Translations
*.mo
@@ -80,6 +71,9 @@ instance/
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
@@ -114,11 +108,13 @@ celerybeat.pid
# Environments
.env
.envrc
.venv*
venv*
.venv
.venvs
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
@@ -132,7 +128,6 @@ env.bak/
# mypy
.mypy_cache/
.mypy_cache_test/
.dmypy.json
dmypy.json
@@ -154,15 +149,4 @@ wandb/
# integration test artifacts
data_map*
\[('_type', 'fake'), ('stop', None)]
# Replit files
*replit*
node_modules
prof
virtualenv/
scratch/
.langgraph_api/
\[('_type', 'fake'), ('stop', None)]

View File

@@ -1,14 +0,0 @@
{
"MD013": false,
"MD024": {
"siblings_only": true
},
"MD025": false,
"MD033": false,
"MD034": false,
"MD036": false,
"MD041": false,
"MD046": {
"style": "fenced"
}
}

View File

@@ -1,8 +0,0 @@
{
"mcpServers": {
"docs-langchain": {
"type": "http",
"url": "https://docs.langchain.com/mcp"
}
}
}

View File

@@ -1,99 +0,0 @@
repos:
- repo: local
hooks:
- id: core
name: format and lint core
language: system
entry: make -C libs/core format lint
files: ^libs/core/
pass_filenames: false
- id: langchain
name: format and lint langchain
language: system
entry: make -C libs/langchain format lint
files: ^libs/langchain/
pass_filenames: false
- id: standard-tests
name: format and lint standard-tests
language: system
entry: make -C libs/standard-tests format lint
files: ^libs/standard-tests/
pass_filenames: false
- id: text-splitters
name: format and lint text-splitters
language: system
entry: make -C libs/text-splitters format lint
files: ^libs/text-splitters/
pass_filenames: false
- id: anthropic
name: format and lint partners/anthropic
language: system
entry: make -C libs/partners/anthropic format lint
files: ^libs/partners/anthropic/
pass_filenames: false
- id: chroma
name: format and lint partners/chroma
language: system
entry: make -C libs/partners/chroma format lint
files: ^libs/partners/chroma/
pass_filenames: false
- id: exa
name: format and lint partners/exa
language: system
entry: make -C libs/partners/exa format lint
files: ^libs/partners/exa/
pass_filenames: false
- id: fireworks
name: format and lint partners/fireworks
language: system
entry: make -C libs/partners/fireworks format lint
files: ^libs/partners/fireworks/
pass_filenames: false
- id: groq
name: format and lint partners/groq
language: system
entry: make -C libs/partners/groq format lint
files: ^libs/partners/groq/
pass_filenames: false
- id: huggingface
name: format and lint partners/huggingface
language: system
entry: make -C libs/partners/huggingface format lint
files: ^libs/partners/huggingface/
pass_filenames: false
- id: mistralai
name: format and lint partners/mistralai
language: system
entry: make -C libs/partners/mistralai format lint
files: ^libs/partners/mistralai/
pass_filenames: false
- id: nomic
name: format and lint partners/nomic
language: system
entry: make -C libs/partners/nomic format lint
files: ^libs/partners/nomic/
pass_filenames: false
- id: ollama
name: format and lint partners/ollama
language: system
entry: make -C libs/partners/ollama format lint
files: ^libs/partners/ollama/
pass_filenames: false
- id: openai
name: format and lint partners/openai
language: system
entry: make -C libs/partners/openai format lint
files: ^libs/partners/openai/
pass_filenames: false
- id: prompty
name: format and lint partners/prompty
language: system
entry: make -C libs/partners/prompty format lint
files: ^libs/partners/prompty/
pass_filenames: false
- id: qdrant
name: format and lint partners/qdrant
language: system
entry: make -C libs/partners/qdrant format lint
files: ^libs/partners/qdrant/
pass_filenames: false

26
.readthedocs.yaml Normal file
View File

@@ -0,0 +1,26 @@
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
# Required
version: 2
# Set the version of Python and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.11"
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/conf.py
# If using Sphinx, optionally build your docs in additional formats such as PDF
# formats:
# - pdf
# Optionally declare the Python requirements required to build your docs
python:
install:
- requirements: docs/requirements.txt
- method: pip
path: .

View File

@@ -1,21 +0,0 @@
{
"recommendations": [
"ms-python.python",
"charliermarsh.ruff",
"ms-python.mypy-type-checker",
"ms-toolsai.jupyter",
"ms-toolsai.jupyter-keymap",
"ms-toolsai.jupyter-renderers",
"ms-toolsai.vscode-jupyter-cell-tags",
"ms-toolsai.vscode-jupyter-slideshow",
"yzhang.markdown-all-in-one",
"davidanson.vscode-markdownlint",
"bierner.markdown-mermaid",
"bierner.markdown-preview-github-styles",
"eamodio.gitlens",
"github.vscode-pull-request-github",
"github.vscode-github-actions",
"redhat.vscode-yaml",
"editorconfig.editorconfig",
],
}

78
.vscode/settings.json vendored
View File

@@ -1,78 +0,0 @@
{
"python.analysis.include": [
"libs/**",
],
"python.analysis.exclude": [
"**/node_modules",
"**/__pycache__",
"**/.pytest_cache",
"**/.*",
],
"python.analysis.autoImportCompletions": true,
"python.analysis.typeCheckingMode": "basic",
"python.testing.cwd": "${workspaceFolder}",
"python.linting.enabled": true,
"python.linting.ruffEnabled": true,
"[python]": {
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports.ruff": "explicit",
"source.fixAll": "explicit"
},
"editor.defaultFormatter": "charliermarsh.ruff"
},
"editor.rulers": [
88
],
"editor.tabSize": 4,
"editor.insertSpaces": true,
"editor.trimAutoWhitespace": true,
"files.trimTrailingWhitespace": true,
"files.insertFinalNewline": true,
"files.exclude": {
"**/__pycache__": true,
"**/.pytest_cache": true,
"**/*.pyc": true,
"**/.mypy_cache": true,
"**/.ruff_cache": true,
"_dist/**": true,
"**/node_modules": true,
"**/.git": false
},
"search.exclude": {
"**/__pycache__": true,
"**/*.pyc": true,
"_dist/**": true,
"**/node_modules": true,
"**/.git": true,
"uv.lock": true,
"yarn.lock": true
},
"git.autofetch": true,
"git.enableSmartCommit": true,
"jupyter.askForKernelRestart": false,
"jupyter.interactiveWindow.textEditor.executeSelection": true,
"[markdown]": {
"editor.wordWrap": "on",
"editor.quickSuggestions": {
"comments": "off",
"strings": "off",
"other": "off"
}
},
"[yaml]": {
"editor.tabSize": 2,
"editor.insertSpaces": true
},
"[json]": {
"editor.tabSize": 2,
"editor.insertSpaces": true
},
"python.terminal.activateEnvironment": false,
"python.defaultInterpreterPath": "./.venv/bin/python",
"github.copilot.chat.commitMessageGeneration.instructions": [
{
"file": ".github/workflows/pr_lint.yml"
}
]
}

326
AGENTS.md
View File

@@ -1,326 +0,0 @@
# Global Development Guidelines for LangChain Projects
## Core Development Principles
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
**Bad - Breaking Change:**
```python
def get_user(id, verbose=False): # Changed from `user_id`
pass
```
**Good - Stable Interface:**
```python
def get_user(user_id: str, verbose: bool = False) -> User:
"""Retrieve user by ID with optional verbose output."""
pass
```
**Before making ANY changes to public APIs:**
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
### 2. Code Quality Standards
**All Python code MUST include type hints and return types.**
**Bad:**
```python
def p(u, d):
return [x for x in u if x not in d]
```
**Good:**
```python
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Filter out users that are not in the known users set.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
return [user for user in users if user not in known_users]
```
**Style Requirements:**
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
- Avoid unnecessary abstraction or premature optimization
- Follow existing patterns in the codebase you're modifying
### 3. Testing Requirements
**Every new feature or bugfix MUST be covered by unit tests.**
**Test Organization:**
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- Use `pytest` as the testing framework
**Test Quality Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
Checklist questions:
- [ ] Does the test suite fail if your new logic is broken?
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
- [ ] Do tests use fixtures or mocks where needed?
```python
def test_filter_unknown_users():
"""Test filtering unknown users from a list."""
users = ["alice", "bob", "charlie"]
known_users = {"alice", "bob"}
result = filter_unknown_users(users, known_users)
assert result == ["charlie"]
assert len(result) == 1
```
### 4. Security and Risk Assessment
**Security Checklist:**
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
**Bad:**
```python
def load_config(path):
with open(path) as f:
return eval(f.read()) # ⚠️ Never eval config
```
**Good:**
```python
import json
def load_config(path: str) -> dict:
with open(path) as f:
return json.load(f)
```
### 5. Documentation Standards
**Use Google-style docstrings with Args section for all public functions.**
**Insufficient Documentation:**
```python
def send_email(to, msg):
"""Send an email to a recipient."""
```
**Complete Documentation:**
```python
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""
Send an email to a recipient with specified priority.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level (`'low'`, `'normal'`, `'high'`).
Returns:
`True` if email was sent successfully, `False` otherwise.
Raises:
`InvalidEmailError`: If the email address format is invalid.
`SMTPConnectionError`: If unable to connect to email server.
"""
```
**Documentation Guidelines:**
- Types go in function signatures, NOT in docstrings
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
- Ensure American English spelling (e.g., "behavior", not "behaviour")
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
### 6. Architectural Improvements
**When you encounter code that could be improved, suggest better designs:**
**Poor Design:**
```python
def process_data(data, db_conn, email_client, logger):
# Function doing too many things
validated = validate_data(data)
result = db_conn.save(validated)
email_client.send_notification(result)
logger.log(f"Processed {len(data)} items")
return result
```
**Better Design:**
```python
@dataclass
class ProcessingResult:
"""Result of data processing operation."""
items_processed: int
success: bool
errors: List[str] = field(default_factory=list)
class DataProcessor:
"""Handles data validation, storage, and notification."""
def __init__(self, db_conn: Database, email_client: EmailClient):
self.db = db_conn
self.email = email_client
def process(self, data: List[dict]) -> ProcessingResult:
"""Process and store data with notifications."""
validated = self._validate_data(data)
result = self.db.save(validated)
self._notify_completion(result)
return result
```
**Design Improvement Areas:**
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
- Reduce code duplication through shared utilities
- Make unit testing easier
- Improve separation of concerns (single responsibility)
- Make unit testing easier through dependency injection
- Add clarity without adding complexity
- Prefer dataclasses for structured data
## Development Tools & Commands
### Package Management
```bash
# Add package
uv add package-name
# Sync project dependencies
uv sync
uv lock
```
### Testing
```bash
# Run unit tests (no network)
make test
# Don't run integration tests, as API keys must be set
# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py
```
### Code Quality
```bash
# Lint code
make lint
# Format code
make format
# Type checking
uv run --group lint mypy .
```
### Dependency Management Patterns
**Local Development Dependencies:**
```toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }
```
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
```python
from langchain_core.tools import tool
@tool
def search_database(query: str) -> str:
"""Search the database for relevant information.
Args:
query: The search query string.
"""
# Implementation here
return results
```
## Commit Standards
**Use Conventional Commits format for PR titles:**
- `feat(core): add multi-tenant support`
- `fix(cli): resolve flag parsing error`
- `docs: update API usage examples`
- `docs(openai): update API usage examples`
## Framework-Specific Guidelines
- Follow the existing patterns in `langchain-core` for base abstractions
- Use `langchain_core.callbacks` for execution tracking
- Implement proper streaming support where applicable
- Avoid deprecated components like legacy `LLMChain`
### Partner Integrations
- Follow the established patterns in existing partner libraries
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
- Include comprehensive integration tests
- Document API key requirements and authentication
---
## Quick Reference Checklist
Before submitting code changes:
- [ ] **Breaking Changes**: Verified no public API changes
- [ ] **Type Hints**: All functions have complete type annotations
- [ ] **Tests**: New functionality is fully tested
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
- [ ] **Documentation**: Google-style docstrings for public functions
- [ ] **Code Quality**: `make lint` and `make format` pass
- [ ] **Architecture**: Suggested improvements where applicable
- [ ] **Commit Message**: Follows Conventional Commits format

View File

@@ -5,4 +5,4 @@ authors:
given-names: "Harrison"
title: "LangChain"
date-released: 2022-10-17
url: "https://github.com/langchain-ai/langchain"
url: "https://github.com/hwchase17/langchain"

326
CLAUDE.md
View File

@@ -1,326 +0,0 @@
# Global Development Guidelines for LangChain Projects
## Core Development Principles
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
**Bad - Breaking Change:**
```python
def get_user(id, verbose=False): # Changed from `user_id`
pass
```
**Good - Stable Interface:**
```python
def get_user(user_id: str, verbose: bool = False) -> User:
"""Retrieve user by ID with optional verbose output."""
pass
```
**Before making ANY changes to public APIs:**
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
### 2. Code Quality Standards
**All Python code MUST include type hints and return types.**
**Bad:**
```python
def p(u, d):
return [x for x in u if x not in d]
```
**Good:**
```python
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Filter out users that are not in the known users set.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
return [user for user in users if user not in known_users]
```
**Style Requirements:**
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
- Avoid unnecessary abstraction or premature optimization
- Follow existing patterns in the codebase you're modifying
### 3. Testing Requirements
**Every new feature or bugfix MUST be covered by unit tests.**
**Test Organization:**
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- Use `pytest` as the testing framework
**Test Quality Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
Checklist questions:
- [ ] Does the test suite fail if your new logic is broken?
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
- [ ] Do tests use fixtures or mocks where needed?
```python
def test_filter_unknown_users():
"""Test filtering unknown users from a list."""
users = ["alice", "bob", "charlie"]
known_users = {"alice", "bob"}
result = filter_unknown_users(users, known_users)
assert result == ["charlie"]
assert len(result) == 1
```
### 4. Security and Risk Assessment
**Security Checklist:**
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
**Bad:**
```python
def load_config(path):
with open(path) as f:
return eval(f.read()) # ⚠️ Never eval config
```
**Good:**
```python
import json
def load_config(path: str) -> dict:
with open(path) as f:
return json.load(f)
```
### 5. Documentation Standards
**Use Google-style docstrings with Args section for all public functions.**
**Insufficient Documentation:**
```python
def send_email(to, msg):
"""Send an email to a recipient."""
```
**Complete Documentation:**
```python
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""
Send an email to a recipient with specified priority.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level (`'low'`, `'normal'`, `'high'`).
Returns:
`True` if email was sent successfully, `False` otherwise.
Raises:
`InvalidEmailError`: If the email address format is invalid.
`SMTPConnectionError`: If unable to connect to email server.
"""
```
**Documentation Guidelines:**
- Types go in function signatures, NOT in docstrings
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
- Ensure American English spelling (e.g., "behavior", not "behaviour")
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
### 6. Architectural Improvements
**When you encounter code that could be improved, suggest better designs:**
**Poor Design:**
```python
def process_data(data, db_conn, email_client, logger):
# Function doing too many things
validated = validate_data(data)
result = db_conn.save(validated)
email_client.send_notification(result)
logger.log(f"Processed {len(data)} items")
return result
```
**Better Design:**
```python
@dataclass
class ProcessingResult:
"""Result of data processing operation."""
items_processed: int
success: bool
errors: List[str] = field(default_factory=list)
class DataProcessor:
"""Handles data validation, storage, and notification."""
def __init__(self, db_conn: Database, email_client: EmailClient):
self.db = db_conn
self.email = email_client
def process(self, data: List[dict]) -> ProcessingResult:
"""Process and store data with notifications."""
validated = self._validate_data(data)
result = self.db.save(validated)
self._notify_completion(result)
return result
```
**Design Improvement Areas:**
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
- Reduce code duplication through shared utilities
- Make unit testing easier
- Improve separation of concerns (single responsibility)
- Make unit testing easier through dependency injection
- Add clarity without adding complexity
- Prefer dataclasses for structured data
## Development Tools & Commands
### Package Management
```bash
# Add package
uv add package-name
# Sync project dependencies
uv sync
uv lock
```
### Testing
```bash
# Run unit tests (no network)
make test
# Don't run integration tests, as API keys must be set
# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py
```
### Code Quality
```bash
# Lint code
make lint
# Format code
make format
# Type checking
uv run --group lint mypy .
```
### Dependency Management Patterns
**Local Development Dependencies:**
```toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }
```
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
```python
from langchain_core.tools import tool
@tool
def search_database(query: str) -> str:
"""Search the database for relevant information.
Args:
query: The search query string.
"""
# Implementation here
return results
```
## Commit Standards
**Use Conventional Commits format for PR titles:**
- `feat(core): add multi-tenant support`
- `fix(cli): resolve flag parsing error`
- `docs: update API usage examples`
- `docs(openai): update API usage examples`
## Framework-Specific Guidelines
- Follow the existing patterns in `langchain-core` for base abstractions
- Use `langchain_core.callbacks` for execution tracking
- Implement proper streaming support where applicable
- Avoid deprecated components like legacy `LLMChain`
### Partner Integrations
- Follow the established patterns in existing partner libraries
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
- Include comprehensive integration tests
- Document API key requirements and authentication
---
## Quick Reference Checklist
Before submitting code changes:
- [ ] **Breaking Changes**: Verified no public API changes
- [ ] **Type Hints**: All functions have complete type annotations
- [ ] **Tests**: New functionality is fully tested
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
- [ ] **Documentation**: Google-style docstrings for public functions
- [ ] **Code Quality**: `make lint` and `make format` pass
- [ ] **Architecture**: Suggested improvements where applicable
- [ ] **Commit Message**: Follows Conventional Commits format

48
Dockerfile Normal file
View File

@@ -0,0 +1,48 @@
# This is a Dockerfile for running unit tests
ARG POETRY_HOME=/opt/poetry
# Use the Python base image
FROM python:3.11.2-bullseye AS builder
# Define the version of Poetry to install (default is 1.4.2)
ARG POETRY_VERSION=1.4.2
# Define the directory to install Poetry to (default is /opt/poetry)
ARG POETRY_HOME
# Create a Python virtual environment for Poetry and install it
RUN python3 -m venv ${POETRY_HOME} && \
$POETRY_HOME/bin/pip install --upgrade pip && \
$POETRY_HOME/bin/pip install poetry==${POETRY_VERSION}
# Test if Poetry is installed in the expected path
RUN echo "Poetry version:" && $POETRY_HOME/bin/poetry --version
# Set the working directory for the app
WORKDIR /app
# Use a multi-stage build to install dependencies
FROM builder AS dependencies
ARG POETRY_HOME
# Copy only the dependency files for installation
COPY pyproject.toml poetry.lock poetry.toml ./
# Install the Poetry dependencies (this layer will be cached as long as the dependencies don't change)
RUN $POETRY_HOME/bin/poetry install --no-interaction --no-ansi --with test
# Use a multi-stage build to run tests
FROM dependencies AS tests
# Copy the rest of the app source code (this layer will be invalidated and rebuilt whenever the source code changes)
COPY . .
RUN /opt/poetry/bin/poetry install --no-interaction --no-ansi --with test
# Set the entrypoint to run tests using Poetry
ENTRYPOINT ["/opt/poetry/bin/poetry", "run", "pytest"]
# Set the default command to run all unit tests
CMD ["tests/unit_tests"]

12
LICENSE
View File

@@ -1,6 +1,6 @@
MIT License
The MIT License
Copyright (c) LangChain, Inc.
Copyright (c) Harrison Chase
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
@@ -9,13 +9,13 @@ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

View File

@@ -1,9 +0,0 @@
# Migrating
Please see the following guides for migrating LangChain code:
* Migrate to [LangChain v1.0](https://docs.langchain.com/oss/python/migrate/langchain-v1)
* Migrate to [LangChain v0.3](https://python.langchain.com/docs/versions/v0_3/)
* Migrate to [LangChain v0.2](https://python.langchain.com/docs/versions/v0_2/)
* Migrating from [LangChain 0.0.x Chains](https://python.langchain.com/docs/versions/migrating_chains/)
* Upgrade to [LangGraph Memory](https://python.langchain.com/docs/versions/migrating_memory/)

70
Makefile Normal file
View File

@@ -0,0 +1,70 @@
.PHONY: all clean format lint test tests test_watch integration_tests docker_tests help extended_tests
all: help
coverage:
poetry run pytest --cov \
--cov-config=.coveragerc \
--cov-report xml \
--cov-report term-missing:skip-covered
clean: docs_clean
docs_build:
cd docs && poetry run make html
docs_clean:
cd docs && poetry run make clean
docs_linkcheck:
poetry run linkchecker docs/_build/html/index.html
format:
poetry run black .
poetry run ruff --select I --fix .
PYTHON_FILES=.
lint: PYTHON_FILES=.
lint_diff: PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$')
lint lint_diff:
poetry run mypy $(PYTHON_FILES)
poetry run black $(PYTHON_FILES) --check
poetry run ruff .
TEST_FILE ?= tests/unit_tests/
test:
poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)
tests:
poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)
extended_tests:
poetry run pytest --disable-socket --allow-unix-socket --only-extended tests/unit_tests
test_watch:
poetry run ptw --now . -- tests/unit_tests
integration_tests:
poetry run pytest tests/integration_tests
docker_tests:
docker build -t my-langchain-image:test .
docker run --rm my-langchain-image:test
help:
@echo '----'
@echo 'coverage - run unit tests and generate coverage report'
@echo 'docs_build - build the documentation'
@echo 'docs_clean - clean the documentation build artifacts'
@echo 'docs_linkcheck - run linkchecker on the documentation'
@echo 'format - run code formatters'
@echo 'lint - run linters'
@echo 'test - run unit tests'
@echo 'tests - run unit tests'
@echo 'test TEST_FILE=<test_file> - run all tests in file'
@echo 'extended_tests - run only extended unit tests'
@echo 'test_watch - run unit tests in watch mode'
@echo 'integration_tests - run integration tests'
@echo 'docker_tests - run unit tests in docker'

127
README.md
View File

@@ -1,74 +1,93 @@
<div align="center">
<a href="https://www.langchain.com/">
<picture>
<source media="(prefers-color-scheme: light)" srcset=".github/images/logo-dark.svg">
<source media="(prefers-color-scheme: dark)" srcset=".github/images/logo-light.svg">
<img alt="LangChain Logo" src=".github/images/logo-dark.svg" width="80%">
</picture>
</a>
</div>
# 🦜️🔗 LangChain
<div align="center">
<h3>The platform for reliable agents.</h3>
</div>
⚡ Building applications with LLMs through composability ⚡
<div align="center">
<a href="https://opensource.org/licenses/MIT" target="_blank"><img src="https://img.shields.io/pypi/l/langchain" alt="PyPI - License"></a>
<a href="https://pypistats.org/packages/langchain" target="_blank"><img src="https://img.shields.io/pepy/dt/langchain" alt="PyPI - Downloads"></a>
<a href="https://pypi.org/project/langchain/#history" target="_blank"><img src="https://img.shields.io/pypi/v/langchain?label=%20" alt="Version"></a>
<a href="https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain" target="_blank"><img src="https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode" alt="Open in Dev Containers"></a>
<a href="https://codespaces.new/langchain-ai/langchain" target="_blank"><img src="https://github.com/codespaces/badge.svg" alt="Open in Github Codespace" title="Open in Github Codespace" width="150" height="20"></a>
<a href="https://codspeed.io/langchain-ai/langchain" target="_blank"><img src="https://img.shields.io/endpoint?url=https://codspeed.io/badge.json" alt="CodSpeed Badge"></a>
<a href="https://twitter.com/langchainai" target="_blank"><img src="https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI" alt="Twitter / X"></a>
</div>
[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml)
[![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml)
[![linkcheck](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml)
[![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
[![](https://dcbadge.vercel.app/api/server/6adMQxSpJS?compact=true&style=flat)](https://discord.gg/6adMQxSpJS)
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)
[![GitHub star chart](https://img.shields.io/github/stars/hwchase17/langchain?style=social)](https://star-history.com/#hwchase17/langchain)
LangChain is a framework for building agents and LLM-powered applications. It helps you chain together interoperable components and third-party integrations to simplify AI application development all while future-proofing decisions as the underlying technology evolves.
```bash
pip install langchain
```
Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).
If you're looking for more advanced customization or agent orchestration, check out [LangGraph](https://docs.langchain.com/oss/python/langgraph/overview), our framework for building controllable agent workflows.
**Production Support:** As you move your LangChains into production, we'd love to offer more comprehensive support.
Please fill out [this form](https://forms.gle/57d8AmXBYp8PP8tZA) and we'll set up a dedicated support Slack channel.
---
## Quick Install
**Documentation**:
`pip install langchain`
or
`conda install langchain -c conda-forge`
- [docs.langchain.com](https://docs.langchain.com/oss/python/langchain/overview) Comprehensive documentation, including conceptual overviews and guides
- [reference.langchain.com/python](https://reference.langchain.com/python) API reference docs for LangChain packages
## 🤔 What is this?
**Discussions**: Visit the [LangChain Forum](https://forum.langchain.com) to connect with the community and share all of your technical questions, ideas, and feedback.
Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. However, using these LLMs in isolation is often insufficient for creating a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge.
> [!NOTE]
> Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
This library aims to assist in the development of those types of applications. Common examples of these applications include:
## Why use LangChain?
**❓ Question Answering over specific documents**
LangChain helps developers build applications powered by LLMs through a standard interface for models, embeddings, vector stores, and more.
- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/question_answering.html)
- End-to-end Example: [Question Answering over Notion Database](https://github.com/hwchase17/notion-qa)
Use LangChain for:
**💬 Chatbots**
- **Real-time data augmentation**. Easily connect LLMs to diverse data sources and external/internal systems, drawing from LangChain's vast library of integrations with model providers, tools, vector stores, retrievers, and more.
- **Model interoperability**. Swap models in and out as your engineering team experiments to find the best choice for your application's needs. As the industry frontier evolves, adapt quickly LangChain's abstractions keep you moving without losing momentum.
- **Rapid prototyping**. Quickly build and iterate on LLM applications with LangChain's modular, component-based architecture. Test different approaches and workflows without rebuilding from scratch, accelerating your development cycle.
- **Production-ready features**. Deploy reliable applications with built-in support for monitoring, evaluation, and debugging through integrations like LangSmith. Scale with confidence using battle-tested patterns and best practices.
- **Vibrant community and ecosystem**. Leverage a rich ecosystem of integrations, templates, and community-contributed components. Benefit from continuous improvements and stay up-to-date with the latest AI developments through an active open-source community.
- **Flexible abstraction layers**. Work at the level of abstraction that suits your needs - from high-level chains for quick starts to low-level components for fine-grained control. LangChain grows with your application's complexity.
- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/chatbots.html)
- End-to-end Example: [Chat-LangChain](https://github.com/hwchase17/chat-langchain)
## LangChain ecosystem
**🤖 Agents**
While the LangChain framework can be used standalone, it also integrates seamlessly with any LangChain product, giving developers a full suite of tools when building LLM applications.
- [Documentation](https://langchain.readthedocs.io/en/latest/modules/agents.html)
- End-to-end Example: [GPT+WolframAlpha](https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain)
To improve your LLM application development, pair LangChain with:
## 📖 Documentation
- [LangGraph](https://docs.langchain.com/oss/python/langgraph/overview) Build agents that can reliably handle complex tasks with LangGraph, our low-level agent orchestration framework. LangGraph offers customizable architecture, long-term memory, and human-in-the-loop workflows and is trusted in production by companies like LinkedIn, Uber, Klarna, and GitLab.
- [Integrations](https://docs.langchain.com/oss/python/integrations/providers/overview) List of LangChain integrations, including chat & embedding models, tools & toolkits, and more
- [LangSmith](https://www.langchain.com/langsmith) Helpful for agent evals and observability. Debug poor-performing LLM app runs, evaluate agent trajectories, gain visibility in production, and improve performance over time.
- [LangSmith Deployment](https://docs.langchain.com/langsmith/deployments) Deploy and scale agents effortlessly with a purpose-built deployment platform for long-running, stateful workflows. Discover, reuse, configure, and share agents across teams and iterate quickly with visual prototyping in [LangSmith Studio](https://docs.langchain.com/langsmith/studio).
- [Deep Agents](https://github.com/langchain-ai/deepagents) *(new!)* Build agents that can plan, use subagents, and leverage file systems for complex tasks
Please see [here](https://langchain.readthedocs.io/en/latest/?) for full documentation on:
## Additional resources
- Getting started (installation, setting up the environment, simple examples)
- How-To examples (demos, integrations, helper functions)
- Reference (full API docs)
- Resources (high-level explanation of core concepts)
- [API Reference](https://reference.langchain.com/python) Detailed reference on navigating base packages and integrations for LangChain.
- [Contributing Guide](https://docs.langchain.com/oss/python/contributing/overview) Learn how to contribute to LangChain projects and find good first issues.
- [Code of Conduct](https://github.com/langchain-ai/langchain/blob/master/.github/CODE_OF_CONDUCT.md) Our community guidelines and standards for participation.
## 🚀 What can this help with?
There are six main areas that LangChain is designed to help with.
These are, in increasing order of complexity:
**📃 LLMs and Prompts:**
This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.
**🔗 Chains:**
Chains go beyond a single LLM call and involve sequences of calls (whether to an LLM or a different utility). LangChain provides a standard interface for chains, lots of integrations with other tools, and end-to-end chains for common applications.
**📚 Data Augmented Generation:**
Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.
**🤖 Agents:**
Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.
**🧠 Memory:**
Memory refers to persisting state between calls of a chain/agent. LangChain provides a standard interface for memory, a collection of memory implementations, and examples of chains/agents that use memory.
**🧐 Evaluation:**
[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.
For more information on these concepts, please see our [full documentation](https://langchain.readthedocs.io/en/latest/).
## 💁 Contributing
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see [here](.github/CONTRIBUTING.md).

View File

@@ -1,80 +0,0 @@
# Security Policy
LangChain has a large ecosystem of integrations with various external resources like local and remote file systems, APIs and databases. These integrations allow developers to create versatile applications that combine the power of LLMs with the ability to access, interact with and manipulate external resources.
## Best practices
When building such applications, developers should remember to follow good security practices:
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc., as appropriate for your application.
* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it's safest to assume that any LLM able to use those credentials may in fact delete data.
* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It's best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use.
Risks of not doing so include, but are not limited to:
* Data corruption or loss.
* Unauthorized access to confidential information.
* Compromised performance or availability of critical resources.
Example scenarios with mitigation strategies:
* A user may ask an agent with access to the file system to delete files that should not be deleted or read the content of files that contain sensitive information. To mitigate, limit the agent to only use a specific directory and only allow it to read or write files that are safe to read or write. Consider further sandboxing the agent by running it in a container.
* A user may ask an agent with write access to an external API to write malicious data to the API, or delete data from that API. To mitigate, give the agent read-only API keys, or limit it to only use endpoints that are already resistant to such misuse.
* A user may ask an agent with access to a database to drop a table or mutate the schema. To mitigate, scope the credentials to only the tables that the agent needs to access and consider issuing READ-ONLY credentials.
If you're building applications that access external resources like file systems, APIs or databases, consider speaking with your company's security team to determine how to best design and secure your applications.
## Reporting OSS Vulnerabilities
LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide
a bounty program for our open source projects.
Please report security vulnerabilities associated with the LangChain
open source projects at [huntr](https://huntr.com/bounties/disclose/?target=https%3A%2F%2Fgithub.com%2Flangchain-ai%2Flangchain&validSearch=true).
Before reporting a vulnerability, please review:
1) In-Scope Targets and Out-of-Scope Targets below.
2) The [langchain-ai/langchain](https://docs.langchain.com/oss/python/contributing/code#repository-structure) monorepo structure.
3) The [Best Practices](#best-practices) above to understand what we consider to be a security vulnerability vs. developer responsibility.
### In-Scope Targets
The following packages and repositories are eligible for bug bounties:
* langchain-core
* langchain (see exceptions)
* langchain-community (see exceptions)
* langgraph
* langserve
### Out of Scope Targets
All out of scope targets defined by huntr as well as:
* **langchain-experimental**: This repository is for experimental code and is not
eligible for bug bounties (see [package warning](https://pypi.org/project/langchain-experimental/)), bug reports to it will be marked as interesting or waste of
time and published with no bounty attached.
* **tools**: Tools in either `langchain` or `langchain-community` are not eligible for bug
bounties. This includes the following directories
* `libs/langchain/langchain/tools`
* `libs/community/langchain_community/tools`
* Please review the [Best Practices](#best-practices)
for more details, but generally tools interact with the real world. Developers are
expected to understand the security implications of their code and are responsible
for the security of their tools.
* Code documented with security notices. This will be decided on a case-by-case basis, but likely will not be eligible for a bounty as the code is already
documented with guidelines for developers that should be followed for making their
application secure.
* Any LangSmith related repositories or APIs (see [Reporting LangSmith Vulnerabilities](#reporting-langsmith-vulnerabilities)).
## Reporting LangSmith Vulnerabilities
Please report security vulnerabilities associated with LangSmith by email to `security@langchain.dev`.
* LangSmith site: [https://smith.langchain.com](https://smith.langchain.com)
* SDK client: [https://github.com/langchain-ai/langsmith-sdk](https://github.com/langchain-ai/langsmith-sdk)
### Other Security Concerns
For any other security concerns, please contact us at `security@langchain.dev`.

21
docs/Makefile Normal file
View File

@@ -0,0 +1,21 @@
# Minimal makefile for Sphinx documentation
#
# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SPHINXAUTOBUILD ?= sphinx-autobuild
SOURCEDIR = .
BUILDDIR = _build
# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
.PHONY: help Makefile
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

BIN
docs/_static/ApifyActors.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 559 KiB

BIN
docs/_static/DataberryDashboard.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 157 KiB

BIN
docs/_static/HeliconeDashboard.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 235 KiB

BIN
docs/_static/HeliconeKeys.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

BIN
docs/_static/MetalDash.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.5 MiB

21
docs/_static/css/custom.css vendored Normal file
View File

@@ -0,0 +1,21 @@
pre {
white-space: break-spaces;
}
@media (min-width: 1200px) {
.container,
.container-lg,
.container-md,
.container-sm,
.container-xl {
max-width: 2560px !important;
}
}
#my-component-root *, #headlessui-portal-root * {
z-index: 10000;
}
.content-container p {
margin: revert;
}

56
docs/_static/js/mendablesearch.js vendored Normal file
View File

@@ -0,0 +1,56 @@
document.addEventListener('DOMContentLoaded', () => {
// Load the external dependencies
function loadScript(src, onLoadCallback) {
const script = document.createElement('script');
script.src = src;
script.onload = onLoadCallback;
document.head.appendChild(script);
}
function createRootElement() {
const rootElement = document.createElement('div');
rootElement.id = 'my-component-root';
document.body.appendChild(rootElement);
return rootElement;
}
function initializeMendable() {
const rootElement = createRootElement();
const { MendableFloatingButton } = Mendable;
const iconSpan1 = React.createElement('span', {
}, '🦜');
const iconSpan2 = React.createElement('span', {
}, '🔗');
const icon = React.createElement('p', {
style: { color: '#ffffff', fontSize: '22px',width: '48px', height: '48px', margin: '0px', padding: '0px', display: 'flex', alignItems: 'center', justifyContent: 'center', textAlign: 'center' },
}, [iconSpan1, iconSpan2]);
const mendableFloatingButton = React.createElement(
MendableFloatingButton,
{
style: { darkMode: false, accentColor: '#010810' },
floatingButtonStyle: { color: '#ffffff', backgroundColor: '#010810' },
anon_key: '82842b36-3ea6-49b2-9fb8-52cfc4bde6bf', // Mendable Search Public ANON key, ok to be public
messageSettings: {
openSourcesInNewTab: false,
prettySources: true // Prettify the sources displayed now
},
icon: icon,
}
);
ReactDOM.render(mendableFloatingButton, rootElement);
}
loadScript('https://unpkg.com/react@17/umd/react.production.min.js', () => {
loadScript('https://unpkg.com/react-dom@17/umd/react-dom.production.min.js', () => {
loadScript('https://unpkg.com/@mendable/search@0.0.102/dist/umd/mendable.min.js', initializeMendable);
});
});
});

View File

@@ -0,0 +1,256 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "920a3c1a",
"metadata": {},
"source": [
"# Model Comparison\n",
"\n",
"Constructing your language model application will likely involved choosing between many different options of prompts, models, and even chains to use. When doing so, you will want to compare these different options on different inputs in an easy, flexible, and intuitive way. \n",
"\n",
"LangChain provides the concept of a ModelLaboratory to test out and try different models."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ab9e95ad",
"metadata": {},
"outputs": [],
"source": [
"from langchain import LLMChain, OpenAI, Cohere, HuggingFaceHub, PromptTemplate\n",
"from langchain.model_laboratory import ModelLaboratory"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "32cb94e6",
"metadata": {},
"outputs": [],
"source": [
"llms = [\n",
" OpenAI(temperature=0), \n",
" Cohere(model=\"command-xlarge-20221108\", max_tokens=20, temperature=0), \n",
" HuggingFaceHub(repo_id=\"google/flan-t5-xl\", model_kwargs={\"temperature\":1})\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "14cde09d",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory.from_llms(llms)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "f186c741",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What color is a flamingo?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"Flamingos are pink.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"Pink\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[38;5;200m\u001b[1;3mpink\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What color is a flamingo?\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "248b652a",
"metadata": {},
"outputs": [],
"source": [
"prompt = PromptTemplate(template=\"What is the capital of {state}?\", input_variables=[\"state\"])\n",
"model_lab_with_prompt = ModelLaboratory.from_llms(llms, prompt=prompt)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "f64377ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"New York\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\u001b[36;1m\u001b[1;3m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 20, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\u001b[33;1m\u001b[1;3m\n",
"\n",
"The capital of New York is Albany.\u001b[0m\n",
"\n",
"\u001b[1mHuggingFaceHub\u001b[0m\n",
"Params: {'repo_id': 'google/flan-t5-xl', 'temperature': 1}\n",
"\u001b[38;5;200m\u001b[1;3mst john s\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab_with_prompt.compare(\"New York\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "54336dbf",
"metadata": {},
"outputs": [],
"source": [
"from langchain import SelfAskWithSearchChain, SerpAPIWrapper\n",
"\n",
"open_ai_llm = OpenAI(temperature=0)\n",
"search = SerpAPIWrapper()\n",
"self_ask_with_search_openai = SelfAskWithSearchChain(llm=open_ai_llm, search_chain=search, verbose=True)\n",
"\n",
"cohere_llm = Cohere(temperature=0, model=\"command-xlarge-20221108\")\n",
"search = SerpAPIWrapper()\n",
"self_ask_with_search_cohere = SelfAskWithSearchChain(llm=cohere_llm, search_chain=search, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "6a50a9f1",
"metadata": {},
"outputs": [],
"source": [
"chains = [self_ask_with_search_openai, self_ask_with_search_cohere]\n",
"names = [str(open_ai_llm), str(cohere_llm)]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "d3549e99",
"metadata": {},
"outputs": [],
"source": [
"model_lab = ModelLaboratory(chains, names=names)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "362f7f57",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[1mInput:\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"\n",
"\u001b[1mOpenAI\u001b[0m\n",
"Params: {'model': 'text-davinci-002', 'temperature': 0.0, 'max_tokens': 256, 'top_p': 1, 'frequency_penalty': 0, 'presence_penalty': 0, 'n': 1, 'best_of': 1}\n",
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"Follow up: Where is Carlos Alcaraz from?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mEl Palmar, Spain.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"So the final answer is: El Palmar, Spain\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[36;1m\u001b[1;3m\n",
"So the final answer is: El Palmar, Spain\u001b[0m\n",
"\n",
"\u001b[1mCohere\u001b[0m\n",
"Params: {'model': 'command-xlarge-20221108', 'max_tokens': 256, 'temperature': 0.0, 'k': 0, 'p': 1, 'frequency_penalty': 0, 'presence_penalty': 0}\n",
"\n",
"\n",
"\u001b[1m> Entering new chain...\u001b[0m\n",
"What is the hometown of the reigning men's U.S. Open champion?\n",
"Are follow up questions needed here:\u001b[32;1m\u001b[1;3m Yes.\n",
"Follow up: Who is the reigning men's U.S. Open champion?\u001b[0m\n",
"Intermediate answer: \u001b[33;1m\u001b[1;3mCarlos Alcaraz.\u001b[0m\u001b[32;1m\u001b[1;3m\n",
"So the final answer is:\n",
"\n",
"Carlos Alcaraz\u001b[0m\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001b[33;1m\u001b[1;3m\n",
"So the final answer is:\n",
"\n",
"Carlos Alcaraz\u001b[0m\n",
"\n"
]
}
],
"source": [
"model_lab.compare(\"What is the hometown of the reigning men's U.S. Open champion?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "94159131",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,57 @@
# Tracing
By enabling tracing in your LangChain runs, youll be able to more effectively visualize, step through, and debug your chains and agents.
First, you should install tracing and set up your environment properly.
You can use either a locally hosted version of this (uses Docker) or a cloud hosted version (in closed alpha).
If you're interested in using the hosted platform, please fill out the form [here](https://forms.gle/tRCEMSeopZf6TE3b6).
- [Locally Hosted Setup](../tracing/local_installation.md)
- [Cloud Hosted Setup](../tracing/hosted_installation.md)
## Tracing Walkthrough
When you first access the UI, you should see a page with your tracing sessions.
An initial one "default" should already be created for you.
A session is just a way to group traces together.
If you click on a session, it will take you to a page with no recorded traces that says "No Runs."
You can create a new session with the new session form.
![](../tracing/homepage.png)
If we click on the `default` session, we can see that to start we have no traces stored.
![](../tracing/default_empty.png)
If we now start running chains and agents with tracing enabled, we will see data show up here.
To do so, we can run [this notebook](../tracing/agent_with_tracing.ipynb) as an example.
After running it, we will see an initial trace show up.
![](../tracing/first_trace.png)
From here we can explore the trace at a high level by clicking on the arrow to show nested runs.
We can keep on clicking further and further down to explore deeper and deeper.
![](../tracing/explore.png)
We can also click on the "Explore" button of the top level run to dive even deeper.
Here, we can see the inputs and outputs in full, as well as all the nested traces.
![](../tracing/explore_trace.png)
We can keep on exploring each of these nested traces in more detail.
For example, here is the lowest level trace with the exact inputs/outputs to the LLM.
![](../tracing/explore_llm.png)
## Changing Sessions
1. To initially record traces to a session other than `"default"`, you can set the `LANGCHAIN_SESSION` environment variable to the name of the session you want to record to:
```python
import os
os.environ["LANGCHAIN_TRACING"] = "true"
os.environ["LANGCHAIN_SESSION"] = "my_session" # Make sure this session actually exists. You can create a new session in the UI.
```
2. To switch sessions mid-script or mid-notebook, do NOT set the `LANGCHAIN_SESSION` environment variable. Instead: `langchain.set_tracing_callback_manager(session_name="my_session")`

View File

@@ -0,0 +1,90 @@
# YouTube
This is a collection of `LangChain` videos on `YouTube`.
### ⛓️[Official LangChain YouTube channel](https://www.youtube.com/@LangChain)⛓️
### Introduction to LangChain with Harrison Chase, creator of LangChain
- [Building the Future with LLMs, `LangChain`, & `Pinecone`](https://youtu.be/nMniwlGyX-c) by [Pinecone](https://www.youtube.com/@pinecone-io)
- [LangChain and Weaviate with Harrison Chase and Bob van Luijt - Weaviate Podcast #36](https://youtu.be/lhby7Ql7hbk) by [Weaviate • Vector Database](https://www.youtube.com/@Weaviate)
- [LangChain Demo + Q&A with Harrison Chase](https://youtu.be/zaYTXQFR0_s?t=788) by [Full Stack Deep Learning](https://www.youtube.com/@FullStackDeepLearning)
- [LangChain Agents: Build Personal Assistants For Your Data (Q&A with Harrison Chase and Mayo Oshin)](https://youtu.be/gVkF8cwfBLI) by [Chat with data](https://www.youtube.com/@chatwithdata)
- ⛓️ [LangChain "Agents in Production" Webinar](https://youtu.be/k8GNCCs16F4) by [LangChain](https://www.youtube.com/@LangChain)
## Videos (sorted by views)
- [Building AI LLM Apps with LangChain (and more?) - LIVE STREAM](https://www.youtube.com/live/M-2Cj_2fzWI?feature=share) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)
- [First look - `ChatGPT` + `WolframAlpha` (`GPT-3.5` and Wolfram|Alpha via LangChain by James Weaver)](https://youtu.be/wYGbY811oMo) by [Dr Alan D. Thompson](https://www.youtube.com/@DrAlanDThompson)
- [LangChain explained - The hottest new Python framework](https://youtu.be/RoR4XJw8wIc) by [AssemblyAI](https://www.youtube.com/@AssemblyAI)
- [Chatbot with INFINITE MEMORY using `OpenAI` & `Pinecone` - `GPT-3`, `Embeddings`, `ADA`, `Vector DB`, `Semantic`](https://youtu.be/2xNzB7xq8nk) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)
- [LangChain for LLMs is... basically just an Ansible playbook](https://youtu.be/X51N9C-OhlE) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)
- [Build your own LLM Apps with LangChain & `GPT-Index`](https://youtu.be/-75p09zFUJY) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [`BabyAGI` - New System of Autonomous AI Agents with LangChain](https://youtu.be/lg3kJvf1kXo) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [Run `BabyAGI` with Langchain Agents (with Python Code)](https://youtu.be/WosPGHPObx8) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [How to Use Langchain With `Zapier` | Write and Send Email with GPT-3 | OpenAI API Tutorial](https://youtu.be/p9v2-xEa9A0) by [StarMorph AI](https://www.youtube.com/@starmorph)
- [Use Your Locally Stored Files To Get Response From GPT - `OpenAI` | Langchain | Python](https://youtu.be/NC1Ni9KS-rk) by [Shweta Lodha](https://www.youtube.com/@shweta-lodha)
- [`Langchain JS` | How to Use GPT-3, GPT-4 to Reference your own Data | `OpenAI Embeddings` Intro](https://youtu.be/veV2I-NEjaM) by [StarMorph AI](https://www.youtube.com/@starmorph)
- [The easiest way to work with large language models | Learn LangChain in 10min](https://youtu.be/kmbS6FDQh7c) by [Sophia Yang](https://www.youtube.com/@SophiaYangDS)
- [4 Autonomous AI Agents: “Westworld” simulation `BabyAGI`, `AutoGPT`, `Camel`, `LangChain`](https://youtu.be/yWbnH6inT_U) by [Sophia Yang](https://www.youtube.com/@SophiaYangDS)
- [AI CAN SEARCH THE INTERNET? Langchain Agents + OpenAI ChatGPT](https://youtu.be/J-GL0htqda8) by [tylerwhatsgood](https://www.youtube.com/@tylerwhatsgood)
- [Query Your Data with GPT-4 | Embeddings, Vector Databases | Langchain JS Knowledgebase](https://youtu.be/jRnUPUTkZmU) by [StarMorph AI](https://www.youtube.com/@starmorph)
- [`Weaviate` + LangChain for LLM apps presented by Erika Cardenas](https://youtu.be/7AGj4Td5Lgw) by [`Weaviate` • Vector Database](https://www.youtube.com/@Weaviate)
- [Langchain Overview — How to Use Langchain & `ChatGPT`](https://youtu.be/oYVYIq0lOtI) by [Python In Office](https://www.youtube.com/@pythoninoffice6568)
- [Langchain Overview - How to Use Langchain & `ChatGPT`](https://youtu.be/oYVYIq0lOtI) by [Python In Office](https://www.youtube.com/@pythoninoffice6568)
- [Custom langchain Agent & Tools with memory. Turn any `Python function` into langchain tool with Gpt 3](https://youtu.be/NIG8lXk0ULg) by [echohive](https://www.youtube.com/@echohive)
- [LangChain: Run Language Models Locally - `Hugging Face Models`](https://youtu.be/Xxxuw4_iCzw) by [Prompt Engineering](https://www.youtube.com/@engineerprompt)
- [`ChatGPT` with any `YouTube` video using langchain and `chromadb`](https://youtu.be/TQZfB2bzVwU) by [echohive](https://www.youtube.com/@echohive)
- [How to Talk to a `PDF` using LangChain and `ChatGPT`](https://youtu.be/v2i1YDtrIwk) by [Automata Learning Lab](https://www.youtube.com/@automatalearninglab)
- [Langchain Document Loaders Part 1: Unstructured Files](https://youtu.be/O5C0wfsen98) by [Merk](https://www.youtube.com/@merksworld)
- [LangChain - Prompt Templates (what all the best prompt engineers use)](https://youtu.be/1aRu8b0XNOQ) by [Nick Daigler](https://www.youtube.com/@nick_daigs)
- [LangChain. Crear aplicaciones Python impulsadas por GPT](https://youtu.be/DkW_rDndts8) by [Jesús Conde](https://www.youtube.com/@0utKast)
- [Easiest Way to Use GPT In Your Products | LangChain Basics Tutorial](https://youtu.be/fLy0VenZyGc) by [Rachel Woods](https://www.youtube.com/@therachelwoods)
- [`BabyAGI` + `GPT-4` Langchain Agent with Internet Access](https://youtu.be/wx1z_hs5P6E) by [tylerwhatsgood](https://www.youtube.com/@tylerwhatsgood)
- [Learning LLM Agents. How does it actually work? LangChain, AutoGPT & OpenAI](https://youtu.be/mb_YAABSplk) by [Arnoldas Kemeklis](https://www.youtube.com/@processusAI)
- [Get Started with LangChain in `Node.js`](https://youtu.be/Wxx1KUWJFv4) by [Developers Digest](https://www.youtube.com/@DevelopersDigest)
- [LangChain + `OpenAI` tutorial: Building a Q&A system w/ own text data](https://youtu.be/DYOU_Z0hAwo) by [Samuel Chan](https://www.youtube.com/@SamuelChan)
- [Langchain + `Zapier` Agent](https://youtu.be/yribLAb-pxA) by [Merk](https://www.youtube.com/@merksworld)
- [Connecting the Internet with `ChatGPT` (LLMs) using Langchain And Answers Your Questions](https://youtu.be/9Y0TBC63yZg) by [Kamalraj M M](https://www.youtube.com/@insightbuilder)
- [Build More Powerful LLM Applications for Businesss with LangChain (Beginners Guide)](https://youtu.be/sp3-WLKEcBg) by[ No Code Blackbox](https://www.youtube.com/@nocodeblackbox)
- ⛓️ [LangFlow LLM Agent Demo for 🦜🔗LangChain](https://youtu.be/zJxDHaWt-6o) by [Cobus Greyling](https://www.youtube.com/@CobusGreylingZA)
- ⛓️ [Chatbot Factory: Streamline Python Chatbot Creation with LLMs and Langchain](https://youtu.be/eYer3uzrcuM) by [Finxter](https://www.youtube.com/@CobusGreylingZA)
- ⛓️ [LangChain Tutorial - ChatGPT mit eigenen Daten](https://youtu.be/0XDLyY90E2c) by [Coding Crashkurse](https://www.youtube.com/@codingcrashkurse6429)
- ⛓️ [Chat with a `CSV` | LangChain Agents Tutorial (Beginners)](https://youtu.be/tjeti5vXWOU) by [GoDataProf](https://www.youtube.com/@godataprof)
- ⛓️ [Introdução ao Langchain - #Cortes - Live DataHackers](https://youtu.be/fw8y5VRei5Y) by [Prof. João Gabriel Lima](https://www.youtube.com/@profjoaogabriellima)
- ⛓️ [LangChain: Level up `ChatGPT` !? | LangChain Tutorial Part 1](https://youtu.be/vxUGx8aZpDE) by [Code Affinity](https://www.youtube.com/@codeaffinitydev)
- ⛓️ [KI schreibt krasses Youtube Skript 😲😳 | LangChain Tutorial Deutsch](https://youtu.be/QpTiXyK1jus) by [SimpleKI](https://www.youtube.com/@simpleki)
- ⛓️ [Chat with Audio: Langchain, `Chroma DB`, OpenAI, and `Assembly AI`](https://youtu.be/Kjy7cx1r75g) by [AI Anytime](https://www.youtube.com/@AIAnytime)
- ⛓️ [QA over documents with Auto vector index selection with Langchain router chains](https://youtu.be/9G05qybShv8) by [echohive](https://www.youtube.com/@echohive)
- ⛓️ [Build your own custom LLM application with `Bubble.io` & Langchain (No Code & Beginner friendly)](https://youtu.be/O7NhQGu1m6c) by [No Code Blackbox](https://www.youtube.com/@nocodeblackbox)
- ⛓️ [Simple App to Question Your Docs: Leveraging `Streamlit`, `Hugging Face Spaces`, LangChain, and `Claude`!](https://youtu.be/X4YbNECRr7o) by [Chris Alexiuk](https://www.youtube.com/@chrisalexiuk)
- ⛓️ [LANGCHAIN AI- `ConstitutionalChainAI` + Databutton AI ASSISTANT Web App](https://youtu.be/5zIU6_rdJCU) by [Avra](https://www.youtube.com/@Avra_b)
- ⛓️ [LANGCHAIN AI AUTONOMOUS AGENT WEB APP - 👶 `BABY AGI` 🤖 with EMAIL AUTOMATION using `DATABUTTON`](https://youtu.be/cvAwOGfeHgw) by [Avra](https://www.youtube.com/@Avra_b)
- ⛓️ [The Future of Data Analysis: Using A.I. Models in Data Analysis (LangChain)](https://youtu.be/v_LIcVyg5dk) by [Absent Data](https://www.youtube.com/@absentdata)
- ⛓️ [Memory in LangChain | Deep dive (python)](https://youtu.be/70lqvTFh_Yg) by [Eden Marco](https://www.youtube.com/@EdenMarco)
- ⛓️ [9 LangChain UseCases | Beginner's Guide | 2023](https://youtu.be/zS8_qosHNMw) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)
- ⛓️ [Use Large Language Models in Jupyter Notebook | LangChain | Agents & Indexes](https://youtu.be/JSe11L1a_QQ) by [Abhinaw Tiwari](https://www.youtube.com/@AbhinawTiwariAT)
- ⛓️ [How to Talk to Your Langchain Agent | `11 Labs` + `Whisper`](https://youtu.be/N4k459Zw2PU) by [VRSEN](https://www.youtube.com/@vrsen)
- ⛓️ [LangChain Deep Dive: 5 FUN AI App Ideas To Build Quickly and Easily](https://youtu.be/mPYEPzLkeks) by [James NoCode](https://www.youtube.com/@jamesnocode)
- ⛓️ [BEST OPEN Alternative to OPENAI's EMBEDDINGs for Retrieval QA: LangChain](https://youtu.be/ogEalPMUCSY) by [Prompt Engineering](https://www.youtube.com/@engineerprompt)
- ⛓️ [LangChain 101: Models](https://youtu.be/T6c_XsyaNSQ) by [Mckay Wrigley](https://www.youtube.com/@realmckaywrigley)
- ⛓️ [LangChain with JavaScript Tutorial #1 | Setup & Using LLMs](https://youtu.be/W3AoeMrg27o) by [Leon van Zyl](https://www.youtube.com/@leonvanzyl)
- ⛓️ [LangChain Overview & Tutorial for Beginners: Build Powerful AI Apps Quickly & Easily (ZERO CODE)](https://youtu.be/iI84yym473Q) by [James NoCode](https://www.youtube.com/@jamesnocode)
- ⛓️ [LangChain In Action: Real-World Use Case With Step-by-Step Tutorial](https://youtu.be/UO699Szp82M) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
- ⛓️ [Summarizing and Querying Multiple Papers with LangChain](https://youtu.be/p_MQRWH5Y6k) by [Automata Learning Lab](https://www.youtube.com/@automatalearninglab)
- ⛓️ [Using Langchain (and `Replit`) through `Tana`, ask `Google`/`Wikipedia`/`Wolfram Alpha` to fill out a table](https://youtu.be/Webau9lEzoI) by [Stian Håklev](https://www.youtube.com/@StianHaklev)
- ⛓️ [Langchain PDF App (GUI) | Create a ChatGPT For Your `PDF` in Python](https://youtu.be/wUAUdEw5oxM) by [Alejandro AO - Software & Ai](https://www.youtube.com/@alejandro_ao)
- ⛓️ [Auto-GPT with LangChain 🔥 | Create Your Own Personal AI Assistant](https://youtu.be/imDfPmMKEjM) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)
- ⛓️ [Create Your OWN Slack AI Assistant with Python & LangChain](https://youtu.be/3jFXRNn2Bu8) by [Dave Ebbelaar](https://www.youtube.com/@daveebbelaar)
- ⛓️ [How to Create LOCAL Chatbots with GPT4All and LangChain [Full Guide]](https://youtu.be/4p1Fojur8Zw) by [Liam Ottley](https://www.youtube.com/@LiamOttley)
- ⛓️ [Build a `Multilingual PDF` Search App with LangChain, `Cohere` and `Bubble`](https://youtu.be/hOrtuumOrv8) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)
- ⛓️ [Building a LangChain Agent (code-free!) Using `Bubble` and `Flowise`](https://youtu.be/jDJIIVWTZDE) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)
- ⛓️ [Build a LangChain-based Semantic PDF Search App with No-Code Tools Bubble and Flowise](https://youtu.be/s33v5cIeqA4) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)
- ⛓️ [LangChain Memory Tutorial | Building a ChatGPT Clone in Python](https://youtu.be/Cwq91cj2Pnc) by [Alejandro AO - Software & Ai](https://www.youtube.com/@alejandro_ao)
- ⛓️ [ChatGPT For Your DATA | Chat with Multiple Documents Using LangChain](https://youtu.be/TeDgIDqQmzs) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)
- ⛓️ [`Llama Index`: Chat with Documentation using URL Loader](https://youtu.be/XJRoDEctAwA) by [Merk](https://www.youtube.com/@merksworld)
- ⛓️ [Using OpenAI, LangChain, and `Gradio` to Build Custom GenAI Applications](https://youtu.be/1MsmqMg3yUc) by [David Hundley](https://www.youtube.com/@dkhundley)
---------------------
⛓ icon marks a new video [last update 2023-05-15]

113
docs/conf.py Normal file
View File

@@ -0,0 +1,113 @@
"""Configuration file for the Sphinx documentation builder."""
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
# -- Path setup --------------------------------------------------------------
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
#
# import os
# import sys
# sys.path.insert(0, os.path.abspath('.'))
import toml
with open("../pyproject.toml") as f:
data = toml.load(f)
# -- Project information -----------------------------------------------------
project = "🦜🔗 LangChain"
copyright = "2023, Harrison Chase"
author = "Harrison Chase"
version = data["tool"]["poetry"]["version"]
release = version
html_title = project + " " + version
html_last_updated_fmt = "%b %d, %Y"
# -- General configuration ---------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
"sphinx.ext.autodoc",
"sphinx.ext.autodoc.typehints",
"sphinx.ext.autosummary",
"sphinx.ext.napoleon",
"sphinx.ext.viewcode",
"sphinxcontrib.autodoc_pydantic",
"myst_nb",
"sphinx_copybutton",
"sphinx_panels",
"IPython.sphinxext.ipython_console_highlighting",
'sphinx_tabs.tabs'
]
source_suffix = [".ipynb", ".html", ".md", ".rst"]
autodoc_pydantic_model_show_json = False
autodoc_pydantic_field_list_validators = False
autodoc_pydantic_config_members = False
autodoc_pydantic_model_show_config_summary = False
autodoc_pydantic_model_show_validator_members = False
autodoc_pydantic_model_show_field_summary = False
autodoc_pydantic_model_members = False
autodoc_pydantic_model_undoc_members = False
# autodoc_typehints = "signature"
# autodoc_typehints = "description"
# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This pattern also affects html_static_path and html_extra_path.
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = "sphinx_book_theme"
html_theme_options = {
"path_to_docs": "docs",
"repository_url": "https://github.com/hwchase17/langchain",
"use_repository_button": True,
}
html_context = {
"display_github": True, # Integrate GitHub
"github_user": "hwchase17", # Username
"github_repo": "langchain", # Repo name
"github_version": "master", # Version
"conf_py_path": "/docs/", # Path in the checkout to the docs root
}
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ["_static"]
# These paths are either relative to html_static_path
# or fully qualified paths (eg. https://...)
html_css_files = [
"css/custom.css",
]
html_js_files = [
"js/mendablesearch.js",
]
nb_execution_mode = "off"
myst_enable_extensions = ["colon_fence"]

192
docs/dependents.md Normal file
View File

@@ -0,0 +1,192 @@
# Dependents
Dependents stats for `hwchase17/langchain`
[![](https://img.shields.io/static/v1?label=Used%20by&message=5152&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(public)&message=172&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(private)&message=4980&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[![](https://img.shields.io/static/v1?label=Used%20by%20(stars)&message=17239&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)
[update: 2023-05-17; only dependent repositories with Stars > 100]
| Repository | Stars |
| :-------- | -----: |
|[openai/openai-cookbook](https://github.com/openai/openai-cookbook) | 35401 |
|[LAION-AI/Open-Assistant](https://github.com/LAION-AI/Open-Assistant) | 32861 |
|[microsoft/TaskMatrix](https://github.com/microsoft/TaskMatrix) | 32766 |
|[hpcaitech/ColossalAI](https://github.com/hpcaitech/ColossalAI) | 29560 |
|[reworkd/AgentGPT](https://github.com/reworkd/AgentGPT) | 22315 |
|[imartinez/privateGPT](https://github.com/imartinez/privateGPT) | 17474 |
|[openai/chatgpt-retrieval-plugin](https://github.com/openai/chatgpt-retrieval-plugin) | 16923 |
|[mindsdb/mindsdb](https://github.com/mindsdb/mindsdb) | 16112 |
|[jerryjliu/llama_index](https://github.com/jerryjliu/llama_index) | 15407 |
|[mlflow/mlflow](https://github.com/mlflow/mlflow) | 14345 |
|[GaiZhenbiao/ChuanhuChatGPT](https://github.com/GaiZhenbiao/ChuanhuChatGPT) | 10372 |
|[databrickslabs/dolly](https://github.com/databrickslabs/dolly) | 9919 |
|[AIGC-Audio/AudioGPT](https://github.com/AIGC-Audio/AudioGPT) | 8177 |
|[logspace-ai/langflow](https://github.com/logspace-ai/langflow) | 6807 |
|[imClumsyPanda/langchain-ChatGLM](https://github.com/imClumsyPanda/langchain-ChatGLM) | 6087 |
|[arc53/DocsGPT](https://github.com/arc53/DocsGPT) | 5292 |
|[e2b-dev/e2b](https://github.com/e2b-dev/e2b) | 4622 |
|[nsarrazin/serge](https://github.com/nsarrazin/serge) | 4076 |
|[madawei2699/myGPTReader](https://github.com/madawei2699/myGPTReader) | 3952 |
|[zauberzeug/nicegui](https://github.com/zauberzeug/nicegui) | 3952 |
|[go-skynet/LocalAI](https://github.com/go-skynet/LocalAI) | 3762 |
|[GreyDGL/PentestGPT](https://github.com/GreyDGL/PentestGPT) | 3388 |
|[mmabrouk/chatgpt-wrapper](https://github.com/mmabrouk/chatgpt-wrapper) | 3243 |
|[zilliztech/GPTCache](https://github.com/zilliztech/GPTCache) | 3189 |
|[wenda-LLM/wenda](https://github.com/wenda-LLM/wenda) | 3050 |
|[marqo-ai/marqo](https://github.com/marqo-ai/marqo) | 2930 |
|[gkamradt/langchain-tutorials](https://github.com/gkamradt/langchain-tutorials) | 2710 |
|[PrefectHQ/marvin](https://github.com/PrefectHQ/marvin) | 2545 |
|[project-baize/baize-chatbot](https://github.com/project-baize/baize-chatbot) | 2479 |
|[whitead/paper-qa](https://github.com/whitead/paper-qa) | 2399 |
|[langgenius/dify](https://github.com/langgenius/dify) | 2344 |
|[GerevAI/gerev](https://github.com/GerevAI/gerev) | 2283 |
|[hwchase17/chat-langchain](https://github.com/hwchase17/chat-langchain) | 2266 |
|[guangzhengli/ChatFiles](https://github.com/guangzhengli/ChatFiles) | 1903 |
|[Azure-Samples/azure-search-openai-demo](https://github.com/Azure-Samples/azure-search-openai-demo) | 1884 |
|[OpenBMB/BMTools](https://github.com/OpenBMB/BMTools) | 1860 |
|[Farama-Foundation/PettingZoo](https://github.com/Farama-Foundation/PettingZoo) | 1813 |
|[OpenGVLab/Ask-Anything](https://github.com/OpenGVLab/Ask-Anything) | 1571 |
|[IntelligenzaArtificiale/Free-Auto-GPT](https://github.com/IntelligenzaArtificiale/Free-Auto-GPT) | 1480 |
|[hwchase17/notion-qa](https://github.com/hwchase17/notion-qa) | 1464 |
|[NVIDIA/NeMo-Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) | 1419 |
|[Unstructured-IO/unstructured](https://github.com/Unstructured-IO/unstructured) | 1410 |
|[Kav-K/GPTDiscord](https://github.com/Kav-K/GPTDiscord) | 1363 |
|[paulpierre/RasaGPT](https://github.com/paulpierre/RasaGPT) | 1344 |
|[StanGirard/quivr](https://github.com/StanGirard/quivr) | 1330 |
|[lunasec-io/lunasec](https://github.com/lunasec-io/lunasec) | 1318 |
|[vocodedev/vocode-python](https://github.com/vocodedev/vocode-python) | 1286 |
|[agiresearch/OpenAGI](https://github.com/agiresearch/OpenAGI) | 1156 |
|[h2oai/h2ogpt](https://github.com/h2oai/h2ogpt) | 1141 |
|[jina-ai/thinkgpt](https://github.com/jina-ai/thinkgpt) | 1106 |
|[yanqiangmiffy/Chinese-LangChain](https://github.com/yanqiangmiffy/Chinese-LangChain) | 1072 |
|[ttengwang/Caption-Anything](https://github.com/ttengwang/Caption-Anything) | 1064 |
|[jina-ai/dev-gpt](https://github.com/jina-ai/dev-gpt) | 1057 |
|[juncongmoo/chatllama](https://github.com/juncongmoo/chatllama) | 1003 |
|[greshake/llm-security](https://github.com/greshake/llm-security) | 1002 |
|[visual-openllm/visual-openllm](https://github.com/visual-openllm/visual-openllm) | 957 |
|[richardyc/Chrome-GPT](https://github.com/richardyc/Chrome-GPT) | 918 |
|[irgolic/AutoPR](https://github.com/irgolic/AutoPR) | 886 |
|[mmz-001/knowledge_gpt](https://github.com/mmz-001/knowledge_gpt) | 867 |
|[thomas-yanxin/LangChain-ChatGLM-Webui](https://github.com/thomas-yanxin/LangChain-ChatGLM-Webui) | 850 |
|[microsoft/X-Decoder](https://github.com/microsoft/X-Decoder) | 837 |
|[peterw/Chat-with-Github-Repo](https://github.com/peterw/Chat-with-Github-Repo) | 826 |
|[cirediatpl/FigmaChain](https://github.com/cirediatpl/FigmaChain) | 782 |
|[hashintel/hash](https://github.com/hashintel/hash) | 778 |
|[seanpixel/Teenage-AGI](https://github.com/seanpixel/Teenage-AGI) | 773 |
|[jina-ai/langchain-serve](https://github.com/jina-ai/langchain-serve) | 738 |
|[corca-ai/EVAL](https://github.com/corca-ai/EVAL) | 737 |
|[ai-sidekick/sidekick](https://github.com/ai-sidekick/sidekick) | 717 |
|[rlancemartin/auto-evaluator](https://github.com/rlancemartin/auto-evaluator) | 703 |
|[poe-platform/api-bot-tutorial](https://github.com/poe-platform/api-bot-tutorial) | 689 |
|[SamurAIGPT/Camel-AutoGPT](https://github.com/SamurAIGPT/Camel-AutoGPT) | 666 |
|[eyurtsev/kor](https://github.com/eyurtsev/kor) | 608 |
|[run-llama/llama-lab](https://github.com/run-llama/llama-lab) | 559 |
|[namuan/dr-doc-search](https://github.com/namuan/dr-doc-search) | 544 |
|[pieroit/cheshire-cat](https://github.com/pieroit/cheshire-cat) | 520 |
|[griptape-ai/griptape](https://github.com/griptape-ai/griptape) | 514 |
|[getmetal/motorhead](https://github.com/getmetal/motorhead) | 481 |
|[hwchase17/chat-your-data](https://github.com/hwchase17/chat-your-data) | 462 |
|[langchain-ai/langchain-aiplugin](https://github.com/langchain-ai/langchain-aiplugin) | 452 |
|[jina-ai/agentchain](https://github.com/jina-ai/agentchain) | 439 |
|[SamurAIGPT/ChatGPT-Developer-Plugins](https://github.com/SamurAIGPT/ChatGPT-Developer-Plugins) | 437 |
|[alexanderatallah/window.ai](https://github.com/alexanderatallah/window.ai) | 433 |
|[michaelthwan/searchGPT](https://github.com/michaelthwan/searchGPT) | 427 |
|[mpaepper/content-chatbot](https://github.com/mpaepper/content-chatbot) | 425 |
|[mckaywrigley/repo-chat](https://github.com/mckaywrigley/repo-chat) | 422 |
|[whyiyhw/chatgpt-wechat](https://github.com/whyiyhw/chatgpt-wechat) | 421 |
|[freddyaboulton/gradio-tools](https://github.com/freddyaboulton/gradio-tools) | 407 |
|[jonra1993/fastapi-alembic-sqlmodel-async](https://github.com/jonra1993/fastapi-alembic-sqlmodel-async) | 395 |
|[yeagerai/yeagerai-agent](https://github.com/yeagerai/yeagerai-agent) | 383 |
|[akshata29/chatpdf](https://github.com/akshata29/chatpdf) | 374 |
|[OpenGVLab/InternGPT](https://github.com/OpenGVLab/InternGPT) | 368 |
|[ruoccofabrizio/azure-open-ai-embeddings-qna](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna) | 358 |
|[101dotxyz/GPTeam](https://github.com/101dotxyz/GPTeam) | 357 |
|[mtenenholtz/chat-twitter](https://github.com/mtenenholtz/chat-twitter) | 354 |
|[amosjyng/langchain-visualizer](https://github.com/amosjyng/langchain-visualizer) | 343 |
|[msoedov/langcorn](https://github.com/msoedov/langcorn) | 334 |
|[showlab/VLog](https://github.com/showlab/VLog) | 330 |
|[continuum-llms/chatgpt-memory](https://github.com/continuum-llms/chatgpt-memory) | 324 |
|[steamship-core/steamship-langchain](https://github.com/steamship-core/steamship-langchain) | 323 |
|[daodao97/chatdoc](https://github.com/daodao97/chatdoc) | 320 |
|[xuwenhao/geektime-ai-course](https://github.com/xuwenhao/geektime-ai-course) | 308 |
|[StevenGrove/GPT4Tools](https://github.com/StevenGrove/GPT4Tools) | 301 |
|[logan-markewich/llama_index_starter_pack](https://github.com/logan-markewich/llama_index_starter_pack) | 300 |
|[andylokandy/gpt-4-search](https://github.com/andylokandy/gpt-4-search) | 299 |
|[Anil-matcha/ChatPDF](https://github.com/Anil-matcha/ChatPDF) | 287 |
|[itamargol/openai](https://github.com/itamargol/openai) | 273 |
|[BlackHC/llm-strategy](https://github.com/BlackHC/llm-strategy) | 267 |
|[momegas/megabots](https://github.com/momegas/megabots) | 259 |
|[bborn/howdoi.ai](https://github.com/bborn/howdoi.ai) | 238 |
|[Cheems-Seminar/grounded-segment-any-parts](https://github.com/Cheems-Seminar/grounded-segment-any-parts) | 232 |
|[ur-whitelab/exmol](https://github.com/ur-whitelab/exmol) | 227 |
|[sullivan-sean/chat-langchainjs](https://github.com/sullivan-sean/chat-langchainjs) | 227 |
|[explosion/spacy-llm](https://github.com/explosion/spacy-llm) | 226 |
|[recalign/RecAlign](https://github.com/recalign/RecAlign) | 218 |
|[jupyterlab/jupyter-ai](https://github.com/jupyterlab/jupyter-ai) | 218 |
|[alvarosevilla95/autolang](https://github.com/alvarosevilla95/autolang) | 215 |
|[conceptofmind/toolformer](https://github.com/conceptofmind/toolformer) | 213 |
|[MagnivOrg/prompt-layer-library](https://github.com/MagnivOrg/prompt-layer-library) | 209 |
|[JohnSnowLabs/nlptest](https://github.com/JohnSnowLabs/nlptest) | 208 |
|[airobotlab/KoChatGPT](https://github.com/airobotlab/KoChatGPT) | 197 |
|[langchain-ai/auto-evaluator](https://github.com/langchain-ai/auto-evaluator) | 195 |
|[yvann-hub/Robby-chatbot](https://github.com/yvann-hub/Robby-chatbot) | 195 |
|[alejandro-ao/langchain-ask-pdf](https://github.com/alejandro-ao/langchain-ask-pdf) | 192 |
|[daveebbelaar/langchain-experiments](https://github.com/daveebbelaar/langchain-experiments) | 189 |
|[NimbleBoxAI/ChainFury](https://github.com/NimbleBoxAI/ChainFury) | 187 |
|[kaleido-lab/dolphin](https://github.com/kaleido-lab/dolphin) | 184 |
|[Anil-matcha/Website-to-Chatbot](https://github.com/Anil-matcha/Website-to-Chatbot) | 183 |
|[plchld/InsightFlow](https://github.com/plchld/InsightFlow) | 180 |
|[OpenBMB/AgentVerse](https://github.com/OpenBMB/AgentVerse) | 166 |
|[benthecoder/ClassGPT](https://github.com/benthecoder/ClassGPT) | 166 |
|[jbrukh/gpt-jargon](https://github.com/jbrukh/gpt-jargon) | 161 |
|[hardbyte/qabot](https://github.com/hardbyte/qabot) | 160 |
|[shaman-ai/agent-actors](https://github.com/shaman-ai/agent-actors) | 153 |
|[radi-cho/datasetGPT](https://github.com/radi-cho/datasetGPT) | 153 |
|[poe-platform/poe-protocol](https://github.com/poe-platform/poe-protocol) | 152 |
|[paolorechia/learn-langchain](https://github.com/paolorechia/learn-langchain) | 149 |
|[ajndkr/lanarky](https://github.com/ajndkr/lanarky) | 149 |
|[fengyuli-dev/multimedia-gpt](https://github.com/fengyuli-dev/multimedia-gpt) | 147 |
|[yasyf/compress-gpt](https://github.com/yasyf/compress-gpt) | 144 |
|[homanp/superagent](https://github.com/homanp/superagent) | 143 |
|[realminchoi/babyagi-ui](https://github.com/realminchoi/babyagi-ui) | 141 |
|[ethanyanjiali/minChatGPT](https://github.com/ethanyanjiali/minChatGPT) | 141 |
|[ccurme/yolopandas](https://github.com/ccurme/yolopandas) | 139 |
|[hwchase17/langchain-streamlit-template](https://github.com/hwchase17/langchain-streamlit-template) | 138 |
|[Jaseci-Labs/jaseci](https://github.com/Jaseci-Labs/jaseci) | 136 |
|[hirokidaichi/wanna](https://github.com/hirokidaichi/wanna) | 135 |
|[Haste171/langchain-chatbot](https://github.com/Haste171/langchain-chatbot) | 134 |
|[jmpaz/promptlib](https://github.com/jmpaz/promptlib) | 130 |
|[Klingefjord/chatgpt-telegram](https://github.com/Klingefjord/chatgpt-telegram) | 130 |
|[filip-michalsky/SalesGPT](https://github.com/filip-michalsky/SalesGPT) | 128 |
|[handrew/browserpilot](https://github.com/handrew/browserpilot) | 128 |
|[shauryr/S2QA](https://github.com/shauryr/S2QA) | 127 |
|[steamship-core/vercel-examples](https://github.com/steamship-core/vercel-examples) | 127 |
|[yasyf/summ](https://github.com/yasyf/summ) | 127 |
|[gia-guar/JARVIS-ChatGPT](https://github.com/gia-guar/JARVIS-ChatGPT) | 126 |
|[jerlendds/osintbuddy](https://github.com/jerlendds/osintbuddy) | 125 |
|[ibiscp/LLM-IMDB](https://github.com/ibiscp/LLM-IMDB) | 124 |
|[Teahouse-Studios/akari-bot](https://github.com/Teahouse-Studios/akari-bot) | 124 |
|[hwchase17/chroma-langchain](https://github.com/hwchase17/chroma-langchain) | 124 |
|[menloparklab/langchain-cohere-qdrant-doc-retrieval](https://github.com/menloparklab/langchain-cohere-qdrant-doc-retrieval) | 123 |
|[peterw/StoryStorm](https://github.com/peterw/StoryStorm) | 123 |
|[chakkaradeep/pyCodeAGI](https://github.com/chakkaradeep/pyCodeAGI) | 123 |
|[petehunt/langchain-github-bot](https://github.com/petehunt/langchain-github-bot) | 115 |
|[su77ungr/CASALIOY](https://github.com/su77ungr/CASALIOY) | 113 |
|[eunomia-bpf/GPTtrace](https://github.com/eunomia-bpf/GPTtrace) | 113 |
|[zenml-io/zenml-projects](https://github.com/zenml-io/zenml-projects) | 112 |
|[pablomarin/GPT-Azure-Search-Engine](https://github.com/pablomarin/GPT-Azure-Search-Engine) | 111 |
|[shamspias/customizable-gpt-chatbot](https://github.com/shamspias/customizable-gpt-chatbot) | 109 |
|[WongSaang/chatgpt-ui-server](https://github.com/WongSaang/chatgpt-ui-server) | 108 |
|[davila7/file-gpt](https://github.com/davila7/file-gpt) | 104 |
|[enhancedocs/enhancedocs](https://github.com/enhancedocs/enhancedocs) | 102 |
|[aurelio-labs/arxiv-bot](https://github.com/aurelio-labs/arxiv-bot) | 101 |
_Generated by [github-dependents-info](https://github.com/nvuillam/github-dependents-info)_
[github-dependents-info --repo hwchase17/langchain --markdownfile dependents.md --minstars 100 --sort stars]

View File

@@ -0,0 +1,66 @@
# Deployments
So, you've created a really cool chain - now what? How do you deploy it and make it easily shareable with the world?
This section covers several options for that. Note that these options are meant for quick deployment of prototypes and demos, not for production systems. If you need help with the deployment of a production system, please contact us directly.
What follows is a list of template GitHub repositories designed to be easily forked and modified to use your chain. This list is far from exhaustive, and we are EXTREMELY open to contributions here.
## [Streamlit](https://github.com/hwchase17/langchain-streamlit-template)
This repo serves as a template for how to deploy a LangChain with Streamlit.
It implements a chatbot interface.
It also contains instructions for how to deploy this app on the Streamlit platform.
## [Gradio (on Hugging Face)](https://github.com/hwchase17/langchain-gradio-template)
This repo serves as a template for how deploy a LangChain with Gradio.
It implements a chatbot interface, with a "Bring-Your-Own-Token" approach (nice for not wracking up big bills).
It also contains instructions for how to deploy this app on the Hugging Face platform.
This is heavily influenced by James Weaver's [excellent examples](https://huggingface.co/JavaFXpert).
## [Beam](https://github.com/slai-labs/get-beam/tree/main/examples/langchain-question-answering)
This repo serves as a template for how deploy a LangChain with [Beam](https://beam.cloud).
It implements a Question Answering app and contains instructions for deploying the app as a serverless REST API.
## [Vercel](https://github.com/homanp/vercel-langchain)
A minimal example on how to run LangChain on Vercel using Flask.
## [FastAPI + Vercel](https://github.com/msoedov/langcorn)
A minimal example on how to run LangChain on Vercel using FastAPI and LangCorn/Uvicorn.
## [Kinsta](https://github.com/kinsta/hello-world-langchain)
A minimal example on how to deploy LangChain to [Kinsta](https://kinsta.com) using Flask.
## [Fly.io](https://github.com/fly-apps/hello-fly-langchain)
A minimal example of how to deploy LangChain to [Fly.io](https://fly.io/) using Flask.
## [Digitalocean App Platform](https://github.com/homanp/digitalocean-langchain)
A minimal example on how to deploy LangChain to DigitalOcean App Platform.
## [Google Cloud Run](https://github.com/homanp/gcp-langchain)
A minimal example on how to deploy LangChain to Google Cloud Run.
## [SteamShip](https://github.com/steamship-core/steamship-langchain/)
This repository contains LangChain adapters for Steamship, enabling LangChain developers to rapidly deploy their apps on Steamship. This includes: production-ready endpoints, horizontal scaling across dependencies, persistent storage of app state, multi-tenancy support, etc.
## [Langchain-serve](https://github.com/jina-ai/langchain-serve)
This repository allows users to serve local chains and agents as RESTful, gRPC, or WebSocket APIs, thanks to [Jina](https://docs.jina.ai/). Deploy your chains & agents with ease and enjoy independent scaling, serverless and autoscaling APIs, as well as a Streamlit playground on Jina AI Cloud.
## [BentoML](https://github.com/ssheng/BentoChain)
This repository provides an example of how to deploy a LangChain application with [BentoML](https://github.com/bentoml/BentoML). BentoML is a framework that enables the containerization of machine learning applications as standard OCI images. BentoML also allows for the automatic generation of OpenAPI and gRPC endpoints. With BentoML, you can integrate models from all popular ML frameworks and deploy them as microservices running on the most optimal hardware and scaling independently.
## [Databutton](https://databutton.com/home?new-data-app=true)
These templates serve as examples of how to build, deploy, and share LangChain applications using Databutton. You can create user interfaces with Streamlit, automate tasks by scheduling Python code, and store files and data in the built-in store. Examples include a Chatbot interface with conversational memory, a Personal search engine, and a starter template for LangChain apps. Deploying and sharing is just one click away.

View File

@@ -0,0 +1,20 @@
# ModelScope
This page covers how to use the modelscope ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific modelscope wrappers.
## Installation and Setup
* Install the Python SDK with `pip install modelscope`
## Wrappers
### Embeddings
There exists a modelscope Embeddings wrapper, which you can access with
```python
from langchain.embeddings import ModelScopeEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](../modules/models/text_embedding/examples/modelscope_hub.ipynb)

View File

@@ -0,0 +1,75 @@
# Concepts
These are concepts and terminology commonly used when developing LLM applications.
It contains reference to external papers or sources where the concept was first introduced,
as well as to places in LangChain where the concept is used.
## Chain of Thought
`Chain of Thought (CoT)` is a prompting technique used to encourage the model to generate a series of intermediate reasoning steps.
A less formal way to induce this behavior is to include “Lets think step-by-step” in the prompt.
- [Chain-of-Thought Paper](https://arxiv.org/pdf/2201.11903.pdf)
- [Step-by-Step Paper](https://arxiv.org/abs/2112.00114)
## Action Plan Generation
`Action Plan Generation` is a prompting technique that uses a language model to generate actions to take.
The results of these actions can then be fed back into the language model to generate a subsequent action.
- [WebGPT Paper](https://arxiv.org/pdf/2112.09332.pdf)
- [SayCan Paper](https://say-can.github.io/assets/palm_saycan.pdf)
## ReAct
`ReAct` is a prompting technique that combines Chain-of-Thought prompting with action plan generation.
This induces the model to think about what action to take, then take it.
- [Paper](https://arxiv.org/pdf/2210.03629.pdf)
- [LangChain Example](../modules/agents/agents/examples/react.ipynb)
## Self-ask
`Self-ask` is a prompting method that builds on top of chain-of-thought prompting.
In this method, the model explicitly asks itself follow-up questions, which are then answered by an external search engine.
- [Paper](https://ofir.io/self-ask.pdf)
- [LangChain Example](../modules/agents/agents/examples/self_ask_with_search.ipynb)
## Prompt Chaining
`Prompt Chaining` is combining multiple LLM calls, with the output of one-step being the input to the next.
- [PromptChainer Paper](https://arxiv.org/pdf/2203.06566.pdf)
- [Language Model Cascades](https://arxiv.org/abs/2207.10342)
- [ICE Primer Book](https://primer.ought.org/)
- [Socratic Models](https://socraticmodels.github.io/)
## Memetic Proxy
`Memetic Proxy` is encouraging the LLM
to respond in a certain way framing the discussion in a context that the model knows of and that
will result in that type of response.
For example, as a conversation between a student and a teacher.
- [Paper](https://arxiv.org/pdf/2102.07350.pdf)
## Self Consistency
`Self Consistency` is a decoding strategy that samples a diverse set of reasoning paths and then selects the most consistent answer.
Is most effective when combined with Chain-of-thought prompting.
- [Paper](https://arxiv.org/pdf/2203.11171.pdf)
## Inception
`Inception` is also called `First Person Instruction`.
It is encouraging the model to think a certain way by including the start of the models response in the prompt.
- [Example](https://twitter.com/goodside/status/1583262455207460865?s=20&t=8Hz7XBnK1OF8siQrxxCIGQ)
## MemPrompt
`MemPrompt` maintains a memory of errors and user feedback, and uses them to prevent repetition of mistakes.
- [Paper](https://memprompt.com/)

View File

@@ -0,0 +1,572 @@
======================
Quickstart
======================
In this quickstart tutorial you'll build a few simple language model application with LangChain. Along the way you'll learn about
- Models (LLMs and chat models)
- Prompt templates
- Chains
- Agents
- Memory
Installation
======================
To get started, install LangChain by running:
.. code-block:: bash
pip install langchain
or if you prefer conda:
.. code-block:: bash
conda install langchain -c conda-forge
Environment Setup
======================
Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc.
For this example, we will be using OpenAI's model APIs. First we'll need to install their Python package:
.. code-block:: bash
pip install openai
Accessing the API requires an API key, which you can get by creating an account and heading [here](https://platform.openai.com/account/api-keys). Once we have a key we'll want to set it as an environment variable by running:
.. code-block:: bash
export OPENAI_API_KEY="..."
Alternatively, you could do this from inside your Python script or notebook by adding:
.. code-block:: python
import os
os.environ["OPENAI_API_KEY"] = "..."
If you'd prefer not to set an environment variable you can pass the key in directly via the `openai_api_key` named parameter when initiating the OpenAI LLM class:
.. code-block:: python
from langchain.llms import OpenAI
llm = OpenAI(openai_api_key="...")
Building a Language Model Application
======================
Now we can start building our language model application. LangChain provides many modules that can be used to build language model applications. Modules can be used as standalones in simple applications, or they can be combined to create more complex functionality.
LLMs
=====================
Get predictions from a language model
---------------------
The basic building block of LangChain is the LLM, which takes in text and generates more text.
As an example, suppose we're building an application that generates a company name based on a company description. In order to do this, we first need to import the OpenAI LLM wrapper.
.. code-block:: python
from langchain.llms import OpenAI
Now we can initialize the wrapper with relevant parameters. In this case, since we want the outputs to be MORE random, we'll initialize our model with a HIGH temperature.
.. code-block:: python
llm = OpenAI(temperature=0.9)
And now we can pass in text and get predictions!
.. code-block:: python
llm("What would be a good company name for a company that makes colorful socks?")
.. code-block:: pycon
Feetful of Fun
For more details on how to use LLMs within LangChain, see the [LLM getting started guide](../modules/models/llms/getting_started.ipynb).
Chat Models
=====================
Get message completions from a chat model
---------------------
Chat models are a variation on language models. While chat models use language models under the hood, the interface they expose is a bit different: rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.
Chat model APIs are fairly new, so we are still figuring out the correct abstractions.
You can get chat completions by passing one or more messages to the chat model. The response will be a message. The types of messages currently supported in LangChain are `AIMessage`, `HumanMessage`, `SystemMessage`, and `ChatMessage` -- `ChatMessage` takes in an arbitrary role parameter. Most of the time, you'll just be dealing with `HumanMessage`, `AIMessage`, and `SystemMessage`.
.. code-block:: python
from langchain.chat_models import ChatOpenAI
from langchain.schema import (
AIMessage,
HumanMessage,
SystemMessage
)
chat = ChatOpenAI(temperature=0)
You can get completions by passing in a single message.
.. code-block:: python
chat([HumanMessage(content="Translate this sentence from English to French. I love programming.")])
# -> AIMessage(content="J'aime programmer.", additional_kwargs={})
You can also pass in multiple messages for OpenAI's gpt-3.5-turbo and gpt-4 models.
.. code-block:: python
messages = [
SystemMessage(content="You are a helpful assistant that translates English to French."),
HumanMessage(content="I love programming.")
]
chat(messages)
# -> AIMessage(content="J'aime programmer.", additional_kwargs={})
You can go one step further and generate completions for multiple sets of messages using `generate`. This returns an `LLMResult` with an additional `message` parameter:
.. code-block:: python
batch_messages = [
[
SystemMessage(content="You are a helpful assistant that translates English to French."),
HumanMessage(content="I love programming.")
],
[
SystemMessage(content="You are a helpful assistant that translates English to French."),
HumanMessage(content="I love artificial intelligence.")
],
]
result = chat.generate(batch_messages)
result
# -> LLMResult(generations=[[ChatGeneration(text="J'aime programmer.", generation_info=None, message=AIMessage(content="J'aime programmer.", additional_kwargs={}))], [ChatGeneration(text="J'aime l'intelligence artificielle.", generation_info=None, message=AIMessage(content="J'aime l'intelligence artificielle.", additional_kwargs={}))]], llm_output={'token_usage': {'prompt_tokens': 57, 'completion_tokens': 20, 'total_tokens': 77}})
You can recover things like token usage from this LLMResult:
.. code-block:: python
result.llm_output['token_usage']
# -> {'prompt_tokens': 57, 'completion_tokens': 20, 'total_tokens': 77}
Prompt Templates
=====================
Manage model prompts
---------------------
.. tabs::
.. group-tab:: LLMs
Most LLM applications do not pass user input directly into to an LLM. Usually they will add the user input to a larger piece of text, called a prompt, that provides additional context on the specific task at hand.
In the previous example, the text we passed to the model contained instructions to generate a company name. For our application, it'd be great if the user only had to provide the description of a company/product, without having to worry about giving the model instructions.
With PromptTemplates this is easy! In this case our template would be very simple:
.. code-block:: python
from langchain.prompts import PromptTemplate
prompt = PromptTemplate.from_template("What is a good name for a company that makes {product}?")
Now can call the `.format` method with arguments corresponding to our string template to construct the full model input:
.. code-block:: python
prompt.format(product="colorful socks")
.. code-block:: pycon
What is a good name for a company that makes colorful socks?
For more details, check out the [Prompts getting started guide](../modules/prompts/chat_prompt_template.ipynb).
.. group-tab:: Chat models
Similar to LLMs, you can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplate`s. You can use `ChatPromptTemplate`'s `format_prompt` -- this returns a `PromptValue`, which you can convert to a string or `Message` object, depending on whether you want to use the formatted value as input to an llm or chat model.
For convenience, there is a `from_template` method exposed on the template. If you were to use this template, this is what it would look like:
.. code-block:: python
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
)
chat = ChatOpenAI(temperature=0)
template = "You are a helpful assistant that translates {input_language} to {output_language}."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template = "{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
# get a chat completion from the formatted messages
chat(chat_prompt.format_prompt(input_language="English", output_language="French", text="I love programming.").to_messages())
.. code-block:: pycon
AIMessage(content="J'aime programmer.", additional_kwargs={})
Chains
=====================
Combine LLMs and prompts in multi-step workflows
---------------------
.. tabs::
.. group-tab:: LLMs
Now that we've got a model and a prompt template, we'll want to combine the two. Chains give us a way to link (or chain) together multiple primitives, like models, prompts, and other chains.
The simplest and most common type of chain is an LLMChain, which passes an input first to a PromptTemplate and then to an LLM. We can construct an LLM chain from our existing model and prompt template:
.. code-block:: python
from langchain.chains import LLMChain
chain = LLMChain(llm=llm, prompt=prompt)
Using this chain we can now replace
.. code-block:: python
llm("What would be a good company name for a company that makes colorful socks?")
with
.. code-block:: python
chain.run("colorful socks")
and get the same output (model randomness aside)
.. code-block:: python
Feetful of Fun
There we go, our first chain! Understanding how this simple chain works will set you up well for working with more complex chains.
For more details, check out the [Chain getting started guide](../modules/chains/getting_started.ipynb).
.. group-tab:: Chat models
The `LLMChain` discussed in the above section can be used with chat models as well:
.. code-block:: python
from langchain.chat_models import ChatOpenAI
from langchain import LLMChain
from langchain.prompts.chat import (
ChatPromptTemplate,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
)
chat = ChatOpenAI(temperature=0)
template = "You are a helpful assistant that translates {input_language} to {output_language}."
system_message_prompt = SystemMessagePromptTemplate.from_template(template)
human_template = "{text}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt, human_message_prompt])
chain = LLMChain(llm=chat, prompt=chat_prompt)
chain.run(input_language="English", output_language="French", text="I love programming.")
.. code-block:: pycon
"J'aime programmer."
Agents
======================
Dynamically call chains based on user input
----------------------
.. tabs::
.. group-tab:: LLMs
Our first chain ran a pre-determined sequence of steps. To handle complex workflows, we need to be able to dynamically choose actions based on inputs.
Agents do just this: they use an LLM to determine which actions to take and in what order. Agents are given access to tools, and they repeatedly choose a tool, run the tool, and observe the output until they come up with a final answer.
To load an agent, you need to choose a(n):
- LLM: The language model powering the agent.
- Tool(s): A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. For a list of predefined tools and their specifications, see [here](../modules/agents/tools/getting_started.md).
- Agent name: A string that references a support agent class. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see the documentation for custom agents (coming soon). For a list of supported agents and their specifications, see [here](../modules/agents/getting_started.ipynb).
For this example, we'll be using SerpAPI to query a search engine. You'll need to install the SerpAPI Python package:
.. code-block:: bash
pip install google-search-results
And set the corresponding environment variable.
.. code-block:: python
import os
os.environ["SERPAPI_API_KEY"] = "..."
Now we can get started!
.. code-block:: python
from langchain.agents import AgentType, initialize_agent, load_tools
from langchain.llms import OpenAI
# The language model we're going to use to control the agent.
llm = OpenAI(temperature=0)
# The tools we'll give the Agent access to. Note that the `llm-math` tool uses an LLM, so we need to pass that in.
tools = load_tools(["serpapi", "llm-math"], llm=llm)
# Finally, let's initialize an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
# Let's test it out!
agent.run("What was the high temperature in SF yesterday in Fahrenheit? What is that number raised to the .023 power?")
Looking at the trace (which is printed because of the `verbose` flag) we can see the sequence of observations and actions the agent took. First it realized it decided to use the search engine to look up the temperature. It then extracted the temperature from the search result and used the math tool to exponentiate.
.. code-block:: python
> Entering new AgentExecutor chain...
Thought: I need to find the temperature first, then use the calculator to raise it to the .023 power.
Action: Search
Action Input: "High temperature in SF yesterday"
Observation: San Francisco Temperature Yesterday. Maximum temperature yesterday: 57 °F (at 1:56 pm) Minimum temperature yesterday: 49 °F (at 1:56 am) Average temperature ...
Thought: I now have the temperature, so I can use the calculator to raise it to the .023 power.
Action: Calculator
Action Input: 57^.023
Observation: Answer: 1.0974509573251117
Thought: I now know the final answer
Final Answer: The high temperature in SF yesterday in Fahrenheit raised to the .023 power is 1.0974509573251117.
> Finished chain.
.. code-block:: pycon
The high temperature in SF yesterday in Fahrenheit raised to the .023 power is 1.0974509573251117.
.. group-tab:: Chat models
Agents can also be used with chat models, you can initialize one using `AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION` as the agent type.
.. code-block:: python
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.agents import AgentType
from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
# First, let's load the language model we're going to use to control the agent.
chat = ChatOpenAI(temperature=0)
# Next, let's load some tools to use. Note that the `llm-math` tool uses an LLM, so we need to pass that in.
llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
# Finally, let's initialize an agent with the tools, the language model, and the type of agent we want to use.
agent = initialize_agent(tools, chat, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
# Now let's test it out!
agent.run("Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?")
.. code-block:: pycon
> Entering new AgentExecutor chain...
Thought: I need to use a search engine to find Olivia Wilde's boyfriend and a calculator to raise his age to the 0.23 power.
Action:
{
"action": "Search",
"action_input": "Olivia Wilde boyfriend"
}
Observation: Sudeikis and Wilde's relationship ended in November 2020. Wilde was publicly served with court documents regarding child custody while she was presenting Don't Worry Darling at CinemaCon 2022. In January 2021, Wilde began dating singer Harry Styles after meeting during the filming of Don't Worry Darling.
Thought:I need to use a search engine to find Harry Styles' current age.
Action:
{
"action": "Search",
"action_input": "Harry Styles age"
}
Observation: 29 years
Thought:Now I need to calculate 29 raised to the 0.23 power.
Action:
{
"action": "Calculator",
"action_input": "29^0.23"
}
Observation: Answer: 2.169459462491557
Thought:I now know the final answer.
Final Answer: 2.169459462491557
> Finished chain.
'2.169459462491557'
Memory
======================
Add state to chains and agents
----------------------
.. tabs::
.. group-tab:: LLMs
The chains and agents we've looked at so far have been stateless, but for many applications it's necessary to reference past interactions. This is clearly the case with a chatbot for example, where you want it to understand new messages in the context of past messages.
The Memory module gives you a way to maintain application state. The base Memory interface is simple: it lets you update state given the latest run inputs and outputs and it lets you modify (or contextualize) the next input using the stored state.
There are a number of built-in memory systems. The simplest of these are buffers that prepend the last few inputs/outputs to the current input.
There are also a number of chains with built-in memory. This notebook walks through using one of those chains (the `ConversationChain`) with two different types of memory.
By default, the `ConversationChain` has a simple type of memory that remembers all previous inputs/outputs and adds as many of them to the prompt as it can. Let's take a look at using this chain (setting `verbose=True` so we can see the prompt).
.. code-block:: python
from langchain import OpenAI, ConversationChain
llm = OpenAI(temperature=0)
conversation = ConversationChain(llm=llm, verbose=True)
conversation.run("Hi there!")
here's what's going on under the hood
.. code-block:: pycon
> Entering new chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi there!
AI:
> Finished chain.
and here's our final output
.. code-block:: pycon
'Hello! How are you today?'
Now if we run the chain again
.. code-block:: python
conversation.run("I'm doing well! Just having a conversation with an AI.")
we'll see that the full prompt that's passed to the model contains the input and output of our first interaction, along with our latest input
.. code-block:: pycon
> Entering new chain...
Prompt after formatting:
The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.
Current conversation:
Human: Hi there!
AI: Hello! How are you today?
Human: I'm doing well! Just having a conversation with an AI.
AI:
> Finished chain.
.. code-block:: pycon
"That's great! What would you like to talk about?"
.. group-tab:: Chat models
You can use Memory with chains and agents initialized with chat models. The main difference between this and Memory for LLMs is that rather than trying to condense all previous messages into a string, we can keep them as their own unique memory object.
.. code-block:: python
from langchain.prompts import (
ChatPromptTemplate,
MessagesPlaceholder,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate
)
from langchain.chains import ConversationChain
from langchain.chat_models import ChatOpenAI
from langchain.memory import ConversationBufferMemory
prompt = ChatPromptTemplate.from_messages([
SystemMessagePromptTemplate.from_template(
"The following is a friendly conversation between a human and an AI. The AI is talkative and "
"provides lots of specific details from its context. If the AI does not know the answer to a "
"question, it truthfully says it does not know."
),
MessagesPlaceholder(variable_name="history"),
HumanMessagePromptTemplate.from_template("{input}")
])
llm = ChatOpenAI(temperature=0)
memory = ConversationBufferMemory(return_messages=True)
conversation = ConversationChain(memory=memory, prompt=prompt, llm=llm)
conversation.predict(input="Hi there!")
.. code-block:: pycon
'Hello! How can I assist you today?'
.. code-block:: python
conversation.predict(input="I'm doing well! Just having a conversation with an AI.")
.. code-block:: pycon
"That sounds like fun! I'm happy to chat with you. Is there anything specific you'd like to talk about?"
conversation.predict(input="Tell me about yourself.")
# -> "Sure! I am an AI language model created by OpenAI. I was trained on a large dataset of text from the internet, which allows me to understand and generate human-like language. I can answer questions, provide information, and even have conversations like this one. Is there anything else you'd like to know about me?"

View File

@@ -0,0 +1,108 @@
# Tutorials
This is a collection of `LangChain` tutorials on `YouTube`.
⛓ icon marks a new video [last update 2023-05-15]
###
[LangChain Tutorials](https://www.youtube.com/watch?v=FuqdVNB_8c0&list=PL9V0lbeJ69brU-ojMpU1Y7Ic58Tap0Cw6) by [Edrick](https://www.youtube.com/@edrickdch):
- ⛓ [LangChain, Chroma DB, OpenAI Beginner Guide | ChatGPT with your PDF](https://youtu.be/FuqdVNB_8c0)
[LangChain Crash Course: Build an AutoGPT app in 25 minutes](https://youtu.be/MlK6SIjcjE8) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)
[LangChain Crash Course - Build apps with language models](https://youtu.be/LbT1yp6quS8) by [Patrick Loeber](https://www.youtube.com/@patloeber)
[LangChain Explained in 13 Minutes | QuickStart Tutorial for Beginners](https://youtu.be/aywZrzNaKjs) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
###
[LangChain for Gen AI and LLMs](https://www.youtube.com/playlist?list=PLIUOU7oqGTLieV9uTIFMm6_4PXg-hlN6F) by [James Briggs](https://www.youtube.com/@jamesbriggs):
- #1 [Getting Started with `GPT-3` vs. Open Source LLMs](https://youtu.be/nE2skSRWTTs)
- #2 [Prompt Templates for `GPT 3.5` and other LLMs](https://youtu.be/RflBcK0oDH0)
- #3 [LLM Chains using `GPT 3.5` and other LLMs](https://youtu.be/S8j9Tk0lZHU)
- #4 [Chatbot Memory for `Chat-GPT`, `Davinci` + other LLMs](https://youtu.be/X05uK0TZozM)
- #5 [Chat with OpenAI in LangChain](https://youtu.be/CnAgB3A5OlU)
-#6 [Fixing LLM Hallucinations with Retrieval Augmentation in LangChain](https://youtu.be/kvdVduIJsc8)
-#7 [LangChain Agents Deep Dive with GPT 3.5](https://youtu.be/jSP-gSEyVeI)
-#8 [Create Custom Tools for Chatbots in LangChain](https://youtu.be/q-HNphrWsDE)
-#9 [Build Conversational Agents with Vector DBs](https://youtu.be/H6bCqqw9xyI)
###
[LangChain 101](https://www.youtube.com/playlist?list=PLqZXAkvF1bPNQER9mLmDbntNfSpzdDIU5) by [Data Independent](https://www.youtube.com/@DataIndependent):
- [What Is LangChain? - LangChain + `ChatGPT` Overview](https://youtu.be/_v_fgW2SkkQ)
- [Quickstart Guide](https://youtu.be/kYRB-vJFy38)
- [Beginner Guide To 7 Essential Concepts](https://youtu.be/2xxziIWmaSA)
- [`OpenAI` + `Wolfram Alpha`](https://youtu.be/UijbzCIJ99g)
- [Ask Questions On Your Custom (or Private) Files](https://youtu.be/EnT-ZTrcPrg)
- [Connect `Google Drive Files` To `OpenAI`](https://youtu.be/IqqHqDcXLww)
- [`YouTube Transcripts` + `OpenAI`](https://youtu.be/pNcQ5XXMgH4)
- [Question A 300 Page Book (w/ `OpenAI` + `Pinecone`)](https://youtu.be/h0DHDp1FbmQ)
- [Workaround `OpenAI's` Token Limit With Chain Types](https://youtu.be/f9_BWhCI4Zo)
- [Build Your Own OpenAI + LangChain Web App in 23 Minutes](https://youtu.be/U_eV8wfMkXU)
- [Working With The New `ChatGPT API`](https://youtu.be/e9P7FLi5Zy8)
- [OpenAI + LangChain Wrote Me 100 Custom Sales Emails](https://youtu.be/y1pyAQM-3Bo)
- [Structured Output From `OpenAI` (Clean Dirty Data)](https://youtu.be/KwAXfey-xQk)
- [Connect `OpenAI` To +5,000 Tools (LangChain + `Zapier`)](https://youtu.be/7tNm0yiDigU)
- [Use LLMs To Extract Data From Text (Expert Mode)](https://youtu.be/xZzvwR9jdPA)
- ⛓ [Extract Insights From Interview Transcripts Using LLMs](https://youtu.be/shkMOHwJ4SM)
- ⛓ [5 Levels Of LLM Summarizing: Novice to Expert](https://youtu.be/qaPMdcCqtWk)
###
[LangChain How to and guides](https://www.youtube.com/playlist?list=PL8motc6AQftk1Bs42EW45kwYbyJ4jOdiZ) by [Sam Witteveen](https://www.youtube.com/@samwitteveenai):
- [LangChain Basics - LLMs & PromptTemplates with Colab](https://youtu.be/J_0qvRt4LNk)
- [LangChain Basics - Tools and Chains](https://youtu.be/hI2BY7yl_Ac)
- [`ChatGPT API` Announcement & Code Walkthrough with LangChain](https://youtu.be/phHqvLHCwH4)
- [Conversations with Memory (explanation & code walkthrough)](https://youtu.be/X550Zbz_ROE)
- [Chat with `Flan20B`](https://youtu.be/VW5LBavIfY4)
- [Using `Hugging Face Models` locally (code walkthrough)](https://youtu.be/Kn7SX2Mx_Jk)
- [`PAL` : Program-aided Language Models with LangChain code](https://youtu.be/dy7-LvDu-3s)
- [Building a Summarization System with LangChain and `GPT-3` - Part 1](https://youtu.be/LNq_2s_H01Y)
- [Building a Summarization System with LangChain and `GPT-3` - Part 2](https://youtu.be/d-yeHDLgKHw)
- [Microsoft's `Visual ChatGPT` using LangChain](https://youtu.be/7YEiEyfPF5U)
- [LangChain Agents - Joining Tools and Chains with Decisions](https://youtu.be/ziu87EXZVUE)
- [Comparing LLMs with LangChain](https://youtu.be/rFNG0MIEuW0)
- [Using `Constitutional AI` in LangChain](https://youtu.be/uoVqNFDwpX4)
- [Talking to `Alpaca` with LangChain - Creating an Alpaca Chatbot](https://youtu.be/v6sF8Ed3nTE)
- [Talk to your `CSV` & `Excel` with LangChain](https://youtu.be/xQ3mZhw69bc)
- [`BabyAGI`: Discover the Power of Task-Driven Autonomous Agents!](https://youtu.be/QBcDLSE2ERA)
- [Improve your `BabyAGI` with LangChain](https://youtu.be/DRgPyOXZ-oE)
- ⛓ [Master `PDF` Chat with LangChain - Your essential guide to queries on documents](https://youtu.be/ZzgUqFtxgXI)
- ⛓ [Using LangChain with `DuckDuckGO` `Wikipedia` & `PythonREPL` Tools](https://youtu.be/KerHlb8nuVc)
- ⛓ [Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)](https://youtu.be/biS8G8x8DdA)
- ⛓ [LangChain Retrieval QA Over Multiple Files with `ChromaDB`](https://youtu.be/3yPBVii7Ct0)
- ⛓ [LangChain Retrieval QA with Instructor Embeddings & `ChromaDB` for PDFs](https://youtu.be/cFCGUjc33aU)
- ⛓ [LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!](https://youtu.be/9ISVjh8mdlA)
###
[LangChain](https://www.youtube.com/playlist?list=PLVEEucA9MYhOu89CX8H3MBZqayTbcCTMr) by [Prompt Engineering](https://www.youtube.com/@engineerprompt):
- [LangChain Crash Course — All You Need to Know to Build Powerful Apps with LLMs](https://youtu.be/5-fc4Tlgmro)
- [Working with MULTIPLE `PDF` Files in LangChain: `ChatGPT` for your Data](https://youtu.be/s5LhRdh5fu4)
- [`ChatGPT` for YOUR OWN `PDF` files with LangChain](https://youtu.be/TLf90ipMzfE)
- [Talk to YOUR DATA without OpenAI APIs: LangChain](https://youtu.be/wrD-fZvT6UI)
- ⛓️ [CHATGPT For WEBSITES: Custom ChatBOT](https://youtu.be/RBnuhhmD21U)
###
LangChain by [Chat with data](https://www.youtube.com/@chatwithdata)
- [LangChain Beginner's Tutorial for `Typescript`/`Javascript`](https://youtu.be/bH722QgRlhQ)
- [`GPT-4` Tutorial: How to Chat With Multiple `PDF` Files (~1000 pages of Tesla's 10-K Annual Reports)](https://youtu.be/Ix9WIZpArm0)
- [`GPT-4` & LangChain Tutorial: How to Chat With A 56-Page `PDF` Document (w/`Pinecone`)](https://youtu.be/ih9PBGVVOO4)
- ⛓ [LangChain & Supabase Tutorial: How to Build a ChatGPT Chatbot For Your Website](https://youtu.be/R2FMzcsmQY8)
###
[Get SH\*T Done with Prompt Engineering and LangChain](https://www.youtube.com/watch?v=muXbPpG_ys4&list=PLEJK-H61Xlwzm5FYLDdKt_6yibO33zoMW) by [Venelin Valkov](https://www.youtube.com/@venelin_valkov)
- [Getting Started with LangChain: Load Custom Data, Run OpenAI Models, Embeddings and `ChatGPT`](https://www.youtube.com/watch?v=muXbPpG_ys4)
- [Loaders, Indexes & Vectorstores in LangChain: Question Answering on `PDF` files with `ChatGPT`](https://www.youtube.com/watch?v=FQnvfR8Dmr0)
- [LangChain Models: `ChatGPT`, `Flan Alpaca`, `OpenAI Embeddings`, Prompt Templates & Streaming](https://www.youtube.com/watch?v=zy6LiK5F5-s)
- [LangChain Chains: Use `ChatGPT` to Build Conversational Agents, Summaries and Q&A on Text With LLMs](https://www.youtube.com/watch?v=h1tJZQPcimM)
- [Analyze Custom CSV Data with `GPT-4` using Langchain](https://www.youtube.com/watch?v=Ew3sGdX8at4)
- ⛓ [Build ChatGPT Chatbots with LangChain Memory: Understanding and Implementing Memory in Conversations](https://youtu.be/CyuUlf54wTs)
---------------------
⛓ icon marks a new video [last update 2023-05-15]

202
docs/index.rst Normal file
View File

@@ -0,0 +1,202 @@
Welcome to LangChain
==========================
| **LangChain** is a framework for developing applications powered by language models. We believe that the most powerful and differentiated applications will not only call out to a language model, but will also be:
1. *Data-aware*: connect a language model to other sources of data
2. *Agentic*: allow a language model to interact with its environment
| The LangChain framework is designed around these principles.
| This is the Python specific portion of the documentation. For a purely conceptual guide to LangChain, see `here <https://docs.langchain.com/docs/>`_. For the JavaScript documentation, see `here <https://js.langchain.com/docs/>`_.
Getting Started
----------------
| How to get started using LangChain to create an Language Model application.
- `Quickstart Guide <./getting_started/getting_started.html>`_
| Concepts and terminology.
- `Concepts and terminology <./getting_started/concepts.html>`_
| Tutorials created by community experts and presented on YouTube.
- `Tutorials <./getting_started/tutorials.html>`_
.. toctree::
:maxdepth: 2
:caption: Getting Started
:name: getting_started
:hidden:
getting_started/getting_started.md
getting_started/concepts.md
getting_started/tutorials.md
Modules
-----------
| These modules are the core abstractions which we view as the building blocks of any LLM-powered application.
For each module LangChain provides standard, extendable interfaces. LangChain also provides external integrations and even end-to-end implementations for off-the-shelf use.
| The docs for each module contain quickstart examples, how-to guides, reference docs, and conceptual guides.
| The modules are (from least to most complex):
- `Models <./modules/models.html>`_: Supported model types and integrations.
- `Prompts <./modules/prompts.html>`_: Prompt management, optimization, and serialization.
- `Memory <./modules/memory.html>`_: Memory refers to state that is persisted between calls of a chain/agent.
- `Indexes <./modules/indexes.html>`_: Language models become much more powerful when combined with application-specific data - this module contains interfaces and integrations for loading, querying and updating external data.
- `Chains <./modules/chains.html>`_: Chains are structured sequences of calls (to an LLM or to a different utility).
- `Agents <./modules/agents.html>`_: An agent is a Chain in which an LLM, given a high-level directive and a set of tools, repeatedly decides an action, executes the action and observes the outcome until the high-level directive is complete.
- `Callbacks <./modules/callbacks/getting_started.html>`_: Callbacks let you log and stream the intermediate steps of any chain, making it easy to observe, debug, and evaluate the internals of an application.
.. toctree::
:maxdepth: 1
:caption: Modules
:name: modules
:hidden:
./modules/models.rst
./modules/prompts.rst
./modules/memory.md
./modules/indexes.md
./modules/chains.md
./modules/agents.md
./modules/callbacks/getting_started.ipynb
Use Cases
----------
| Best practices and built-in implementations for common LangChain use cases:
- `Autonomous Agents <./use_cases/autonomous_agents.html>`_: Autonomous agents are long-running agents that take many steps in an attempt to accomplish an objective. Examples include AutoGPT and BabyAGI.
- `Agent Simulations <./use_cases/agent_simulations.html>`_: Putting agents in a sandbox and observing how they interact with each other and react to events can be an effective way to evaluate their long-range reasoning and planning abilities.
- `Personal Assistants <./use_cases/personal_assistants.html>`_: One of the primary LangChain use cases. Personal assistants need to take actions, remember interactions, and have knowledge about your data.
- `Question Answering <./use_cases/question_answering.html>`_: Another common LangChain use case. Answering questions over specific documents, only utilizing the information in those documents to construct an answer.
- `Chatbots <./use_cases/chatbots.html>`_: Language models love to chat, making this a very natural use of them.
- `Querying Tabular Data <./use_cases/tabular.html>`_: Recommended reading if you want to use language models to query structured data (CSVs, SQL, dataframes, etc).
- `Code Understanding <./use_cases/code.html>`_: Recommended reading if you want to use language models to analyze code.
- `Interacting with APIs <./use_cases/apis.html>`_: Enabling language models to interact with APIs is extremely powerful. It gives them access to up-to-date information and allows them to take actions.
- `Extraction <./use_cases/extraction.html>`_: Extract structured information from text.
- `Summarization <./use_cases/summarization.html>`_: Compressing longer documents. A type of Data-Augmented Generation.
- `Evaluation <./use_cases/evaluation.html>`_: Generative models are hard to evaluate with traditional metrics. One promising approach is to use language models themselves to do the evaluation.
.. toctree::
:maxdepth: 1
:caption: Use Cases
:name: use_cases
:hidden:
./use_cases/autonomous_agents.md
./use_cases/agent_simulations.md
./use_cases/personal_assistants.md
./use_cases/question_answering.md
./use_cases/chatbots.md
./use_cases/tabular.rst
./use_cases/code.md
./use_cases/apis.md
./use_cases/extraction.md
./use_cases/summarization.md
./use_cases/evaluation.rst
Reference Docs
---------------
| Full documentation on all methods, classes, installation methods, and integration setups for LangChain.
- `LangChain Installation <./reference/installation.html>`_
- `Reference Documentation <./reference.html>`_
.. toctree::
:maxdepth: 1
:caption: Reference
:name: reference
:hidden:
./reference/installation.md
./reference.rst
Ecosystem
------------
| LangChain integrates a lot of different LLMs, systems, and products.
| From the other side, many systems and products depend on LangChain.
| It creates a vibrant and thriving ecosystem.
- `Integrations <./integrations.html>`_: Guides for how other products can be used with LangChain.
- `Dependents <./dependents.html>`_: List of repositories that use LangChain.
- `Deployments <./ecosystem/deployments.html>`_: A collection of instructions, code snippets, and template repositories for deploying LangChain apps.
.. toctree::
:maxdepth: 2
:glob:
:caption: Ecosystem
:name: ecosystem
:hidden:
./integrations.rst
./dependents.md
./ecosystem/deployments.md
Additional Resources
---------------------
| Additional resources we think may be useful as you develop your application!
- `LangChainHub <https://github.com/hwchase17/langchain-hub>`_: The LangChainHub is a place to share and explore other prompts, chains, and agents.
- `Gallery <https://github.com/kyrolabs/awesome-langchain>`_: A collection of great projects that use Langchain, compiled by the folks at `Kyrolabs <https://kyrolabs.com>`_. Useful for finding inspiration and example implementations.
- `Tracing <./additional_resources/tracing.html>`_: A guide on using tracing in LangChain to visualize the execution of chains and agents.
- `Model Laboratory <./additional_resources/model_laboratory.html>`_: Experimenting with different prompts, models, and chains is a big part of developing the best possible application. The ModelLaboratory makes it easy to do so.
- `Discord <https://discord.gg/6adMQxSpJS>`_: Join us on our Discord to discuss all things LangChain!
- `YouTube <./additional_resources/youtube.html>`_: A collection of the LangChain tutorials and videos.
- `Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>`_: As you move your LangChains into production, we'd love to offer more comprehensive support. Please fill out this form and we'll set up a dedicated support Slack channel.
.. toctree::
:maxdepth: 1
:caption: Additional Resources
:name: resources
:hidden:
LangChainHub <https://github.com/hwchase17/langchain-hub>
Gallery <https://github.com/kyrolabs/awesome-langchain>
./additional_resources/tracing.md
./additional_resources/model_laboratory.ipynb
Discord <https://discord.gg/6adMQxSpJS>
./additional_resources/youtube.md
Production Support <https://forms.gle/57d8AmXBYp8PP8tZA>

33
docs/integrations.rst Normal file
View File

@@ -0,0 +1,33 @@
Integrations
===================
LangChain integrates with many LLMs, systems, and products.
Integrations by Module
--------------------------------
| Integrations grouped by the core LangChain module they map to:
- `LLM Providers <./modules/models/llms/integrations.html>`_
- `Chat Model Providers <./modules/models/chat/integrations.html>`_
- `Text Embedding Model Providers <./modules/models/text_embedding.html>`_
- `Document Loader Integrations <./modules/indexes/document_loaders.html>`_
- `Text Splitter Integrations <./modules/indexes/text_splitters.html>`_
- `Vectorstore Providers <./modules/indexes/vectorstores.html>`_
- `Retriever Providers <./modules/indexes/retrievers.html>`_
- `Tool Providers <./modules/agents/tools.html>`_
- `Toolkit Integrations <./modules/agents/toolkits.html>`_
All Integrations
-------------------------------------------
| A comprehensive list of LLMs, systems, and products integrated with LangChain:
.. toctree::
:maxdepth: 1
:glob:
integrations/*

16
docs/integrations/ai21.md Normal file
View File

@@ -0,0 +1,16 @@
# AI21 Labs
This page covers how to use the AI21 ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific AI21 wrappers.
## Installation and Setup
- Get an AI21 api key and set it as an environment variable (`AI21_API_KEY`)
## Wrappers
### LLM
There exists an AI21 LLM wrapper, which you can access with
```python
from langchain.llms import AI21
```

View File

@@ -0,0 +1,291 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Aim\n",
"\n",
"Aim makes it super easy to visualize and debug LangChain executions. Aim tracks inputs and outputs of LLMs and tools, as well as actions of agents. \n",
"\n",
"With Aim, you can easily debug and examine an individual execution:\n",
"\n",
"![](https://user-images.githubusercontent.com/13848158/227784778-06b806c7-74a1-4d15-ab85-9ece09b458aa.png)\n",
"\n",
"Additionally, you have the option to compare multiple executions side by side:\n",
"\n",
"![](https://user-images.githubusercontent.com/13848158/227784994-699b24b7-e69b-48f9-9ffa-e6a6142fd719.png)\n",
"\n",
"Aim is fully open source, [learn more](https://github.com/aimhubio/aim) about Aim on GitHub.\n",
"\n",
"Let's move forward and see how to enable and configure Aim callback."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Tracking LangChain Executions with Aim</h3>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this notebook we will explore three usage scenarios. To start off, we will install the necessary packages and import certain modules. Subsequently, we will configure two environment variables that can be established either within the Python script or through the terminal."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "mf88kuCJhbVu"
},
"outputs": [],
"source": [
"!pip install aim\n",
"!pip install langchain\n",
"!pip install openai\n",
"!pip install google-search-results"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "g4eTuajwfl6L"
},
"outputs": [],
"source": [
"import os\n",
"from datetime import datetime\n",
"\n",
"from langchain.llms import OpenAI\n",
"from langchain.callbacks import AimCallbackHandler, StdOutCallbackHandler"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our examples use a GPT model as the LLM, and OpenAI offers an API for this purpose. You can obtain the key from the following link: https://platform.openai.com/account/api-keys .\n",
"\n",
"We will use the SerpApi to retrieve search results from Google. To acquire the SerpApi key, please go to https://serpapi.com/manage-api-key ."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "T1bSmKd6V2If"
},
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = \"...\"\n",
"os.environ[\"SERPAPI_API_KEY\"] = \"...\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "QenUYuBZjIzc"
},
"source": [
"The event methods of `AimCallbackHandler` accept the LangChain module or agent as input and log at least the prompts and generated results, as well as the serialized version of the LangChain module, to the designated Aim run."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "KAz8weWuUeXF"
},
"outputs": [],
"source": [
"session_group = datetime.now().strftime(\"%m.%d.%Y_%H.%M.%S\")\n",
"aim_callback = AimCallbackHandler(\n",
" repo=\".\",\n",
" experiment_name=\"scenario 1: OpenAI LLM\",\n",
")\n",
"\n",
"callbacks = [StdOutCallbackHandler(), aim_callback]\n",
"llm = OpenAI(temperature=0, callbacks=callbacks)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "b8WfByB4fl6N"
},
"source": [
"The `flush_tracker` function is used to record LangChain assets on Aim. By default, the session is reset rather than being terminated outright."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Scenario 1</h3> In the first scenario, we will use OpenAI LLM."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "o_VmneyIUyx8"
},
"outputs": [],
"source": [
"# scenario 1 - LLM\n",
"llm_result = llm.generate([\"Tell me a joke\", \"Tell me a poem\"] * 3)\n",
"aim_callback.flush_tracker(\n",
" langchain_asset=llm,\n",
" experiment_name=\"scenario 2: Chain with multiple SubChains on multiple generations\",\n",
")\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Scenario 2</h3> Scenario two involves chaining with multiple SubChains across multiple generations."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "trxslyb1U28Y"
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain.chains import LLMChain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "uauQk10SUzF6"
},
"outputs": [],
"source": [
"# scenario 2 - Chain\n",
"template = \"\"\"You are a playwright. Given the title of play, it is your job to write a synopsis for that title.\n",
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"title\"], template=template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)\n",
"\n",
"test_prompts = [\n",
" {\"title\": \"documentary about good video games that push the boundary of game design\"},\n",
" {\"title\": \"the phenomenon behind the remarkable speed of cheetahs\"},\n",
" {\"title\": \"the best in class mlops tooling\"},\n",
"]\n",
"synopsis_chain.apply(test_prompts)\n",
"aim_callback.flush_tracker(\n",
" langchain_asset=synopsis_chain, experiment_name=\"scenario 3: Agent with Tools\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h3>Scenario 3</h3> The third scenario involves an agent with tools."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "_jN73xcPVEpI"
},
"outputs": [],
"source": [
"from langchain.agents import initialize_agent, load_tools\n",
"from langchain.agents import AgentType"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "Gpq4rk6VT9cu",
"outputId": "68ae261e-d0a2-4229-83c4-762562263b66"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Leo DiCaprio's girlfriend is and then calculate her age raised to the 0.43 power.\n",
"Action: Search\n",
"Action Input: \"Leo DiCaprio girlfriend\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mLeonardo DiCaprio seemed to prove a long-held theory about his love life right after splitting from girlfriend Camila Morrone just months ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to find out Camila Morrone's age\n",
"Action: Search\n",
"Action Input: \"Camila Morrone age\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3m25 years\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I need to calculate 25 raised to the 0.43 power\n",
"Action: Calculator\n",
"Action Input: 25^0.43\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mAnswer: 3.991298452658078\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Camila Morrone is Leo DiCaprio's girlfriend and her current age raised to the 0.43 power is 3.991298452658078.\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"# scenario 3 - Agent with Tools\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" callbacks=callbacks,\n",
")\n",
"agent.run(\n",
" \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",
")\n",
"aim_callback.flush_tracker(langchain_asset=agent, reset=False, finish=True)"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"provenance": []
},
"gpuClass": "standard",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

View File

@@ -0,0 +1,15 @@
# AnalyticDB
This page covers how to use the AnalyticDB ecosystem within LangChain.
### VectorStore
There exists a wrapper around AnalyticDB, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import AnalyticDB
```
For a more detailed walkthrough of the AnalyticDB wrapper, see [this notebook](../modules/indexes/vectorstores/examples/analyticdb.ipynb)

View File

@@ -0,0 +1,17 @@
# Anyscale
This page covers how to use the Anyscale ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Anyscale wrappers.
## Installation and Setup
- Get an Anyscale Service URL, route and API key and set them as environment variables (`ANYSCALE_SERVICE_URL`,`ANYSCALE_SERVICE_ROUTE`, `ANYSCALE_SERVICE_TOKEN`).
- Please see [the Anyscale docs](https://docs.anyscale.com/productionize/services-v2/get-started) for more details.
## Wrappers
### LLM
There exists an Anyscale LLM wrapper, which you can access with
```python
from langchain.llms import Anyscale
```

View File

@@ -0,0 +1,46 @@
# Apify
This page covers how to use [Apify](https://apify.com) within LangChain.
## Overview
Apify is a cloud platform for web scraping and data extraction,
which provides an [ecosystem](https://apify.com/store) of more than a thousand
ready-made apps called *Actors* for various scraping, crawling, and extraction use cases.
[![Apify Actors](../_static/ApifyActors.png)](https://apify.com/store)
This integration enables you run Actors on the Apify platform and load their results into LangChain to feed your vector
indexes with documents and data from the web, e.g. to generate answers from websites with documentation,
blogs, or knowledge bases.
## Installation and Setup
- Install the Apify API client for Python with `pip install apify-client`
- Get your [Apify API token](https://console.apify.com/account/integrations) and either set it as
an environment variable (`APIFY_API_TOKEN`) or pass it to the `ApifyWrapper` as `apify_api_token` in the constructor.
## Wrappers
### Utility
You can use the `ApifyWrapper` to run Actors on the Apify platform.
```python
from langchain.utilities import ApifyWrapper
```
For a more detailed walkthrough of this wrapper, see [this notebook](../modules/agents/tools/examples/apify.ipynb).
### Loader
You can also use our `ApifyDatasetLoader` to get data from Apify dataset.
```python
from langchain.document_loaders import ApifyDatasetLoader
```
For a more detailed walkthrough of this loader, see [this notebook](../modules/indexes/document_loaders/examples/apify_dataset.ipynb).

View File

@@ -0,0 +1,27 @@
# AtlasDB
This page covers how to use Nomic's Atlas ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Atlas wrappers.
## Installation and Setup
- Install the Python package with `pip install nomic`
- Nomic is also included in langchains poetry extras `poetry install -E all`
## Wrappers
### VectorStore
There exists a wrapper around the Atlas neural database, allowing you to use it as a vectorstore.
This vectorstore also gives you full access to the underlying AtlasProject object, which will allow you to use the full range of Atlas map interactions, such as bulk tagging and automatic topic modeling.
Please see [the Atlas docs](https://docs.nomic.ai/atlas_api.html) for more detailed information.
To import this vectorstore:
```python
from langchain.vectorstores import AtlasDB
```
For a more detailed walkthrough of the AtlasDB wrapper, see [this notebook](../modules/indexes/vectorstores/examples/atlas.ipynb)

View File

@@ -0,0 +1,79 @@
# Banana
This page covers how to use the Banana ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Banana wrappers.
## Installation and Setup
- Install with `pip install banana-dev`
- Get an Banana api key and set it as an environment variable (`BANANA_API_KEY`)
## Define your Banana Template
If you want to use an available language model template you can find one [here](https://app.banana.dev/templates/conceptofmind/serverless-template-palmyra-base).
This template uses the Palmyra-Base model by [Writer](https://writer.com/product/api/).
You can check out an example Banana repository [here](https://github.com/conceptofmind/serverless-template-palmyra-base).
## Build the Banana app
Banana Apps must include the "output" key in the return json.
There is a rigid response structure.
```python
# Return the results as a dictionary
result = {'output': result}
```
An example inference function would be:
```python
def inference(model_inputs:dict) -> dict:
global model
global tokenizer
# Parse out your arguments
prompt = model_inputs.get('prompt', None)
if prompt == None:
return {'message': "No prompt provided"}
# Run the model
input_ids = tokenizer.encode(prompt, return_tensors='pt').cuda()
output = model.generate(
input_ids,
max_length=100,
do_sample=True,
top_k=50,
top_p=0.95,
num_return_sequences=1,
temperature=0.9,
early_stopping=True,
no_repeat_ngram_size=3,
num_beams=5,
length_penalty=1.5,
repetition_penalty=1.5,
bad_words_ids=[[tokenizer.encode(' ', add_prefix_space=True)[0]]]
)
result = tokenizer.decode(output[0], skip_special_tokens=True)
# Return the results as a dictionary
result = {'output': result}
return result
```
You can find a full example of a Banana app [here](https://github.com/conceptofmind/serverless-template-palmyra-base/blob/main/app.py).
## Wrappers
### LLM
There exists an Banana LLM wrapper, which you can access with
```python
from langchain.llms import Banana
```
You need to provide a model key located in the dashboard:
```python
llm = Banana(model_key="YOUR_MODEL_KEY")
```

92
docs/integrations/beam.md Normal file
View File

@@ -0,0 +1,92 @@
# Beam
This page covers how to use Beam within LangChain.
It is broken into two parts: installation and setup, and then references to specific Beam wrappers.
## Installation and Setup
- [Create an account](https://www.beam.cloud/)
- Install the Beam CLI with `curl https://raw.githubusercontent.com/slai-labs/get-beam/main/get-beam.sh -sSfL | sh`
- Register API keys with `beam configure`
- Set environment variables (`BEAM_CLIENT_ID`) and (`BEAM_CLIENT_SECRET`)
- Install the Beam SDK `pip install beam-sdk`
## Wrappers
### LLM
There exists a Beam LLM wrapper, which you can access with
```python
from langchain.llms.beam import Beam
```
## Define your Beam app.
This is the environment youll be developing against once you start the app.
It's also used to define the maximum response length from the model.
```python
llm = Beam(model_name="gpt2",
name="langchain-gpt2-test",
cpu=8,
memory="32Gi",
gpu="A10G",
python_version="python3.8",
python_packages=[
"diffusers[torch]>=0.10",
"transformers",
"torch",
"pillow",
"accelerate",
"safetensors",
"xformers",],
max_length="50",
verbose=False)
```
## Deploy your Beam app
Once defined, you can deploy your Beam app by calling your model's `_deploy()` method.
```python
llm._deploy()
```
## Call your Beam app
Once a beam model is deployed, it can be called by callying your model's `_call()` method.
This returns the GPT2 text response to your prompt.
```python
response = llm._call("Running machine learning on a remote GPU")
```
An example script which deploys the model and calls it would be:
```python
from langchain.llms.beam import Beam
import time
llm = Beam(model_name="gpt2",
name="langchain-gpt2-test",
cpu=8,
memory="32Gi",
gpu="A10G",
python_version="python3.8",
python_packages=[
"diffusers[torch]>=0.10",
"transformers",
"torch",
"pillow",
"accelerate",
"safetensors",
"xformers",],
max_length="50",
verbose=False)
llm._deploy()
response = llm._call("Running machine learning on a remote GPU")
print(response)
```

View File

@@ -0,0 +1,17 @@
# CerebriumAI
This page covers how to use the CerebriumAI ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific CerebriumAI wrappers.
## Installation and Setup
- Install with `pip install cerebrium`
- Get an CerebriumAI api key and set it as an environment variable (`CEREBRIUMAI_API_KEY`)
## Wrappers
### LLM
There exists an CerebriumAI LLM wrapper, which you can access with
```python
from langchain.llms import CerebriumAI
```

View File

@@ -0,0 +1,20 @@
# Chroma
This page covers how to use the Chroma ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Chroma wrappers.
## Installation and Setup
- Install the Python package with `pip install chromadb`
## Wrappers
### VectorStore
There exists a wrapper around Chroma vector databases, allowing you to use it as a vectorstore,
whether for semantic search or example selection.
To import this vectorstore:
```python
from langchain.vectorstores import Chroma
```
For a more detailed walkthrough of the Chroma wrapper, see [this notebook](../modules/indexes/vectorstores/getting_started.ipynb)

View File

@@ -0,0 +1,587 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# ClearML Integration\n",
"\n",
"In order to properly keep track of your langchain experiments and their results, you can enable the ClearML integration. ClearML is an experiment manager that neatly tracks and organizes all your experiment runs.\n",
"\n",
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/hwchase17/langchain/blob/master/docs/ecosystem/clearml_tracking.ipynb\">\n",
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
"</a>"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Getting API Credentials\n",
"\n",
"We'll be using quite some APIs in this notebook, here is a list and where to get them:\n",
"\n",
"- ClearML: https://app.clear.ml/settings/workspace-configuration\n",
"- OpenAI: https://platform.openai.com/account/api-keys\n",
"- SerpAPI (google search): https://serpapi.com/dashboard"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"os.environ[\"CLEARML_API_ACCESS_KEY\"] = \"\"\n",
"os.environ[\"CLEARML_API_SECRET_KEY\"] = \"\"\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\"\n",
"os.environ[\"SERPAPI_API_KEY\"] = \"\""
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting Up"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install clearml\n",
"!pip install pandas\n",
"!pip install textstat\n",
"!pip install spacy\n",
"!python -m spacy download en_core_web_sm"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The clearml callback is currently in beta and is subject to change based on updates to `langchain`. Please report any issues to https://github.com/allegroai/clearml/issues with the tag `langchain`.\n"
]
}
],
"source": [
"from datetime import datetime\n",
"from langchain.callbacks import ClearMLCallbackHandler, StdOutCallbackHandler\n",
"from langchain.llms import OpenAI\n",
"\n",
"# Setup and use the ClearML Callback\n",
"clearml_callback = ClearMLCallbackHandler(\n",
" task_type=\"inference\",\n",
" project_name=\"langchain_callback_demo\",\n",
" task_name=\"llm\",\n",
" tags=[\"test\"],\n",
" # Change the following parameters based on the amount of detail you want tracked\n",
" visualize=True,\n",
" complexity_metrics=True,\n",
" stream_logs=True\n",
")\n",
"callbacks = [StdOutCallbackHandler(), clearml_callback]\n",
"# Get the OpenAI model ready to go\n",
"llm = OpenAI(temperature=0, callbacks=callbacks)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Scenario 1: Just an LLM\n",
"\n",
"First, let's just run a single LLM a few times and capture the resulting prompt-answer conversation in ClearML"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a joke'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a poem'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a joke'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a poem'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a joke'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Tell me a poem'}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nQ: What did the fish say when it hit the wall?\\nA: Dam!', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 109.04, 'flesch_kincaid_grade': 1.3, 'smog_index': 0.0, 'coleman_liau_index': -1.24, 'automated_readability_index': 0.3, 'dale_chall_readability_score': 5.5, 'difficult_words': 0, 'linsear_write_formula': 5.5, 'gunning_fog': 5.2, 'text_standard': '5th and 6th grade', 'fernandez_huerta': 133.58, 'szigriszt_pazos': 131.54, 'gutierrez_polini': 62.3, 'crawford': -0.2, 'gulpease_index': 79.8, 'osman': 116.91}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nRoses are red,\\nViolets are blue,\\nSugar is sweet,\\nAnd so are you.', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 83.66, 'flesch_kincaid_grade': 4.8, 'smog_index': 0.0, 'coleman_liau_index': 3.23, 'automated_readability_index': 3.9, 'dale_chall_readability_score': 6.71, 'difficult_words': 2, 'linsear_write_formula': 6.5, 'gunning_fog': 8.28, 'text_standard': '6th and 7th grade', 'fernandez_huerta': 115.58, 'szigriszt_pazos': 112.37, 'gutierrez_polini': 54.83, 'crawford': 1.4, 'gulpease_index': 72.1, 'osman': 100.17}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nQ: What did the fish say when it hit the wall?\\nA: Dam!', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 109.04, 'flesch_kincaid_grade': 1.3, 'smog_index': 0.0, 'coleman_liau_index': -1.24, 'automated_readability_index': 0.3, 'dale_chall_readability_score': 5.5, 'difficult_words': 0, 'linsear_write_formula': 5.5, 'gunning_fog': 5.2, 'text_standard': '5th and 6th grade', 'fernandez_huerta': 133.58, 'szigriszt_pazos': 131.54, 'gutierrez_polini': 62.3, 'crawford': -0.2, 'gulpease_index': 79.8, 'osman': 116.91}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nRoses are red,\\nViolets are blue,\\nSugar is sweet,\\nAnd so are you.', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 83.66, 'flesch_kincaid_grade': 4.8, 'smog_index': 0.0, 'coleman_liau_index': 3.23, 'automated_readability_index': 3.9, 'dale_chall_readability_score': 6.71, 'difficult_words': 2, 'linsear_write_formula': 6.5, 'gunning_fog': 8.28, 'text_standard': '6th and 7th grade', 'fernandez_huerta': 115.58, 'szigriszt_pazos': 112.37, 'gutierrez_polini': 54.83, 'crawford': 1.4, 'gulpease_index': 72.1, 'osman': 100.17}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nQ: What did the fish say when it hit the wall?\\nA: Dam!', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 109.04, 'flesch_kincaid_grade': 1.3, 'smog_index': 0.0, 'coleman_liau_index': -1.24, 'automated_readability_index': 0.3, 'dale_chall_readability_score': 5.5, 'difficult_words': 0, 'linsear_write_formula': 5.5, 'gunning_fog': 5.2, 'text_standard': '5th and 6th grade', 'fernandez_huerta': 133.58, 'szigriszt_pazos': 131.54, 'gutierrez_polini': 62.3, 'crawford': -0.2, 'gulpease_index': 79.8, 'osman': 116.91}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 24, 'token_usage_completion_tokens': 138, 'token_usage_total_tokens': 162, 'model_name': 'text-davinci-003', 'step': 4, 'starts': 2, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 0, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': '\\n\\nRoses are red,\\nViolets are blue,\\nSugar is sweet,\\nAnd so are you.', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 83.66, 'flesch_kincaid_grade': 4.8, 'smog_index': 0.0, 'coleman_liau_index': 3.23, 'automated_readability_index': 3.9, 'dale_chall_readability_score': 6.71, 'difficult_words': 2, 'linsear_write_formula': 6.5, 'gunning_fog': 8.28, 'text_standard': '6th and 7th grade', 'fernandez_huerta': 115.58, 'szigriszt_pazos': 112.37, 'gutierrez_polini': 54.83, 'crawford': 1.4, 'gulpease_index': 72.1, 'osman': 100.17}\n",
"{'action_records': action name step starts ends errors text_ctr chain_starts \\\n",
"0 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"1 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"2 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"3 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"4 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"5 on_llm_start OpenAI 1 1 0 0 0 0 \n",
"6 on_llm_end NaN 2 1 1 0 0 0 \n",
"7 on_llm_end NaN 2 1 1 0 0 0 \n",
"8 on_llm_end NaN 2 1 1 0 0 0 \n",
"9 on_llm_end NaN 2 1 1 0 0 0 \n",
"10 on_llm_end NaN 2 1 1 0 0 0 \n",
"11 on_llm_end NaN 2 1 1 0 0 0 \n",
"12 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"13 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"14 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"15 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"16 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"17 on_llm_start OpenAI 3 2 1 0 0 0 \n",
"18 on_llm_end NaN 4 2 2 0 0 0 \n",
"19 on_llm_end NaN 4 2 2 0 0 0 \n",
"20 on_llm_end NaN 4 2 2 0 0 0 \n",
"21 on_llm_end NaN 4 2 2 0 0 0 \n",
"22 on_llm_end NaN 4 2 2 0 0 0 \n",
"23 on_llm_end NaN 4 2 2 0 0 0 \n",
"\n",
" chain_ends llm_starts ... difficult_words linsear_write_formula \\\n",
"0 0 1 ... NaN NaN \n",
"1 0 1 ... NaN NaN \n",
"2 0 1 ... NaN NaN \n",
"3 0 1 ... NaN NaN \n",
"4 0 1 ... NaN NaN \n",
"5 0 1 ... NaN NaN \n",
"6 0 1 ... 0.0 5.5 \n",
"7 0 1 ... 2.0 6.5 \n",
"8 0 1 ... 0.0 5.5 \n",
"9 0 1 ... 2.0 6.5 \n",
"10 0 1 ... 0.0 5.5 \n",
"11 0 1 ... 2.0 6.5 \n",
"12 0 2 ... NaN NaN \n",
"13 0 2 ... NaN NaN \n",
"14 0 2 ... NaN NaN \n",
"15 0 2 ... NaN NaN \n",
"16 0 2 ... NaN NaN \n",
"17 0 2 ... NaN NaN \n",
"18 0 2 ... 0.0 5.5 \n",
"19 0 2 ... 2.0 6.5 \n",
"20 0 2 ... 0.0 5.5 \n",
"21 0 2 ... 2.0 6.5 \n",
"22 0 2 ... 0.0 5.5 \n",
"23 0 2 ... 2.0 6.5 \n",
"\n",
" gunning_fog text_standard fernandez_huerta szigriszt_pazos \\\n",
"0 NaN NaN NaN NaN \n",
"1 NaN NaN NaN NaN \n",
"2 NaN NaN NaN NaN \n",
"3 NaN NaN NaN NaN \n",
"4 NaN NaN NaN NaN \n",
"5 NaN NaN NaN NaN \n",
"6 5.20 5th and 6th grade 133.58 131.54 \n",
"7 8.28 6th and 7th grade 115.58 112.37 \n",
"8 5.20 5th and 6th grade 133.58 131.54 \n",
"9 8.28 6th and 7th grade 115.58 112.37 \n",
"10 5.20 5th and 6th grade 133.58 131.54 \n",
"11 8.28 6th and 7th grade 115.58 112.37 \n",
"12 NaN NaN NaN NaN \n",
"13 NaN NaN NaN NaN \n",
"14 NaN NaN NaN NaN \n",
"15 NaN NaN NaN NaN \n",
"16 NaN NaN NaN NaN \n",
"17 NaN NaN NaN NaN \n",
"18 5.20 5th and 6th grade 133.58 131.54 \n",
"19 8.28 6th and 7th grade 115.58 112.37 \n",
"20 5.20 5th and 6th grade 133.58 131.54 \n",
"21 8.28 6th and 7th grade 115.58 112.37 \n",
"22 5.20 5th and 6th grade 133.58 131.54 \n",
"23 8.28 6th and 7th grade 115.58 112.37 \n",
"\n",
" gutierrez_polini crawford gulpease_index osman \n",
"0 NaN NaN NaN NaN \n",
"1 NaN NaN NaN NaN \n",
"2 NaN NaN NaN NaN \n",
"3 NaN NaN NaN NaN \n",
"4 NaN NaN NaN NaN \n",
"5 NaN NaN NaN NaN \n",
"6 62.30 -0.2 79.8 116.91 \n",
"7 54.83 1.4 72.1 100.17 \n",
"8 62.30 -0.2 79.8 116.91 \n",
"9 54.83 1.4 72.1 100.17 \n",
"10 62.30 -0.2 79.8 116.91 \n",
"11 54.83 1.4 72.1 100.17 \n",
"12 NaN NaN NaN NaN \n",
"13 NaN NaN NaN NaN \n",
"14 NaN NaN NaN NaN \n",
"15 NaN NaN NaN NaN \n",
"16 NaN NaN NaN NaN \n",
"17 NaN NaN NaN NaN \n",
"18 62.30 -0.2 79.8 116.91 \n",
"19 54.83 1.4 72.1 100.17 \n",
"20 62.30 -0.2 79.8 116.91 \n",
"21 54.83 1.4 72.1 100.17 \n",
"22 62.30 -0.2 79.8 116.91 \n",
"23 54.83 1.4 72.1 100.17 \n",
"\n",
"[24 rows x 39 columns], 'session_analysis': prompt_step prompts name output_step \\\n",
"0 1 Tell me a joke OpenAI 2 \n",
"1 1 Tell me a poem OpenAI 2 \n",
"2 1 Tell me a joke OpenAI 2 \n",
"3 1 Tell me a poem OpenAI 2 \n",
"4 1 Tell me a joke OpenAI 2 \n",
"5 1 Tell me a poem OpenAI 2 \n",
"6 3 Tell me a joke OpenAI 4 \n",
"7 3 Tell me a poem OpenAI 4 \n",
"8 3 Tell me a joke OpenAI 4 \n",
"9 3 Tell me a poem OpenAI 4 \n",
"10 3 Tell me a joke OpenAI 4 \n",
"11 3 Tell me a poem OpenAI 4 \n",
"\n",
" output \\\n",
"0 \\n\\nQ: What did the fish say when it hit the w... \n",
"1 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"2 \\n\\nQ: What did the fish say when it hit the w... \n",
"3 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"4 \\n\\nQ: What did the fish say when it hit the w... \n",
"5 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"6 \\n\\nQ: What did the fish say when it hit the w... \n",
"7 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"8 \\n\\nQ: What did the fish say when it hit the w... \n",
"9 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"10 \\n\\nQ: What did the fish say when it hit the w... \n",
"11 \\n\\nRoses are red,\\nViolets are blue,\\nSugar i... \n",
"\n",
" token_usage_total_tokens token_usage_prompt_tokens \\\n",
"0 162 24 \n",
"1 162 24 \n",
"2 162 24 \n",
"3 162 24 \n",
"4 162 24 \n",
"5 162 24 \n",
"6 162 24 \n",
"7 162 24 \n",
"8 162 24 \n",
"9 162 24 \n",
"10 162 24 \n",
"11 162 24 \n",
"\n",
" token_usage_completion_tokens flesch_reading_ease flesch_kincaid_grade \\\n",
"0 138 109.04 1.3 \n",
"1 138 83.66 4.8 \n",
"2 138 109.04 1.3 \n",
"3 138 83.66 4.8 \n",
"4 138 109.04 1.3 \n",
"5 138 83.66 4.8 \n",
"6 138 109.04 1.3 \n",
"7 138 83.66 4.8 \n",
"8 138 109.04 1.3 \n",
"9 138 83.66 4.8 \n",
"10 138 109.04 1.3 \n",
"11 138 83.66 4.8 \n",
"\n",
" ... difficult_words linsear_write_formula gunning_fog \\\n",
"0 ... 0 5.5 5.20 \n",
"1 ... 2 6.5 8.28 \n",
"2 ... 0 5.5 5.20 \n",
"3 ... 2 6.5 8.28 \n",
"4 ... 0 5.5 5.20 \n",
"5 ... 2 6.5 8.28 \n",
"6 ... 0 5.5 5.20 \n",
"7 ... 2 6.5 8.28 \n",
"8 ... 0 5.5 5.20 \n",
"9 ... 2 6.5 8.28 \n",
"10 ... 0 5.5 5.20 \n",
"11 ... 2 6.5 8.28 \n",
"\n",
" text_standard fernandez_huerta szigriszt_pazos gutierrez_polini \\\n",
"0 5th and 6th grade 133.58 131.54 62.30 \n",
"1 6th and 7th grade 115.58 112.37 54.83 \n",
"2 5th and 6th grade 133.58 131.54 62.30 \n",
"3 6th and 7th grade 115.58 112.37 54.83 \n",
"4 5th and 6th grade 133.58 131.54 62.30 \n",
"5 6th and 7th grade 115.58 112.37 54.83 \n",
"6 5th and 6th grade 133.58 131.54 62.30 \n",
"7 6th and 7th grade 115.58 112.37 54.83 \n",
"8 5th and 6th grade 133.58 131.54 62.30 \n",
"9 6th and 7th grade 115.58 112.37 54.83 \n",
"10 5th and 6th grade 133.58 131.54 62.30 \n",
"11 6th and 7th grade 115.58 112.37 54.83 \n",
"\n",
" crawford gulpease_index osman \n",
"0 -0.2 79.8 116.91 \n",
"1 1.4 72.1 100.17 \n",
"2 -0.2 79.8 116.91 \n",
"3 1.4 72.1 100.17 \n",
"4 -0.2 79.8 116.91 \n",
"5 1.4 72.1 100.17 \n",
"6 -0.2 79.8 116.91 \n",
"7 1.4 72.1 100.17 \n",
"8 -0.2 79.8 116.91 \n",
"9 1.4 72.1 100.17 \n",
"10 -0.2 79.8 116.91 \n",
"11 1.4 72.1 100.17 \n",
"\n",
"[12 rows x 24 columns]}\n",
"2023-03-29 14:00:25,948 - clearml.Task - INFO - Completed model upload to https://files.clear.ml/langchain_callback_demo/llm.988bd727b0e94a29a3ac0ee526813545/models/simple_sequential\n"
]
}
],
"source": [
"# SCENARIO 1 - LLM\n",
"llm_result = llm.generate([\"Tell me a joke\", \"Tell me a poem\"] * 3)\n",
"# After every generation run, use flush to make sure all the metrics\n",
"# prompts and other output are properly saved separately\n",
"clearml_callback.flush_tracker(langchain_asset=llm, name=\"simple_sequential\")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"At this point you can already go to https://app.clear.ml and take a look at the resulting ClearML Task that was created.\n",
"\n",
"Among others, you should see that this notebook is saved along with any git information. The model JSON that contains the used parameters is saved as an artifact, there are also console logs and under the plots section, you'll find tables that represent the flow of the chain.\n",
"\n",
"Finally, if you enabled visualizations, these are stored as HTML files under debug samples."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Scenario 2: Creating an agent with tools\n",
"\n",
"To show a more advanced workflow, let's create an agent with access to tools. The way ClearML tracks the results is not different though, only the table will look slightly different as there are other types of actions taken when compared to the earlier, simpler example.\n",
"\n",
"You can now also see the use of the `finish=True` keyword, which will fully close the ClearML Task, instead of just resetting the parameters and prompts for a new conversation."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"{'action': 'on_chain_start', 'name': 'AgentExecutor', 'step': 1, 'starts': 1, 'ends': 0, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 0, 'llm_ends': 0, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'input': 'Who is the wife of the person who sang summer of 69?'}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 2, 'starts': 2, 'ends': 0, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 0, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'prompts': 'Answer the following questions as best you can. You have access to the following tools:\\n\\nSearch: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.\\nCalculator: Useful for when you need to answer questions about math.\\n\\nUse the following format:\\n\\nQuestion: the input question you must answer\\nThought: you should always think about what to do\\nAction: the action to take, should be one of [Search, Calculator]\\nAction Input: the input to the action\\nObservation: the result of the action\\n... (this Thought/Action/Action Input/Observation can repeat N times)\\nThought: I now know the final answer\\nFinal Answer: the final answer to the original input question\\n\\nBegin!\\n\\nQuestion: Who is the wife of the person who sang summer of 69?\\nThought:'}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 189, 'token_usage_completion_tokens': 34, 'token_usage_total_tokens': 223, 'model_name': 'text-davinci-003', 'step': 3, 'starts': 2, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 0, 'tool_ends': 0, 'agent_ends': 0, 'text': ' I need to find out who sang summer of 69 and then find out who their wife is.\\nAction: Search\\nAction Input: \"Who sang summer of 69\"', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 91.61, 'flesch_kincaid_grade': 3.8, 'smog_index': 0.0, 'coleman_liau_index': 3.41, 'automated_readability_index': 3.5, 'dale_chall_readability_score': 6.06, 'difficult_words': 2, 'linsear_write_formula': 5.75, 'gunning_fog': 5.4, 'text_standard': '3rd and 4th grade', 'fernandez_huerta': 121.07, 'szigriszt_pazos': 119.5, 'gutierrez_polini': 54.91, 'crawford': 0.9, 'gulpease_index': 72.7, 'osman': 92.16}\n",
"\u001b[32;1m\u001b[1;3m I need to find out who sang summer of 69 and then find out who their wife is.\n",
"Action: Search\n",
"Action Input: \"Who sang summer of 69\"\u001b[0m{'action': 'on_agent_action', 'tool': 'Search', 'tool_input': 'Who sang summer of 69', 'log': ' I need to find out who sang summer of 69 and then find out who their wife is.\\nAction: Search\\nAction Input: \"Who sang summer of 69\"', 'step': 4, 'starts': 3, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 1, 'tool_ends': 0, 'agent_ends': 0}\n",
"{'action': 'on_tool_start', 'input_str': 'Who sang summer of 69', 'name': 'Search', 'description': 'A search engine. Useful for when you need to answer questions about current events. Input should be a search query.', 'step': 5, 'starts': 4, 'ends': 1, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 2, 'tool_ends': 0, 'agent_ends': 0}\n",
"\n",
"Observation: \u001b[36;1m\u001b[1;3mBryan Adams - Summer Of 69 (Official Music Video).\u001b[0m\n",
"Thought:{'action': 'on_tool_end', 'output': 'Bryan Adams - Summer Of 69 (Official Music Video).', 'step': 6, 'starts': 4, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 1, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 2, 'tool_ends': 1, 'agent_ends': 0}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 7, 'starts': 5, 'ends': 2, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 1, 'llm_streams': 0, 'tool_starts': 2, 'tool_ends': 1, 'agent_ends': 0, 'prompts': 'Answer the following questions as best you can. You have access to the following tools:\\n\\nSearch: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.\\nCalculator: Useful for when you need to answer questions about math.\\n\\nUse the following format:\\n\\nQuestion: the input question you must answer\\nThought: you should always think about what to do\\nAction: the action to take, should be one of [Search, Calculator]\\nAction Input: the input to the action\\nObservation: the result of the action\\n... (this Thought/Action/Action Input/Observation can repeat N times)\\nThought: I now know the final answer\\nFinal Answer: the final answer to the original input question\\n\\nBegin!\\n\\nQuestion: Who is the wife of the person who sang summer of 69?\\nThought: I need to find out who sang summer of 69 and then find out who their wife is.\\nAction: Search\\nAction Input: \"Who sang summer of 69\"\\nObservation: Bryan Adams - Summer Of 69 (Official Music Video).\\nThought:'}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 242, 'token_usage_completion_tokens': 28, 'token_usage_total_tokens': 270, 'model_name': 'text-davinci-003', 'step': 8, 'starts': 5, 'ends': 3, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 2, 'tool_ends': 1, 'agent_ends': 0, 'text': ' I need to find out who Bryan Adams is married to.\\nAction: Search\\nAction Input: \"Who is Bryan Adams married to\"', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 94.66, 'flesch_kincaid_grade': 2.7, 'smog_index': 0.0, 'coleman_liau_index': 4.73, 'automated_readability_index': 4.0, 'dale_chall_readability_score': 7.16, 'difficult_words': 2, 'linsear_write_formula': 4.25, 'gunning_fog': 4.2, 'text_standard': '4th and 5th grade', 'fernandez_huerta': 124.13, 'szigriszt_pazos': 119.2, 'gutierrez_polini': 52.26, 'crawford': 0.7, 'gulpease_index': 74.7, 'osman': 84.2}\n",
"\u001b[32;1m\u001b[1;3m I need to find out who Bryan Adams is married to.\n",
"Action: Search\n",
"Action Input: \"Who is Bryan Adams married to\"\u001b[0m{'action': 'on_agent_action', 'tool': 'Search', 'tool_input': 'Who is Bryan Adams married to', 'log': ' I need to find out who Bryan Adams is married to.\\nAction: Search\\nAction Input: \"Who is Bryan Adams married to\"', 'step': 9, 'starts': 6, 'ends': 3, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 3, 'tool_ends': 1, 'agent_ends': 0}\n",
"{'action': 'on_tool_start', 'input_str': 'Who is Bryan Adams married to', 'name': 'Search', 'description': 'A search engine. Useful for when you need to answer questions about current events. Input should be a search query.', 'step': 10, 'starts': 7, 'ends': 3, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 1, 'agent_ends': 0}\n",
"\n",
"Observation: \u001b[36;1m\u001b[1;3mBryan Adams has never married. In the 1990s, he was in a relationship with Danish model Cecilie Thomsen. In 2011, Bryan and Alicia Grimaldi, his ...\u001b[0m\n",
"Thought:{'action': 'on_tool_end', 'output': 'Bryan Adams has never married. In the 1990s, he was in a relationship with Danish model Cecilie Thomsen. In 2011, Bryan and Alicia Grimaldi, his ...', 'step': 11, 'starts': 7, 'ends': 4, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 2, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 0}\n",
"{'action': 'on_llm_start', 'name': 'OpenAI', 'step': 12, 'starts': 8, 'ends': 4, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 3, 'llm_ends': 2, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 0, 'prompts': 'Answer the following questions as best you can. You have access to the following tools:\\n\\nSearch: A search engine. Useful for when you need to answer questions about current events. Input should be a search query.\\nCalculator: Useful for when you need to answer questions about math.\\n\\nUse the following format:\\n\\nQuestion: the input question you must answer\\nThought: you should always think about what to do\\nAction: the action to take, should be one of [Search, Calculator]\\nAction Input: the input to the action\\nObservation: the result of the action\\n... (this Thought/Action/Action Input/Observation can repeat N times)\\nThought: I now know the final answer\\nFinal Answer: the final answer to the original input question\\n\\nBegin!\\n\\nQuestion: Who is the wife of the person who sang summer of 69?\\nThought: I need to find out who sang summer of 69 and then find out who their wife is.\\nAction: Search\\nAction Input: \"Who sang summer of 69\"\\nObservation: Bryan Adams - Summer Of 69 (Official Music Video).\\nThought: I need to find out who Bryan Adams is married to.\\nAction: Search\\nAction Input: \"Who is Bryan Adams married to\"\\nObservation: Bryan Adams has never married. In the 1990s, he was in a relationship with Danish model Cecilie Thomsen. In 2011, Bryan and Alicia Grimaldi, his ...\\nThought:'}\n",
"{'action': 'on_llm_end', 'token_usage_prompt_tokens': 314, 'token_usage_completion_tokens': 18, 'token_usage_total_tokens': 332, 'model_name': 'text-davinci-003', 'step': 13, 'starts': 8, 'ends': 5, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 3, 'llm_ends': 3, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 0, 'text': ' I now know the final answer.\\nFinal Answer: Bryan Adams has never been married.', 'generation_info_finish_reason': 'stop', 'generation_info_logprobs': None, 'flesch_reading_ease': 81.29, 'flesch_kincaid_grade': 3.7, 'smog_index': 0.0, 'coleman_liau_index': 5.75, 'automated_readability_index': 3.9, 'dale_chall_readability_score': 7.37, 'difficult_words': 1, 'linsear_write_formula': 2.5, 'gunning_fog': 2.8, 'text_standard': '3rd and 4th grade', 'fernandez_huerta': 115.7, 'szigriszt_pazos': 110.84, 'gutierrez_polini': 49.79, 'crawford': 0.7, 'gulpease_index': 85.4, 'osman': 83.14}\n",
"\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: Bryan Adams has never been married.\u001b[0m\n",
"{'action': 'on_agent_finish', 'output': 'Bryan Adams has never been married.', 'log': ' I now know the final answer.\\nFinal Answer: Bryan Adams has never been married.', 'step': 14, 'starts': 8, 'ends': 6, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 0, 'llm_starts': 3, 'llm_ends': 3, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 1}\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"{'action': 'on_chain_end', 'outputs': 'Bryan Adams has never been married.', 'step': 15, 'starts': 8, 'ends': 7, 'errors': 0, 'text_ctr': 0, 'chain_starts': 1, 'chain_ends': 1, 'llm_starts': 3, 'llm_ends': 3, 'llm_streams': 0, 'tool_starts': 4, 'tool_ends': 2, 'agent_ends': 1}\n",
"{'action_records': action name step starts ends errors text_ctr \\\n",
"0 on_llm_start OpenAI 1 1 0 0 0 \n",
"1 on_llm_start OpenAI 1 1 0 0 0 \n",
"2 on_llm_start OpenAI 1 1 0 0 0 \n",
"3 on_llm_start OpenAI 1 1 0 0 0 \n",
"4 on_llm_start OpenAI 1 1 0 0 0 \n",
".. ... ... ... ... ... ... ... \n",
"66 on_tool_end NaN 11 7 4 0 0 \n",
"67 on_llm_start OpenAI 12 8 4 0 0 \n",
"68 on_llm_end NaN 13 8 5 0 0 \n",
"69 on_agent_finish NaN 14 8 6 0 0 \n",
"70 on_chain_end NaN 15 8 7 0 0 \n",
"\n",
" chain_starts chain_ends llm_starts ... gulpease_index osman input \\\n",
"0 0 0 1 ... NaN NaN NaN \n",
"1 0 0 1 ... NaN NaN NaN \n",
"2 0 0 1 ... NaN NaN NaN \n",
"3 0 0 1 ... NaN NaN NaN \n",
"4 0 0 1 ... NaN NaN NaN \n",
".. ... ... ... ... ... ... ... \n",
"66 1 0 2 ... NaN NaN NaN \n",
"67 1 0 3 ... NaN NaN NaN \n",
"68 1 0 3 ... 85.4 83.14 NaN \n",
"69 1 0 3 ... NaN NaN NaN \n",
"70 1 1 3 ... NaN NaN NaN \n",
"\n",
" tool tool_input log \\\n",
"0 NaN NaN NaN \n",
"1 NaN NaN NaN \n",
"2 NaN NaN NaN \n",
"3 NaN NaN NaN \n",
"4 NaN NaN NaN \n",
".. ... ... ... \n",
"66 NaN NaN NaN \n",
"67 NaN NaN NaN \n",
"68 NaN NaN NaN \n",
"69 NaN NaN I now know the final answer.\\nFinal Answer: B... \n",
"70 NaN NaN NaN \n",
"\n",
" input_str description output \\\n",
"0 NaN NaN NaN \n",
"1 NaN NaN NaN \n",
"2 NaN NaN NaN \n",
"3 NaN NaN NaN \n",
"4 NaN NaN NaN \n",
".. ... ... ... \n",
"66 NaN NaN Bryan Adams has never married. In the 1990s, h... \n",
"67 NaN NaN NaN \n",
"68 NaN NaN NaN \n",
"69 NaN NaN Bryan Adams has never been married. \n",
"70 NaN NaN NaN \n",
"\n",
" outputs \n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
".. ... \n",
"66 NaN \n",
"67 NaN \n",
"68 NaN \n",
"69 NaN \n",
"70 Bryan Adams has never been married. \n",
"\n",
"[71 rows x 47 columns], 'session_analysis': prompt_step prompts name \\\n",
"0 2 Answer the following questions as best you can... OpenAI \n",
"1 7 Answer the following questions as best you can... OpenAI \n",
"2 12 Answer the following questions as best you can... OpenAI \n",
"\n",
" output_step output \\\n",
"0 3 I need to find out who sang summer of 69 and ... \n",
"1 8 I need to find out who Bryan Adams is married... \n",
"2 13 I now know the final answer.\\nFinal Answer: B... \n",
"\n",
" token_usage_total_tokens token_usage_prompt_tokens \\\n",
"0 223 189 \n",
"1 270 242 \n",
"2 332 314 \n",
"\n",
" token_usage_completion_tokens flesch_reading_ease flesch_kincaid_grade \\\n",
"0 34 91.61 3.8 \n",
"1 28 94.66 2.7 \n",
"2 18 81.29 3.7 \n",
"\n",
" ... difficult_words linsear_write_formula gunning_fog \\\n",
"0 ... 2 5.75 5.4 \n",
"1 ... 2 4.25 4.2 \n",
"2 ... 1 2.50 2.8 \n",
"\n",
" text_standard fernandez_huerta szigriszt_pazos gutierrez_polini \\\n",
"0 3rd and 4th grade 121.07 119.50 54.91 \n",
"1 4th and 5th grade 124.13 119.20 52.26 \n",
"2 3rd and 4th grade 115.70 110.84 49.79 \n",
"\n",
" crawford gulpease_index osman \n",
"0 0.9 72.7 92.16 \n",
"1 0.7 74.7 84.20 \n",
"2 0.7 85.4 83.14 \n",
"\n",
"[3 rows x 24 columns]}\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Could not update last created model in Task 988bd727b0e94a29a3ac0ee526813545, Task status 'completed' cannot be updated\n"
]
}
],
"source": [
"from langchain.agents import initialize_agent, load_tools\n",
"from langchain.agents import AgentType\n",
"\n",
"# SCENARIO 2 - Agent with Tools\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,\n",
" callbacks=callbacks,\n",
")\n",
"agent.run(\n",
" \"Who is the wife of the person who sang summer of 69?\"\n",
")\n",
"clearml_callback.flush_tracker(langchain_asset=agent, name=\"Agent with Tools\", finish=True)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tips and Next Steps\n",
"\n",
"- Make sure you always use a unique `name` argument for the `clearml_callback.flush_tracker` function. If not, the model parameters used for a run will override the previous run!\n",
"\n",
"- If you close the ClearML Callback using `clearml_callback.flush_tracker(..., finish=True)` the Callback cannot be used anymore. Make a new one if you want to keep logging.\n",
"\n",
"- Check out the rest of the open source ClearML ecosystem, there is a data version manager, a remote execution agent, automated pipelines and much more!\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "a53ebf4a859167383b364e7e7521d0add3c2dbbdecce4edf676e8c4634ff3fbb"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,25 @@
# Cohere
This page covers how to use the Cohere ecosystem within LangChain.
It is broken into two parts: installation and setup, and then references to specific Cohere wrappers.
## Installation and Setup
- Install the Python SDK with `pip install cohere`
- Get an Cohere api key and set it as an environment variable (`COHERE_API_KEY`)
## Wrappers
### LLM
There exists an Cohere LLM wrapper, which you can access with
```python
from langchain.llms import Cohere
```
### Embeddings
There exists an Cohere Embeddings wrapper, which you can access with
```python
from langchain.embeddings import CohereEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](../modules/models/text_embedding/examples/cohere.ipynb)

View File

@@ -0,0 +1,347 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Comet"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![](https://user-images.githubusercontent.com/7529846/230328046-a8b18c51-12e3-4617-9b39-97614a571a2d.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this guide we will demonstrate how to track your Langchain Experiments, Evaluation Metrics, and LLM Sessions with [Comet](https://www.comet.com/site/?utm_source=langchain&utm_medium=referral&utm_campaign=comet_notebook). \n",
"\n",
"<a target=\"_blank\" href=\"https://colab.research.google.com/github/hwchase17/langchain/blob/master/docs/ecosystem/comet_tracking.ipynb\">\n",
" <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
"</a>\n",
"\n",
"**Example Project:** [Comet with LangChain](https://www.comet.com/examples/comet-example-langchain/view/b5ZThK6OFdhKWVSP3fDfRtrNF/panels?utm_source=langchain&utm_medium=referral&utm_campaign=comet_notebook)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img width=\"1280\" alt=\"comet-langchain\" src=\"https://user-images.githubusercontent.com/7529846/230326720-a9711435-9c6f-4edb-a707-94b67271ab25.png\">\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Install Comet and Dependencies"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install comet_ml langchain openai google-search-results spacy textstat pandas\n",
"\n",
"import sys\n",
"!{sys.executable} -m spacy download en_core_web_sm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialize Comet and Set your Credentials"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can grab your [Comet API Key here](https://www.comet.com/signup?utm_source=langchain&utm_medium=referral&utm_campaign=comet_notebook) or click the link after initializing Comet"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import comet_ml\n",
"\n",
"comet_ml.init(project_name=\"comet-example-langchain\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set OpenAI and SerpAPI credentials"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You will need an [OpenAI API Key](https://platform.openai.com/account/api-keys) and a [SerpAPI API Key](https://serpapi.com/dashboard) to run the following examples"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"...\"\n",
"#os.environ[\"OPENAI_ORGANIZATION\"] = \"...\"\n",
"os.environ[\"SERPAPI_API_KEY\"] = \"...\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 1: Using just an LLM"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from datetime import datetime\n",
"\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.llms import OpenAI\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
" project_name=\"comet-example-langchain\",\n",
" complexity_metrics=True,\n",
" stream_logs=True,\n",
" tags=[\"llm\"],\n",
" visualizations=[\"dep\"],\n",
")\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks, verbose=True)\n",
"\n",
"llm_result = llm.generate([\"Tell me a joke\", \"Tell me a poem\", \"Tell me a fact\"] * 3)\n",
"print(\"LLM result\", llm_result)\n",
"comet_callback.flush_tracker(llm, finish=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 2: Using an LLM in a Chain"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.chains import LLMChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.prompts import PromptTemplate\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
" complexity_metrics=True,\n",
" project_name=\"comet-example-langchain\",\n",
" stream_logs=True,\n",
" tags=[\"synopsis-chain\"],\n",
")\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks)\n",
"\n",
"template = \"\"\"You are a playwright. Given the title of play, it is your job to write a synopsis for that title.\n",
"Title: {title}\n",
"Playwright: This is a synopsis for the above play:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"title\"], template=template)\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template, callbacks=callbacks)\n",
"\n",
"test_prompts = [{\"title\": \"Documentary about Bigfoot in Paris\"}]\n",
"print(synopsis_chain.apply(test_prompts))\n",
"comet_callback.flush_tracker(synopsis_chain, finish=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 3: Using An Agent with Tools "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import initialize_agent, load_tools\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.llms import OpenAI\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
" project_name=\"comet-example-langchain\",\n",
" complexity_metrics=True,\n",
" stream_logs=True,\n",
" tags=[\"agent\"],\n",
")\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9, callbacks=callbacks)\n",
"\n",
"tools = load_tools([\"serpapi\", \"llm-math\"], llm=llm, callbacks=callbacks)\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=\"zero-shot-react-description\",\n",
" callbacks=callbacks,\n",
" verbose=True,\n",
")\n",
"agent.run(\n",
" \"Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?\"\n",
")\n",
"comet_callback.flush_tracker(agent, finish=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Scenario 4: Using Custom Evaluation Metrics"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `CometCallbackManager` also allows you to define and use Custom Evaluation Metrics to assess generated outputs from your model. Let's take a look at how this works. \n",
"\n",
"\n",
"In the snippet below, we will use the [ROUGE](https://huggingface.co/spaces/evaluate-metric/rouge) metric to evaluate the quality of a generated summary of an input prompt. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install rouge-score"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from rouge_score import rouge_scorer\n",
"\n",
"from langchain.callbacks import CometCallbackHandler, StdOutCallbackHandler\n",
"from langchain.chains import LLMChain\n",
"from langchain.llms import OpenAI\n",
"from langchain.prompts import PromptTemplate\n",
"\n",
"\n",
"class Rouge:\n",
" def __init__(self, reference):\n",
" self.reference = reference\n",
" self.scorer = rouge_scorer.RougeScorer([\"rougeLsum\"], use_stemmer=True)\n",
"\n",
" def compute_metric(self, generation, prompt_idx, gen_idx):\n",
" prediction = generation.text\n",
" results = self.scorer.score(target=self.reference, prediction=prediction)\n",
"\n",
" return {\n",
" \"rougeLsum_score\": results[\"rougeLsum\"].fmeasure,\n",
" \"reference\": self.reference,\n",
" }\n",
"\n",
"\n",
"reference = \"\"\"\n",
"The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building.\n",
"It was the first structure to reach a height of 300 metres.\n",
"\n",
"It is now taller than the Chrysler Building in New York City by 5.2 metres (17 ft)\n",
"Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France .\n",
"\"\"\"\n",
"rouge_score = Rouge(reference=reference)\n",
"\n",
"template = \"\"\"Given the following article, it is your job to write a summary.\n",
"Article:\n",
"{article}\n",
"Summary: This is the summary for the above article:\"\"\"\n",
"prompt_template = PromptTemplate(input_variables=[\"article\"], template=template)\n",
"\n",
"comet_callback = CometCallbackHandler(\n",
" project_name=\"comet-example-langchain\",\n",
" complexity_metrics=False,\n",
" stream_logs=True,\n",
" tags=[\"custom_metrics\"],\n",
" custom_metrics=rouge_score.compute_metric,\n",
")\n",
"callbacks = [StdOutCallbackHandler(), comet_callback]\n",
"llm = OpenAI(temperature=0.9)\n",
"\n",
"synopsis_chain = LLMChain(llm=llm, prompt=prompt_template)\n",
"\n",
"test_prompts = [\n",
" {\n",
" \"article\": \"\"\"\n",
" The tower is 324 metres (1,063 ft) tall, about the same height as\n",
" an 81-storey building, and the tallest structure in Paris. Its base is square,\n",
" measuring 125 metres (410 ft) on each side.\n",
" During its construction, the Eiffel Tower surpassed the\n",
" Washington Monument to become the tallest man-made structure in the world,\n",
" a title it held for 41 years until the Chrysler Building\n",
" in New York City was finished in 1930.\n",
"\n",
" It was the first structure to reach a height of 300 metres.\n",
" Due to the addition of a broadcasting aerial at the top of the tower in 1957,\n",
" it is now taller than the Chrysler Building by 5.2 metres (17 ft).\n",
"\n",
" Excluding transmitters, the Eiffel Tower is the second tallest\n",
" free-standing structure in France after the Millau Viaduct.\n",
" \"\"\"\n",
" }\n",
"]\n",
"print(synopsis_chain.apply(test_prompts, callbacks=callbacks))\n",
"comet_callback.flush_tracker(synopsis_chain, finish=True)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.15"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -0,0 +1,57 @@
# C Transformers
This page covers how to use the [C Transformers](https://github.com/marella/ctransformers) library within LangChain.
It is broken into two parts: installation and setup, and then references to specific C Transformers wrappers.
## Installation and Setup
- Install the Python package with `pip install ctransformers`
- Download a supported [GGML model](https://huggingface.co/TheBloke) (see [Supported Models](https://github.com/marella/ctransformers#supported-models))
## Wrappers
### LLM
There exists a CTransformers LLM wrapper, which you can access with:
```python
from langchain.llms import CTransformers
```
It provides a unified interface for all models:
```python
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2')
print(llm('AI is going to'))
```
If you are getting `illegal instruction` error, try using `lib='avx'` or `lib='basic'`:
```py
llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2', lib='avx')
```
It can be used with models hosted on the Hugging Face Hub:
```py
llm = CTransformers(model='marella/gpt-2-ggml')
```
If a model repo has multiple model files (`.bin` files), specify a model file using:
```py
llm = CTransformers(model='marella/gpt-2-ggml', model_file='ggml-model.bin')
```
Additional parameters can be passed using the `config` parameter:
```py
config = {'max_new_tokens': 256, 'repetition_penalty': 1.1}
llm = CTransformers(model='marella/gpt-2-ggml', config=config)
```
See [Documentation](https://github.com/marella/ctransformers#config) for a list of available parameters.
For a more detailed walkthrough of this, see [this notebook](../modules/models/llms/integrations/ctransformers.ipynb).

View File

@@ -0,0 +1,25 @@
# Databerry
This page covers how to use the [Databerry](https://databerry.ai) within LangChain.
## What is Databerry?
Databerry is an [open source](https://github.com/gmpetrov/databerry) document retrievial platform that helps to connect your personal data with Large Language Models.
![Databerry](../_static/DataberryDashboard.png)
## Quick start
Retrieving documents stored in Databerry from LangChain is very easy!
```python
from langchain.retrievers import DataberryRetriever
retriever = DataberryRetriever(
datastore_url="https://api.databerry.ai/query/clg1xg2h80000l708dymr0fxc",
# api_key="DATABERRY_API_KEY", # optional if datastore is public
# top_k=10 # optional
)
docs = retriever.get_relevant_documents("What's Databerry?")
```

View File

@@ -0,0 +1,280 @@
{
"cells": [
{
"cell_type": "markdown",
"source": [
"# Databricks\n",
"\n",
"This notebook covers how to connect to the [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the SQLDatabase wrapper of LangChain.\n",
"It is broken into 3 parts: installation and setup, connecting to Databricks, and examples."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## Installation and Setup"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 1,
"outputs": [],
"source": [
"!pip install databricks-sql-connector"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## Connecting to Databricks\n",
"\n",
"You can connect to [Databricks runtimes](https://docs.databricks.com/runtime/index.html) and [Databricks SQL](https://www.databricks.com/product/databricks-sql) using the `SQLDatabase.from_databricks()` method.\n",
"\n",
"### Syntax\n",
"```python\n",
"SQLDatabase.from_databricks(\n",
" catalog: str,\n",
" schema: str,\n",
" host: Optional[str] = None,\n",
" api_token: Optional[str] = None,\n",
" warehouse_id: Optional[str] = None,\n",
" cluster_id: Optional[str] = None,\n",
" engine_args: Optional[dict] = None,\n",
" **kwargs: Any)\n",
"```\n",
"### Required Parameters\n",
"* `catalog`: The catalog name in the Databricks database.\n",
"* `schema`: The schema name in the catalog.\n",
"\n",
"### Optional Parameters\n",
"There following parameters are optional. When executing the method in a Databricks notebook, you don't need to provide them in most of the cases.\n",
"* `host`: The Databricks workspace hostname, excluding 'https://' part. Defaults to 'DATABRICKS_HOST' environment variable or current workspace if in a Databricks notebook.\n",
"* `api_token`: The Databricks personal access token for accessing the Databricks SQL warehouse or the cluster. Defaults to 'DATABRICKS_API_TOKEN' environment variable or a temporary one is generated if in a Databricks notebook.\n",
"* `warehouse_id`: The warehouse ID in the Databricks SQL.\n",
"* `cluster_id`: The cluster ID in the Databricks Runtime. If running in a Databricks notebook and both 'warehouse_id' and 'cluster_id' are None, it uses the ID of the cluster the notebook is attached to.\n",
"* `engine_args`: The arguments to be used when connecting Databricks.\n",
"* `**kwargs`: Additional keyword arguments for the `SQLDatabase.from_uri` method."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"## Examples"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 2,
"outputs": [],
"source": [
"# Connecting to Databricks with SQLDatabase wrapper\n",
"from langchain import SQLDatabase\n",
"\n",
"db = SQLDatabase.from_databricks(catalog='samples', schema='nyctaxi')"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 3,
"outputs": [],
"source": [
"# Creating a OpenAI Chat LLM wrapper\n",
"from langchain.chat_models import ChatOpenAI\n",
"\n",
"llm = ChatOpenAI(temperature=0, model_name=\"gpt-4\")"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "markdown",
"source": [
"### SQL Chain example\n",
"\n",
"This example demonstrates the use of the [SQL Chain](https://python.langchain.com/en/latest/modules/chains/examples/sqlite.html) for answering a question over a Databricks database."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 4,
"id": "36f2270b",
"metadata": {},
"outputs": [],
"source": [
"from langchain import SQLDatabaseChain\n",
"\n",
"db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "4e2b5f25",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001B[1m> Entering new SQLDatabaseChain chain...\u001B[0m\n",
"What is the average duration of taxi rides that start between midnight and 6am?\n",
"SQLQuery:\u001B[32;1m\u001B[1;3mSELECT AVG(UNIX_TIMESTAMP(tpep_dropoff_datetime) - UNIX_TIMESTAMP(tpep_pickup_datetime)) as avg_duration\n",
"FROM trips\n",
"WHERE HOUR(tpep_pickup_datetime) >= 0 AND HOUR(tpep_pickup_datetime) < 6\u001B[0m\n",
"SQLResult: \u001B[33;1m\u001B[1;3m[(987.8122786304605,)]\u001B[0m\n",
"Answer:\u001B[32;1m\u001B[1;3mThe average duration of taxi rides that start between midnight and 6am is 987.81 seconds.\u001B[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
"data": {
"text/plain": "'The average duration of taxi rides that start between midnight and 6am is 987.81 seconds.'"
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"db_chain.run(\"What is the average duration of taxi rides that start between midnight and 6am?\")"
]
},
{
"cell_type": "markdown",
"source": [
"### SQL Database Agent example\n",
"\n",
"This example demonstrates the use of the [SQL Database Agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/sql_database.html) for answering questions over a Databricks database."
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": 7,
"id": "9918e86a",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import create_sql_agent\n",
"from langchain.agents.agent_toolkits import SQLDatabaseToolkit\n",
"\n",
"toolkit = SQLDatabaseToolkit(db=db, llm=llm)\n",
"agent = create_sql_agent(\n",
" llm=llm,\n",
" toolkit=toolkit,\n",
" verbose=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "c484a76e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mAction: list_tables_sql_db\n",
"Action Input: \u001B[0m\n",
"Observation: \u001B[38;5;200m\u001B[1;3mtrips\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3mI should check the schema of the trips table to see if it has the necessary columns for trip distance and duration.\n",
"Action: schema_sql_db\n",
"Action Input: trips\u001B[0m\n",
"Observation: \u001B[33;1m\u001B[1;3m\n",
"CREATE TABLE trips (\n",
"\ttpep_pickup_datetime TIMESTAMP, \n",
"\ttpep_dropoff_datetime TIMESTAMP, \n",
"\ttrip_distance FLOAT, \n",
"\tfare_amount FLOAT, \n",
"\tpickup_zip INT, \n",
"\tdropoff_zip INT\n",
") USING DELTA\n",
"\n",
"/*\n",
"3 rows from trips table:\n",
"tpep_pickup_datetime\ttpep_dropoff_datetime\ttrip_distance\tfare_amount\tpickup_zip\tdropoff_zip\n",
"2016-02-14 16:52:13+00:00\t2016-02-14 17:16:04+00:00\t4.94\t19.0\t10282\t10171\n",
"2016-02-04 18:44:19+00:00\t2016-02-04 18:46:00+00:00\t0.28\t3.5\t10110\t10110\n",
"2016-02-17 17:13:57+00:00\t2016-02-17 17:17:55+00:00\t0.7\t5.0\t10103\t10023\n",
"*/\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3mThe trips table has the necessary columns for trip distance and duration. I will write a query to find the longest trip distance and its duration.\n",
"Action: query_checker_sql_db\n",
"Action Input: SELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001B[0m\n",
"Observation: \u001B[31;1m\u001B[1;3mSELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3mThe query is correct. I will now execute it to find the longest trip distance and its duration.\n",
"Action: query_sql_db\n",
"Action Input: SELECT trip_distance, tpep_dropoff_datetime - tpep_pickup_datetime as duration FROM trips ORDER BY trip_distance DESC LIMIT 1\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3m[(30.6, '0 00:43:31.000000000')]\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3mI now know the final answer.\n",
"Final Answer: The longest trip distance is 30.6 miles and it took 43 minutes and 31 seconds.\u001B[0m\n",
"\n",
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
"data": {
"text/plain": "'The longest trip distance is 30.6 miles and it took 43 minutes and 31 seconds.'"
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent.run(\"What is the longest trip distance and how long did it take?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

Some files were not shown because too many files have changed in this diff Show More