mirror of
https://github.com/hwchase17/langchain.git
synced 2026-02-13 14:21:27 +00:00
Compare commits
3 Commits
langchain-
...
langchain=
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
b67cd71d7d | ||
|
|
e150b7c7e3 | ||
|
|
ee3fc91e7a |
132
.github/CODE_OF_CONDUCT.md
vendored
Normal file
132
.github/CODE_OF_CONDUCT.md
vendored
Normal file
@@ -0,0 +1,132 @@
|
||||
# Contributor Covenant Code of Conduct
|
||||
|
||||
## Our Pledge
|
||||
|
||||
We as members, contributors, and leaders pledge to make participation in our
|
||||
community a harassment-free experience for everyone, regardless of age, body
|
||||
size, visible or invisible disability, ethnicity, sex characteristics, gender
|
||||
identity and expression, level of experience, education, socio-economic status,
|
||||
nationality, personal appearance, race, caste, color, religion, or sexual
|
||||
identity and orientation.
|
||||
|
||||
We pledge to act and interact in ways that contribute to an open, welcoming,
|
||||
diverse, inclusive, and healthy community.
|
||||
|
||||
## Our Standards
|
||||
|
||||
Examples of behavior that contributes to a positive environment for our
|
||||
community include:
|
||||
|
||||
* Demonstrating empathy and kindness toward other people
|
||||
* Being respectful of differing opinions, viewpoints, and experiences
|
||||
* Giving and gracefully accepting constructive feedback
|
||||
* Accepting responsibility and apologizing to those affected by our mistakes,
|
||||
and learning from the experience
|
||||
* Focusing on what is best not just for us as individuals, but for the overall
|
||||
community
|
||||
|
||||
Examples of unacceptable behavior include:
|
||||
|
||||
* The use of sexualized language or imagery, and sexual attention or advances of
|
||||
any kind
|
||||
* Trolling, insulting or derogatory comments, and personal or political attacks
|
||||
* Public or private harassment
|
||||
* Publishing others' private information, such as a physical or email address,
|
||||
without their explicit permission
|
||||
* Other conduct which could reasonably be considered inappropriate in a
|
||||
professional setting
|
||||
|
||||
## Enforcement Responsibilities
|
||||
|
||||
Community leaders are responsible for clarifying and enforcing our standards of
|
||||
acceptable behavior and will take appropriate and fair corrective action in
|
||||
response to any behavior that they deem inappropriate, threatening, offensive,
|
||||
or harmful.
|
||||
|
||||
Community leaders have the right and responsibility to remove, edit, or reject
|
||||
comments, commits, code, wiki edits, issues, and other contributions that are
|
||||
not aligned to this Code of Conduct, and will communicate reasons for moderation
|
||||
decisions when appropriate.
|
||||
|
||||
## Scope
|
||||
|
||||
This Code of Conduct applies within all community spaces, and also applies when
|
||||
an individual is officially representing the community in public spaces.
|
||||
Examples of representing our community include using an official e-mail address,
|
||||
posting via an official social media account, or acting as an appointed
|
||||
representative at an online or offline event.
|
||||
|
||||
## Enforcement
|
||||
|
||||
Instances of abusive, harassing, or otherwise unacceptable behavior may be
|
||||
reported to the community leaders responsible for enforcement at
|
||||
conduct@langchain.dev.
|
||||
All complaints will be reviewed and investigated promptly and fairly.
|
||||
|
||||
All community leaders are obligated to respect the privacy and security of the
|
||||
reporter of any incident.
|
||||
|
||||
## Enforcement Guidelines
|
||||
|
||||
Community leaders will follow these Community Impact Guidelines in determining
|
||||
the consequences for any action they deem in violation of this Code of Conduct:
|
||||
|
||||
### 1. Correction
|
||||
|
||||
**Community Impact**: Use of inappropriate language or other behavior deemed
|
||||
unprofessional or unwelcome in the community.
|
||||
|
||||
**Consequence**: A private, written warning from community leaders, providing
|
||||
clarity around the nature of the violation and an explanation of why the
|
||||
behavior was inappropriate. A public apology may be requested.
|
||||
|
||||
### 2. Warning
|
||||
|
||||
**Community Impact**: A violation through a single incident or series of
|
||||
actions.
|
||||
|
||||
**Consequence**: A warning with consequences for continued behavior. No
|
||||
interaction with the people involved, including unsolicited interaction with
|
||||
those enforcing the Code of Conduct, for a specified period of time. This
|
||||
includes avoiding interactions in community spaces as well as external channels
|
||||
like social media. Violating these terms may lead to a temporary or permanent
|
||||
ban.
|
||||
|
||||
### 3. Temporary Ban
|
||||
|
||||
**Community Impact**: A serious violation of community standards, including
|
||||
sustained inappropriate behavior.
|
||||
|
||||
**Consequence**: A temporary ban from any sort of interaction or public
|
||||
communication with the community for a specified period of time. No public or
|
||||
private interaction with the people involved, including unsolicited interaction
|
||||
with those enforcing the Code of Conduct, is allowed during this period.
|
||||
Violating these terms may lead to a permanent ban.
|
||||
|
||||
### 4. Permanent Ban
|
||||
|
||||
**Community Impact**: Demonstrating a pattern of violation of community
|
||||
standards, including sustained inappropriate behavior, harassment of an
|
||||
individual, or aggression toward or disparagement of classes of individuals.
|
||||
|
||||
**Consequence**: A permanent ban from any sort of public interaction within the
|
||||
community.
|
||||
|
||||
## Attribution
|
||||
|
||||
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
|
||||
version 2.1, available at
|
||||
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
|
||||
|
||||
Community Impact Guidelines were inspired by
|
||||
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
|
||||
|
||||
For answers to common questions about this code of conduct, see the FAQ at
|
||||
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
|
||||
[https://www.contributor-covenant.org/translations][translations].
|
||||
|
||||
[homepage]: https://www.contributor-covenant.org
|
||||
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
|
||||
[Mozilla CoC]: https://github.com/mozilla/diversity
|
||||
[FAQ]: https://www.contributor-covenant.org/faq
|
||||
[translations]: https://www.contributor-covenant.org/translations
|
||||
6
.github/CONTRIBUTING.md
vendored
Normal file
6
.github/CONTRIBUTING.md
vendored
Normal file
@@ -0,0 +1,6 @@
|
||||
# Contributing to LangChain
|
||||
|
||||
Hi there! Thank you for even being interested in contributing to LangChain.
|
||||
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
|
||||
|
||||
To learn how to contribute to LangChain, please follow the [contribution guide here](https://docs.langchain.com/oss/python/contributing).
|
||||
77
.github/ISSUE_TEMPLATE/bug-report.yml
vendored
77
.github/ISSUE_TEMPLATE/bug-report.yml
vendored
@@ -8,15 +8,16 @@ body:
|
||||
value: |
|
||||
Thank you for taking the time to file a bug report.
|
||||
|
||||
For usage questions, feature requests and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
|
||||
Use this to report BUGS in LangChain. For usage questions, feature requests and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
|
||||
|
||||
Check these before submitting to see if your issue has already been reported, fixed or if there's another way to solve your problem:
|
||||
Relevant links to check before filing a bug report to see if your issue has already been reported, fixed or
|
||||
if there's another way to solve your problem:
|
||||
|
||||
* [Documentation](https://docs.langchain.com/oss/python/langchain/overview),
|
||||
* [API Reference Documentation](https://reference.langchain.com/python/),
|
||||
* [LangChain Forum](https://forum.langchain.com/),
|
||||
* [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
|
||||
* [API Reference](https://reference.langchain.com/python/),
|
||||
* [LangChain ChatBot](https://chat.langchain.com/)
|
||||
* [GitHub search](https://github.com/langchain-ai/langchain),
|
||||
* [LangChain Forum](https://forum.langchain.com/),
|
||||
- type: checkboxes
|
||||
id: checks
|
||||
attributes:
|
||||
@@ -35,48 +36,16 @@ body:
|
||||
required: true
|
||||
- label: This is not related to the langchain-community package.
|
||||
required: true
|
||||
- label: I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
|
||||
required: true
|
||||
- label: I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
|
||||
required: true
|
||||
- type: checkboxes
|
||||
id: package
|
||||
attributes:
|
||||
label: Package (Required)
|
||||
description: |
|
||||
Which `langchain` package(s) is this bug related to? Select at least one.
|
||||
|
||||
Note that if the package you are reporting for is not listed here, it is not in this repository (e.g. `langchain-google-genai` is in [`langchain-ai/langchain-google`](https://github.com/langchain-ai/langchain-google/)).
|
||||
|
||||
Please report issues for other packages to their respective repositories.
|
||||
options:
|
||||
- label: langchain
|
||||
- label: langchain-openai
|
||||
- label: langchain-anthropic
|
||||
- label: langchain-classic
|
||||
- label: langchain-core
|
||||
- label: langchain-cli
|
||||
- label: langchain-model-profiles
|
||||
- label: langchain-tests
|
||||
- label: langchain-text-splitters
|
||||
- label: langchain-chroma
|
||||
- label: langchain-deepseek
|
||||
- label: langchain-exa
|
||||
- label: langchain-fireworks
|
||||
- label: langchain-groq
|
||||
- label: langchain-huggingface
|
||||
- label: langchain-mistralai
|
||||
- label: langchain-nomic
|
||||
- label: langchain-ollama
|
||||
- label: langchain-perplexity
|
||||
- label: langchain-prompty
|
||||
- label: langchain-qdrant
|
||||
- label: langchain-xai
|
||||
- label: Other / not sure / general
|
||||
- type: textarea
|
||||
id: reproduction
|
||||
validations:
|
||||
required: true
|
||||
attributes:
|
||||
label: Example Code (Python)
|
||||
label: Example Code
|
||||
description: |
|
||||
Please add a self-contained, [minimal, reproducible, example](https://stackoverflow.com/help/minimal-reproducible-example) with your use case.
|
||||
|
||||
@@ -84,12 +53,15 @@ body:
|
||||
|
||||
**Important!**
|
||||
|
||||
* Avoid screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
|
||||
* Reduce your code to the minimum required to reproduce the issue if possible.
|
||||
* Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
|
||||
* Reduce your code to the minimum required to reproduce the issue if possible. This makes it much easier for others to help you.
|
||||
* Use code tags (e.g., ```python ... ```) to correctly [format your code](https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting).
|
||||
* INCLUDE the language label (e.g. `python`) after the first three backticks to enable syntax highlighting. (e.g., ```python rather than ```).
|
||||
|
||||
(This will be automatically formatted into code, so no need for backticks.)
|
||||
render: python
|
||||
placeholder: |
|
||||
The following code:
|
||||
|
||||
```python
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
|
||||
def bad_code(inputs) -> int:
|
||||
@@ -97,14 +69,17 @@ body:
|
||||
|
||||
chain = RunnableLambda(bad_code)
|
||||
chain.invoke('Hello!')
|
||||
```
|
||||
- type: textarea
|
||||
id: error
|
||||
validations:
|
||||
required: false
|
||||
attributes:
|
||||
label: Error Message and Stack Trace (if applicable)
|
||||
description: |
|
||||
If you are reporting an error, please copy and paste the full error message and
|
||||
stack trace.
|
||||
(This will be automatically formatted into code, so no need for backticks.)
|
||||
render: shell
|
||||
If you are reporting an error, please include the full error message and stack trace.
|
||||
placeholder: |
|
||||
Exception + full stack trace
|
||||
- type: textarea
|
||||
id: description
|
||||
attributes:
|
||||
@@ -124,7 +99,9 @@ body:
|
||||
attributes:
|
||||
label: System Info
|
||||
description: |
|
||||
Please share your system info with us.
|
||||
Please share your system info with us. Do NOT skip this step and please don't trim
|
||||
the output. Most users don't include enough information here and it makes it harder
|
||||
for us to help you.
|
||||
|
||||
Run the following command in your terminal and paste the output here:
|
||||
|
||||
@@ -136,6 +113,8 @@ body:
|
||||
from langchain_core import sys_info
|
||||
sys_info.print_sys_info()
|
||||
```
|
||||
|
||||
alternatively, put the entire output of `pip freeze` here.
|
||||
placeholder: |
|
||||
python -m langchain_core.sys_info
|
||||
validations:
|
||||
|
||||
8
.github/ISSUE_TEMPLATE/config.yml
vendored
8
.github/ISSUE_TEMPLATE/config.yml
vendored
@@ -1,15 +1,9 @@
|
||||
blank_issues_enabled: false
|
||||
version: 2.1
|
||||
contact_links:
|
||||
- name: 📚 Documentation issue
|
||||
- name: 📚 Documentation
|
||||
url: https://github.com/langchain-ai/docs/issues/new?template=01-langchain.yml
|
||||
about: Report an issue related to the LangChain documentation
|
||||
- name: 💬 LangChain Forum
|
||||
url: https://forum.langchain.com/
|
||||
about: General community discussions and support
|
||||
- name: 📚 LangChain Documentation
|
||||
url: https://docs.langchain.com/oss/python/langchain/overview
|
||||
about: View the official LangChain documentation
|
||||
- name: 📚 API Reference Documentation
|
||||
url: https://reference.langchain.com/python/
|
||||
about: View the official LangChain API reference documentation
|
||||
|
||||
40
.github/ISSUE_TEMPLATE/feature-request.yml
vendored
40
.github/ISSUE_TEMPLATE/feature-request.yml
vendored
@@ -13,11 +13,11 @@ body:
|
||||
Relevant links to check before filing a feature request to see if your request has already been made or
|
||||
if there's another way to achieve what you want:
|
||||
|
||||
* [Documentation](https://docs.langchain.com/oss/python/langchain/overview),
|
||||
* [API Reference Documentation](https://reference.langchain.com/python/),
|
||||
* [LangChain Forum](https://forum.langchain.com/),
|
||||
* [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
|
||||
* [API Reference](https://reference.langchain.com/python/),
|
||||
* [LangChain ChatBot](https://chat.langchain.com/)
|
||||
* [GitHub search](https://github.com/langchain-ai/langchain),
|
||||
* [LangChain Forum](https://forum.langchain.com/),
|
||||
- type: checkboxes
|
||||
id: checks
|
||||
attributes:
|
||||
@@ -34,40 +34,6 @@ body:
|
||||
required: true
|
||||
- label: This is not related to the langchain-community package.
|
||||
required: true
|
||||
- type: checkboxes
|
||||
id: package
|
||||
attributes:
|
||||
label: Package (Required)
|
||||
description: |
|
||||
Which `langchain` package(s) is this request related to? Select at least one.
|
||||
|
||||
Note that if the package you are requesting for is not listed here, it is not in this repository (e.g. `langchain-google-genai` is in `langchain-ai/langchain`).
|
||||
|
||||
Please submit feature requests for other packages to their respective repositories.
|
||||
options:
|
||||
- label: langchain
|
||||
- label: langchain-openai
|
||||
- label: langchain-anthropic
|
||||
- label: langchain-classic
|
||||
- label: langchain-core
|
||||
- label: langchain-cli
|
||||
- label: langchain-model-profiles
|
||||
- label: langchain-tests
|
||||
- label: langchain-text-splitters
|
||||
- label: langchain-chroma
|
||||
- label: langchain-deepseek
|
||||
- label: langchain-exa
|
||||
- label: langchain-fireworks
|
||||
- label: langchain-groq
|
||||
- label: langchain-huggingface
|
||||
- label: langchain-mistralai
|
||||
- label: langchain-nomic
|
||||
- label: langchain-ollama
|
||||
- label: langchain-perplexity
|
||||
- label: langchain-prompty
|
||||
- label: langchain-qdrant
|
||||
- label: langchain-xai
|
||||
- label: Other / not sure / general
|
||||
- type: textarea
|
||||
id: feature-description
|
||||
validations:
|
||||
|
||||
30
.github/ISSUE_TEMPLATE/privileged.yml
vendored
30
.github/ISSUE_TEMPLATE/privileged.yml
vendored
@@ -18,33 +18,3 @@ body:
|
||||
attributes:
|
||||
label: Issue Content
|
||||
description: Add the content of the issue here.
|
||||
- type: checkboxes
|
||||
id: package
|
||||
attributes:
|
||||
label: Package (Required)
|
||||
description: |
|
||||
Please select package(s) that this issue is related to.
|
||||
options:
|
||||
- label: langchain
|
||||
- label: langchain-openai
|
||||
- label: langchain-anthropic
|
||||
- label: langchain-classic
|
||||
- label: langchain-core
|
||||
- label: langchain-cli
|
||||
- label: langchain-model-profiles
|
||||
- label: langchain-tests
|
||||
- label: langchain-text-splitters
|
||||
- label: langchain-chroma
|
||||
- label: langchain-deepseek
|
||||
- label: langchain-exa
|
||||
- label: langchain-fireworks
|
||||
- label: langchain-groq
|
||||
- label: langchain-huggingface
|
||||
- label: langchain-mistralai
|
||||
- label: langchain-nomic
|
||||
- label: langchain-ollama
|
||||
- label: langchain-perplexity
|
||||
- label: langchain-prompty
|
||||
- label: langchain-qdrant
|
||||
- label: langchain-xai
|
||||
- label: Other / not sure / general
|
||||
|
||||
48
.github/ISSUE_TEMPLATE/task.yml
vendored
48
.github/ISSUE_TEMPLATE/task.yml
vendored
@@ -25,13 +25,13 @@ body:
|
||||
label: Task Description
|
||||
description: |
|
||||
Provide a clear and detailed description of the task.
|
||||
|
||||
|
||||
What needs to be done? Be specific about the scope and requirements.
|
||||
placeholder: |
|
||||
This task involves...
|
||||
|
||||
|
||||
The goal is to...
|
||||
|
||||
|
||||
Specific requirements:
|
||||
- ...
|
||||
- ...
|
||||
@@ -43,7 +43,7 @@ body:
|
||||
label: Acceptance Criteria
|
||||
description: |
|
||||
Define the criteria that must be met for this task to be considered complete.
|
||||
|
||||
|
||||
What are the specific deliverables or outcomes expected?
|
||||
placeholder: |
|
||||
This task will be complete when:
|
||||
@@ -58,15 +58,15 @@ body:
|
||||
label: Context and Background
|
||||
description: |
|
||||
Provide any relevant context, background information, or links to related issues/PRs.
|
||||
|
||||
|
||||
Why is this task needed? What problem does it solve?
|
||||
placeholder: |
|
||||
Background:
|
||||
- ...
|
||||
|
||||
|
||||
Related issues/PRs:
|
||||
- #...
|
||||
|
||||
|
||||
Additional context:
|
||||
- ...
|
||||
validations:
|
||||
@@ -77,45 +77,15 @@ body:
|
||||
label: Dependencies
|
||||
description: |
|
||||
List any dependencies or blockers for this task.
|
||||
|
||||
|
||||
Are there other tasks, issues, or external factors that need to be completed first?
|
||||
placeholder: |
|
||||
This task depends on:
|
||||
- [ ] Issue #...
|
||||
- [ ] PR #...
|
||||
- [ ] External dependency: ...
|
||||
|
||||
|
||||
Blocked by:
|
||||
- ...
|
||||
validations:
|
||||
required: false
|
||||
- type: checkboxes
|
||||
id: package
|
||||
attributes:
|
||||
label: Package (Required)
|
||||
description: |
|
||||
Please select package(s) that this task is related to.
|
||||
options:
|
||||
- label: langchain
|
||||
- label: langchain-openai
|
||||
- label: langchain-anthropic
|
||||
- label: langchain-classic
|
||||
- label: langchain-core
|
||||
- label: langchain-cli
|
||||
- label: langchain-model-profiles
|
||||
- label: langchain-tests
|
||||
- label: langchain-text-splitters
|
||||
- label: langchain-chroma
|
||||
- label: langchain-deepseek
|
||||
- label: langchain-exa
|
||||
- label: langchain-fireworks
|
||||
- label: langchain-groq
|
||||
- label: langchain-huggingface
|
||||
- label: langchain-mistralai
|
||||
- label: langchain-nomic
|
||||
- label: langchain-ollama
|
||||
- label: langchain-perplexity
|
||||
- label: langchain-prompty
|
||||
- label: langchain-qdrant
|
||||
- label: langchain-xai
|
||||
- label: Other / not sure / general
|
||||
|
||||
38
.github/PULL_REQUEST_TEMPLATE.md
vendored
38
.github/PULL_REQUEST_TEMPLATE.md
vendored
@@ -1,30 +1,28 @@
|
||||
(Replace this entire block of text)
|
||||
|
||||
Read the full contributing guidelines: https://docs.langchain.com/oss/python/contributing/overview
|
||||
|
||||
Thank you for contributing to LangChain! Follow these steps to have your pull request considered as ready for review.
|
||||
|
||||
1. PR title: Should follow the format: TYPE(SCOPE): DESCRIPTION
|
||||
Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. **If any of these steps are not completed, your PR will not be considered for review.**
|
||||
|
||||
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
|
||||
- Examples:
|
||||
- fix(anthropic): resolve flag parsing error
|
||||
- feat(core): add multi-tenant support
|
||||
- test(openai): update API usage tests
|
||||
- Allowed TYPE and SCOPE values: https://github.com/langchain-ai/langchain/blob/master/.github/workflows/pr_lint.yml#L15-L33
|
||||
- fix(cli): resolve flag parsing error
|
||||
- docs(openai): update API usage examples
|
||||
- Allowed `{TYPE}` values:
|
||||
- feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release
|
||||
- Allowed `{SCOPE}` values (optional):
|
||||
- core, cli, langchain, standard-tests, text-splitters, docs, anthropic, chroma, deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant, xai, infra
|
||||
- Once you've written the title, please delete this checklist item; do not include it in the PR.
|
||||
|
||||
2. PR description:
|
||||
- [ ] **PR message**: ***Delete this entire checklist*** and replace with
|
||||
- **Description:** a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue.
|
||||
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
|
||||
- **Dependencies:** any dependencies required for this change
|
||||
|
||||
- Write 1-2 sentences summarizing the change.
|
||||
- If this PR addresses a specific issue, please include "Fixes #ISSUE_NUMBER" in the description to automatically close the issue when the PR is merged.
|
||||
- If there are any breaking changes, please clearly describe them.
|
||||
- If this PR depends on another PR being merged first, please include "Depends on #PR_NUMBER" inthe description.
|
||||
|
||||
3. Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified.
|
||||
|
||||
- We will not consider a PR unless these three are passing in CI.
|
||||
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. **We will not consider a PR unless these three are passing in CI.** See [contribution guidelines](https://docs.langchain.com/oss/python/contributing) for more.
|
||||
|
||||
Additional guidelines:
|
||||
|
||||
- We ask that if you use generative AI for your contribution, you include a disclaimer.
|
||||
- PRs should not touch more than one package unless absolutely necessary.
|
||||
- Do not update the `uv.lock` files unless or add dependencies to `pyproject.toml` files (even optional ones) unless you have explicit permission to do so by a maintainer.
|
||||
- Most PRs should not touch more than one package.
|
||||
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests. Likewise, please do not update the `uv.lock` files unless you are adding a required dependency.
|
||||
- Changes should be backwards compatible.
|
||||
- Make sure optional dependencies are imported within a function.
|
||||
|
||||
330
.github/copilot-instructions.md
vendored
Normal file
330
.github/copilot-instructions.md
vendored
Normal file
@@ -0,0 +1,330 @@
|
||||
# Global Development Guidelines for LangChain Projects
|
||||
|
||||
## Core Development Principles
|
||||
|
||||
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
|
||||
|
||||
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
|
||||
|
||||
❌ **Bad - Breaking Change:**
|
||||
|
||||
```python
|
||||
def get_user(id, verbose=False): # Changed from `user_id`
|
||||
pass
|
||||
```
|
||||
|
||||
✅ **Good - Stable Interface:**
|
||||
|
||||
```python
|
||||
def get_user(user_id: str, verbose: bool = False) -> User:
|
||||
"""Retrieve user by ID with optional verbose output."""
|
||||
pass
|
||||
```
|
||||
|
||||
**Before making ANY changes to public APIs:**
|
||||
|
||||
- Check if the function/class is exported in `__init__.py`
|
||||
- Look for existing usage patterns in tests and examples
|
||||
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
|
||||
- Mark experimental features clearly with docstring admonitions (using MkDocs Material, like `!!! warning`)
|
||||
|
||||
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
|
||||
|
||||
### 2. Code Quality Standards
|
||||
|
||||
**All Python code MUST include type hints and return types.**
|
||||
|
||||
❌ **Bad:**
|
||||
|
||||
```python
|
||||
def p(u, d):
|
||||
return [x for x in u if x not in d]
|
||||
```
|
||||
|
||||
✅ **Good:**
|
||||
|
||||
```python
|
||||
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
|
||||
"""Filter out users that are not in the known users set.
|
||||
|
||||
Args:
|
||||
users: List of user identifiers to filter.
|
||||
known_users: Set of known/valid user identifiers.
|
||||
|
||||
Returns:
|
||||
List of users that are not in the known_users set.
|
||||
"""
|
||||
return [user for user in users if user not in known_users]
|
||||
```
|
||||
|
||||
**Style Requirements:**
|
||||
|
||||
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
|
||||
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
|
||||
- Avoid unnecessary abstraction or premature optimization
|
||||
- Follow existing patterns in the codebase you're modifying
|
||||
|
||||
### 3. Testing Requirements
|
||||
|
||||
**Every new feature or bugfix MUST be covered by unit tests.**
|
||||
|
||||
**Test Organization:**
|
||||
|
||||
- Unit tests: `tests/unit_tests/` (no network calls allowed)
|
||||
- Integration tests: `tests/integration_tests/` (network calls permitted)
|
||||
- Use `pytest` as the testing framework
|
||||
|
||||
**Test Quality Checklist:**
|
||||
|
||||
- [ ] Tests fail when your new logic is broken
|
||||
- [ ] Happy path is covered
|
||||
- [ ] Edge cases and error conditions are tested
|
||||
- [ ] Use fixtures/mocks for external dependencies
|
||||
- [ ] Tests are deterministic (no flaky tests)
|
||||
|
||||
Checklist questions:
|
||||
|
||||
- [ ] Does the test suite fail if your new logic is broken?
|
||||
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
|
||||
- [ ] Do tests use fixtures or mocks where needed?
|
||||
|
||||
```python
|
||||
def test_filter_unknown_users():
|
||||
"""Test filtering unknown users from a list."""
|
||||
users = ["alice", "bob", "charlie"]
|
||||
known_users = {"alice", "bob"}
|
||||
|
||||
result = filter_unknown_users(users, known_users)
|
||||
|
||||
assert result == ["charlie"]
|
||||
assert len(result) == 1
|
||||
```
|
||||
|
||||
### 4. Security and Risk Assessment
|
||||
|
||||
**Security Checklist:**
|
||||
|
||||
- No `eval()`, `exec()`, or `pickle` on user-controlled input
|
||||
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
|
||||
- Remove unreachable/commented code before committing
|
||||
- Race conditions or resource leaks (file handles, sockets, threads).
|
||||
- Ensure proper resource cleanup (file handles, connections)
|
||||
|
||||
❌ **Bad:**
|
||||
|
||||
```python
|
||||
def load_config(path):
|
||||
with open(path) as f:
|
||||
return eval(f.read()) # ⚠️ Never eval config
|
||||
```
|
||||
|
||||
✅ **Good:**
|
||||
|
||||
```python
|
||||
import json
|
||||
|
||||
def load_config(path: str) -> dict:
|
||||
with open(path) as f:
|
||||
return json.load(f)
|
||||
```
|
||||
|
||||
### 5. Documentation Standards
|
||||
|
||||
**Use Google-style docstrings with Args and Returns sections for all public functions.**
|
||||
|
||||
❌ **Insufficient Documentation:**
|
||||
|
||||
```python
|
||||
def send_email(to, msg):
|
||||
"""Send an email to a recipient."""
|
||||
```
|
||||
|
||||
✅ **Complete Documentation:**
|
||||
|
||||
```python
|
||||
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
|
||||
"""
|
||||
Send an email to a recipient with specified priority.
|
||||
|
||||
Args:
|
||||
to: The email address of the recipient.
|
||||
msg: The message body to send.
|
||||
priority: Email priority level.
|
||||
|
||||
Returns:
|
||||
True if email was sent successfully, False otherwise.
|
||||
|
||||
Raises:
|
||||
InvalidEmailError: If the email address format is invalid.
|
||||
SMTPConnectionError: If unable to connect to email server.
|
||||
"""
|
||||
```
|
||||
|
||||
**Documentation Guidelines:**
|
||||
|
||||
- Types go in function signatures, NOT in docstrings
|
||||
- Focus on "why" rather than "what" in descriptions
|
||||
- Document all parameters, return values, and exceptions
|
||||
- Keep descriptions concise but clear
|
||||
|
||||
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
|
||||
|
||||
### 6. Architectural Improvements
|
||||
|
||||
**When you encounter code that could be improved, suggest better designs:**
|
||||
|
||||
❌ **Poor Design:**
|
||||
|
||||
```python
|
||||
def process_data(data, db_conn, email_client, logger):
|
||||
# Function doing too many things
|
||||
validated = validate_data(data)
|
||||
result = db_conn.save(validated)
|
||||
email_client.send_notification(result)
|
||||
logger.log(f"Processed {len(data)} items")
|
||||
return result
|
||||
```
|
||||
|
||||
✅ **Better Design:**
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ProcessingResult:
|
||||
"""Result of data processing operation."""
|
||||
items_processed: int
|
||||
success: bool
|
||||
errors: List[str] = field(default_factory=list)
|
||||
|
||||
class DataProcessor:
|
||||
"""Handles data validation, storage, and notification."""
|
||||
|
||||
def __init__(self, db_conn: Database, email_client: EmailClient):
|
||||
self.db = db_conn
|
||||
self.email = email_client
|
||||
|
||||
def process(self, data: List[dict]) -> ProcessingResult:
|
||||
"""Process and store data with notifications.
|
||||
|
||||
Args:
|
||||
data: List of data items to process.
|
||||
|
||||
Returns:
|
||||
ProcessingResult with details of the operation.
|
||||
"""
|
||||
validated = self._validate_data(data)
|
||||
result = self.db.save(validated)
|
||||
self._notify_completion(result)
|
||||
return result
|
||||
```
|
||||
|
||||
**Design Improvement Areas:**
|
||||
|
||||
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
|
||||
|
||||
- Reduce code duplication through shared utilities
|
||||
- Make unit testing easier
|
||||
- Improve separation of concerns (single responsibility)
|
||||
- Make unit testing easier through dependency injection
|
||||
- Add clarity without adding complexity
|
||||
- Prefer dataclasses for structured data
|
||||
|
||||
## Development Tools & Commands
|
||||
|
||||
### Package Management
|
||||
|
||||
```bash
|
||||
# Add package
|
||||
uv add package-name
|
||||
|
||||
# Sync project dependencies
|
||||
uv sync
|
||||
uv lock
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
```bash
|
||||
# Run unit tests (no network)
|
||||
make test
|
||||
|
||||
# Don't run integration tests, as API keys must be set
|
||||
|
||||
# Run specific test file
|
||||
uv run --group test pytest tests/unit_tests/test_specific.py
|
||||
```
|
||||
|
||||
### Code Quality
|
||||
|
||||
```bash
|
||||
# Lint code
|
||||
make lint
|
||||
|
||||
# Format code
|
||||
make format
|
||||
|
||||
# Type checking
|
||||
uv run --group lint mypy .
|
||||
```
|
||||
|
||||
### Dependency Management Patterns
|
||||
|
||||
**Local Development Dependencies:**
|
||||
|
||||
```toml
|
||||
[tool.uv.sources]
|
||||
langchain-core = { path = "../core", editable = true }
|
||||
langchain-tests = { path = "../standard-tests", editable = true }
|
||||
```
|
||||
|
||||
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
|
||||
|
||||
```python
|
||||
from langchain_core.tools import tool
|
||||
|
||||
@tool
|
||||
def search_database(query: str) -> str:
|
||||
"""Search the database for relevant information.
|
||||
|
||||
Args:
|
||||
query: The search query string.
|
||||
"""
|
||||
# Implementation here
|
||||
return results
|
||||
```
|
||||
|
||||
## Commit Standards
|
||||
|
||||
**Use Conventional Commits format for PR titles:**
|
||||
|
||||
- `feat(core): add multi-tenant support`
|
||||
- `!fix(cli): resolve flag parsing error` (breaking change uses exclamation mark)
|
||||
- `docs: update API usage examples`
|
||||
- `docs(openai): update API usage examples`
|
||||
|
||||
## Framework-Specific Guidelines
|
||||
|
||||
- Follow the existing patterns in `langchain_core` for base abstractions
|
||||
- Implement proper streaming support where applicable
|
||||
- Avoid deprecated components
|
||||
|
||||
### Partner Integrations
|
||||
|
||||
- Follow the established patterns in existing partner libraries
|
||||
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
|
||||
- Include comprehensive integration tests
|
||||
- Document API key requirements and authentication
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference Checklist
|
||||
|
||||
Before submitting code changes:
|
||||
|
||||
- [ ] **Breaking Changes**: Verified no public API changes
|
||||
- [ ] **Type Hints**: All functions have complete type annotations
|
||||
- [ ] **Tests**: New functionality is fully tested
|
||||
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
|
||||
- [ ] **Documentation**: Google-style docstrings for public functions
|
||||
- [ ] **Code Quality**: `make lint` and `make format` pass
|
||||
- [ ] **Architecture**: Suggested improvements where applicable
|
||||
- [ ] **Commit Message**: Follows Conventional Commits format
|
||||
11
.github/pr-file-labeler.yml
vendored
11
.github/pr-file-labeler.yml
vendored
@@ -148,5 +148,16 @@ documentation:
|
||||
- changed-files:
|
||||
- any-glob-to-any-file:
|
||||
- "**/*.md"
|
||||
- "**/*.rst"
|
||||
- "**/README*"
|
||||
|
||||
# Security related changes
|
||||
security:
|
||||
- changed-files:
|
||||
- any-glob-to-any-file:
|
||||
- "**/*security*"
|
||||
- "**/*auth*"
|
||||
- "**/*credential*"
|
||||
- "**/*secret*"
|
||||
- "**/*token*"
|
||||
- ".github/workflows/security*"
|
||||
|
||||
@@ -35,7 +35,7 @@ jobs:
|
||||
timeout-minutes: 20
|
||||
name: "Python ${{ inputs.python-version }}"
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
8
.github/workflows/_lint.yml
vendored
8
.github/workflows/_lint.yml
vendored
@@ -38,7 +38,7 @@ jobs:
|
||||
timeout-minutes: 20
|
||||
steps:
|
||||
- name: "📋 Checkout Code"
|
||||
uses: actions/checkout@v6
|
||||
uses: actions/checkout@v5
|
||||
|
||||
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
|
||||
uses: "./.github/actions/uv_setup"
|
||||
@@ -47,12 +47,6 @@ jobs:
|
||||
cache-suffix: lint-${{ inputs.working-directory }}
|
||||
working-directory: ${{ inputs.working-directory }}
|
||||
|
||||
# - name: "🔒 Verify Lockfile is Up-to-Date"
|
||||
# working-directory: ${{ inputs.working-directory }}
|
||||
# run: |
|
||||
# unset UV_FROZEN
|
||||
# uv lock --check
|
||||
|
||||
- name: "📦 Install Lint & Typing Dependencies"
|
||||
working-directory: ${{ inputs.working-directory }}
|
||||
run: |
|
||||
|
||||
28
.github/workflows/_release.yml
vendored
28
.github/workflows/_release.yml
vendored
@@ -19,7 +19,7 @@ on:
|
||||
required: true
|
||||
type: string
|
||||
description: "From which folder this pipeline executes"
|
||||
default: "libs/langchain_v1"
|
||||
default: "libs/langchain"
|
||||
release-version:
|
||||
required: true
|
||||
type: string
|
||||
@@ -54,7 +54,7 @@ jobs:
|
||||
version: ${{ steps.check-version.outputs.version }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- name: Set up Python + uv
|
||||
uses: "./.github/actions/uv_setup"
|
||||
@@ -77,7 +77,7 @@ jobs:
|
||||
working-directory: ${{ inputs.working-directory }}
|
||||
|
||||
- name: Upload build
|
||||
uses: actions/upload-artifact@v6
|
||||
uses: actions/upload-artifact@v5
|
||||
with:
|
||||
name: dist
|
||||
path: ${{ inputs.working-directory }}/dist/
|
||||
@@ -105,7 +105,7 @@ jobs:
|
||||
outputs:
|
||||
release-body: ${{ steps.generate-release-body.outputs.release-body }}
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
with:
|
||||
repository: langchain-ai/langchain
|
||||
path: langchain
|
||||
@@ -206,9 +206,9 @@ jobs:
|
||||
id-token: write
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- uses: actions/download-artifact@v7
|
||||
- uses: actions/download-artifact@v6
|
||||
with:
|
||||
name: dist
|
||||
path: ${{ inputs.working-directory }}/dist/
|
||||
@@ -237,7 +237,7 @@ jobs:
|
||||
contents: read
|
||||
timeout-minutes: 20
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
# We explicitly *don't* set up caching here. This ensures our tests are
|
||||
# maximally sensitive to catching breakage.
|
||||
@@ -258,7 +258,7 @@ jobs:
|
||||
with:
|
||||
python-version: ${{ env.PYTHON_VERSION }}
|
||||
|
||||
- uses: actions/download-artifact@v7
|
||||
- uses: actions/download-artifact@v6
|
||||
with:
|
||||
name: dist
|
||||
path: ${{ inputs.working-directory }}/dist/
|
||||
@@ -412,7 +412,7 @@ jobs:
|
||||
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
|
||||
LANGCHAIN_TESTS_USER_AGENT: ${{ secrets.LANGCHAIN_TESTS_USER_AGENT }}
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
# We implement this conditional as Github Actions does not have good support
|
||||
# for conditionally needing steps. https://github.com/actions/runner/issues/491
|
||||
@@ -430,7 +430,7 @@ jobs:
|
||||
with:
|
||||
python-version: ${{ env.PYTHON_VERSION }}
|
||||
|
||||
- uses: actions/download-artifact@v7
|
||||
- uses: actions/download-artifact@v6
|
||||
if: startsWith(inputs.working-directory, 'libs/core')
|
||||
with:
|
||||
name: dist
|
||||
@@ -492,14 +492,14 @@ jobs:
|
||||
working-directory: ${{ inputs.working-directory }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- name: Set up Python + uv
|
||||
uses: "./.github/actions/uv_setup"
|
||||
with:
|
||||
python-version: ${{ env.PYTHON_VERSION }}
|
||||
|
||||
- uses: actions/download-artifact@v7
|
||||
- uses: actions/download-artifact@v6
|
||||
with:
|
||||
name: dist
|
||||
path: ${{ inputs.working-directory }}/dist/
|
||||
@@ -532,14 +532,14 @@ jobs:
|
||||
working-directory: ${{ inputs.working-directory }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- name: Set up Python + uv
|
||||
uses: "./.github/actions/uv_setup"
|
||||
with:
|
||||
python-version: ${{ env.PYTHON_VERSION }}
|
||||
|
||||
- uses: actions/download-artifact@v7
|
||||
- uses: actions/download-artifact@v6
|
||||
with:
|
||||
name: dist
|
||||
path: ${{ inputs.working-directory }}/dist/
|
||||
|
||||
2
.github/workflows/_test.yml
vendored
2
.github/workflows/_test.yml
vendored
@@ -33,7 +33,7 @@ jobs:
|
||||
name: "Python ${{ inputs.python-version }}"
|
||||
steps:
|
||||
- name: "📋 Checkout Code"
|
||||
uses: actions/checkout@v6
|
||||
uses: actions/checkout@v5
|
||||
|
||||
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
2
.github/workflows/_test_pydantic.yml
vendored
2
.github/workflows/_test_pydantic.yml
vendored
@@ -36,7 +36,7 @@ jobs:
|
||||
name: "Pydantic ~=${{ inputs.pydantic-version }}"
|
||||
steps:
|
||||
- name: "📋 Checkout Code"
|
||||
uses: actions/checkout@v6
|
||||
uses: actions/checkout@v5
|
||||
|
||||
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
|
||||
uses: "./.github/actions/uv_setup"
|
||||
|
||||
107
.github/workflows/auto-label-by-package.yml
vendored
107
.github/workflows/auto-label-by-package.yml
vendored
@@ -1,107 +0,0 @@
|
||||
name: Auto Label Issues by Package
|
||||
|
||||
on:
|
||||
issues:
|
||||
types: [opened, edited]
|
||||
|
||||
jobs:
|
||||
label-by-package:
|
||||
permissions:
|
||||
issues: write
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Sync package labels
|
||||
uses: actions/github-script@v8
|
||||
with:
|
||||
script: |
|
||||
const body = context.payload.issue.body || "";
|
||||
|
||||
// Extract text under "### Package"
|
||||
const match = body.match(/### Package\s+([\s\S]*?)\n###/i);
|
||||
if (!match) return;
|
||||
|
||||
const packageSection = match[1].trim();
|
||||
|
||||
// Mapping table for package names to labels
|
||||
const mapping = {
|
||||
"langchain": "langchain",
|
||||
"langchain-openai": "openai",
|
||||
"langchain-anthropic": "anthropic",
|
||||
"langchain-classic": "langchain-classic",
|
||||
"langchain-core": "core",
|
||||
"langchain-cli": "cli",
|
||||
"langchain-model-profiles": "model-profiles",
|
||||
"langchain-tests": "standard-tests",
|
||||
"langchain-text-splitters": "text-splitters",
|
||||
"langchain-chroma": "chroma",
|
||||
"langchain-deepseek": "deepseek",
|
||||
"langchain-exa": "exa",
|
||||
"langchain-fireworks": "fireworks",
|
||||
"langchain-groq": "groq",
|
||||
"langchain-huggingface": "huggingface",
|
||||
"langchain-mistralai": "mistralai",
|
||||
"langchain-nomic": "nomic",
|
||||
"langchain-ollama": "ollama",
|
||||
"langchain-perplexity": "perplexity",
|
||||
"langchain-prompty": "prompty",
|
||||
"langchain-qdrant": "qdrant",
|
||||
"langchain-xai": "xai",
|
||||
};
|
||||
|
||||
// All possible package labels we manage
|
||||
const allPackageLabels = Object.values(mapping);
|
||||
const selectedLabels = [];
|
||||
|
||||
// Check if this is checkbox format (multiple selection)
|
||||
const checkboxMatches = packageSection.match(/- \[x\]\s+([^\n\r]+)/gi);
|
||||
if (checkboxMatches) {
|
||||
// Handle checkbox format
|
||||
for (const match of checkboxMatches) {
|
||||
const packageName = match.replace(/- \[x\]\s+/i, '').trim();
|
||||
const label = mapping[packageName];
|
||||
if (label && !selectedLabels.includes(label)) {
|
||||
selectedLabels.push(label);
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// Handle dropdown format (single selection)
|
||||
const label = mapping[packageSection];
|
||||
if (label) {
|
||||
selectedLabels.push(label);
|
||||
}
|
||||
}
|
||||
|
||||
// Get current issue labels
|
||||
const issue = await github.rest.issues.get({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: context.issue.number
|
||||
});
|
||||
|
||||
const currentLabels = issue.data.labels.map(label => label.name);
|
||||
const currentPackageLabels = currentLabels.filter(label => allPackageLabels.includes(label));
|
||||
|
||||
// Determine labels to add and remove
|
||||
const labelsToAdd = selectedLabels.filter(label => !currentPackageLabels.includes(label));
|
||||
const labelsToRemove = currentPackageLabels.filter(label => !selectedLabels.includes(label));
|
||||
|
||||
// Add new labels
|
||||
if (labelsToAdd.length > 0) {
|
||||
await github.rest.issues.addLabels({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: context.issue.number,
|
||||
labels: labelsToAdd
|
||||
});
|
||||
}
|
||||
|
||||
// Remove old labels
|
||||
for (const label of labelsToRemove) {
|
||||
await github.rest.issues.removeLabel({
|
||||
owner: context.repo.owner,
|
||||
repo: context.repo.repo,
|
||||
issue_number: context.issue.number,
|
||||
name: label
|
||||
});
|
||||
}
|
||||
6
.github/workflows/check_diffs.yml
vendored
6
.github/workflows/check_diffs.yml
vendored
@@ -47,7 +47,7 @@ jobs:
|
||||
if: ${{ !contains(github.event.pull_request.labels.*.name, 'ci-ignore') }}
|
||||
steps:
|
||||
- name: "📋 Checkout Code"
|
||||
uses: actions/checkout@v6
|
||||
uses: actions/checkout@v5
|
||||
- name: "🐍 Setup Python 3.11"
|
||||
uses: actions/setup-python@v6
|
||||
with:
|
||||
@@ -141,7 +141,7 @@ jobs:
|
||||
run:
|
||||
working-directory: ${{ matrix.job-configs.working-directory }}
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- name: "🐍 Set up Python ${{ matrix.job-configs.python-version }} + UV"
|
||||
uses: "./.github/actions/uv_setup"
|
||||
@@ -182,7 +182,7 @@ jobs:
|
||||
job-configs: ${{ fromJson(needs.build.outputs.codspeed) }}
|
||||
fail-fast: false
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- name: "📦 Install UV Package Manager"
|
||||
uses: astral-sh/setup-uv@v7
|
||||
|
||||
@@ -9,6 +9,8 @@ on:
|
||||
paths:
|
||||
- "libs/core/pyproject.toml"
|
||||
- "libs/core/langchain_core/version.py"
|
||||
- "libs/langchain_v1/pyproject.toml"
|
||||
- "libs/langchain_v1/langchain/__init__.py"
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
@@ -18,9 +20,9 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
|
||||
- name: "✅ Verify pyproject.toml & version.py Match"
|
||||
- name: "✅ Verify pyproject.toml & version files Match"
|
||||
run: |
|
||||
# Check core versions
|
||||
CORE_PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/core/pyproject.toml)
|
||||
6
.github/workflows/integration_tests.yml
vendored
6
.github/workflows/integration_tests.yml
vendored
@@ -71,14 +71,14 @@ jobs:
|
||||
working-directory: ${{ fromJSON(needs.compute-matrix.outputs.matrix).working-directory }}
|
||||
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
with:
|
||||
path: langchain
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
with:
|
||||
repository: langchain-ai/langchain-google
|
||||
path: langchain-google
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
with:
|
||||
repository: langchain-ai/langchain-aws
|
||||
path: langchain-aws
|
||||
|
||||
13
.github/workflows/pr_lint.yml
vendored
13
.github/workflows/pr_lint.yml
vendored
@@ -26,13 +26,11 @@
|
||||
# * revert — reverts a previous commit
|
||||
# * release — prepare a new release
|
||||
#
|
||||
# Allowed Scope(s) (optional):
|
||||
# core, cli, langchain, langchain_v1, langchain-classic, model-profiles,
|
||||
# standard-tests, text-splitters, docs, anthropic, chroma, deepseek, exa,
|
||||
# fireworks, groq, huggingface, mistralai, nomic, ollama, openai,
|
||||
# perplexity, prompty, qdrant, xai, infra, deps
|
||||
#
|
||||
# Multiple scopes can be used by separating them with a comma.
|
||||
# Allowed Scopes (optional):
|
||||
# core, cli, langchain, langchain_v1, langchain-classic, standard-tests,
|
||||
# text-splitters, docs, anthropic, chroma, deepseek, exa, fireworks, groq,
|
||||
# huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant,
|
||||
# xai, infra, deps
|
||||
#
|
||||
# Rules:
|
||||
# 1. The 'Type' must start with a lowercase letter.
|
||||
@@ -102,7 +100,6 @@ jobs:
|
||||
qdrant
|
||||
xai
|
||||
infra
|
||||
deps
|
||||
requireScope: false
|
||||
disallowScopes: |
|
||||
release
|
||||
|
||||
4
.github/workflows/v03_api_doc_build.yml
vendored
4
.github/workflows/v03_api_doc_build.yml
vendored
@@ -23,12 +23,12 @@ jobs:
|
||||
permissions:
|
||||
contents: read
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
with:
|
||||
ref: v0.3
|
||||
path: langchain
|
||||
|
||||
- uses: actions/checkout@v6
|
||||
- uses: actions/checkout@v5
|
||||
with:
|
||||
repository: langchain-ai/langchain-api-docs-html
|
||||
path: langchain-api-docs-html
|
||||
|
||||
8
.github/workflows/v1_changes.md
vendored
Normal file
8
.github/workflows/v1_changes.md
vendored
Normal file
@@ -0,0 +1,8 @@
|
||||
With the deprecation of v0 docs, the following files will need to be migrated/supported
|
||||
in the new docs repo:
|
||||
|
||||
- run_notebooks.yml: New repo should run Integration tests on code snippets?
|
||||
- people.yml: Need to fix and somehow display on the new docs site
|
||||
- Subsequently, `.github/actions/people/`
|
||||
- _test_doc_imports.yml
|
||||
- check-broken-links.yml
|
||||
3
.gitignore
vendored
3
.gitignore
vendored
@@ -163,6 +163,3 @@ node_modules
|
||||
|
||||
prof
|
||||
virtualenv/
|
||||
scratch/
|
||||
|
||||
.langgraph_api/
|
||||
|
||||
2
.vscode/extensions.json
vendored
2
.vscode/extensions.json
vendored
@@ -6,6 +6,8 @@
|
||||
"ms-toolsai.jupyter",
|
||||
"ms-toolsai.jupyter-keymap",
|
||||
"ms-toolsai.jupyter-renderers",
|
||||
"ms-toolsai.vscode-jupyter-cell-tags",
|
||||
"ms-toolsai.vscode-jupyter-slideshow",
|
||||
"yzhang.markdown-all-in-one",
|
||||
"davidanson.vscode-markdownlint",
|
||||
"bierner.markdown-mermaid",
|
||||
|
||||
419
AGENTS.md
419
AGENTS.md
@@ -1,58 +1,255 @@
|
||||
# Global development guidelines for the LangChain monorepo
|
||||
# Global Development Guidelines for LangChain Projects
|
||||
|
||||
This document provides context to understand the LangChain Python project and assist with development.
|
||||
## Core Development Principles
|
||||
|
||||
## Project architecture and context
|
||||
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
|
||||
|
||||
### Monorepo structure
|
||||
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
|
||||
|
||||
This is a Python monorepo with multiple independently versioned packages that use `uv`.
|
||||
❌ **Bad - Breaking Change:**
|
||||
|
||||
```txt
|
||||
langchain/
|
||||
├── libs/
|
||||
│ ├── core/ # `langchain-core` primitives and base abstractions
|
||||
│ ├── langchain/ # `langchain-classic` (legacy, no new features)
|
||||
│ ├── langchain_v1/ # Actively maintained `langchain` package
|
||||
│ ├── partners/ # Third-party integrations
|
||||
│ │ ├── openai/ # OpenAI models and embeddings
|
||||
│ │ ├── anthropic/ # Anthropic (Claude) integration
|
||||
│ │ ├── ollama/ # Local model support
|
||||
│ │ └── ... (other integrations maintained by the LangChain team)
|
||||
│ ├── text-splitters/ # Document chunking utilities
|
||||
│ ├── standard-tests/ # Shared test suite for integrations
|
||||
│ ├── model-profiles/ # Model configuration profiles
|
||||
│ └── cli/ # Command-line interface tools
|
||||
├── .github/ # CI/CD workflows and templates
|
||||
├── .vscode/ # VSCode IDE standard settings and recommended extensions
|
||||
└── README.md # Information about LangChain
|
||||
```python
|
||||
def get_user(id, verbose=False): # Changed from `user_id`
|
||||
pass
|
||||
```
|
||||
|
||||
- **Core layer** (`langchain-core`): Base abstractions, interfaces, and protocols. Users should not need to know about this layer directly.
|
||||
- **Implementation layer** (`langchain`): Concrete implementations and high-level public utilities
|
||||
- **Integration layer** (`partners/`): Third-party service integrations. Note that this monorepo is not exhaustive of all LangChain integrations; some are maintained in separate repos, such as `langchain-ai/langchain-google` and `langchain-ai/langchain-aws`. Usually these repos are cloned at the same level as this monorepo, so if needed, you can refer to their code directly by navigating to `../langchain-google/` from this monorepo.
|
||||
- **Testing layer** (`standard-tests/`): Standardized integration tests for partner integrations
|
||||
✅ **Good - Stable Interface:**
|
||||
|
||||
### Development tools & commands**
|
||||
```python
|
||||
def get_user(user_id: str, verbose: bool = False) -> User:
|
||||
"""Retrieve user by ID with optional verbose output."""
|
||||
pass
|
||||
```
|
||||
|
||||
- `uv` – Fast Python package installer and resolver (replaces pip/poetry)
|
||||
- `make` – Task runner for common development commands. Feel free to look at the `Makefile` for available commands and usage patterns.
|
||||
- `ruff` – Fast Python linter and formatter
|
||||
- `mypy` – Static type checking
|
||||
- `pytest` – Testing framework
|
||||
**Before making ANY changes to public APIs:**
|
||||
|
||||
This monorepo uses `uv` for dependency management. Local development uses editable installs: `[tool.uv.sources]`
|
||||
- Check if the function/class is exported in `__init__.py`
|
||||
- Look for existing usage patterns in tests and examples
|
||||
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
|
||||
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
|
||||
|
||||
Each package in `libs/` has its own `pyproject.toml` and `uv.lock`.
|
||||
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
|
||||
|
||||
### 2. Code Quality Standards
|
||||
|
||||
**All Python code MUST include type hints and return types.**
|
||||
|
||||
❌ **Bad:**
|
||||
|
||||
```python
|
||||
def p(u, d):
|
||||
return [x for x in u if x not in d]
|
||||
```
|
||||
|
||||
✅ **Good:**
|
||||
|
||||
```python
|
||||
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
|
||||
"""Filter out users that are not in the known users set.
|
||||
|
||||
Args:
|
||||
users: List of user identifiers to filter.
|
||||
known_users: Set of known/valid user identifiers.
|
||||
|
||||
Returns:
|
||||
List of users that are not in the known_users set.
|
||||
"""
|
||||
return [user for user in users if user not in known_users]
|
||||
```
|
||||
|
||||
**Style Requirements:**
|
||||
|
||||
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
|
||||
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
|
||||
- Avoid unnecessary abstraction or premature optimization
|
||||
- Follow existing patterns in the codebase you're modifying
|
||||
|
||||
### 3. Testing Requirements
|
||||
|
||||
**Every new feature or bugfix MUST be covered by unit tests.**
|
||||
|
||||
**Test Organization:**
|
||||
|
||||
- Unit tests: `tests/unit_tests/` (no network calls allowed)
|
||||
- Integration tests: `tests/integration_tests/` (network calls permitted)
|
||||
- Use `pytest` as the testing framework
|
||||
|
||||
**Test Quality Checklist:**
|
||||
|
||||
- [ ] Tests fail when your new logic is broken
|
||||
- [ ] Happy path is covered
|
||||
- [ ] Edge cases and error conditions are tested
|
||||
- [ ] Use fixtures/mocks for external dependencies
|
||||
- [ ] Tests are deterministic (no flaky tests)
|
||||
|
||||
Checklist questions:
|
||||
|
||||
- [ ] Does the test suite fail if your new logic is broken?
|
||||
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
|
||||
- [ ] Do tests use fixtures or mocks where needed?
|
||||
|
||||
```python
|
||||
def test_filter_unknown_users():
|
||||
"""Test filtering unknown users from a list."""
|
||||
users = ["alice", "bob", "charlie"]
|
||||
known_users = {"alice", "bob"}
|
||||
|
||||
result = filter_unknown_users(users, known_users)
|
||||
|
||||
assert result == ["charlie"]
|
||||
assert len(result) == 1
|
||||
```
|
||||
|
||||
### 4. Security and Risk Assessment
|
||||
|
||||
**Security Checklist:**
|
||||
|
||||
- No `eval()`, `exec()`, or `pickle` on user-controlled input
|
||||
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
|
||||
- Remove unreachable/commented code before committing
|
||||
- Race conditions or resource leaks (file handles, sockets, threads).
|
||||
- Ensure proper resource cleanup (file handles, connections)
|
||||
|
||||
❌ **Bad:**
|
||||
|
||||
```python
|
||||
def load_config(path):
|
||||
with open(path) as f:
|
||||
return eval(f.read()) # ⚠️ Never eval config
|
||||
```
|
||||
|
||||
✅ **Good:**
|
||||
|
||||
```python
|
||||
import json
|
||||
|
||||
def load_config(path: str) -> dict:
|
||||
with open(path) as f:
|
||||
return json.load(f)
|
||||
```
|
||||
|
||||
### 5. Documentation Standards
|
||||
|
||||
**Use Google-style docstrings with Args section for all public functions.**
|
||||
|
||||
❌ **Insufficient Documentation:**
|
||||
|
||||
```python
|
||||
def send_email(to, msg):
|
||||
"""Send an email to a recipient."""
|
||||
```
|
||||
|
||||
✅ **Complete Documentation:**
|
||||
|
||||
```python
|
||||
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
|
||||
"""
|
||||
Send an email to a recipient with specified priority.
|
||||
|
||||
Args:
|
||||
to: The email address of the recipient.
|
||||
msg: The message body to send.
|
||||
priority: Email priority level (`'low'`, `'normal'`, `'high'`).
|
||||
|
||||
Returns:
|
||||
`True` if email was sent successfully, `False` otherwise.
|
||||
|
||||
Raises:
|
||||
`InvalidEmailError`: If the email address format is invalid.
|
||||
`SMTPConnectionError`: If unable to connect to email server.
|
||||
"""
|
||||
```
|
||||
|
||||
**Documentation Guidelines:**
|
||||
|
||||
- Types go in function signatures, NOT in docstrings
|
||||
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
|
||||
- Focus on "why" rather than "what" in descriptions
|
||||
- Document all parameters, return values, and exceptions
|
||||
- Keep descriptions concise but clear
|
||||
- Ensure American English spelling (e.g., "behavior", not "behaviour")
|
||||
|
||||
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
|
||||
|
||||
### 6. Architectural Improvements
|
||||
|
||||
**When you encounter code that could be improved, suggest better designs:**
|
||||
|
||||
❌ **Poor Design:**
|
||||
|
||||
```python
|
||||
def process_data(data, db_conn, email_client, logger):
|
||||
# Function doing too many things
|
||||
validated = validate_data(data)
|
||||
result = db_conn.save(validated)
|
||||
email_client.send_notification(result)
|
||||
logger.log(f"Processed {len(data)} items")
|
||||
return result
|
||||
```
|
||||
|
||||
✅ **Better Design:**
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ProcessingResult:
|
||||
"""Result of data processing operation."""
|
||||
items_processed: int
|
||||
success: bool
|
||||
errors: List[str] = field(default_factory=list)
|
||||
|
||||
class DataProcessor:
|
||||
"""Handles data validation, storage, and notification."""
|
||||
|
||||
def __init__(self, db_conn: Database, email_client: EmailClient):
|
||||
self.db = db_conn
|
||||
self.email = email_client
|
||||
|
||||
def process(self, data: List[dict]) -> ProcessingResult:
|
||||
"""Process and store data with notifications."""
|
||||
validated = self._validate_data(data)
|
||||
result = self.db.save(validated)
|
||||
self._notify_completion(result)
|
||||
return result
|
||||
```
|
||||
|
||||
**Design Improvement Areas:**
|
||||
|
||||
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
|
||||
|
||||
- Reduce code duplication through shared utilities
|
||||
- Make unit testing easier
|
||||
- Improve separation of concerns (single responsibility)
|
||||
- Make unit testing easier through dependency injection
|
||||
- Add clarity without adding complexity
|
||||
- Prefer dataclasses for structured data
|
||||
|
||||
## Development Tools & Commands
|
||||
|
||||
### Package Management
|
||||
|
||||
```bash
|
||||
# Add package
|
||||
uv add package-name
|
||||
|
||||
# Sync project dependencies
|
||||
uv sync
|
||||
uv lock
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
```bash
|
||||
# Run unit tests (no network)
|
||||
make test
|
||||
|
||||
# Don't run integration tests, as API keys must be set
|
||||
|
||||
# Run specific test file
|
||||
uv run --group test pytest tests/unit_tests/test_specific.py
|
||||
```
|
||||
|
||||
### Code Quality
|
||||
|
||||
```bash
|
||||
# Lint code
|
||||
make lint
|
||||
@@ -64,118 +261,66 @@ make format
|
||||
uv run --group lint mypy .
|
||||
```
|
||||
|
||||
#### Key config files
|
||||
### Dependency Management Patterns
|
||||
|
||||
- pyproject.toml: Main workspace configuration with dependency groups
|
||||
- uv.lock: Locked dependencies for reproducible builds
|
||||
- Makefile: Development tasks
|
||||
**Local Development Dependencies:**
|
||||
|
||||
#### Commit standards
|
||||
|
||||
Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes.
|
||||
|
||||
#### Pull request guidelines
|
||||
|
||||
- Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
|
||||
- Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
|
||||
- Highlight areas of the proposed changes that require careful review.
|
||||
|
||||
## Core development principles
|
||||
|
||||
### Maintain stable public interfaces
|
||||
|
||||
CRITICAL: Always attempt to preserve function signatures, argument positions, and names for exported/public methods. Do not make breaking changes.
|
||||
|
||||
**Before making ANY changes to public APIs:**
|
||||
|
||||
- Check if the function/class is exported in `__init__.py`
|
||||
- Look for existing usage patterns in tests and examples
|
||||
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
|
||||
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
|
||||
|
||||
Ask: "Would this change break someone's code if they used it last week?"
|
||||
|
||||
### Code quality standards
|
||||
|
||||
All Python code MUST include type hints and return types.
|
||||
|
||||
```python title="Example"
|
||||
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
|
||||
"""Single line description of the function.
|
||||
|
||||
Any additional context about the function can go here.
|
||||
|
||||
Args:
|
||||
users: List of user identifiers to filter.
|
||||
known_users: Set of known/valid user identifiers.
|
||||
|
||||
Returns:
|
||||
List of users that are not in the known_users set.
|
||||
"""
|
||||
```toml
|
||||
[tool.uv.sources]
|
||||
langchain-core = { path = "../core", editable = true }
|
||||
langchain-tests = { path = "../standard-tests", editable = true }
|
||||
```
|
||||
|
||||
- Use descriptive, self-explanatory variable names.
|
||||
- Follow existing patterns in the codebase you're modifying
|
||||
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
|
||||
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
|
||||
|
||||
### Testing requirements
|
||||
```python
|
||||
from langchain_core.tools import tool
|
||||
|
||||
Every new feature or bugfix MUST be covered by unit tests.
|
||||
|
||||
- Unit tests: `tests/unit_tests/` (no network calls allowed)
|
||||
- Integration tests: `tests/integration_tests/` (network calls permitted)
|
||||
- We use `pytest` as the testing framework; if in doubt, check other existing tests for examples.
|
||||
- The testing file structure should mirror the source code structure.
|
||||
|
||||
**Checklist:**
|
||||
|
||||
- [ ] Tests fail when your new logic is broken
|
||||
- [ ] Happy path is covered
|
||||
- [ ] Edge cases and error conditions are tested
|
||||
- [ ] Use fixtures/mocks for external dependencies
|
||||
- [ ] Tests are deterministic (no flaky tests)
|
||||
- [ ] Does the test suite fail if your new logic is broken?
|
||||
|
||||
### Security and risk assessment
|
||||
|
||||
- No `eval()`, `exec()`, or `pickle` on user-controlled input
|
||||
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
|
||||
- Remove unreachable/commented code before committing
|
||||
- Race conditions or resource leaks (file handles, sockets, threads).
|
||||
- Ensure proper resource cleanup (file handles, connections)
|
||||
|
||||
### Documentation standards
|
||||
|
||||
Use Google-style docstrings with Args section for all public functions.
|
||||
|
||||
```python title="Example"
|
||||
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
|
||||
"""Send an email to a recipient with specified priority.
|
||||
|
||||
Any additional context about the function can go here.
|
||||
@tool
|
||||
def search_database(query: str) -> str:
|
||||
"""Search the database for relevant information.
|
||||
|
||||
Args:
|
||||
to: The email address of the recipient.
|
||||
msg: The message body to send.
|
||||
priority: Email priority level.
|
||||
|
||||
Returns:
|
||||
`True` if email was sent successfully, `False` otherwise.
|
||||
|
||||
Raises:
|
||||
InvalidEmailError: If the email address format is invalid.
|
||||
SMTPConnectionError: If unable to connect to email server.
|
||||
query: The search query string.
|
||||
"""
|
||||
# Implementation here
|
||||
return results
|
||||
```
|
||||
|
||||
- Types go in function signatures, NOT in docstrings
|
||||
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
|
||||
- Focus on "why" rather than "what" in descriptions
|
||||
- Document all parameters, return values, and exceptions
|
||||
- Keep descriptions concise but clear
|
||||
- Ensure American English spelling (e.g., "behavior", not "behaviour")
|
||||
## Commit Standards
|
||||
|
||||
## Additional resources
|
||||
**Use Conventional Commits format for PR titles:**
|
||||
|
||||
- **Documentation:** https://docs.langchain.com/oss/python/langchain/overview and source at https://github.com/langchain-ai/docs or `../docs/`. Prefer the local install and use file search tools for best results. If needed, use the docs MCP server as defined in `.mcp.json` for programmatic access.
|
||||
- **Contributing Guide:** [`.github/CONTRIBUTING.md`](https://docs.langchain.com/oss/python/contributing/overview)
|
||||
- `feat(core): add multi-tenant support`
|
||||
- `fix(cli): resolve flag parsing error`
|
||||
- `docs: update API usage examples`
|
||||
- `docs(openai): update API usage examples`
|
||||
|
||||
## Framework-Specific Guidelines
|
||||
|
||||
- Follow the existing patterns in `langchain-core` for base abstractions
|
||||
- Use `langchain_core.callbacks` for execution tracking
|
||||
- Implement proper streaming support where applicable
|
||||
- Avoid deprecated components like legacy `LLMChain`
|
||||
|
||||
### Partner Integrations
|
||||
|
||||
- Follow the established patterns in existing partner libraries
|
||||
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
|
||||
- Include comprehensive integration tests
|
||||
- Document API key requirements and authentication
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference Checklist
|
||||
|
||||
Before submitting code changes:
|
||||
|
||||
- [ ] **Breaking Changes**: Verified no public API changes
|
||||
- [ ] **Type Hints**: All functions have complete type annotations
|
||||
- [ ] **Tests**: New functionality is fully tested
|
||||
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
|
||||
- [ ] **Documentation**: Google-style docstrings for public functions
|
||||
- [ ] **Code Quality**: `make lint` and `make format` pass
|
||||
- [ ] **Architecture**: Suggested improvements where applicable
|
||||
- [ ] **Commit Message**: Follows Conventional Commits format
|
||||
|
||||
419
CLAUDE.md
419
CLAUDE.md
@@ -1,58 +1,255 @@
|
||||
# Global development guidelines for the LangChain monorepo
|
||||
# Global Development Guidelines for LangChain Projects
|
||||
|
||||
This document provides context to understand the LangChain Python project and assist with development.
|
||||
## Core Development Principles
|
||||
|
||||
## Project architecture and context
|
||||
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
|
||||
|
||||
### Monorepo structure
|
||||
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
|
||||
|
||||
This is a Python monorepo with multiple independently versioned packages that use `uv`.
|
||||
❌ **Bad - Breaking Change:**
|
||||
|
||||
```txt
|
||||
langchain/
|
||||
├── libs/
|
||||
│ ├── core/ # `langchain-core` primitives and base abstractions
|
||||
│ ├── langchain/ # `langchain-classic` (legacy, no new features)
|
||||
│ ├── langchain_v1/ # Actively maintained `langchain` package
|
||||
│ ├── partners/ # Third-party integrations
|
||||
│ │ ├── openai/ # OpenAI models and embeddings
|
||||
│ │ ├── anthropic/ # Anthropic (Claude) integration
|
||||
│ │ ├── ollama/ # Local model support
|
||||
│ │ └── ... (other integrations maintained by the LangChain team)
|
||||
│ ├── text-splitters/ # Document chunking utilities
|
||||
│ ├── standard-tests/ # Shared test suite for integrations
|
||||
│ ├── model-profiles/ # Model configuration profiles
|
||||
│ └── cli/ # Command-line interface tools
|
||||
├── .github/ # CI/CD workflows and templates
|
||||
├── .vscode/ # VSCode IDE standard settings and recommended extensions
|
||||
└── README.md # Information about LangChain
|
||||
```python
|
||||
def get_user(id, verbose=False): # Changed from `user_id`
|
||||
pass
|
||||
```
|
||||
|
||||
- **Core layer** (`langchain-core`): Base abstractions, interfaces, and protocols. Users should not need to know about this layer directly.
|
||||
- **Implementation layer** (`langchain`): Concrete implementations and high-level public utilities
|
||||
- **Integration layer** (`partners/`): Third-party service integrations. Note that this monorepo is not exhaustive of all LangChain integrations; some are maintained in separate repos, such as `langchain-ai/langchain-google` and `langchain-ai/langchain-aws`. Usually these repos are cloned at the same level as this monorepo, so if needed, you can refer to their code directly by navigating to `../langchain-google/` from this monorepo.
|
||||
- **Testing layer** (`standard-tests/`): Standardized integration tests for partner integrations
|
||||
✅ **Good - Stable Interface:**
|
||||
|
||||
### Development tools & commands**
|
||||
```python
|
||||
def get_user(user_id: str, verbose: bool = False) -> User:
|
||||
"""Retrieve user by ID with optional verbose output."""
|
||||
pass
|
||||
```
|
||||
|
||||
- `uv` – Fast Python package installer and resolver (replaces pip/poetry)
|
||||
- `make` – Task runner for common development commands. Feel free to look at the `Makefile` for available commands and usage patterns.
|
||||
- `ruff` – Fast Python linter and formatter
|
||||
- `mypy` – Static type checking
|
||||
- `pytest` – Testing framework
|
||||
**Before making ANY changes to public APIs:**
|
||||
|
||||
This monorepo uses `uv` for dependency management. Local development uses editable installs: `[tool.uv.sources]`
|
||||
- Check if the function/class is exported in `__init__.py`
|
||||
- Look for existing usage patterns in tests and examples
|
||||
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
|
||||
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
|
||||
|
||||
Each package in `libs/` has its own `pyproject.toml` and `uv.lock`.
|
||||
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
|
||||
|
||||
### 2. Code Quality Standards
|
||||
|
||||
**All Python code MUST include type hints and return types.**
|
||||
|
||||
❌ **Bad:**
|
||||
|
||||
```python
|
||||
def p(u, d):
|
||||
return [x for x in u if x not in d]
|
||||
```
|
||||
|
||||
✅ **Good:**
|
||||
|
||||
```python
|
||||
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
|
||||
"""Filter out users that are not in the known users set.
|
||||
|
||||
Args:
|
||||
users: List of user identifiers to filter.
|
||||
known_users: Set of known/valid user identifiers.
|
||||
|
||||
Returns:
|
||||
List of users that are not in the known_users set.
|
||||
"""
|
||||
return [user for user in users if user not in known_users]
|
||||
```
|
||||
|
||||
**Style Requirements:**
|
||||
|
||||
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
|
||||
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
|
||||
- Avoid unnecessary abstraction or premature optimization
|
||||
- Follow existing patterns in the codebase you're modifying
|
||||
|
||||
### 3. Testing Requirements
|
||||
|
||||
**Every new feature or bugfix MUST be covered by unit tests.**
|
||||
|
||||
**Test Organization:**
|
||||
|
||||
- Unit tests: `tests/unit_tests/` (no network calls allowed)
|
||||
- Integration tests: `tests/integration_tests/` (network calls permitted)
|
||||
- Use `pytest` as the testing framework
|
||||
|
||||
**Test Quality Checklist:**
|
||||
|
||||
- [ ] Tests fail when your new logic is broken
|
||||
- [ ] Happy path is covered
|
||||
- [ ] Edge cases and error conditions are tested
|
||||
- [ ] Use fixtures/mocks for external dependencies
|
||||
- [ ] Tests are deterministic (no flaky tests)
|
||||
|
||||
Checklist questions:
|
||||
|
||||
- [ ] Does the test suite fail if your new logic is broken?
|
||||
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
|
||||
- [ ] Do tests use fixtures or mocks where needed?
|
||||
|
||||
```python
|
||||
def test_filter_unknown_users():
|
||||
"""Test filtering unknown users from a list."""
|
||||
users = ["alice", "bob", "charlie"]
|
||||
known_users = {"alice", "bob"}
|
||||
|
||||
result = filter_unknown_users(users, known_users)
|
||||
|
||||
assert result == ["charlie"]
|
||||
assert len(result) == 1
|
||||
```
|
||||
|
||||
### 4. Security and Risk Assessment
|
||||
|
||||
**Security Checklist:**
|
||||
|
||||
- No `eval()`, `exec()`, or `pickle` on user-controlled input
|
||||
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
|
||||
- Remove unreachable/commented code before committing
|
||||
- Race conditions or resource leaks (file handles, sockets, threads).
|
||||
- Ensure proper resource cleanup (file handles, connections)
|
||||
|
||||
❌ **Bad:**
|
||||
|
||||
```python
|
||||
def load_config(path):
|
||||
with open(path) as f:
|
||||
return eval(f.read()) # ⚠️ Never eval config
|
||||
```
|
||||
|
||||
✅ **Good:**
|
||||
|
||||
```python
|
||||
import json
|
||||
|
||||
def load_config(path: str) -> dict:
|
||||
with open(path) as f:
|
||||
return json.load(f)
|
||||
```
|
||||
|
||||
### 5. Documentation Standards
|
||||
|
||||
**Use Google-style docstrings with Args section for all public functions.**
|
||||
|
||||
❌ **Insufficient Documentation:**
|
||||
|
||||
```python
|
||||
def send_email(to, msg):
|
||||
"""Send an email to a recipient."""
|
||||
```
|
||||
|
||||
✅ **Complete Documentation:**
|
||||
|
||||
```python
|
||||
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
|
||||
"""
|
||||
Send an email to a recipient with specified priority.
|
||||
|
||||
Args:
|
||||
to: The email address of the recipient.
|
||||
msg: The message body to send.
|
||||
priority: Email priority level (`'low'`, `'normal'`, `'high'`).
|
||||
|
||||
Returns:
|
||||
`True` if email was sent successfully, `False` otherwise.
|
||||
|
||||
Raises:
|
||||
`InvalidEmailError`: If the email address format is invalid.
|
||||
`SMTPConnectionError`: If unable to connect to email server.
|
||||
"""
|
||||
```
|
||||
|
||||
**Documentation Guidelines:**
|
||||
|
||||
- Types go in function signatures, NOT in docstrings
|
||||
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
|
||||
- Focus on "why" rather than "what" in descriptions
|
||||
- Document all parameters, return values, and exceptions
|
||||
- Keep descriptions concise but clear
|
||||
- Ensure American English spelling (e.g., "behavior", not "behaviour")
|
||||
|
||||
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
|
||||
|
||||
### 6. Architectural Improvements
|
||||
|
||||
**When you encounter code that could be improved, suggest better designs:**
|
||||
|
||||
❌ **Poor Design:**
|
||||
|
||||
```python
|
||||
def process_data(data, db_conn, email_client, logger):
|
||||
# Function doing too many things
|
||||
validated = validate_data(data)
|
||||
result = db_conn.save(validated)
|
||||
email_client.send_notification(result)
|
||||
logger.log(f"Processed {len(data)} items")
|
||||
return result
|
||||
```
|
||||
|
||||
✅ **Better Design:**
|
||||
|
||||
```python
|
||||
@dataclass
|
||||
class ProcessingResult:
|
||||
"""Result of data processing operation."""
|
||||
items_processed: int
|
||||
success: bool
|
||||
errors: List[str] = field(default_factory=list)
|
||||
|
||||
class DataProcessor:
|
||||
"""Handles data validation, storage, and notification."""
|
||||
|
||||
def __init__(self, db_conn: Database, email_client: EmailClient):
|
||||
self.db = db_conn
|
||||
self.email = email_client
|
||||
|
||||
def process(self, data: List[dict]) -> ProcessingResult:
|
||||
"""Process and store data with notifications."""
|
||||
validated = self._validate_data(data)
|
||||
result = self.db.save(validated)
|
||||
self._notify_completion(result)
|
||||
return result
|
||||
```
|
||||
|
||||
**Design Improvement Areas:**
|
||||
|
||||
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
|
||||
|
||||
- Reduce code duplication through shared utilities
|
||||
- Make unit testing easier
|
||||
- Improve separation of concerns (single responsibility)
|
||||
- Make unit testing easier through dependency injection
|
||||
- Add clarity without adding complexity
|
||||
- Prefer dataclasses for structured data
|
||||
|
||||
## Development Tools & Commands
|
||||
|
||||
### Package Management
|
||||
|
||||
```bash
|
||||
# Add package
|
||||
uv add package-name
|
||||
|
||||
# Sync project dependencies
|
||||
uv sync
|
||||
uv lock
|
||||
```
|
||||
|
||||
### Testing
|
||||
|
||||
```bash
|
||||
# Run unit tests (no network)
|
||||
make test
|
||||
|
||||
# Don't run integration tests, as API keys must be set
|
||||
|
||||
# Run specific test file
|
||||
uv run --group test pytest tests/unit_tests/test_specific.py
|
||||
```
|
||||
|
||||
### Code Quality
|
||||
|
||||
```bash
|
||||
# Lint code
|
||||
make lint
|
||||
@@ -64,118 +261,66 @@ make format
|
||||
uv run --group lint mypy .
|
||||
```
|
||||
|
||||
#### Key config files
|
||||
### Dependency Management Patterns
|
||||
|
||||
- pyproject.toml: Main workspace configuration with dependency groups
|
||||
- uv.lock: Locked dependencies for reproducible builds
|
||||
- Makefile: Development tasks
|
||||
**Local Development Dependencies:**
|
||||
|
||||
#### Commit standards
|
||||
|
||||
Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes.
|
||||
|
||||
#### Pull request guidelines
|
||||
|
||||
- Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
|
||||
- Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
|
||||
- Highlight areas of the proposed changes that require careful review.
|
||||
|
||||
## Core development principles
|
||||
|
||||
### Maintain stable public interfaces
|
||||
|
||||
CRITICAL: Always attempt to preserve function signatures, argument positions, and names for exported/public methods. Do not make breaking changes.
|
||||
|
||||
**Before making ANY changes to public APIs:**
|
||||
|
||||
- Check if the function/class is exported in `__init__.py`
|
||||
- Look for existing usage patterns in tests and examples
|
||||
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
|
||||
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
|
||||
|
||||
Ask: "Would this change break someone's code if they used it last week?"
|
||||
|
||||
### Code quality standards
|
||||
|
||||
All Python code MUST include type hints and return types.
|
||||
|
||||
```python title="Example"
|
||||
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
|
||||
"""Single line description of the function.
|
||||
|
||||
Any additional context about the function can go here.
|
||||
|
||||
Args:
|
||||
users: List of user identifiers to filter.
|
||||
known_users: Set of known/valid user identifiers.
|
||||
|
||||
Returns:
|
||||
List of users that are not in the known_users set.
|
||||
"""
|
||||
```toml
|
||||
[tool.uv.sources]
|
||||
langchain-core = { path = "../core", editable = true }
|
||||
langchain-tests = { path = "../standard-tests", editable = true }
|
||||
```
|
||||
|
||||
- Use descriptive, self-explanatory variable names.
|
||||
- Follow existing patterns in the codebase you're modifying
|
||||
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
|
||||
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
|
||||
|
||||
### Testing requirements
|
||||
```python
|
||||
from langchain_core.tools import tool
|
||||
|
||||
Every new feature or bugfix MUST be covered by unit tests.
|
||||
|
||||
- Unit tests: `tests/unit_tests/` (no network calls allowed)
|
||||
- Integration tests: `tests/integration_tests/` (network calls permitted)
|
||||
- We use `pytest` as the testing framework; if in doubt, check other existing tests for examples.
|
||||
- The testing file structure should mirror the source code structure.
|
||||
|
||||
**Checklist:**
|
||||
|
||||
- [ ] Tests fail when your new logic is broken
|
||||
- [ ] Happy path is covered
|
||||
- [ ] Edge cases and error conditions are tested
|
||||
- [ ] Use fixtures/mocks for external dependencies
|
||||
- [ ] Tests are deterministic (no flaky tests)
|
||||
- [ ] Does the test suite fail if your new logic is broken?
|
||||
|
||||
### Security and risk assessment
|
||||
|
||||
- No `eval()`, `exec()`, or `pickle` on user-controlled input
|
||||
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
|
||||
- Remove unreachable/commented code before committing
|
||||
- Race conditions or resource leaks (file handles, sockets, threads).
|
||||
- Ensure proper resource cleanup (file handles, connections)
|
||||
|
||||
### Documentation standards
|
||||
|
||||
Use Google-style docstrings with Args section for all public functions.
|
||||
|
||||
```python title="Example"
|
||||
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
|
||||
"""Send an email to a recipient with specified priority.
|
||||
|
||||
Any additional context about the function can go here.
|
||||
@tool
|
||||
def search_database(query: str) -> str:
|
||||
"""Search the database for relevant information.
|
||||
|
||||
Args:
|
||||
to: The email address of the recipient.
|
||||
msg: The message body to send.
|
||||
priority: Email priority level.
|
||||
|
||||
Returns:
|
||||
`True` if email was sent successfully, `False` otherwise.
|
||||
|
||||
Raises:
|
||||
InvalidEmailError: If the email address format is invalid.
|
||||
SMTPConnectionError: If unable to connect to email server.
|
||||
query: The search query string.
|
||||
"""
|
||||
# Implementation here
|
||||
return results
|
||||
```
|
||||
|
||||
- Types go in function signatures, NOT in docstrings
|
||||
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
|
||||
- Focus on "why" rather than "what" in descriptions
|
||||
- Document all parameters, return values, and exceptions
|
||||
- Keep descriptions concise but clear
|
||||
- Ensure American English spelling (e.g., "behavior", not "behaviour")
|
||||
## Commit Standards
|
||||
|
||||
## Additional resources
|
||||
**Use Conventional Commits format for PR titles:**
|
||||
|
||||
- **Documentation:** https://docs.langchain.com/oss/python/langchain/overview and source at https://github.com/langchain-ai/docs or `../docs/`. Prefer the local install and use file search tools for best results. If needed, use the docs MCP server as defined in `.mcp.json` for programmatic access.
|
||||
- **Contributing Guide:** [`.github/CONTRIBUTING.md`](https://docs.langchain.com/oss/python/contributing/overview)
|
||||
- `feat(core): add multi-tenant support`
|
||||
- `fix(cli): resolve flag parsing error`
|
||||
- `docs: update API usage examples`
|
||||
- `docs(openai): update API usage examples`
|
||||
|
||||
## Framework-Specific Guidelines
|
||||
|
||||
- Follow the existing patterns in `langchain-core` for base abstractions
|
||||
- Use `langchain_core.callbacks` for execution tracking
|
||||
- Implement proper streaming support where applicable
|
||||
- Avoid deprecated components like legacy `LLMChain`
|
||||
|
||||
### Partner Integrations
|
||||
|
||||
- Follow the established patterns in existing partner libraries
|
||||
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
|
||||
- Include comprehensive integration tests
|
||||
- Document API key requirements and authentication
|
||||
|
||||
---
|
||||
|
||||
## Quick Reference Checklist
|
||||
|
||||
Before submitting code changes:
|
||||
|
||||
- [ ] **Breaking Changes**: Verified no public API changes
|
||||
- [ ] **Type Hints**: All functions have complete type annotations
|
||||
- [ ] **Tests**: New functionality is fully tested
|
||||
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
|
||||
- [ ] **Documentation**: Google-style docstrings for public functions
|
||||
- [ ] **Code Quality**: `make lint` and `make format` pass
|
||||
- [ ] **Architecture**: Suggested improvements where applicable
|
||||
- [ ] **Commit Message**: Follows Conventional Commits format
|
||||
|
||||
9
MIGRATE.md
Normal file
9
MIGRATE.md
Normal file
@@ -0,0 +1,9 @@
|
||||
# Migrating
|
||||
|
||||
Please see the following guides for migrating LangChain code:
|
||||
|
||||
* Migrate to [LangChain v1.0](https://docs.langchain.com/oss/python/migrate/langchain-v1)
|
||||
* Migrate to [LangChain v0.3](https://python.langchain.com/docs/versions/v0_3/)
|
||||
* Migrate to [LangChain v0.2](https://python.langchain.com/docs/versions/v0_2/)
|
||||
* Migrating from [LangChain 0.0.x Chains](https://python.langchain.com/docs/versions/migrating_chains/)
|
||||
* Upgrade to [LangGraph Memory](https://python.langchain.com/docs/versions/migrating_memory/)
|
||||
80
SECURITY.md
Normal file
80
SECURITY.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# Security Policy
|
||||
|
||||
LangChain has a large ecosystem of integrations with various external resources like local and remote file systems, APIs and databases. These integrations allow developers to create versatile applications that combine the power of LLMs with the ability to access, interact with and manipulate external resources.
|
||||
|
||||
## Best practices
|
||||
|
||||
When building such applications, developers should remember to follow good security practices:
|
||||
|
||||
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc., as appropriate for your application.
|
||||
* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it's safest to assume that any LLM able to use those credentials may in fact delete data.
|
||||
* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It's best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use.
|
||||
|
||||
Risks of not doing so include, but are not limited to:
|
||||
|
||||
* Data corruption or loss.
|
||||
* Unauthorized access to confidential information.
|
||||
* Compromised performance or availability of critical resources.
|
||||
|
||||
Example scenarios with mitigation strategies:
|
||||
|
||||
* A user may ask an agent with access to the file system to delete files that should not be deleted or read the content of files that contain sensitive information. To mitigate, limit the agent to only use a specific directory and only allow it to read or write files that are safe to read or write. Consider further sandboxing the agent by running it in a container.
|
||||
* A user may ask an agent with write access to an external API to write malicious data to the API, or delete data from that API. To mitigate, give the agent read-only API keys, or limit it to only use endpoints that are already resistant to such misuse.
|
||||
* A user may ask an agent with access to a database to drop a table or mutate the schema. To mitigate, scope the credentials to only the tables that the agent needs to access and consider issuing READ-ONLY credentials.
|
||||
|
||||
If you're building applications that access external resources like file systems, APIs or databases, consider speaking with your company's security team to determine how to best design and secure your applications.
|
||||
|
||||
## Reporting OSS Vulnerabilities
|
||||
|
||||
LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide
|
||||
a bounty program for our open source projects.
|
||||
|
||||
Please report security vulnerabilities associated with the LangChain
|
||||
open source projects at [huntr](https://huntr.com/bounties/disclose/?target=https%3A%2F%2Fgithub.com%2Flangchain-ai%2Flangchain&validSearch=true).
|
||||
|
||||
Before reporting a vulnerability, please review:
|
||||
|
||||
1) In-Scope Targets and Out-of-Scope Targets below.
|
||||
2) The [langchain-ai/langchain](https://docs.langchain.com/oss/python/contributing/code#repository-structure) monorepo structure.
|
||||
3) The [Best Practices](#best-practices) above to understand what we consider to be a security vulnerability vs. developer responsibility.
|
||||
|
||||
### In-Scope Targets
|
||||
|
||||
The following packages and repositories are eligible for bug bounties:
|
||||
|
||||
* langchain-core
|
||||
* langchain (see exceptions)
|
||||
* langchain-community (see exceptions)
|
||||
* langgraph
|
||||
* langserve
|
||||
|
||||
### Out of Scope Targets
|
||||
|
||||
All out of scope targets defined by huntr as well as:
|
||||
|
||||
* **langchain-experimental**: This repository is for experimental code and is not
|
||||
eligible for bug bounties (see [package warning](https://pypi.org/project/langchain-experimental/)), bug reports to it will be marked as interesting or waste of
|
||||
time and published with no bounty attached.
|
||||
* **tools**: Tools in either `langchain` or `langchain-community` are not eligible for bug
|
||||
bounties. This includes the following directories
|
||||
* `libs/langchain/langchain/tools`
|
||||
* `libs/community/langchain_community/tools`
|
||||
* Please review the [Best Practices](#best-practices)
|
||||
for more details, but generally tools interact with the real world. Developers are
|
||||
expected to understand the security implications of their code and are responsible
|
||||
for the security of their tools.
|
||||
* Code documented with security notices. This will be decided on a case-by-case basis, but likely will not be eligible for a bounty as the code is already
|
||||
documented with guidelines for developers that should be followed for making their
|
||||
application secure.
|
||||
* Any LangSmith related repositories or APIs (see [Reporting LangSmith Vulnerabilities](#reporting-langsmith-vulnerabilities)).
|
||||
|
||||
## Reporting LangSmith Vulnerabilities
|
||||
|
||||
Please report security vulnerabilities associated with LangSmith by email to `security@langchain.dev`.
|
||||
|
||||
* LangSmith site: [https://smith.langchain.com](https://smith.langchain.com)
|
||||
* SDK client: [https://github.com/langchain-ai/langsmith-sdk](https://github.com/langchain-ai/langsmith-sdk)
|
||||
|
||||
### Other Security Concerns
|
||||
|
||||
For any other security concerns, please contact us at `security@langchain.dev`.
|
||||
@@ -1,20 +0,0 @@
|
||||
# Makefile for libs/ directory
|
||||
# Contains targets that operate across multiple packages
|
||||
|
||||
LANGCHAIN_DIRS = core text-splitters langchain langchain_v1 model-profiles
|
||||
|
||||
.PHONY: lock check-lock
|
||||
|
||||
# Regenerate lockfiles for all core packages
|
||||
lock:
|
||||
@for dir in $(LANGCHAIN_DIRS); do \
|
||||
echo "=== Locking $$dir ==="; \
|
||||
(cd $$dir && uv lock); \
|
||||
done
|
||||
|
||||
# Verify all lockfiles are up-to-date
|
||||
check-lock:
|
||||
@for dir in $(LANGCHAIN_DIRS); do \
|
||||
echo "=== Checking $$dir ==="; \
|
||||
(cd $$dir && uv lock --check) || exit 1; \
|
||||
done
|
||||
@@ -6,8 +6,9 @@ import hashlib
|
||||
import logging
|
||||
import re
|
||||
import shutil
|
||||
from collections.abc import Sequence
|
||||
from pathlib import Path
|
||||
from typing import TYPE_CHECKING, Any, TypedDict
|
||||
from typing import Any, TypedDict
|
||||
|
||||
from git import Repo
|
||||
|
||||
@@ -17,9 +18,6 @@ from langchain_cli.constants import (
|
||||
DEFAULT_GIT_SUBDIRECTORY,
|
||||
)
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
|
||||
@@ -1,11 +1,9 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from dataclasses import dataclass
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from .file import File
|
||||
from .folder import Folder
|
||||
from .file import File
|
||||
from .folder import Folder
|
||||
|
||||
|
||||
@dataclass
|
||||
|
||||
@@ -1,12 +1,9 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING
|
||||
from pathlib import Path
|
||||
|
||||
from .file import File
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
class Folder:
|
||||
def __init__(self, name: str, *files: Folder | File) -> None:
|
||||
|
||||
@@ -34,7 +34,7 @@ The LangChain ecosystem is built on top of `langchain-core`. Some of the benefit
|
||||
|
||||
## 📖 Documentation
|
||||
|
||||
For full documentation, see the [API reference](https://reference.langchain.com/python/langchain_core/). For conceptual guides, tutorials, and examples on using LangChain, see the [LangChain Docs](https://docs.langchain.com/oss/python/langchain/overview).
|
||||
For full documentation, see the [API reference](https://reference.langchain.com/python/langchain_core/).
|
||||
|
||||
## 📕 Releases & Versioning
|
||||
|
||||
|
||||
@@ -28,27 +28,6 @@ from pydantic.v1.fields import FieldInfo as FieldInfoV1
|
||||
from langchain_core._api.internal import is_caller_internal
|
||||
|
||||
|
||||
def _build_deprecation_message(
|
||||
*,
|
||||
alternative: str = "",
|
||||
alternative_import: str = "",
|
||||
) -> str:
|
||||
"""Build a simple deprecation message for `__deprecated__` attribute.
|
||||
|
||||
Args:
|
||||
alternative: An alternative API name.
|
||||
alternative_import: A fully qualified import path for the alternative.
|
||||
|
||||
Returns:
|
||||
A deprecation message string for IDE/type checker display.
|
||||
"""
|
||||
if alternative_import:
|
||||
return f"Use {alternative_import} instead."
|
||||
if alternative:
|
||||
return f"Use {alternative} instead."
|
||||
return "Deprecated."
|
||||
|
||||
|
||||
class LangChainDeprecationWarning(DeprecationWarning):
|
||||
"""A class for issuing deprecation warnings for LangChain users."""
|
||||
|
||||
@@ -102,57 +81,60 @@ def deprecated(
|
||||
) -> Callable[[T], T]:
|
||||
"""Decorator to mark a function, a class, or a property as deprecated.
|
||||
|
||||
When deprecating a classmethod, a staticmethod, or a property, the `@deprecated`
|
||||
decorator should go *under* `@classmethod` and `@staticmethod` (i.e., `deprecated`
|
||||
should directly decorate the underlying callable), but *over* `@property`.
|
||||
When deprecating a classmethod, a staticmethod, or a property, the
|
||||
`@deprecated` decorator should go *under* `@classmethod` and
|
||||
`@staticmethod` (i.e., `deprecated` should directly decorate the
|
||||
underlying callable), but *over* `@property`.
|
||||
|
||||
When deprecating a class `C` intended to be used as a base class in a multiple
|
||||
inheritance hierarchy, `C` *must* define an `__init__` method (if `C` instead
|
||||
inherited its `__init__` from its own base class, then `@deprecated` would mess up
|
||||
`__init__` inheritance when installing its own (deprecation-emitting) `C.__init__`).
|
||||
When deprecating a class `C` intended to be used as a base class in a
|
||||
multiple inheritance hierarchy, `C` *must* define an `__init__` method
|
||||
(if `C` instead inherited its `__init__` from its own base class, then
|
||||
`@deprecated` would mess up `__init__` inheritance when installing its
|
||||
own (deprecation-emitting) `C.__init__`).
|
||||
|
||||
Parameters are the same as for `warn_deprecated`, except that *obj_type* defaults to
|
||||
'class' if decorating a class, 'attribute' if decorating a property, and 'function'
|
||||
otherwise.
|
||||
Parameters are the same as for `warn_deprecated`, except that *obj_type*
|
||||
defaults to 'class' if decorating a class, 'attribute' if decorating a
|
||||
property, and 'function' otherwise.
|
||||
|
||||
Args:
|
||||
since: The release at which this API became deprecated.
|
||||
message: Override the default deprecation message.
|
||||
|
||||
The `%(since)s`, `%(name)s`, `%(alternative)s`, `%(obj_type)s`,
|
||||
`%(addendum)s`, and `%(removal)s` format specifiers will be replaced by the
|
||||
since:
|
||||
The release at which this API became deprecated.
|
||||
message:
|
||||
Override the default deprecation message. The %(since)s,
|
||||
%(name)s, %(alternative)s, %(obj_type)s, %(addendum)s,
|
||||
and %(removal)s format specifiers will be replaced by the
|
||||
values of the respective arguments passed to this function.
|
||||
name: The name of the deprecated object.
|
||||
alternative: An alternative API that the user may use in place of the deprecated
|
||||
API.
|
||||
|
||||
The deprecation warning will tell the user about this alternative if
|
||||
provided.
|
||||
alternative_import: An alternative import that the user may use instead.
|
||||
pending: If `True`, uses a `PendingDeprecationWarning` instead of a
|
||||
`DeprecationWarning`.
|
||||
|
||||
Cannot be used together with removal.
|
||||
obj_type: The object type being deprecated.
|
||||
addendum: Additional text appended directly to the final message.
|
||||
removal: The expected removal version.
|
||||
|
||||
With the default (an empty string), a removal version is automatically
|
||||
computed from since. Set to other Falsy values to not schedule a removal
|
||||
date.
|
||||
|
||||
Cannot be used together with pending.
|
||||
package: The package of the deprecated object.
|
||||
name:
|
||||
The name of the deprecated object.
|
||||
alternative:
|
||||
An alternative API that the user may use in place of the
|
||||
deprecated API. The deprecation warning will tell the user
|
||||
about this alternative if provided.
|
||||
alternative_import:
|
||||
An alternative import that the user may use instead.
|
||||
pending:
|
||||
If `True`, uses a `PendingDeprecationWarning` instead of a
|
||||
DeprecationWarning. Cannot be used together with removal.
|
||||
obj_type:
|
||||
The object type being deprecated.
|
||||
addendum:
|
||||
Additional text appended directly to the final message.
|
||||
removal:
|
||||
The expected removal version. With the default (an empty
|
||||
string), a removal version is automatically computed from
|
||||
since. Set to other Falsy values to not schedule a removal
|
||||
date. Cannot be used together with pending.
|
||||
package:
|
||||
The package of the deprecated object.
|
||||
|
||||
Returns:
|
||||
A decorator to mark a function or class as deprecated.
|
||||
|
||||
Example:
|
||||
```python
|
||||
@deprecated("1.4.0")
|
||||
def the_function_to_deprecate():
|
||||
pass
|
||||
```
|
||||
```python
|
||||
@deprecated("1.4.0")
|
||||
def the_function_to_deprecate():
|
||||
pass
|
||||
```
|
||||
"""
|
||||
_validate_deprecation_params(
|
||||
removal, alternative, alternative_import, pending=pending
|
||||
@@ -241,11 +223,6 @@ def deprecated(
|
||||
obj.__init__ = functools.wraps(obj.__init__)( # type: ignore[misc]
|
||||
warn_if_direct_instance
|
||||
)
|
||||
# Set __deprecated__ for PEP 702 (IDE/type checker support)
|
||||
obj.__deprecated__ = _build_deprecation_message( # type: ignore[attr-defined]
|
||||
alternative=alternative,
|
||||
alternative_import=alternative_import,
|
||||
)
|
||||
return obj
|
||||
|
||||
elif isinstance(obj, FieldInfoV1):
|
||||
@@ -338,15 +315,12 @@ def deprecated(
|
||||
|
||||
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T: # noqa: ARG001
|
||||
"""Finalize the property."""
|
||||
prop = _DeprecatedProperty(
|
||||
fget=obj.fget, fset=obj.fset, fdel=obj.fdel, doc=new_doc
|
||||
return cast(
|
||||
"T",
|
||||
_DeprecatedProperty(
|
||||
fget=obj.fget, fset=obj.fset, fdel=obj.fdel, doc=new_doc
|
||||
),
|
||||
)
|
||||
# Set __deprecated__ for PEP 702 (IDE/type checker support)
|
||||
prop.__deprecated__ = _build_deprecation_message( # type: ignore[attr-defined]
|
||||
alternative=alternative,
|
||||
alternative_import=alternative_import,
|
||||
)
|
||||
return cast("T", prop)
|
||||
|
||||
else:
|
||||
_name = _name or cast("type | Callable", obj).__qualname__
|
||||
@@ -369,11 +343,6 @@ def deprecated(
|
||||
"""
|
||||
wrapper = functools.wraps(wrapped)(wrapper)
|
||||
wrapper.__doc__ = new_doc
|
||||
# Set __deprecated__ for PEP 702 (IDE/type checker support)
|
||||
wrapper.__deprecated__ = _build_deprecation_message( # type: ignore[attr-defined]
|
||||
alternative=alternative,
|
||||
alternative_import=alternative_import,
|
||||
)
|
||||
return cast("T", wrapper)
|
||||
|
||||
old_doc = inspect.cleandoc(old_doc or "").strip("\n")
|
||||
@@ -429,7 +398,7 @@ def deprecated(
|
||||
|
||||
@contextlib.contextmanager
|
||||
def suppress_langchain_deprecation_warning() -> Generator[None, None, None]:
|
||||
"""Context manager to suppress `LangChainDeprecationWarning`."""
|
||||
"""Context manager to suppress LangChainDeprecationWarning."""
|
||||
with warnings.catch_warnings():
|
||||
warnings.simplefilter("ignore", LangChainDeprecationWarning)
|
||||
warnings.simplefilter("ignore", LangChainPendingDeprecationWarning)
|
||||
@@ -452,33 +421,35 @@ def warn_deprecated(
|
||||
"""Display a standardized deprecation.
|
||||
|
||||
Args:
|
||||
since: The release at which this API became deprecated.
|
||||
message: Override the default deprecation message.
|
||||
|
||||
The `%(since)s`, `%(name)s`, `%(alternative)s`, `%(obj_type)s`,
|
||||
`%(addendum)s`, and `%(removal)s` format specifiers will be replaced by the
|
||||
since:
|
||||
The release at which this API became deprecated.
|
||||
message:
|
||||
Override the default deprecation message. The %(since)s,
|
||||
%(name)s, %(alternative)s, %(obj_type)s, %(addendum)s,
|
||||
and %(removal)s format specifiers will be replaced by the
|
||||
values of the respective arguments passed to this function.
|
||||
name: The name of the deprecated object.
|
||||
alternative: An alternative API that the user may use in place of the
|
||||
deprecated API.
|
||||
|
||||
The deprecation warning will tell the user about this alternative if
|
||||
provided.
|
||||
alternative_import: An alternative import that the user may use instead.
|
||||
pending: If `True`, uses a `PendingDeprecationWarning` instead of a
|
||||
`DeprecationWarning`.
|
||||
|
||||
Cannot be used together with removal.
|
||||
obj_type: The object type being deprecated.
|
||||
addendum: Additional text appended directly to the final message.
|
||||
removal: The expected removal version.
|
||||
|
||||
With the default (an empty string), a removal version is automatically
|
||||
computed from since. Set to other Falsy values to not schedule a removal
|
||||
date.
|
||||
|
||||
Cannot be used together with pending.
|
||||
package: The package of the deprecated object.
|
||||
name:
|
||||
The name of the deprecated object.
|
||||
alternative:
|
||||
An alternative API that the user may use in place of the
|
||||
deprecated API. The deprecation warning will tell the user
|
||||
about this alternative if provided.
|
||||
alternative_import:
|
||||
An alternative import that the user may use instead.
|
||||
pending:
|
||||
If `True`, uses a `PendingDeprecationWarning` instead of a
|
||||
DeprecationWarning. Cannot be used together with removal.
|
||||
obj_type:
|
||||
The object type being deprecated.
|
||||
addendum:
|
||||
Additional text appended directly to the final message.
|
||||
removal:
|
||||
The expected removal version. With the default (an empty
|
||||
string), a removal version is automatically computed from
|
||||
since. Set to other Falsy values to not schedule a removal
|
||||
date. Cannot be used together with pending.
|
||||
package:
|
||||
The package of the deprecated object.
|
||||
"""
|
||||
if not pending:
|
||||
if not removal:
|
||||
@@ -563,8 +534,8 @@ def rename_parameter(
|
||||
"""Decorator indicating that parameter *old* of *func* is renamed to *new*.
|
||||
|
||||
The actual implementation of *func* should use *new*, not *old*. If *old* is passed
|
||||
to *func*, a `DeprecationWarning` is emitted, and its value is used, even if *new*
|
||||
is also passed by keyword.
|
||||
to *func*, a DeprecationWarning is emitted, and its value is used, even if *new* is
|
||||
also passed by keyword.
|
||||
|
||||
Args:
|
||||
since: The version in which the parameter was renamed.
|
||||
|
||||
@@ -3,7 +3,6 @@
|
||||
Distinct from provider-based [prompt caching](https://docs.langchain.com/oss/python/langchain/models#prompt-caching).
|
||||
|
||||
!!! warning "Beta feature"
|
||||
|
||||
This is a beta feature. Please be wary of deploying experimental code to production
|
||||
unless you've taken appropriate precautions.
|
||||
|
||||
|
||||
@@ -5,12 +5,13 @@ from __future__ import annotations
|
||||
import logging
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
from typing_extensions import Self
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
from uuid import UUID
|
||||
|
||||
from tenacity import RetryCallState
|
||||
from typing_extensions import Self
|
||||
|
||||
from langchain_core.agents import AgentAction, AgentFinish
|
||||
from langchain_core.documents import Document
|
||||
|
||||
@@ -6,6 +6,7 @@ import asyncio
|
||||
import atexit
|
||||
import functools
|
||||
import logging
|
||||
import uuid
|
||||
from abc import ABC, abstractmethod
|
||||
from collections.abc import Callable
|
||||
from concurrent.futures import ThreadPoolExecutor
|
||||
@@ -38,9 +39,9 @@ from langchain_core.tracers.context import (
|
||||
tracing_v2_callback_var,
|
||||
)
|
||||
from langchain_core.tracers.langchain import LangChainTracer
|
||||
from langchain_core.tracers.schemas import Run
|
||||
from langchain_core.tracers.stdout import ConsoleCallbackHandler
|
||||
from langchain_core.utils.env import env_var_is_set
|
||||
from langchain_core.utils.uuid import uuid7
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import AsyncGenerator, Coroutine, Generator, Sequence
|
||||
@@ -51,7 +52,6 @@ if TYPE_CHECKING:
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.outputs import ChatGenerationChunk, GenerationChunk, LLMResult
|
||||
from langchain_core.runnables.config import RunnableConfig
|
||||
from langchain_core.tracers.schemas import Run
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -504,7 +504,7 @@ class BaseRunManager(RunManagerMixin):
|
||||
|
||||
"""
|
||||
return cls(
|
||||
run_id=uuid7(),
|
||||
run_id=uuid.uuid4(),
|
||||
handlers=[],
|
||||
inheritable_handlers=[],
|
||||
tags=[],
|
||||
@@ -1330,7 +1330,7 @@ class CallbackManager(BaseCallbackManager):
|
||||
managers = []
|
||||
for i, prompt in enumerate(prompts):
|
||||
# Can't have duplicate runs with the same run ID (if provided)
|
||||
run_id_ = run_id if i == 0 and run_id is not None else uuid7()
|
||||
run_id_ = run_id if i == 0 and run_id is not None else uuid.uuid4()
|
||||
handle_event(
|
||||
self.handlers,
|
||||
"on_llm_start",
|
||||
@@ -1384,7 +1384,7 @@ class CallbackManager(BaseCallbackManager):
|
||||
run_id_ = run_id
|
||||
run_id = None
|
||||
else:
|
||||
run_id_ = uuid7()
|
||||
run_id_ = uuid.uuid4()
|
||||
handle_event(
|
||||
self.handlers,
|
||||
"on_chat_model_start",
|
||||
@@ -1433,7 +1433,7 @@ class CallbackManager(BaseCallbackManager):
|
||||
|
||||
"""
|
||||
if run_id is None:
|
||||
run_id = uuid7()
|
||||
run_id = uuid.uuid4()
|
||||
handle_event(
|
||||
self.handlers,
|
||||
"on_chain_start",
|
||||
@@ -1488,7 +1488,7 @@ class CallbackManager(BaseCallbackManager):
|
||||
|
||||
"""
|
||||
if run_id is None:
|
||||
run_id = uuid7()
|
||||
run_id = uuid.uuid4()
|
||||
|
||||
handle_event(
|
||||
self.handlers,
|
||||
@@ -1537,7 +1537,7 @@ class CallbackManager(BaseCallbackManager):
|
||||
The callback manager for the retriever run.
|
||||
"""
|
||||
if run_id is None:
|
||||
run_id = uuid7()
|
||||
run_id = uuid.uuid4()
|
||||
|
||||
handle_event(
|
||||
self.handlers,
|
||||
@@ -1594,7 +1594,7 @@ class CallbackManager(BaseCallbackManager):
|
||||
)
|
||||
raise ValueError(msg)
|
||||
if run_id is None:
|
||||
run_id = uuid7()
|
||||
run_id = uuid.uuid4()
|
||||
|
||||
handle_event(
|
||||
self.handlers,
|
||||
@@ -1816,7 +1816,7 @@ class AsyncCallbackManager(BaseCallbackManager):
|
||||
run_id_ = run_id
|
||||
run_id = None
|
||||
else:
|
||||
run_id_ = uuid7()
|
||||
run_id_ = uuid.uuid4()
|
||||
|
||||
if inline_handlers:
|
||||
inline_tasks.append(
|
||||
@@ -1900,7 +1900,7 @@ class AsyncCallbackManager(BaseCallbackManager):
|
||||
run_id_ = run_id
|
||||
run_id = None
|
||||
else:
|
||||
run_id_ = uuid7()
|
||||
run_id_ = uuid.uuid4()
|
||||
|
||||
for handler in self.handlers:
|
||||
task = ahandle_event(
|
||||
@@ -1962,7 +1962,7 @@ class AsyncCallbackManager(BaseCallbackManager):
|
||||
The async callback manager for the chain run.
|
||||
"""
|
||||
if run_id is None:
|
||||
run_id = uuid7()
|
||||
run_id = uuid.uuid4()
|
||||
|
||||
await ahandle_event(
|
||||
self.handlers,
|
||||
@@ -2010,7 +2010,7 @@ class AsyncCallbackManager(BaseCallbackManager):
|
||||
The async callback manager for the tool run.
|
||||
"""
|
||||
if run_id is None:
|
||||
run_id = uuid7()
|
||||
run_id = uuid.uuid4()
|
||||
|
||||
await ahandle_event(
|
||||
self.handlers,
|
||||
@@ -2060,7 +2060,7 @@ class AsyncCallbackManager(BaseCallbackManager):
|
||||
if not self.handlers:
|
||||
return
|
||||
if run_id is None:
|
||||
run_id = uuid7()
|
||||
run_id = uuid.uuid4()
|
||||
|
||||
if kwargs:
|
||||
msg = (
|
||||
@@ -2102,7 +2102,7 @@ class AsyncCallbackManager(BaseCallbackManager):
|
||||
The async callback manager for the retriever run.
|
||||
"""
|
||||
if run_id is None:
|
||||
run_id = uuid7()
|
||||
run_id = uuid.uuid4()
|
||||
|
||||
await ahandle_event(
|
||||
self.handlers,
|
||||
|
||||
@@ -95,7 +95,7 @@ def get_usage_metadata_callback(
|
||||
"""Get usage metadata callback.
|
||||
|
||||
Get context manager for tracking usage metadata across chat model calls using
|
||||
[`AIMessage.usage_metadata`][langchain.messages.AIMessage.usage_metadata].
|
||||
`AIMessage.usage_metadata`.
|
||||
|
||||
Args:
|
||||
name: The name of the context variable.
|
||||
|
||||
@@ -11,7 +11,7 @@ from langchain_core.prompts.prompt import PromptTemplate
|
||||
|
||||
|
||||
def _get_length_based(text: str) -> int:
|
||||
return len(re.split(r"\n| ", text))
|
||||
return len(re.split("\n| ", text))
|
||||
|
||||
|
||||
class LengthBasedExampleSelector(BaseExampleSelector, BaseModel):
|
||||
|
||||
@@ -6,9 +6,16 @@ import hashlib
|
||||
import json
|
||||
import uuid
|
||||
import warnings
|
||||
from collections.abc import (
|
||||
AsyncIterable,
|
||||
AsyncIterator,
|
||||
Callable,
|
||||
Iterable,
|
||||
Iterator,
|
||||
Sequence,
|
||||
)
|
||||
from itertools import islice
|
||||
from typing import (
|
||||
TYPE_CHECKING,
|
||||
Any,
|
||||
Literal,
|
||||
TypedDict,
|
||||
@@ -22,16 +29,6 @@ from langchain_core.exceptions import LangChainException
|
||||
from langchain_core.indexing.base import DocumentIndex, RecordManager
|
||||
from langchain_core.vectorstores import VectorStore
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import (
|
||||
AsyncIterable,
|
||||
AsyncIterator,
|
||||
Callable,
|
||||
Iterable,
|
||||
Iterator,
|
||||
Sequence,
|
||||
)
|
||||
|
||||
# Magic UUID to use as a namespace for hashing.
|
||||
# Used to try and generate a unique UUID for each document
|
||||
# from hashing the document content and metadata.
|
||||
@@ -302,7 +299,6 @@ def index(
|
||||
are not able to specify the uid of the document.
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 0.3.25"
|
||||
|
||||
Added `scoped_full` cleanup mode.
|
||||
|
||||
!!! warning
|
||||
@@ -641,7 +637,6 @@ async def aindex(
|
||||
are not able to specify the uid of the document.
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 0.3.25"
|
||||
|
||||
Added `scoped_full` cleanup mode.
|
||||
|
||||
!!! warning
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
"""Core language model abstractions.
|
||||
"""Language models.
|
||||
|
||||
LangChain has two main classes to work with language models: chat models and
|
||||
"old-fashioned" LLMs (string-in, string-out).
|
||||
"old-fashioned" LLMs.
|
||||
|
||||
**Chat models**
|
||||
|
||||
@@ -11,16 +11,14 @@ as outputs (as opposed to using plain text).
|
||||
Chat models support the assignment of distinct roles to conversation messages, helping
|
||||
to distinguish messages from the AI, users, and instructions such as system messages.
|
||||
|
||||
The key abstraction for chat models is
|
||||
[`BaseChatModel`][langchain_core.language_models.BaseChatModel]. Implementations should
|
||||
inherit from this class.
|
||||
The key abstraction for chat models is `BaseChatModel`. Implementations should inherit
|
||||
from this class.
|
||||
|
||||
See existing [chat model integrations](https://docs.langchain.com/oss/python/integrations/chat).
|
||||
|
||||
**LLMs (legacy)**
|
||||
**LLMs**
|
||||
|
||||
Language models that takes a string as input and returns a string.
|
||||
|
||||
These are traditionally older models (newer models generally are chat models).
|
||||
|
||||
Although the underlying models are string in, string out, the LangChain wrappers also
|
||||
@@ -55,10 +53,6 @@ if TYPE_CHECKING:
|
||||
ParrotFakeChatModel,
|
||||
)
|
||||
from langchain_core.language_models.llms import LLM, BaseLLM
|
||||
from langchain_core.language_models.model_profile import (
|
||||
ModelProfile,
|
||||
ModelProfileRegistry,
|
||||
)
|
||||
|
||||
__all__ = (
|
||||
"LLM",
|
||||
@@ -74,8 +68,6 @@ __all__ = (
|
||||
"LanguageModelInput",
|
||||
"LanguageModelLike",
|
||||
"LanguageModelOutput",
|
||||
"ModelProfile",
|
||||
"ModelProfileRegistry",
|
||||
"ParrotFakeChatModel",
|
||||
"SimpleChatModel",
|
||||
"get_tokenizer",
|
||||
@@ -98,8 +90,6 @@ _dynamic_imports = {
|
||||
"GenericFakeChatModel": "fake_chat_models",
|
||||
"ParrotFakeChatModel": "fake_chat_models",
|
||||
"LLM": "llms",
|
||||
"ModelProfile": "model_profile",
|
||||
"ModelProfileRegistry": "model_profile",
|
||||
"BaseLLM": "llms",
|
||||
"is_openai_data_block": "_utils",
|
||||
}
|
||||
|
||||
@@ -140,7 +140,6 @@ def _normalize_messages(
|
||||
- LangChain v0 standard content blocks for backward compatibility
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 1.0.0"
|
||||
|
||||
In previous versions, this function returned messages in LangChain v0 format.
|
||||
Now, it returns messages in LangChain v1 format, which upgraded chat models now
|
||||
expect to receive when passing back in message history. For backward
|
||||
|
||||
@@ -299,9 +299,6 @@ class BaseLanguageModel(
|
||||
|
||||
Useful for checking if an input fits in a model's context window.
|
||||
|
||||
This should be overridden by model-specific implementations to provide accurate
|
||||
token counts via model-specific tokenizers.
|
||||
|
||||
Args:
|
||||
text: The string input to tokenize.
|
||||
|
||||
@@ -320,17 +317,9 @@ class BaseLanguageModel(
|
||||
|
||||
Useful for checking if an input fits in a model's context window.
|
||||
|
||||
This should be overridden by model-specific implementations to provide accurate
|
||||
token counts via model-specific tokenizers.
|
||||
|
||||
!!! note
|
||||
|
||||
* The base implementation of `get_num_tokens_from_messages` ignores tool
|
||||
schemas.
|
||||
* The base implementation of `get_num_tokens_from_messages` adds additional
|
||||
prefixes to messages in represent user roles, which will add to the
|
||||
overall token count. Model-specific implementations may choose to
|
||||
handle this differently.
|
||||
The base implementation of `get_num_tokens_from_messages` ignores tool
|
||||
schemas.
|
||||
|
||||
Args:
|
||||
messages: The message inputs to tokenize.
|
||||
|
||||
@@ -15,6 +15,7 @@ from typing import TYPE_CHECKING, Any, Literal, cast
|
||||
from pydantic import BaseModel, ConfigDict, Field
|
||||
from typing_extensions import override
|
||||
|
||||
from langchain_core._api.beta_decorator import beta
|
||||
from langchain_core.caches import BaseCache
|
||||
from langchain_core.callbacks import (
|
||||
AsyncCallbackManager,
|
||||
@@ -33,7 +34,6 @@ from langchain_core.language_models.base import (
|
||||
LangSmithParams,
|
||||
LanguageModelInput,
|
||||
)
|
||||
from langchain_core.language_models.model_profile import ModelProfile
|
||||
from langchain_core.load import dumpd, dumps
|
||||
from langchain_core.messages import (
|
||||
AIMessage,
|
||||
@@ -76,6 +76,8 @@ from langchain_core.utils.utils import LC_ID_PREFIX, from_env
|
||||
if TYPE_CHECKING:
|
||||
import uuid
|
||||
|
||||
from langchain_model_profiles import ModelProfile # type: ignore[import-untyped]
|
||||
|
||||
from langchain_core.output_parsers.base import OutputParserLike
|
||||
from langchain_core.runnables import Runnable, RunnableConfig
|
||||
from langchain_core.tools import BaseTool
|
||||
@@ -89,10 +91,7 @@ def _generate_response_from_error(error: BaseException) -> list[ChatGeneration]:
|
||||
try:
|
||||
metadata["body"] = response.json()
|
||||
except Exception:
|
||||
try:
|
||||
metadata["body"] = getattr(response, "text", None)
|
||||
except Exception:
|
||||
metadata["body"] = None
|
||||
metadata["body"] = getattr(response, "text", None)
|
||||
if hasattr(response, "headers"):
|
||||
try:
|
||||
metadata["headers"] = dict(response.headers)
|
||||
@@ -333,26 +332,10 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
|
||||
[`langchain-openai`](https://pypi.org/project/langchain-openai)) can also use this
|
||||
field to roll out new content formats in a backward-compatible way.
|
||||
|
||||
!!! version-added "Added in `langchain-core` 1.0.0"
|
||||
!!! version-added "Added in `langchain-core` 1.0"
|
||||
|
||||
"""
|
||||
|
||||
profile: ModelProfile | None = Field(default=None, exclude=True)
|
||||
"""Profile detailing model capabilities.
|
||||
|
||||
!!! warning "Beta feature"
|
||||
|
||||
This is a beta feature. The format of model profiles is subject to change.
|
||||
|
||||
If not specified, automatically loaded from the provider package on initialization
|
||||
if data is available.
|
||||
|
||||
Example profile data includes context window sizes, supported modalities, or support
|
||||
for tool calling, structured output, and other features.
|
||||
|
||||
!!! version-added "Added in `langchain-core` 1.1.0"
|
||||
"""
|
||||
|
||||
model_config = ConfigDict(
|
||||
arbitrary_types_allowed=True,
|
||||
)
|
||||
@@ -548,7 +531,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
|
||||
):
|
||||
if block["type"] != index_type:
|
||||
index_type = block["type"]
|
||||
index += 1
|
||||
index = index + 1
|
||||
if "index" not in block:
|
||||
block["index"] = index
|
||||
run_manager.on_llm_new_token(
|
||||
@@ -680,7 +663,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
|
||||
):
|
||||
if block["type"] != index_type:
|
||||
index_type = block["type"]
|
||||
index += 1
|
||||
index = index + 1
|
||||
if "index" not in block:
|
||||
block["index"] = index
|
||||
await run_manager.on_llm_new_token(
|
||||
@@ -731,7 +714,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
|
||||
|
||||
# --- Custom methods ---
|
||||
|
||||
def _combine_llm_outputs(self, _llm_outputs: list[dict | None], /) -> dict:
|
||||
def _combine_llm_outputs(self, llm_outputs: list[dict | None]) -> dict: # noqa: ARG002
|
||||
return {}
|
||||
|
||||
def _convert_cached_generations(self, cache_val: list) -> list[ChatGeneration]:
|
||||
@@ -1188,7 +1171,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
|
||||
):
|
||||
if block["type"] != index_type:
|
||||
index_type = block["type"]
|
||||
index += 1
|
||||
index = index + 1
|
||||
if "index" not in block:
|
||||
block["index"] = index
|
||||
if run_manager:
|
||||
@@ -1306,7 +1289,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
|
||||
):
|
||||
if block["type"] != index_type:
|
||||
index_type = block["type"]
|
||||
index += 1
|
||||
index = index + 1
|
||||
if "index" not in block:
|
||||
block["index"] = index
|
||||
if run_manager:
|
||||
@@ -1579,89 +1562,88 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
|
||||
depends on the `schema` as described above.
|
||||
- `'parsing_error'`: `BaseException | None`
|
||||
|
||||
???+ example "Pydantic schema (`include_raw=False`)"
|
||||
Example: Pydantic schema (`include_raw=False`):
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
|
||||
|
||||
class AnswerWithJustification(BaseModel):
|
||||
'''An answer to the user question along with justification for the answer.'''
|
||||
class AnswerWithJustification(BaseModel):
|
||||
'''An answer to the user question along with justification for the answer.'''
|
||||
|
||||
answer: str
|
||||
justification: str
|
||||
answer: str
|
||||
justification: str
|
||||
|
||||
|
||||
model = ChatModel(model="model-name", temperature=0)
|
||||
structured_model = model.with_structured_output(AnswerWithJustification)
|
||||
model = ChatModel(model="model-name", temperature=0)
|
||||
structured_model = model.with_structured_output(AnswerWithJustification)
|
||||
|
||||
structured_model.invoke(
|
||||
"What weighs more a pound of bricks or a pound of feathers"
|
||||
)
|
||||
structured_model.invoke(
|
||||
"What weighs more a pound of bricks or a pound of feathers"
|
||||
)
|
||||
|
||||
# -> AnswerWithJustification(
|
||||
# answer='They weigh the same',
|
||||
# justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'
|
||||
# )
|
||||
```
|
||||
# -> AnswerWithJustification(
|
||||
# answer='They weigh the same',
|
||||
# justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'
|
||||
# )
|
||||
```
|
||||
|
||||
??? example "Pydantic schema (`include_raw=True`)"
|
||||
Example: Pydantic schema (`include_raw=True`):
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
|
||||
|
||||
class AnswerWithJustification(BaseModel):
|
||||
'''An answer to the user question along with justification for the answer.'''
|
||||
class AnswerWithJustification(BaseModel):
|
||||
'''An answer to the user question along with justification for the answer.'''
|
||||
|
||||
answer: str
|
||||
justification: str
|
||||
answer: str
|
||||
justification: str
|
||||
|
||||
|
||||
model = ChatModel(model="model-name", temperature=0)
|
||||
structured_model = model.with_structured_output(
|
||||
AnswerWithJustification, include_raw=True
|
||||
)
|
||||
model = ChatModel(model="model-name", temperature=0)
|
||||
structured_model = model.with_structured_output(
|
||||
AnswerWithJustification, include_raw=True
|
||||
)
|
||||
|
||||
structured_model.invoke(
|
||||
"What weighs more a pound of bricks or a pound of feathers"
|
||||
)
|
||||
# -> {
|
||||
# 'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Ao02pnFYXD6GN1yzc0uXPsvF', 'function': {'arguments': '{"answer":"They weigh the same.","justification":"Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ."}', 'name': 'AnswerWithJustification'}, 'type': 'function'}]}),
|
||||
# 'parsed': AnswerWithJustification(answer='They weigh the same.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'),
|
||||
# 'parsing_error': None
|
||||
# }
|
||||
```
|
||||
structured_model.invoke(
|
||||
"What weighs more a pound of bricks or a pound of feathers"
|
||||
)
|
||||
# -> {
|
||||
# 'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Ao02pnFYXD6GN1yzc0uXPsvF', 'function': {'arguments': '{"answer":"They weigh the same.","justification":"Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ."}', 'name': 'AnswerWithJustification'}, 'type': 'function'}]}),
|
||||
# 'parsed': AnswerWithJustification(answer='They weigh the same.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'),
|
||||
# 'parsing_error': None
|
||||
# }
|
||||
```
|
||||
|
||||
??? example "Dictionary schema (`include_raw=False`)"
|
||||
Example: `dict` schema (`include_raw=False`):
|
||||
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
from langchain_core.utils.function_calling import convert_to_openai_tool
|
||||
```python
|
||||
from pydantic import BaseModel
|
||||
from langchain_core.utils.function_calling import convert_to_openai_tool
|
||||
|
||||
|
||||
class AnswerWithJustification(BaseModel):
|
||||
'''An answer to the user question along with justification for the answer.'''
|
||||
class AnswerWithJustification(BaseModel):
|
||||
'''An answer to the user question along with justification for the answer.'''
|
||||
|
||||
answer: str
|
||||
justification: str
|
||||
answer: str
|
||||
justification: str
|
||||
|
||||
|
||||
dict_schema = convert_to_openai_tool(AnswerWithJustification)
|
||||
model = ChatModel(model="model-name", temperature=0)
|
||||
structured_model = model.with_structured_output(dict_schema)
|
||||
dict_schema = convert_to_openai_tool(AnswerWithJustification)
|
||||
model = ChatModel(model="model-name", temperature=0)
|
||||
structured_model = model.with_structured_output(dict_schema)
|
||||
|
||||
structured_model.invoke(
|
||||
"What weighs more a pound of bricks or a pound of feathers"
|
||||
)
|
||||
# -> {
|
||||
# 'answer': 'They weigh the same',
|
||||
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
|
||||
# }
|
||||
```
|
||||
structured_model.invoke(
|
||||
"What weighs more a pound of bricks or a pound of feathers"
|
||||
)
|
||||
# -> {
|
||||
# 'answer': 'They weigh the same',
|
||||
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
|
||||
# }
|
||||
```
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 0.2.26"
|
||||
|
||||
Added support for `TypedDict` class.
|
||||
|
||||
""" # noqa: E501
|
||||
@@ -1703,6 +1685,40 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
|
||||
return RunnableMap(raw=llm) | parser_with_fallback
|
||||
return llm | output_parser
|
||||
|
||||
@property
|
||||
@beta()
|
||||
def profile(self) -> ModelProfile:
|
||||
"""Return profiling information for the model.
|
||||
|
||||
This property relies on the `langchain-model-profiles` package to retrieve chat
|
||||
model capabilities, such as context window sizes and supported features.
|
||||
|
||||
Raises:
|
||||
ImportError: If `langchain-model-profiles` is not installed.
|
||||
|
||||
Returns:
|
||||
A `ModelProfile` object containing profiling information for the model.
|
||||
"""
|
||||
try:
|
||||
from langchain_model_profiles import get_model_profile # noqa: PLC0415
|
||||
except ImportError as err:
|
||||
informative_error_message = (
|
||||
"To access model profiling information, please install the "
|
||||
"`langchain-model-profiles` package: "
|
||||
"`pip install langchain-model-profiles`."
|
||||
)
|
||||
raise ImportError(informative_error_message) from err
|
||||
|
||||
provider_id = self._llm_type
|
||||
model_name = (
|
||||
# Model name is not standardized across integrations. New integrations
|
||||
# should prefer `model`.
|
||||
getattr(self, "model", None)
|
||||
or getattr(self, "model_name", None)
|
||||
or getattr(self, "model_id", "")
|
||||
)
|
||||
return get_model_profile(provider_id, model_name) or {}
|
||||
|
||||
|
||||
class SimpleChatModel(BaseChatModel):
|
||||
"""Simplified implementation for a chat model to inherit from.
|
||||
|
||||
@@ -61,8 +61,6 @@ if TYPE_CHECKING:
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_background_tasks: set[asyncio.Task] = set()
|
||||
|
||||
|
||||
@functools.lru_cache
|
||||
def _log_error_once(msg: str) -> None:
|
||||
@@ -102,9 +100,9 @@ def create_base_retry_decorator(
|
||||
asyncio.run(coro)
|
||||
else:
|
||||
if loop.is_running():
|
||||
task = loop.create_task(coro)
|
||||
_background_tasks.add(task)
|
||||
task.add_done_callback(_background_tasks.discard)
|
||||
# TODO: Fix RUF006 - this task should have a reference
|
||||
# and be awaited somewhere
|
||||
loop.create_task(coro) # noqa: RUF006
|
||||
else:
|
||||
asyncio.run(coro)
|
||||
except Exception as e:
|
||||
|
||||
@@ -1,85 +0,0 @@
|
||||
"""Model profile types and utilities."""
|
||||
|
||||
from typing_extensions import TypedDict
|
||||
|
||||
|
||||
class ModelProfile(TypedDict, total=False):
|
||||
"""Model profile.
|
||||
|
||||
!!! warning "Beta feature"
|
||||
|
||||
This is a beta feature. The format of model profiles is subject to change.
|
||||
|
||||
Provides information about chat model capabilities, such as context window sizes
|
||||
and supported features.
|
||||
"""
|
||||
|
||||
# --- Input constraints ---
|
||||
|
||||
max_input_tokens: int
|
||||
"""Maximum context window (tokens)"""
|
||||
|
||||
image_inputs: bool
|
||||
"""Whether image inputs are supported."""
|
||||
# TODO: add more detail about formats?
|
||||
|
||||
image_url_inputs: bool
|
||||
"""Whether [image URL inputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
|
||||
are supported."""
|
||||
|
||||
pdf_inputs: bool
|
||||
"""Whether [PDF inputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
|
||||
are supported."""
|
||||
# TODO: add more detail about formats? e.g. bytes or base64
|
||||
|
||||
audio_inputs: bool
|
||||
"""Whether [audio inputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
|
||||
are supported."""
|
||||
# TODO: add more detail about formats? e.g. bytes or base64
|
||||
|
||||
video_inputs: bool
|
||||
"""Whether [video inputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
|
||||
are supported."""
|
||||
# TODO: add more detail about formats? e.g. bytes or base64
|
||||
|
||||
image_tool_message: bool
|
||||
"""Whether images can be included in tool messages."""
|
||||
|
||||
pdf_tool_message: bool
|
||||
"""Whether PDFs can be included in tool messages."""
|
||||
|
||||
# --- Output constraints ---
|
||||
|
||||
max_output_tokens: int
|
||||
"""Maximum output tokens"""
|
||||
|
||||
reasoning_output: bool
|
||||
"""Whether the model supports [reasoning / chain-of-thought](https://docs.langchain.com/oss/python/langchain/models#reasoning)"""
|
||||
|
||||
image_outputs: bool
|
||||
"""Whether [image outputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
|
||||
are supported."""
|
||||
|
||||
audio_outputs: bool
|
||||
"""Whether [audio outputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
|
||||
are supported."""
|
||||
|
||||
video_outputs: bool
|
||||
"""Whether [video outputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
|
||||
are supported."""
|
||||
|
||||
# --- Tool calling ---
|
||||
tool_calling: bool
|
||||
"""Whether the model supports [tool calling](https://docs.langchain.com/oss/python/langchain/models#tool-calling)"""
|
||||
|
||||
tool_choice: bool
|
||||
"""Whether the model supports [tool choice](https://docs.langchain.com/oss/python/langchain/models#forcing-tool-calls)"""
|
||||
|
||||
# --- Structured output ---
|
||||
structured_output: bool
|
||||
"""Whether the model supports a native [structured output](https://docs.langchain.com/oss/python/langchain/models#structured-outputs)
|
||||
feature"""
|
||||
|
||||
|
||||
ModelProfileRegistry = dict[str, ModelProfile]
|
||||
"""Registry mapping model identifiers or names to their ModelProfile."""
|
||||
@@ -6,7 +6,7 @@ from langchain_core._import_utils import import_attr
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from langchain_core.load.dump import dumpd, dumps
|
||||
from langchain_core.load.load import InitValidator, loads
|
||||
from langchain_core.load.load import loads
|
||||
from langchain_core.load.serializable import Serializable
|
||||
|
||||
# Unfortunately, we have to eagerly import load from langchain_core/load/load.py
|
||||
@@ -15,19 +15,11 @@ if TYPE_CHECKING:
|
||||
# the `from langchain_core.load.load import load` absolute import should also work.
|
||||
from langchain_core.load.load import load
|
||||
|
||||
__all__ = (
|
||||
"InitValidator",
|
||||
"Serializable",
|
||||
"dumpd",
|
||||
"dumps",
|
||||
"load",
|
||||
"loads",
|
||||
)
|
||||
__all__ = ("Serializable", "dumpd", "dumps", "load", "loads")
|
||||
|
||||
_dynamic_imports = {
|
||||
"dumpd": "dump",
|
||||
"dumps": "dump",
|
||||
"InitValidator": "load",
|
||||
"loads": "load",
|
||||
"Serializable": "serializable",
|
||||
}
|
||||
|
||||
@@ -1,176 +0,0 @@
|
||||
"""Validation utilities for LangChain serialization.
|
||||
|
||||
Provides escape-based protection against injection attacks in serialized objects. The
|
||||
approach uses an allowlist design: only dicts explicitly produced by
|
||||
`Serializable.to_json()` are treated as LC objects during deserialization.
|
||||
|
||||
## How escaping works
|
||||
|
||||
During serialization, plain dicts (user data) that contain an `'lc'` key are wrapped:
|
||||
|
||||
```python
|
||||
{"lc": 1, ...} # user data that looks like LC object
|
||||
# becomes:
|
||||
{"__lc_escaped__": {"lc": 1, ...}}
|
||||
```
|
||||
|
||||
During deserialization, escaped dicts are unwrapped and returned as plain dicts,
|
||||
NOT instantiated as LC objects.
|
||||
"""
|
||||
|
||||
from typing import Any
|
||||
|
||||
_LC_ESCAPED_KEY = "__lc_escaped__"
|
||||
"""Sentinel key used to mark escaped user dicts during serialization.
|
||||
|
||||
When a plain dict contains 'lc' key (which could be confused with LC objects),
|
||||
we wrap it as {"__lc_escaped__": {...original...}}.
|
||||
"""
|
||||
|
||||
|
||||
def _needs_escaping(obj: dict[str, Any]) -> bool:
|
||||
"""Check if a dict needs escaping to prevent confusion with LC objects.
|
||||
|
||||
A dict needs escaping if:
|
||||
|
||||
1. It has an `'lc'` key (could be confused with LC serialization format)
|
||||
2. It has only the escape key (would be mistaken for an escaped dict)
|
||||
"""
|
||||
return "lc" in obj or (len(obj) == 1 and _LC_ESCAPED_KEY in obj)
|
||||
|
||||
|
||||
def _escape_dict(obj: dict[str, Any]) -> dict[str, Any]:
|
||||
"""Wrap a dict in the escape marker.
|
||||
|
||||
Example:
|
||||
```python
|
||||
{"key": "value"} # becomes {"__lc_escaped__": {"key": "value"}}
|
||||
```
|
||||
"""
|
||||
return {_LC_ESCAPED_KEY: obj}
|
||||
|
||||
|
||||
def _is_escaped_dict(obj: dict[str, Any]) -> bool:
|
||||
"""Check if a dict is an escaped user dict.
|
||||
|
||||
Example:
|
||||
```python
|
||||
{"__lc_escaped__": {...}} # is an escaped dict
|
||||
```
|
||||
"""
|
||||
return len(obj) == 1 and _LC_ESCAPED_KEY in obj
|
||||
|
||||
|
||||
def _serialize_value(obj: Any) -> Any:
|
||||
"""Serialize a value with escaping of user dicts.
|
||||
|
||||
Called recursively on kwarg values to escape any plain dicts that could be confused
|
||||
with LC objects.
|
||||
|
||||
Args:
|
||||
obj: The value to serialize.
|
||||
|
||||
Returns:
|
||||
The serialized value with user dicts escaped as needed.
|
||||
"""
|
||||
from langchain_core.load.serializable import ( # noqa: PLC0415
|
||||
Serializable,
|
||||
to_json_not_implemented,
|
||||
)
|
||||
|
||||
if isinstance(obj, Serializable):
|
||||
# This is an LC object - serialize it properly (not escaped)
|
||||
return _serialize_lc_object(obj)
|
||||
if isinstance(obj, dict):
|
||||
if not all(isinstance(k, (str, int, float, bool, type(None))) for k in obj):
|
||||
# if keys are not json serializable
|
||||
return to_json_not_implemented(obj)
|
||||
# Check if dict needs escaping BEFORE recursing into values.
|
||||
# If it needs escaping, wrap it as-is - the contents are user data that
|
||||
# will be returned as-is during deserialization (no instantiation).
|
||||
# This prevents re-escaping of already-escaped nested content.
|
||||
if _needs_escaping(obj):
|
||||
return _escape_dict(obj)
|
||||
# Safe dict (no 'lc' key) - recurse into values
|
||||
return {k: _serialize_value(v) for k, v in obj.items()}
|
||||
if isinstance(obj, (list, tuple)):
|
||||
return [_serialize_value(item) for item in obj]
|
||||
if isinstance(obj, (str, int, float, bool, type(None))):
|
||||
return obj
|
||||
|
||||
# Non-JSON-serializable object (datetime, custom objects, etc.)
|
||||
return to_json_not_implemented(obj)
|
||||
|
||||
|
||||
def _is_lc_secret(obj: Any) -> bool:
|
||||
"""Check if an object is a LangChain secret marker."""
|
||||
expected_num_keys = 3
|
||||
return (
|
||||
isinstance(obj, dict)
|
||||
and obj.get("lc") == 1
|
||||
and obj.get("type") == "secret"
|
||||
and "id" in obj
|
||||
and len(obj) == expected_num_keys
|
||||
)
|
||||
|
||||
|
||||
def _serialize_lc_object(obj: Any) -> dict[str, Any]:
|
||||
"""Serialize a `Serializable` object with escaping of user data in kwargs.
|
||||
|
||||
Args:
|
||||
obj: The `Serializable` object to serialize.
|
||||
|
||||
Returns:
|
||||
The serialized dict with user data in kwargs escaped as needed.
|
||||
|
||||
Note:
|
||||
Kwargs values are processed with `_serialize_value` to escape user data (like
|
||||
metadata) that contains `'lc'` keys. Secret fields (from `lc_secrets`) are
|
||||
skipped because `to_json()` replaces their values with secret markers.
|
||||
"""
|
||||
from langchain_core.load.serializable import Serializable # noqa: PLC0415
|
||||
|
||||
if not isinstance(obj, Serializable):
|
||||
msg = f"Expected Serializable, got {type(obj)}"
|
||||
raise TypeError(msg)
|
||||
|
||||
serialized: dict[str, Any] = dict(obj.to_json())
|
||||
|
||||
# Process kwargs to escape user data that could be confused with LC objects
|
||||
# Skip secret fields - to_json() already converted them to secret markers
|
||||
if serialized.get("type") == "constructor" and "kwargs" in serialized:
|
||||
serialized["kwargs"] = {
|
||||
k: v if _is_lc_secret(v) else _serialize_value(v)
|
||||
for k, v in serialized["kwargs"].items()
|
||||
}
|
||||
|
||||
return serialized
|
||||
|
||||
|
||||
def _unescape_value(obj: Any) -> Any:
|
||||
"""Unescape a value, processing escape markers in dict values and lists.
|
||||
|
||||
When an escaped dict is encountered (`{"__lc_escaped__": ...}`), it's
|
||||
unwrapped and the contents are returned AS-IS (no further processing).
|
||||
The contents represent user data that should not be modified.
|
||||
|
||||
For regular dicts and lists, we recurse to find any nested escape markers.
|
||||
|
||||
Args:
|
||||
obj: The value to unescape.
|
||||
|
||||
Returns:
|
||||
The unescaped value.
|
||||
"""
|
||||
if isinstance(obj, dict):
|
||||
if _is_escaped_dict(obj):
|
||||
# Unwrap and return the user data as-is (no further unescaping).
|
||||
# The contents are user data that may contain more escape keys,
|
||||
# but those are part of the user's actual data.
|
||||
return obj[_LC_ESCAPED_KEY]
|
||||
|
||||
# Regular dict - recurse into values to find nested escape markers
|
||||
return {k: _unescape_value(v) for k, v in obj.items()}
|
||||
if isinstance(obj, list):
|
||||
return [_unescape_value(item) for item in obj]
|
||||
return obj
|
||||
@@ -1,26 +1,10 @@
|
||||
"""Serialize LangChain objects to JSON.
|
||||
|
||||
Provides `dumps` (to JSON string) and `dumpd` (to dict) for serializing
|
||||
`Serializable` objects.
|
||||
|
||||
## Escaping
|
||||
|
||||
During serialization, plain dicts (user data) that contain an `'lc'` key are escaped
|
||||
by wrapping them: `{"__lc_escaped__": {...original...}}`. This prevents injection
|
||||
attacks where malicious data could trick the deserializer into instantiating
|
||||
arbitrary classes. The escape marker is removed during deserialization.
|
||||
|
||||
This is an allowlist approach: only dicts explicitly produced by
|
||||
`Serializable.to_json()` are treated as LC objects; everything else is escaped if it
|
||||
could be confused with the LC format.
|
||||
"""
|
||||
"""Dump objects to json."""
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
from pydantic import BaseModel
|
||||
|
||||
from langchain_core.load._validation import _serialize_value
|
||||
from langchain_core.load.serializable import Serializable, to_json_not_implemented
|
||||
from langchain_core.messages import AIMessage
|
||||
from langchain_core.outputs import ChatGeneration
|
||||
@@ -41,20 +25,6 @@ def default(obj: Any) -> Any:
|
||||
|
||||
|
||||
def _dump_pydantic_models(obj: Any) -> Any:
|
||||
"""Convert nested Pydantic models to dicts for JSON serialization.
|
||||
|
||||
Handles the special case where a `ChatGeneration` contains an `AIMessage`
|
||||
with a parsed Pydantic model in `additional_kwargs["parsed"]`. Since
|
||||
Pydantic models aren't directly JSON serializable, this converts them to
|
||||
dicts.
|
||||
|
||||
Args:
|
||||
obj: The object to process.
|
||||
|
||||
Returns:
|
||||
A copy of the object with nested Pydantic models converted to dicts, or
|
||||
the original object unchanged if no conversion was needed.
|
||||
"""
|
||||
if (
|
||||
isinstance(obj, ChatGeneration)
|
||||
and isinstance(obj.message, AIMessage)
|
||||
@@ -70,17 +40,10 @@ def _dump_pydantic_models(obj: Any) -> Any:
|
||||
def dumps(obj: Any, *, pretty: bool = False, **kwargs: Any) -> str:
|
||||
"""Return a JSON string representation of an object.
|
||||
|
||||
Note:
|
||||
Plain dicts containing an `'lc'` key are automatically escaped to prevent
|
||||
confusion with LC serialization format. The escape marker is removed during
|
||||
deserialization.
|
||||
|
||||
Args:
|
||||
obj: The object to dump.
|
||||
pretty: Whether to pretty print the json.
|
||||
|
||||
If `True`, the json will be indented by either 2 spaces or the amount
|
||||
provided in the `indent` kwarg.
|
||||
pretty: Whether to pretty print the json. If `True`, the json will be
|
||||
indented with 2 spaces (if no indent is provided as part of `kwargs`).
|
||||
**kwargs: Additional arguments to pass to `json.dumps`
|
||||
|
||||
Returns:
|
||||
@@ -92,29 +55,28 @@ def dumps(obj: Any, *, pretty: bool = False, **kwargs: Any) -> str:
|
||||
if "default" in kwargs:
|
||||
msg = "`default` should not be passed to dumps"
|
||||
raise ValueError(msg)
|
||||
|
||||
obj = _dump_pydantic_models(obj)
|
||||
serialized = _serialize_value(obj)
|
||||
|
||||
if pretty:
|
||||
indent = kwargs.pop("indent", 2)
|
||||
return json.dumps(serialized, indent=indent, **kwargs)
|
||||
return json.dumps(serialized, **kwargs)
|
||||
try:
|
||||
obj = _dump_pydantic_models(obj)
|
||||
if pretty:
|
||||
indent = kwargs.pop("indent", 2)
|
||||
return json.dumps(obj, default=default, indent=indent, **kwargs)
|
||||
return json.dumps(obj, default=default, **kwargs)
|
||||
except TypeError:
|
||||
if pretty:
|
||||
indent = kwargs.pop("indent", 2)
|
||||
return json.dumps(to_json_not_implemented(obj), indent=indent, **kwargs)
|
||||
return json.dumps(to_json_not_implemented(obj), **kwargs)
|
||||
|
||||
|
||||
def dumpd(obj: Any) -> Any:
|
||||
"""Return a dict representation of an object.
|
||||
|
||||
Note:
|
||||
Plain dicts containing an `'lc'` key are automatically escaped to prevent
|
||||
confusion with LC serialization format. The escape marker is removed during
|
||||
deserialization.
|
||||
|
||||
Args:
|
||||
obj: The object to dump.
|
||||
|
||||
Returns:
|
||||
Dictionary that can be serialized to json using `json.dumps`.
|
||||
"""
|
||||
obj = _dump_pydantic_models(obj)
|
||||
return _serialize_value(obj)
|
||||
# Unfortunately this function is not as efficient as it could be because it first
|
||||
# dumps the object to a json string and then loads it back into a dictionary.
|
||||
return json.loads(dumps(obj))
|
||||
|
||||
@@ -1,83 +1,11 @@
|
||||
"""Load LangChain objects from JSON strings or objects.
|
||||
|
||||
## How it works
|
||||
|
||||
Each `Serializable` LangChain object has a unique identifier (its "class path"), which
|
||||
is a list of strings representing the module path and class name. For example:
|
||||
|
||||
- `AIMessage` -> `["langchain_core", "messages", "ai", "AIMessage"]`
|
||||
- `ChatPromptTemplate` -> `["langchain_core", "prompts", "chat", "ChatPromptTemplate"]`
|
||||
|
||||
When deserializing, the class path from the JSON `'id'` field is checked against an
|
||||
allowlist. If the class is not in the allowlist, deserialization raises a `ValueError`.
|
||||
|
||||
## Security model
|
||||
|
||||
The `allowed_objects` parameter controls which classes can be deserialized:
|
||||
|
||||
- **`'core'` (default)**: Allow classes defined in the serialization mappings for
|
||||
langchain_core.
|
||||
- **`'all'`**: Allow classes defined in the serialization mappings. This
|
||||
includes core LangChain types (messages, prompts, documents, etc.) and trusted
|
||||
partner integrations. See `langchain_core.load.mapping` for the full list.
|
||||
- **Explicit list of classes**: Only those specific classes are allowed.
|
||||
|
||||
For simple data types like messages and documents, the default allowlist is safe to use.
|
||||
These classes do not perform side effects during initialization.
|
||||
|
||||
!!! note "Side effects in allowed classes"
|
||||
|
||||
Deserialization calls `__init__` on allowed classes. If those classes perform side
|
||||
effects during initialization (network calls, file operations, etc.), those side
|
||||
effects will occur. The allowlist prevents instantiation of classes outside the
|
||||
allowlist, but does not sandbox the allowed classes themselves.
|
||||
|
||||
Import paths are also validated against trusted namespaces before any module is
|
||||
imported.
|
||||
|
||||
### Injection protection (escape-based)
|
||||
|
||||
During serialization, plain dicts that contain an `'lc'` key are escaped by wrapping
|
||||
them: `{"__lc_escaped__": {...}}`. During deserialization, escaped dicts are unwrapped
|
||||
and returned as plain dicts, NOT instantiated as LC objects.
|
||||
|
||||
This is an allowlist approach: only dicts explicitly produced by
|
||||
`Serializable.to_json()` (which are NOT escaped) are treated as LC objects;
|
||||
everything else is user data.
|
||||
|
||||
Even if an attacker's payload includes `__lc_escaped__` wrappers, it will be unwrapped
|
||||
to plain dicts and NOT instantiated as malicious objects.
|
||||
|
||||
## Examples
|
||||
|
||||
```python
|
||||
from langchain_core.load import load
|
||||
from langchain_core.prompts import ChatPromptTemplate
|
||||
from langchain_core.messages import AIMessage, HumanMessage
|
||||
|
||||
# Use default allowlist (classes from mappings) - recommended
|
||||
obj = load(data)
|
||||
|
||||
# Allow only specific classes (most restrictive)
|
||||
obj = load(
|
||||
data,
|
||||
allowed_objects=[
|
||||
ChatPromptTemplate,
|
||||
AIMessage,
|
||||
HumanMessage,
|
||||
],
|
||||
)
|
||||
```
|
||||
"""
|
||||
"""Load LangChain objects from JSON strings or objects."""
|
||||
|
||||
import importlib
|
||||
import json
|
||||
import os
|
||||
from collections.abc import Callable, Iterable
|
||||
from typing import Any, Literal, cast
|
||||
from typing import Any
|
||||
|
||||
from langchain_core._api import beta
|
||||
from langchain_core.load._validation import _is_escaped_dict, _unescape_value
|
||||
from langchain_core.load.mapping import (
|
||||
_JS_SERIALIZABLE_MAPPING,
|
||||
_OG_SERIALIZABLE_MAPPING,
|
||||
@@ -116,209 +44,32 @@ ALL_SERIALIZABLE_MAPPINGS = {
|
||||
**_JS_SERIALIZABLE_MAPPING,
|
||||
}
|
||||
|
||||
# Cache for the default allowed class paths computed from mappings
|
||||
# Maps mode ("all" or "core") to the cached set of paths
|
||||
_default_class_paths_cache: dict[str, set[tuple[str, ...]]] = {}
|
||||
|
||||
|
||||
def _get_default_allowed_class_paths(
|
||||
allowed_object_mode: Literal["all", "core"],
|
||||
) -> set[tuple[str, ...]]:
|
||||
"""Get the default allowed class paths from the serialization mappings.
|
||||
|
||||
This uses the mappings as the source of truth for what classes are allowed
|
||||
by default. Both the legacy paths (keys) and current paths (values) are included.
|
||||
|
||||
Args:
|
||||
allowed_object_mode: either `'all'` or `'core'`.
|
||||
|
||||
Returns:
|
||||
Set of class path tuples that are allowed by default.
|
||||
"""
|
||||
if allowed_object_mode in _default_class_paths_cache:
|
||||
return _default_class_paths_cache[allowed_object_mode]
|
||||
|
||||
allowed_paths: set[tuple[str, ...]] = set()
|
||||
for key, value in ALL_SERIALIZABLE_MAPPINGS.items():
|
||||
if allowed_object_mode == "core" and value[0] != "langchain_core":
|
||||
continue
|
||||
allowed_paths.add(key)
|
||||
allowed_paths.add(value)
|
||||
|
||||
_default_class_paths_cache[allowed_object_mode] = allowed_paths
|
||||
return _default_class_paths_cache[allowed_object_mode]
|
||||
|
||||
|
||||
def _block_jinja2_templates(
|
||||
class_path: tuple[str, ...],
|
||||
kwargs: dict[str, Any],
|
||||
) -> None:
|
||||
"""Block jinja2 templates during deserialization for security.
|
||||
|
||||
Jinja2 templates can execute arbitrary code, so they are blocked by default when
|
||||
deserializing objects with `template_format='jinja2'`.
|
||||
|
||||
Note:
|
||||
We intentionally do NOT check the `class_path` here to keep this simple and
|
||||
future-proof. If any new class is added that accepts `template_format='jinja2'`,
|
||||
it will be automatically blocked without needing to update this function.
|
||||
|
||||
Args:
|
||||
class_path: The class path tuple being deserialized (unused).
|
||||
kwargs: The kwargs dict for the class constructor.
|
||||
|
||||
Raises:
|
||||
ValueError: If `template_format` is `'jinja2'`.
|
||||
"""
|
||||
_ = class_path # Unused - see docstring for rationale. Kept to satisfy signature.
|
||||
if kwargs.get("template_format") == "jinja2":
|
||||
msg = (
|
||||
"Jinja2 templates are not allowed during deserialization for security "
|
||||
"reasons. Use 'f-string' template format instead, or explicitly allow "
|
||||
"jinja2 by providing a custom init_validator."
|
||||
)
|
||||
raise ValueError(msg)
|
||||
|
||||
|
||||
def default_init_validator(
|
||||
class_path: tuple[str, ...],
|
||||
kwargs: dict[str, Any],
|
||||
) -> None:
|
||||
"""Default init validator that blocks jinja2 templates.
|
||||
|
||||
This is the default validator used by `load()` and `loads()` when no custom
|
||||
validator is provided.
|
||||
|
||||
Args:
|
||||
class_path: The class path tuple being deserialized.
|
||||
kwargs: The kwargs dict for the class constructor.
|
||||
|
||||
Raises:
|
||||
ValueError: If template_format is `'jinja2'`.
|
||||
"""
|
||||
_block_jinja2_templates(class_path, kwargs)
|
||||
|
||||
|
||||
AllowedObject = type[Serializable]
|
||||
"""Type alias for classes that can be included in the `allowed_objects` parameter.
|
||||
|
||||
Must be a `Serializable` subclass (the class itself, not an instance).
|
||||
"""
|
||||
|
||||
InitValidator = Callable[[tuple[str, ...], dict[str, Any]], None]
|
||||
"""Type alias for a callable that validates kwargs during deserialization.
|
||||
|
||||
The callable receives:
|
||||
|
||||
- `class_path`: A tuple of strings identifying the class being instantiated
|
||||
(e.g., `('langchain', 'schema', 'messages', 'AIMessage')`).
|
||||
- `kwargs`: The kwargs dict that will be passed to the constructor.
|
||||
|
||||
The validator should raise an exception if the object should not be deserialized.
|
||||
"""
|
||||
|
||||
|
||||
def _compute_allowed_class_paths(
|
||||
allowed_objects: Iterable[AllowedObject],
|
||||
import_mappings: dict[tuple[str, ...], tuple[str, ...]],
|
||||
) -> set[tuple[str, ...]]:
|
||||
"""Return allowed class paths from an explicit list of classes.
|
||||
|
||||
A class path is a tuple of strings identifying a serializable class, derived from
|
||||
`Serializable.lc_id()`. For example: `('langchain_core', 'messages', 'AIMessage')`.
|
||||
|
||||
Args:
|
||||
allowed_objects: Iterable of `Serializable` subclasses to allow.
|
||||
import_mappings: Mapping of legacy class paths to current class paths.
|
||||
|
||||
Returns:
|
||||
Set of allowed class paths.
|
||||
|
||||
Example:
|
||||
```python
|
||||
# Allow a specific class
|
||||
_compute_allowed_class_paths([MyPrompt], {}) ->
|
||||
{("langchain_core", "prompts", "MyPrompt")}
|
||||
|
||||
# Include legacy paths that map to the same class
|
||||
import_mappings = {("old", "Prompt"): ("langchain_core", "prompts", "MyPrompt")}
|
||||
_compute_allowed_class_paths([MyPrompt], import_mappings) ->
|
||||
{("langchain_core", "prompts", "MyPrompt"), ("old", "Prompt")}
|
||||
```
|
||||
"""
|
||||
allowed_objects_list = list(allowed_objects)
|
||||
|
||||
allowed_class_paths: set[tuple[str, ...]] = set()
|
||||
for allowed_obj in allowed_objects_list:
|
||||
if not isinstance(allowed_obj, type) or not issubclass(
|
||||
allowed_obj, Serializable
|
||||
):
|
||||
msg = "allowed_objects must contain Serializable subclasses."
|
||||
raise TypeError(msg)
|
||||
|
||||
class_path = tuple(allowed_obj.lc_id())
|
||||
allowed_class_paths.add(class_path)
|
||||
# Add legacy paths that map to the same class.
|
||||
for mapping_key, mapping_value in import_mappings.items():
|
||||
if tuple(mapping_value) == class_path:
|
||||
allowed_class_paths.add(mapping_key)
|
||||
return allowed_class_paths
|
||||
|
||||
|
||||
class Reviver:
|
||||
"""Reviver for JSON objects.
|
||||
|
||||
Used as the `object_hook` for `json.loads` to reconstruct LangChain objects from
|
||||
their serialized JSON representation.
|
||||
|
||||
Only classes in the allowlist can be instantiated.
|
||||
"""
|
||||
"""Reviver for JSON objects."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
allowed_objects: Iterable[AllowedObject] | Literal["all", "core"] = "core",
|
||||
secrets_map: dict[str, str] | None = None,
|
||||
valid_namespaces: list[str] | None = None,
|
||||
secrets_from_env: bool = False, # noqa: FBT001,FBT002
|
||||
secrets_from_env: bool = True, # noqa: FBT001,FBT002
|
||||
additional_import_mappings: dict[tuple[str, ...], tuple[str, ...]]
|
||||
| None = None,
|
||||
*,
|
||||
ignore_unserializable_fields: bool = False,
|
||||
init_validator: InitValidator | None = default_init_validator,
|
||||
) -> None:
|
||||
"""Initialize the reviver.
|
||||
|
||||
Args:
|
||||
allowed_objects: Allowlist of classes that can be deserialized.
|
||||
- `'core'` (default): Allow classes defined in the serialization
|
||||
mappings for `langchain_core`.
|
||||
- `'all'`: Allow classes defined in the serialization mappings.
|
||||
|
||||
This includes core LangChain types (messages, prompts, documents,
|
||||
etc.) and trusted partner integrations. See
|
||||
`langchain_core.load.mapping` for the full list.
|
||||
- Explicit list of classes: Only those specific classes are allowed.
|
||||
secrets_map: A map of secrets to load.
|
||||
If a secret is not found in the map, it will be loaded from the
|
||||
environment if `secrets_from_env` is `True`.
|
||||
valid_namespaces: Additional namespaces (modules) to allow during
|
||||
deserialization, beyond the default trusted namespaces.
|
||||
secrets_map: A map of secrets to load. If a secret is not found in
|
||||
the map, it will be loaded from the environment if `secrets_from_env`
|
||||
is True.
|
||||
valid_namespaces: A list of additional namespaces (modules)
|
||||
to allow to be deserialized.
|
||||
secrets_from_env: Whether to load secrets from the environment.
|
||||
additional_import_mappings: A dictionary of additional namespace mappings.
|
||||
|
||||
additional_import_mappings: A dictionary of additional namespace mappings
|
||||
You can use this to override default mappings or add new mappings.
|
||||
|
||||
When `allowed_objects` is `None` (using defaults), paths from these
|
||||
mappings are also added to the allowed class paths.
|
||||
ignore_unserializable_fields: Whether to ignore unserializable fields.
|
||||
init_validator: Optional callable to validate kwargs before instantiation.
|
||||
|
||||
If provided, this function is called with `(class_path, kwargs)` where
|
||||
`class_path` is the class path tuple and `kwargs` is the kwargs dict.
|
||||
The validator should raise an exception if the object should not be
|
||||
deserialized, otherwise return `None`.
|
||||
|
||||
Defaults to `default_init_validator` which blocks jinja2 templates.
|
||||
"""
|
||||
self.secrets_from_env = secrets_from_env
|
||||
self.secrets_map = secrets_map or {}
|
||||
@@ -337,26 +88,7 @@ class Reviver:
|
||||
if self.additional_import_mappings
|
||||
else ALL_SERIALIZABLE_MAPPINGS
|
||||
)
|
||||
# Compute allowed class paths:
|
||||
# - "all" -> use default paths from mappings (+ additional_import_mappings)
|
||||
# - Explicit list -> compute from those classes
|
||||
if allowed_objects in ("all", "core"):
|
||||
self.allowed_class_paths: set[tuple[str, ...]] | None = (
|
||||
_get_default_allowed_class_paths(
|
||||
cast("Literal['all', 'core']", allowed_objects)
|
||||
).copy()
|
||||
)
|
||||
# Add paths from additional_import_mappings to the defaults
|
||||
if self.additional_import_mappings:
|
||||
for key, value in self.additional_import_mappings.items():
|
||||
self.allowed_class_paths.add(key)
|
||||
self.allowed_class_paths.add(value)
|
||||
else:
|
||||
self.allowed_class_paths = _compute_allowed_class_paths(
|
||||
cast("Iterable[AllowedObject]", allowed_objects), self.import_mappings
|
||||
)
|
||||
self.ignore_unserializable_fields = ignore_unserializable_fields
|
||||
self.init_validator = init_validator
|
||||
|
||||
def __call__(self, value: dict[str, Any]) -> Any:
|
||||
"""Revive the value.
|
||||
@@ -407,20 +139,6 @@ class Reviver:
|
||||
[*namespace, name] = value["id"]
|
||||
mapping_key = tuple(value["id"])
|
||||
|
||||
if (
|
||||
self.allowed_class_paths is not None
|
||||
and mapping_key not in self.allowed_class_paths
|
||||
):
|
||||
msg = (
|
||||
f"Deserialization of {mapping_key!r} is not allowed. "
|
||||
"The default (allowed_objects='core') only permits core "
|
||||
"langchain-core classes. To allow trusted partner integrations, "
|
||||
"use allowed_objects='all'. Alternatively, pass an explicit list "
|
||||
"of allowed classes via allowed_objects=[...]. "
|
||||
"See langchain_core.load.mapping for the full allowlist."
|
||||
)
|
||||
raise ValueError(msg)
|
||||
|
||||
if (
|
||||
namespace[0] not in self.valid_namespaces
|
||||
# The root namespace ["langchain"] is not a valid identifier.
|
||||
@@ -428,11 +146,13 @@ class Reviver:
|
||||
):
|
||||
msg = f"Invalid namespace: {value}"
|
||||
raise ValueError(msg)
|
||||
# Determine explicit import path
|
||||
# Has explicit import path.
|
||||
if mapping_key in self.import_mappings:
|
||||
import_path = self.import_mappings[mapping_key]
|
||||
# Split into module and name
|
||||
import_dir, name = import_path[:-1], import_path[-1]
|
||||
# Import module
|
||||
mod = importlib.import_module(".".join(import_dir))
|
||||
elif namespace[0] in DISALLOW_LOAD_FROM_PATH:
|
||||
msg = (
|
||||
"Trying to deserialize something that cannot "
|
||||
@@ -440,16 +160,9 @@ class Reviver:
|
||||
f"{mapping_key}."
|
||||
)
|
||||
raise ValueError(msg)
|
||||
# Otherwise, treat namespace as path.
|
||||
else:
|
||||
# Otherwise, treat namespace as path.
|
||||
import_dir = namespace
|
||||
|
||||
# Validate import path is in trusted namespaces before importing
|
||||
if import_dir[0] not in self.valid_namespaces:
|
||||
msg = f"Invalid namespace: {value}"
|
||||
raise ValueError(msg)
|
||||
|
||||
mod = importlib.import_module(".".join(import_dir))
|
||||
mod = importlib.import_module(".".join(namespace))
|
||||
|
||||
cls = getattr(mod, name)
|
||||
|
||||
@@ -461,10 +174,6 @@ class Reviver:
|
||||
# We don't need to recurse on kwargs
|
||||
# as json.loads will do that for us.
|
||||
kwargs = value.get("kwargs", {})
|
||||
|
||||
if self.init_validator is not None:
|
||||
self.init_validator(mapping_key, kwargs)
|
||||
|
||||
return cls(**kwargs)
|
||||
|
||||
return value
|
||||
@@ -474,81 +183,40 @@ class Reviver:
|
||||
def loads(
|
||||
text: str,
|
||||
*,
|
||||
allowed_objects: Iterable[AllowedObject] | Literal["all", "core"] = "core",
|
||||
secrets_map: dict[str, str] | None = None,
|
||||
valid_namespaces: list[str] | None = None,
|
||||
secrets_from_env: bool = False,
|
||||
secrets_from_env: bool = True,
|
||||
additional_import_mappings: dict[tuple[str, ...], tuple[str, ...]] | None = None,
|
||||
ignore_unserializable_fields: bool = False,
|
||||
init_validator: InitValidator | None = default_init_validator,
|
||||
) -> Any:
|
||||
"""Revive a LangChain class from a JSON string.
|
||||
|
||||
Equivalent to `load(json.loads(text))`.
|
||||
|
||||
Only classes in the allowlist can be instantiated. The default allowlist includes
|
||||
core LangChain types (messages, prompts, documents, etc.). See
|
||||
`langchain_core.load.mapping` for the full list.
|
||||
|
||||
!!! warning "Beta feature"
|
||||
|
||||
This is a beta feature. Please be wary of deploying experimental code to
|
||||
production unless you've taken appropriate precautions.
|
||||
|
||||
Args:
|
||||
text: The string to load.
|
||||
allowed_objects: Allowlist of classes that can be deserialized.
|
||||
|
||||
- `'core'` (default): Allow classes defined in the serialization mappings
|
||||
for `langchain_core`.
|
||||
- `'all'`: Allow classes defined in the serialization mappings.
|
||||
|
||||
This includes core LangChain types (messages, prompts, documents, etc.)
|
||||
and trusted partner integrations. See `langchain_core.load.mapping` for
|
||||
the full list.
|
||||
|
||||
- Explicit list of classes: Only those specific classes are allowed.
|
||||
- `[]`: Disallow all deserialization (will raise on any object).
|
||||
secrets_map: A map of secrets to load.
|
||||
|
||||
If a secret is not found in the map, it will be loaded from the environment
|
||||
if `secrets_from_env` is `True`.
|
||||
valid_namespaces: Additional namespaces (modules) to allow during
|
||||
deserialization, beyond the default trusted namespaces.
|
||||
secrets_map: A map of secrets to load. If a secret is not found in
|
||||
the map, it will be loaded from the environment if `secrets_from_env`
|
||||
is True.
|
||||
valid_namespaces: A list of additional namespaces (modules)
|
||||
to allow to be deserialized.
|
||||
secrets_from_env: Whether to load secrets from the environment.
|
||||
additional_import_mappings: A dictionary of additional namespace mappings.
|
||||
|
||||
additional_import_mappings: A dictionary of additional namespace mappings
|
||||
You can use this to override default mappings or add new mappings.
|
||||
|
||||
When `allowed_objects` is `None` (using defaults), paths from these
|
||||
mappings are also added to the allowed class paths.
|
||||
ignore_unserializable_fields: Whether to ignore unserializable fields.
|
||||
init_validator: Optional callable to validate kwargs before instantiation.
|
||||
|
||||
If provided, this function is called with `(class_path, kwargs)` where
|
||||
`class_path` is the class path tuple and `kwargs` is the kwargs dict.
|
||||
The validator should raise an exception if the object should not be
|
||||
deserialized, otherwise return `None`.
|
||||
|
||||
Defaults to `default_init_validator` which blocks jinja2 templates.
|
||||
|
||||
Returns:
|
||||
Revived LangChain objects.
|
||||
|
||||
Raises:
|
||||
ValueError: If an object's class path is not in the `allowed_objects` allowlist.
|
||||
"""
|
||||
# Parse JSON and delegate to load() for proper escape handling
|
||||
raw_obj = json.loads(text)
|
||||
return load(
|
||||
raw_obj,
|
||||
allowed_objects=allowed_objects,
|
||||
secrets_map=secrets_map,
|
||||
valid_namespaces=valid_namespaces,
|
||||
secrets_from_env=secrets_from_env,
|
||||
additional_import_mappings=additional_import_mappings,
|
||||
ignore_unserializable_fields=ignore_unserializable_fields,
|
||||
init_validator=init_validator,
|
||||
return json.loads(
|
||||
text,
|
||||
object_hook=Reviver(
|
||||
secrets_map,
|
||||
valid_namespaces,
|
||||
secrets_from_env,
|
||||
additional_import_mappings,
|
||||
ignore_unserializable_fields=ignore_unserializable_fields,
|
||||
),
|
||||
)
|
||||
|
||||
|
||||
@@ -556,116 +224,49 @@ def loads(
|
||||
def load(
|
||||
obj: Any,
|
||||
*,
|
||||
allowed_objects: Iterable[AllowedObject] | Literal["all", "core"] = "core",
|
||||
secrets_map: dict[str, str] | None = None,
|
||||
valid_namespaces: list[str] | None = None,
|
||||
secrets_from_env: bool = False,
|
||||
secrets_from_env: bool = True,
|
||||
additional_import_mappings: dict[tuple[str, ...], tuple[str, ...]] | None = None,
|
||||
ignore_unserializable_fields: bool = False,
|
||||
init_validator: InitValidator | None = default_init_validator,
|
||||
) -> Any:
|
||||
"""Revive a LangChain class from a JSON object.
|
||||
|
||||
Use this if you already have a parsed JSON object, eg. from `json.load` or
|
||||
`orjson.loads`.
|
||||
|
||||
Only classes in the allowlist can be instantiated. The default allowlist includes
|
||||
core LangChain types (messages, prompts, documents, etc.). See
|
||||
`langchain_core.load.mapping` for the full list.
|
||||
|
||||
!!! warning "Beta feature"
|
||||
|
||||
This is a beta feature. Please be wary of deploying experimental code to
|
||||
production unless you've taken appropriate precautions.
|
||||
Use this if you already have a parsed JSON object,
|
||||
eg. from `json.load` or `orjson.loads`.
|
||||
|
||||
Args:
|
||||
obj: The object to load.
|
||||
allowed_objects: Allowlist of classes that can be deserialized.
|
||||
|
||||
- `'core'` (default): Allow classes defined in the serialization mappings
|
||||
for `langchain_core`.
|
||||
- `'all'`: Allow classes defined in the serialization mappings.
|
||||
|
||||
This includes core LangChain types (messages, prompts, documents, etc.)
|
||||
and trusted partner integrations. See `langchain_core.load.mapping` for
|
||||
the full list.
|
||||
|
||||
- Explicit list of classes: Only those specific classes are allowed.
|
||||
- `[]`: Disallow all deserialization (will raise on any object).
|
||||
secrets_map: A map of secrets to load.
|
||||
|
||||
If a secret is not found in the map, it will be loaded from the environment
|
||||
if `secrets_from_env` is `True`.
|
||||
valid_namespaces: Additional namespaces (modules) to allow during
|
||||
deserialization, beyond the default trusted namespaces.
|
||||
secrets_map: A map of secrets to load. If a secret is not found in
|
||||
the map, it will be loaded from the environment if `secrets_from_env`
|
||||
is True.
|
||||
valid_namespaces: A list of additional namespaces (modules)
|
||||
to allow to be deserialized.
|
||||
secrets_from_env: Whether to load secrets from the environment.
|
||||
additional_import_mappings: A dictionary of additional namespace mappings.
|
||||
|
||||
additional_import_mappings: A dictionary of additional namespace mappings
|
||||
You can use this to override default mappings or add new mappings.
|
||||
|
||||
When `allowed_objects` is `None` (using defaults), paths from these
|
||||
mappings are also added to the allowed class paths.
|
||||
ignore_unserializable_fields: Whether to ignore unserializable fields.
|
||||
init_validator: Optional callable to validate kwargs before instantiation.
|
||||
|
||||
If provided, this function is called with `(class_path, kwargs)` where
|
||||
`class_path` is the class path tuple and `kwargs` is the kwargs dict.
|
||||
The validator should raise an exception if the object should not be
|
||||
deserialized, otherwise return `None`.
|
||||
|
||||
Defaults to `default_init_validator` which blocks jinja2 templates.
|
||||
|
||||
Returns:
|
||||
Revived LangChain objects.
|
||||
|
||||
Raises:
|
||||
ValueError: If an object's class path is not in the `allowed_objects` allowlist.
|
||||
|
||||
Example:
|
||||
```python
|
||||
from langchain_core.load import load, dumpd
|
||||
from langchain_core.messages import AIMessage
|
||||
|
||||
msg = AIMessage(content="Hello")
|
||||
data = dumpd(msg)
|
||||
|
||||
# Deserialize using default allowlist
|
||||
loaded = load(data)
|
||||
|
||||
# Or with explicit allowlist
|
||||
loaded = load(data, allowed_objects=[AIMessage])
|
||||
|
||||
# Or extend defaults with additional mappings
|
||||
loaded = load(
|
||||
data,
|
||||
additional_import_mappings={
|
||||
("my_pkg", "MyClass"): ("my_pkg", "module", "MyClass"),
|
||||
},
|
||||
)
|
||||
```
|
||||
"""
|
||||
reviver = Reviver(
|
||||
allowed_objects,
|
||||
secrets_map,
|
||||
valid_namespaces,
|
||||
secrets_from_env,
|
||||
additional_import_mappings,
|
||||
ignore_unserializable_fields=ignore_unserializable_fields,
|
||||
init_validator=init_validator,
|
||||
)
|
||||
|
||||
def _load(obj: Any) -> Any:
|
||||
if isinstance(obj, dict):
|
||||
# Check for escaped dict FIRST (before recursing).
|
||||
# Escaped dicts are user data that should NOT be processed as LC objects.
|
||||
if _is_escaped_dict(obj):
|
||||
return _unescape_value(obj)
|
||||
|
||||
# Not escaped - recurse into children then apply reviver
|
||||
# Need to revive leaf nodes before reviving this node
|
||||
loaded_obj = {k: _load(v) for k, v in obj.items()}
|
||||
return reviver(loaded_obj)
|
||||
if isinstance(obj, list):
|
||||
return [_load(o) for o in obj]
|
||||
if isinstance(obj, str) and obj in reviver.secrets_map:
|
||||
return reviver.secrets_map[obj]
|
||||
return obj
|
||||
|
||||
return _load(obj)
|
||||
|
||||
@@ -1,19 +1,21 @@
|
||||
"""Serialization mapping.
|
||||
|
||||
This file contains a mapping between the `lc_namespace` path for a given
|
||||
subclass that implements from `Serializable` to the namespace
|
||||
This file contains a mapping between the lc_namespace path for a given
|
||||
subclass that implements from Serializable to the namespace
|
||||
where that class is actually located.
|
||||
|
||||
This mapping helps maintain the ability to serialize and deserialize
|
||||
well-known LangChain objects even if they are moved around in the codebase
|
||||
across different LangChain versions.
|
||||
|
||||
For example, the code for the `AIMessage` class is located in
|
||||
`langchain_core.messages.ai.AIMessage`. This message is associated with the
|
||||
`lc_namespace` of `["langchain", "schema", "messages", "AIMessage"]`,
|
||||
because this code was originally in `langchain.schema.messages.AIMessage`.
|
||||
For example,
|
||||
|
||||
The mapping allows us to deserialize an `AIMessage` created with an older
|
||||
The code for AIMessage class is located in langchain_core.messages.ai.AIMessage,
|
||||
This message is associated with the lc_namespace
|
||||
["langchain", "schema", "messages", "AIMessage"],
|
||||
because this code was originally in langchain.schema.messages.AIMessage.
|
||||
|
||||
The mapping allows us to deserialize an AIMessage created with an older
|
||||
version of LangChain where the code was in a different location.
|
||||
"""
|
||||
|
||||
@@ -273,11 +275,6 @@ SERIALIZABLE_MAPPING: dict[tuple[str, ...], tuple[str, ...]] = {
|
||||
"chat_models",
|
||||
"ChatGroq",
|
||||
),
|
||||
("langchain_xai", "chat_models", "ChatXAI"): (
|
||||
"langchain_xai",
|
||||
"chat_models",
|
||||
"ChatXAI",
|
||||
),
|
||||
("langchain", "chat_models", "fireworks", "ChatFireworks"): (
|
||||
"langchain_fireworks",
|
||||
"chat_models",
|
||||
@@ -532,6 +529,16 @@ SERIALIZABLE_MAPPING: dict[tuple[str, ...], tuple[str, ...]] = {
|
||||
"structured",
|
||||
"StructuredPrompt",
|
||||
),
|
||||
("langchain_sambanova", "chat_models", "ChatSambaNovaCloud"): (
|
||||
"langchain_sambanova",
|
||||
"chat_models",
|
||||
"ChatSambaNovaCloud",
|
||||
),
|
||||
("langchain_sambanova", "chat_models", "ChatSambaStudio"): (
|
||||
"langchain_sambanova",
|
||||
"chat_models",
|
||||
"ChatSambaStudio",
|
||||
),
|
||||
("langchain_core", "prompts", "message", "_DictMessagePromptTemplate"): (
|
||||
"langchain_core",
|
||||
"prompts",
|
||||
|
||||
@@ -92,12 +92,11 @@ class Serializable(BaseModel, ABC):
|
||||
|
||||
It relies on the following methods and properties:
|
||||
|
||||
- [`is_lc_serializable`][langchain_core.load.serializable.Serializable.is_lc_serializable]: Is this class serializable?
|
||||
|
||||
- `is_lc_serializable`: Is this class serializable?
|
||||
By design, even if a class inherits from `Serializable`, it is not serializable
|
||||
by default. This is to prevent accidental serialization of objects that should
|
||||
not be serialized.
|
||||
- [`get_lc_namespace`][langchain_core.load.serializable.Serializable.get_lc_namespace]: Get the namespace of the LangChain object.
|
||||
- `get_lc_namespace`: Get the namespace of the LangChain object.
|
||||
|
||||
During deserialization, this namespace is used to identify
|
||||
the correct class to instantiate.
|
||||
@@ -106,10 +105,10 @@ class Serializable(BaseModel, ABC):
|
||||
During deserialization an additional mapping is handle classes that have moved
|
||||
or been renamed across package versions.
|
||||
|
||||
- [`lc_secrets`][langchain_core.load.serializable.Serializable.lc_secrets]: A map of constructor argument names to secret ids.
|
||||
- [`lc_attributes`][langchain_core.load.serializable.Serializable.lc_attributes]: List of additional attribute names that should be included
|
||||
- `lc_secrets`: A map of constructor argument names to secret ids.
|
||||
- `lc_attributes`: List of additional attribute names that should be included
|
||||
as part of the serialized representation.
|
||||
""" # noqa: E501
|
||||
"""
|
||||
|
||||
# Remove default BaseModel init docstring.
|
||||
def __init__(self, *args: Any, **kwargs: Any) -> None:
|
||||
@@ -133,12 +132,12 @@ class Serializable(BaseModel, ABC):
|
||||
def get_lc_namespace(cls) -> list[str]:
|
||||
"""Get the namespace of the LangChain object.
|
||||
|
||||
For example, if the class is [`langchain.llms.openai.OpenAI`][langchain_openai.OpenAI],
|
||||
then the namespace is `["langchain", "llms", "openai"]`
|
||||
For example, if the class is `langchain.llms.openai.OpenAI`, then the
|
||||
namespace is `["langchain", "llms", "openai"]`
|
||||
|
||||
Returns:
|
||||
The namespace.
|
||||
""" # noqa: E501
|
||||
"""
|
||||
return cls.__module__.split(".")
|
||||
|
||||
@property
|
||||
|
||||
@@ -51,22 +51,22 @@ class InputTokenDetails(TypedDict, total=False):
|
||||
May also hold extra provider-specific keys.
|
||||
|
||||
!!! version-added "Added in `langchain-core` 0.3.9"
|
||||
|
||||
"""
|
||||
|
||||
audio: int
|
||||
"""Audio input tokens."""
|
||||
|
||||
cache_creation: int
|
||||
"""Input tokens that were cached and there was a cache miss.
|
||||
|
||||
Since there was a cache miss, the cache was created from these tokens.
|
||||
"""
|
||||
|
||||
cache_read: int
|
||||
"""Input tokens that were cached and there was a cache hit.
|
||||
|
||||
Since there was a cache hit, the tokens were read from the cache. More precisely,
|
||||
the model state given these tokens was read from the cache.
|
||||
|
||||
"""
|
||||
|
||||
|
||||
@@ -91,12 +91,12 @@ class OutputTokenDetails(TypedDict, total=False):
|
||||
|
||||
audio: int
|
||||
"""Audio output tokens."""
|
||||
|
||||
reasoning: int
|
||||
"""Reasoning output tokens.
|
||||
|
||||
Tokens generated by the model in a chain of thought process (i.e. by OpenAI's o1
|
||||
models) that are not returned as part of model output.
|
||||
|
||||
"""
|
||||
|
||||
|
||||
@@ -124,11 +124,9 @@ class UsageMetadata(TypedDict):
|
||||
```
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 0.3.9"
|
||||
|
||||
Added `input_token_details` and `output_token_details`.
|
||||
|
||||
!!! note "LangSmith SDK"
|
||||
|
||||
The LangSmith SDK also has a `UsageMetadata` class. While the two share fields,
|
||||
LangSmith's `UsageMetadata` has additional fields to capture cost information
|
||||
used by the LangSmith platform.
|
||||
@@ -136,19 +134,15 @@ class UsageMetadata(TypedDict):
|
||||
|
||||
input_tokens: int
|
||||
"""Count of input (or prompt) tokens. Sum of all input token types."""
|
||||
|
||||
output_tokens: int
|
||||
"""Count of output (or completion) tokens. Sum of all output token types."""
|
||||
|
||||
total_tokens: int
|
||||
"""Total token count. Sum of `input_tokens` + `output_tokens`."""
|
||||
|
||||
input_token_details: NotRequired[InputTokenDetails]
|
||||
"""Breakdown of input token counts.
|
||||
|
||||
Does *not* need to sum to full input token count. Does *not* need to have all keys.
|
||||
"""
|
||||
|
||||
output_token_details: NotRequired[OutputTokenDetails]
|
||||
"""Breakdown of output token counts.
|
||||
|
||||
@@ -168,10 +162,8 @@ class AIMessage(BaseMessage):
|
||||
|
||||
tool_calls: list[ToolCall] = []
|
||||
"""If present, tool calls associated with the message."""
|
||||
|
||||
invalid_tool_calls: list[InvalidToolCall] = []
|
||||
"""If present, tool calls with parsing errors associated with the message."""
|
||||
|
||||
usage_metadata: UsageMetadata | None = None
|
||||
"""If present, usage metadata for a message, such as token counts.
|
||||
|
||||
@@ -326,7 +318,7 @@ class AIMessage(BaseMessage):
|
||||
if tool_calls := values.get("tool_calls"):
|
||||
values["tool_calls"] = [
|
||||
create_tool_call(
|
||||
**{k: v for k, v in tc.items() if k not in {"type", "extras"}}
|
||||
**{k: v for k, v in tc.items() if k not in ("type", "extras")}
|
||||
)
|
||||
for tc in tool_calls
|
||||
]
|
||||
@@ -442,7 +434,7 @@ class AIMessageChunk(AIMessage, BaseMessageChunk):
|
||||
blocks = [
|
||||
block
|
||||
for block in blocks
|
||||
if block["type"] not in {"tool_call", "invalid_tool_call"}
|
||||
if block["type"] not in ("tool_call", "invalid_tool_call")
|
||||
]
|
||||
for tool_call_chunk in self.tool_call_chunks:
|
||||
tc: types.ToolCallChunk = {
|
||||
@@ -563,7 +555,7 @@ class AIMessageChunk(AIMessage, BaseMessageChunk):
|
||||
|
||||
@model_validator(mode="after")
|
||||
def init_server_tool_calls(self) -> Self:
|
||||
"""Parse `server_tool_call_chunks` from [`ServerToolCallChunk`][langchain.messages.ServerToolCallChunk] objects.""" # noqa: E501
|
||||
"""Parse `server_tool_call_chunks`."""
|
||||
if (
|
||||
self.chunk_position == "last"
|
||||
and self.response_metadata.get("output_version") == "v1"
|
||||
@@ -573,7 +565,7 @@ class AIMessageChunk(AIMessage, BaseMessageChunk):
|
||||
if (
|
||||
isinstance(block, dict)
|
||||
and block.get("type")
|
||||
in {"server_tool_call", "server_tool_call_chunk"}
|
||||
in ("server_tool_call", "server_tool_call_chunk")
|
||||
and (args_str := block.get("args"))
|
||||
and isinstance(args_str, str)
|
||||
):
|
||||
|
||||
@@ -5,9 +5,11 @@ from __future__ import annotations
|
||||
from typing import TYPE_CHECKING, Any, cast, overload
|
||||
|
||||
from pydantic import ConfigDict, Field
|
||||
from typing_extensions import Self
|
||||
|
||||
from langchain_core._api.deprecation import warn_deprecated
|
||||
from langchain_core.load.serializable import Serializable
|
||||
from langchain_core.messages import content as types
|
||||
from langchain_core.utils import get_bolded_text
|
||||
from langchain_core.utils._merge import merge_dicts, merge_lists
|
||||
from langchain_core.utils.interactive_env import is_interactive_env
|
||||
@@ -15,9 +17,6 @@ from langchain_core.utils.interactive_env import is_interactive_env
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Sequence
|
||||
|
||||
from typing_extensions import Self
|
||||
|
||||
from langchain_core.messages import content as types
|
||||
from langchain_core.prompts.chat import ChatPromptTemplate
|
||||
|
||||
|
||||
@@ -391,12 +390,12 @@ class BaseMessageChunk(BaseMessage):
|
||||
Raises:
|
||||
TypeError: If the other object is not a message chunk.
|
||||
|
||||
Example:
|
||||
```txt
|
||||
AIMessageChunk(content="Hello", ...)
|
||||
+ AIMessageChunk(content=" World", ...)
|
||||
= AIMessageChunk(content="Hello World", ...)
|
||||
```
|
||||
For example,
|
||||
|
||||
`AIMessageChunk(content="Hello") + AIMessageChunk(content=" World")`
|
||||
|
||||
will give `AIMessageChunk(content="Hello World")`
|
||||
|
||||
"""
|
||||
if isinstance(other, BaseMessageChunk):
|
||||
# If both are (subclasses of) BaseMessageChunk,
|
||||
|
||||
@@ -12,11 +12,10 @@ the implementation in `BaseMessage`.
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from collections.abc import Callable
|
||||
from typing import TYPE_CHECKING
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable
|
||||
|
||||
from langchain_core.messages import AIMessage, AIMessageChunk
|
||||
from langchain_core.messages import content as types
|
||||
|
||||
|
||||
@@ -159,12 +159,12 @@ def _convert_citation_to_v1(citation: dict[str, Any]) -> types.Annotation:
|
||||
|
||||
return url_citation
|
||||
|
||||
if citation_type in {
|
||||
if citation_type in (
|
||||
"char_location",
|
||||
"content_block_location",
|
||||
"page_location",
|
||||
"search_result_location",
|
||||
}:
|
||||
):
|
||||
document_citation: types.Citation = {
|
||||
"type": "citation",
|
||||
"cited_text": citation["cited_text"],
|
||||
@@ -173,6 +173,8 @@ def _convert_citation_to_v1(citation: dict[str, Any]) -> types.Annotation:
|
||||
document_citation["title"] = citation["document_title"]
|
||||
elif title := citation.get("title"):
|
||||
document_citation["title"] = title
|
||||
else:
|
||||
pass
|
||||
known_fields = {
|
||||
"type",
|
||||
"cited_text",
|
||||
@@ -243,20 +245,11 @@ def _convert_to_v1_from_anthropic(message: AIMessage) -> list[types.ContentBlock
|
||||
and message.chunk_position != "last"
|
||||
):
|
||||
# Isolated chunk
|
||||
chunk = message.tool_call_chunks[0]
|
||||
|
||||
tool_call_chunk = types.ToolCallChunk(
|
||||
name=chunk.get("name"),
|
||||
id=chunk.get("id"),
|
||||
args=chunk.get("args"),
|
||||
type="tool_call_chunk",
|
||||
tool_call_chunk: types.ToolCallChunk = (
|
||||
message.tool_call_chunks[0].copy() # type: ignore[assignment]
|
||||
)
|
||||
if "caller" in block:
|
||||
tool_call_chunk["extras"] = {"caller": block["caller"]}
|
||||
|
||||
index = chunk.get("index")
|
||||
if index is not None:
|
||||
tool_call_chunk["index"] = index
|
||||
if "type" not in tool_call_chunk:
|
||||
tool_call_chunk["type"] = "tool_call_chunk"
|
||||
yield tool_call_chunk
|
||||
else:
|
||||
tool_call_block: types.ToolCall | None = None
|
||||
@@ -278,6 +271,8 @@ def _convert_to_v1_from_anthropic(message: AIMessage) -> list[types.ContentBlock
|
||||
"id": tc.get("id"),
|
||||
}
|
||||
break
|
||||
else:
|
||||
pass
|
||||
if not tool_call_block:
|
||||
tool_call_block = {
|
||||
"type": "tool_call",
|
||||
@@ -287,27 +282,17 @@ def _convert_to_v1_from_anthropic(message: AIMessage) -> list[types.ContentBlock
|
||||
}
|
||||
if "index" in block:
|
||||
tool_call_block["index"] = block["index"]
|
||||
if "caller" in block:
|
||||
if "extras" not in tool_call_block:
|
||||
tool_call_block["extras"] = {}
|
||||
tool_call_block["extras"]["caller"] = block["caller"]
|
||||
|
||||
yield tool_call_block
|
||||
|
||||
elif block_type == "input_json_delta" and isinstance(
|
||||
message, AIMessageChunk
|
||||
):
|
||||
if len(message.tool_call_chunks) == 1:
|
||||
chunk = message.tool_call_chunks[0]
|
||||
tool_call_chunk = types.ToolCallChunk(
|
||||
name=chunk.get("name"),
|
||||
id=chunk.get("id"),
|
||||
args=chunk.get("args"),
|
||||
type="tool_call_chunk",
|
||||
tool_call_chunk = (
|
||||
message.tool_call_chunks[0].copy() # type: ignore[assignment]
|
||||
)
|
||||
index = chunk.get("index")
|
||||
if index is not None:
|
||||
tool_call_chunk["index"] = index
|
||||
if "type" not in tool_call_chunk:
|
||||
tool_call_chunk["type"] = "tool_call_chunk"
|
||||
yield tool_call_chunk
|
||||
|
||||
else:
|
||||
@@ -461,26 +446,12 @@ def _convert_to_v1_from_anthropic(message: AIMessage) -> list[types.ContentBlock
|
||||
|
||||
|
||||
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message with Anthropic content.
|
||||
|
||||
Args:
|
||||
message: The message to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message with Anthropic content."""
|
||||
return _convert_to_v1_from_anthropic(message)
|
||||
|
||||
|
||||
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message chunk with Anthropic content.
|
||||
|
||||
Args:
|
||||
message: The message chunk to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message chunk with Anthropic content."""
|
||||
return _convert_to_v1_from_anthropic(message)
|
||||
|
||||
|
||||
|
||||
@@ -65,28 +65,14 @@ def _convert_to_v1_from_bedrock_chunk(
|
||||
|
||||
|
||||
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message with Bedrock content.
|
||||
|
||||
Args:
|
||||
message: The message to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message with Bedrock content."""
|
||||
if "claude" not in message.response_metadata.get("model_name", "").lower():
|
||||
raise NotImplementedError # fall back to best-effort parsing
|
||||
return _convert_to_v1_from_bedrock(message)
|
||||
|
||||
|
||||
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message chunk with Bedrock content.
|
||||
|
||||
Args:
|
||||
message: The message chunk to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message chunk with Bedrock content."""
|
||||
# TODO: add model_name to all Bedrock chunks and update core merging logic
|
||||
# to not append during aggregation. Then raise NotImplementedError here if
|
||||
# not an Anthropic model to fall back to best-effort parsing.
|
||||
|
||||
@@ -209,16 +209,11 @@ def _convert_to_v1_from_converse(message: AIMessage) -> list[types.ContentBlock]
|
||||
and message.chunk_position != "last"
|
||||
):
|
||||
# Isolated chunk
|
||||
chunk = message.tool_call_chunks[0]
|
||||
tool_call_chunk = types.ToolCallChunk(
|
||||
name=chunk.get("name"),
|
||||
id=chunk.get("id"),
|
||||
args=chunk.get("args"),
|
||||
type="tool_call_chunk",
|
||||
tool_call_chunk: types.ToolCallChunk = (
|
||||
message.tool_call_chunks[0].copy() # type: ignore[assignment]
|
||||
)
|
||||
index = chunk.get("index")
|
||||
if index is not None:
|
||||
tool_call_chunk["index"] = index
|
||||
if "type" not in tool_call_chunk:
|
||||
tool_call_chunk["type"] = "tool_call_chunk"
|
||||
yield tool_call_chunk
|
||||
else:
|
||||
tool_call_block: types.ToolCall | None = None
|
||||
@@ -240,6 +235,8 @@ def _convert_to_v1_from_converse(message: AIMessage) -> list[types.ContentBlock]
|
||||
"id": tc.get("id"),
|
||||
}
|
||||
break
|
||||
else:
|
||||
pass
|
||||
if not tool_call_block:
|
||||
tool_call_block = {
|
||||
"type": "tool_call",
|
||||
@@ -256,16 +253,11 @@ def _convert_to_v1_from_converse(message: AIMessage) -> list[types.ContentBlock]
|
||||
and isinstance(message, AIMessageChunk)
|
||||
and len(message.tool_call_chunks) == 1
|
||||
):
|
||||
chunk = message.tool_call_chunks[0]
|
||||
tool_call_chunk = types.ToolCallChunk(
|
||||
name=chunk.get("name"),
|
||||
id=chunk.get("id"),
|
||||
args=chunk.get("args"),
|
||||
type="tool_call_chunk",
|
||||
tool_call_chunk = (
|
||||
message.tool_call_chunks[0].copy() # type: ignore[assignment]
|
||||
)
|
||||
index = chunk.get("index")
|
||||
if index is not None:
|
||||
tool_call_chunk["index"] = index
|
||||
if "type" not in tool_call_chunk:
|
||||
tool_call_chunk["type"] = "tool_call_chunk"
|
||||
yield tool_call_chunk
|
||||
|
||||
else:
|
||||
@@ -281,26 +273,12 @@ def _convert_to_v1_from_converse(message: AIMessage) -> list[types.ContentBlock]
|
||||
|
||||
|
||||
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message with Bedrock Converse content.
|
||||
|
||||
Args:
|
||||
message: The message to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message with Bedrock Converse content."""
|
||||
return _convert_to_v1_from_converse(message)
|
||||
|
||||
|
||||
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a chunk with Bedrock Converse content.
|
||||
|
||||
Args:
|
||||
message: The message chunk to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a chunk with Bedrock Converse content."""
|
||||
return _convert_to_v1_from_converse(message)
|
||||
|
||||
|
||||
|
||||
@@ -76,36 +76,21 @@ def translate_grounding_metadata_to_citations(
|
||||
for chunk_index in chunk_indices:
|
||||
if chunk_index < len(grounding_chunks):
|
||||
chunk = grounding_chunks[chunk_index]
|
||||
|
||||
# Handle web and maps grounding
|
||||
web_info = chunk.get("web") or {}
|
||||
maps_info = chunk.get("maps") or {}
|
||||
|
||||
# Extract citation info depending on source
|
||||
url = maps_info.get("uri") or web_info.get("uri")
|
||||
title = maps_info.get("title") or web_info.get("title")
|
||||
|
||||
# Note: confidence_scores is a legacy field from Gemini 2.0 and earlier
|
||||
# that indicated confidence (0.0-1.0) for each grounding chunk.
|
||||
#
|
||||
# In Gemini 2.5+, this field is always None/empty and should be ignored.
|
||||
extras_metadata = {
|
||||
"web_search_queries": web_search_queries,
|
||||
"grounding_chunk_index": chunk_index,
|
||||
"confidence_scores": support.get("confidence_scores") or [],
|
||||
}
|
||||
|
||||
# Add maps-specific metadata if present
|
||||
if maps_info.get("placeId"):
|
||||
extras_metadata["place_id"] = maps_info["placeId"]
|
||||
web_info = chunk.get("web", {})
|
||||
|
||||
citation = create_citation(
|
||||
url=url,
|
||||
title=title,
|
||||
url=web_info.get("uri"),
|
||||
title=web_info.get("title"),
|
||||
start_index=start_index,
|
||||
end_index=end_index,
|
||||
cited_text=cited_text,
|
||||
google_ai_metadata=extras_metadata,
|
||||
extras={
|
||||
"google_ai_metadata": {
|
||||
"web_search_queries": web_search_queries,
|
||||
"grounding_chunk_index": chunk_index,
|
||||
"confidence_scores": support.get("confidence_scores", []),
|
||||
}
|
||||
},
|
||||
)
|
||||
citations.append(citation)
|
||||
|
||||
@@ -411,10 +396,7 @@ def _convert_to_v1_from_genai(message: AIMessage) -> list[types.ContentBlock]:
|
||||
except Exception:
|
||||
# Not valid base64, treat as non-standard
|
||||
converted_blocks.append(
|
||||
{
|
||||
"type": "non_standard",
|
||||
"value": item,
|
||||
}
|
||||
{"type": "non_standard", "value": item}
|
||||
)
|
||||
else:
|
||||
# This likely won't be reached according to previous implementations
|
||||
@@ -526,26 +508,12 @@ def _convert_to_v1_from_genai(message: AIMessage) -> list[types.ContentBlock]:
|
||||
|
||||
|
||||
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message with Google (GenAI) content.
|
||||
|
||||
Args:
|
||||
message: The message to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message with Google (GenAI) content."""
|
||||
return _convert_to_v1_from_genai(message)
|
||||
|
||||
|
||||
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a chunk with Google (GenAI) content.
|
||||
|
||||
Args:
|
||||
message: The message chunk to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a chunk with Google (GenAI) content."""
|
||||
return _convert_to_v1_from_genai(message)
|
||||
|
||||
|
||||
|
||||
@@ -119,26 +119,12 @@ def _convert_to_v1_from_groq(message: AIMessage) -> list[types.ContentBlock]:
|
||||
|
||||
|
||||
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message with groq content.
|
||||
|
||||
Args:
|
||||
message: The message to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message with groq content."""
|
||||
return _convert_to_v1_from_groq(message)
|
||||
|
||||
|
||||
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message chunk with groq content.
|
||||
|
||||
Args:
|
||||
message: The message chunk to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message chunk with groq content."""
|
||||
return _convert_to_v1_from_groq(message)
|
||||
|
||||
|
||||
|
||||
@@ -4,6 +4,7 @@ from __future__ import annotations
|
||||
|
||||
import json
|
||||
import warnings
|
||||
from collections.abc import Iterable
|
||||
from typing import TYPE_CHECKING, Any, Literal, cast
|
||||
|
||||
from langchain_core.language_models._utils import (
|
||||
@@ -13,24 +14,11 @@ from langchain_core.language_models._utils import (
|
||||
from langchain_core.messages import content as types
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Iterable
|
||||
|
||||
from langchain_core.messages import AIMessage, AIMessageChunk
|
||||
|
||||
|
||||
def convert_to_openai_image_block(block: dict[str, Any]) -> dict:
|
||||
"""Convert `ImageContentBlock` to format expected by OpenAI Chat Completions.
|
||||
|
||||
Args:
|
||||
block: The image content block to convert.
|
||||
|
||||
Raises:
|
||||
ValueError: If required keys are missing.
|
||||
ValueError: If source type is unsupported.
|
||||
|
||||
Returns:
|
||||
The formatted image content block.
|
||||
"""
|
||||
"""Convert `ImageContentBlock` to format expected by OpenAI Chat Completions."""
|
||||
if "url" in block:
|
||||
return {
|
||||
"type": "image_url",
|
||||
@@ -61,18 +49,6 @@ def convert_to_openai_data_block(
|
||||
|
||||
"Standard data content block" can include old-style LangChain v0 blocks
|
||||
(URLContentBlock, Base64ContentBlock, IDContentBlock) or new ones.
|
||||
|
||||
Args:
|
||||
block: The content block to convert.
|
||||
api: The OpenAI API being targeted. Either "chat/completions" or "responses".
|
||||
|
||||
Raises:
|
||||
ValueError: If required keys are missing.
|
||||
ValueError: If file URLs are used with Chat Completions API.
|
||||
ValueError: If block type is unsupported.
|
||||
|
||||
Returns:
|
||||
The formatted content block.
|
||||
"""
|
||||
if block["type"] == "image":
|
||||
chat_completions_block = convert_to_openai_image_block(block)
|
||||
@@ -271,7 +247,7 @@ def _convert_from_v1_to_chat_completions(message: AIMessage) -> AIMessage:
|
||||
if block_type == "text":
|
||||
# Strip annotations
|
||||
new_content.append({"type": "text", "text": block["text"]})
|
||||
elif block_type in {"reasoning", "tool_call"}:
|
||||
elif block_type in ("reasoning", "tool_call"):
|
||||
pass
|
||||
else:
|
||||
new_content.append(block)
|
||||
@@ -729,6 +705,8 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
|
||||
if invalid_tool_call.get("id") == call_id:
|
||||
tool_call_block = invalid_tool_call.copy()
|
||||
break
|
||||
else:
|
||||
pass
|
||||
if tool_call_block:
|
||||
if "id" in block:
|
||||
if "extras" not in tool_call_block:
|
||||
@@ -756,7 +734,7 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
|
||||
k: v for k, v in block["action"].items() if k != "sources"
|
||||
}
|
||||
for key in block:
|
||||
if key not in {"type", "id", "action", "status", "index"}:
|
||||
if key not in ("type", "id", "action", "status", "index"):
|
||||
web_search_call[key] = block[key]
|
||||
|
||||
yield cast("types.ServerToolCall", web_search_call)
|
||||
@@ -782,6 +760,8 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
|
||||
web_search_result["status"] = "success"
|
||||
elif status:
|
||||
web_search_result["extras"] = {"status": status}
|
||||
else:
|
||||
pass
|
||||
if "index" in block and isinstance(block["index"], int):
|
||||
web_search_result["index"] = f"lc_wsr_{block['index'] + 1}"
|
||||
yield cast("types.ServerToolResult", web_search_result)
|
||||
@@ -797,14 +777,14 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
|
||||
file_search_call["index"] = f"lc_fsc_{block['index']}"
|
||||
|
||||
for key in block:
|
||||
if key not in {
|
||||
if key not in (
|
||||
"type",
|
||||
"id",
|
||||
"queries",
|
||||
"results",
|
||||
"status",
|
||||
"index",
|
||||
}:
|
||||
):
|
||||
file_search_call[key] = block[key]
|
||||
|
||||
yield cast("types.ServerToolCall", file_search_call)
|
||||
@@ -823,6 +803,8 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
|
||||
file_search_result["status"] = "success"
|
||||
elif status:
|
||||
file_search_result["extras"] = {"status": status}
|
||||
else:
|
||||
pass
|
||||
if "index" in block and isinstance(block["index"], int):
|
||||
file_search_result["index"] = f"lc_fsr_{block['index'] + 1}"
|
||||
yield cast("types.ServerToolResult", file_search_result)
|
||||
@@ -866,6 +848,8 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
|
||||
code_interpreter_result["status"] = "success"
|
||||
elif status:
|
||||
code_interpreter_result["extras"] = {"status": status}
|
||||
else:
|
||||
pass
|
||||
if "index" in block and isinstance(block["index"], int):
|
||||
code_interpreter_result["index"] = f"lc_cir_{block['index'] + 1}"
|
||||
|
||||
@@ -996,14 +980,7 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
|
||||
|
||||
|
||||
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message with OpenAI content.
|
||||
|
||||
Args:
|
||||
message: The message to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message with OpenAI content."""
|
||||
if isinstance(message.content, str):
|
||||
return _convert_to_v1_from_chat_completions(message)
|
||||
message = _convert_from_v03_ai_message(message)
|
||||
@@ -1011,14 +988,7 @@ def translate_content(message: AIMessage) -> list[types.ContentBlock]:
|
||||
|
||||
|
||||
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
|
||||
"""Derive standard content blocks from a message chunk with OpenAI content.
|
||||
|
||||
Args:
|
||||
message: The message chunk to translate.
|
||||
|
||||
Returns:
|
||||
The derived content blocks.
|
||||
"""
|
||||
"""Derive standard content blocks from a message chunk with OpenAI content."""
|
||||
if isinstance(message.content, str):
|
||||
return _convert_to_v1_from_chat_completions_chunk(message)
|
||||
message = _convert_from_v03_ai_message(message) # type: ignore[assignment]
|
||||
|
||||
@@ -654,7 +654,7 @@ class PlainTextContentBlock(TypedDict):
|
||||
|
||||
!!! note
|
||||
Title and context are optional fields that may be passed to the model. See
|
||||
Anthropic [example](https://platform.claude.com/docs/en/build-with-claude/citations#citable-vs-non-citable-content).
|
||||
Anthropic [example](https://docs.claude.com/en/docs/build-with-claude/citations#citable-vs-non-citable-content).
|
||||
|
||||
!!! note "Factory function"
|
||||
`create_plaintext_block` may also be used as a factory to create a
|
||||
|
||||
@@ -29,39 +29,38 @@ class ToolMessage(BaseMessage, ToolOutputMixin):
|
||||
`ToolMessage` objects contain the result of a tool invocation. Typically, the result
|
||||
is encoded inside the `content` field.
|
||||
|
||||
`tool_call_id` is used to associate the tool call request with the tool call
|
||||
response. Useful in situations where a chat model is able to request multiple tool
|
||||
calls in parallel.
|
||||
Example: A `ToolMessage` representing a result of `42` from a tool call with id
|
||||
|
||||
Example:
|
||||
A `ToolMessage` representing a result of `42` from a tool call with id
|
||||
```python
|
||||
from langchain_core.messages import ToolMessage
|
||||
|
||||
```python
|
||||
from langchain_core.messages import ToolMessage
|
||||
ToolMessage(content="42", tool_call_id="call_Jja7J89XsjrOLA5r!MEOW!SL")
|
||||
```
|
||||
|
||||
ToolMessage(content="42", tool_call_id="call_Jja7J89XsjrOLA5r!MEOW!SL")
|
||||
```
|
||||
Example: A `ToolMessage` where only part of the tool output is sent to the model
|
||||
and the full output is passed in to artifact.
|
||||
|
||||
Example:
|
||||
A `ToolMessage` where only part of the tool output is sent to the model
|
||||
and the full output is passed in to artifact.
|
||||
```python
|
||||
from langchain_core.messages import ToolMessage
|
||||
|
||||
```python
|
||||
from langchain_core.messages import ToolMessage
|
||||
tool_output = {
|
||||
"stdout": "From the graph we can see that the correlation between "
|
||||
"x and y is ...",
|
||||
"stderr": None,
|
||||
"artifacts": {"type": "image", "base64_data": "/9j/4gIcSU..."},
|
||||
}
|
||||
|
||||
tool_output = {
|
||||
"stdout": "From the graph we can see that the correlation between "
|
||||
"x and y is ...",
|
||||
"stderr": None,
|
||||
"artifacts": {"type": "image", "base64_data": "/9j/4gIcSU..."},
|
||||
}
|
||||
ToolMessage(
|
||||
content=tool_output["stdout"],
|
||||
artifact=tool_output,
|
||||
tool_call_id="call_Jja7J89XsjrOLA5r!MEOW!SL",
|
||||
)
|
||||
```
|
||||
|
||||
The `tool_call_id` field is used to associate the tool call request with the
|
||||
tool call response. Useful in situations where a chat model is able
|
||||
to request multiple tool calls in parallel.
|
||||
|
||||
ToolMessage(
|
||||
content=tool_output["stdout"],
|
||||
artifact=tool_output,
|
||||
tool_call_id="call_Jja7J89XsjrOLA5r!MEOW!SL",
|
||||
)
|
||||
```
|
||||
"""
|
||||
|
||||
tool_call_id: str
|
||||
|
||||
@@ -15,16 +15,12 @@ import json
|
||||
import logging
|
||||
import math
|
||||
from collections.abc import Callable, Iterable, Sequence
|
||||
from functools import partial, wraps
|
||||
from functools import partial
|
||||
from typing import (
|
||||
TYPE_CHECKING,
|
||||
Annotated,
|
||||
Any,
|
||||
Concatenate,
|
||||
Literal,
|
||||
ParamSpec,
|
||||
Protocol,
|
||||
TypeVar,
|
||||
cast,
|
||||
overload,
|
||||
)
|
||||
@@ -109,11 +105,6 @@ def get_buffer_string(
|
||||
Raises:
|
||||
ValueError: If an unsupported message type is encountered.
|
||||
|
||||
Note:
|
||||
If a message is an `AIMessage` and contains both tool calls under `tool_calls`
|
||||
and a function call under `additional_kwargs["function_call"]`, only the tool
|
||||
calls will be appended to the string representation.
|
||||
|
||||
Example:
|
||||
```python
|
||||
from langchain_core import AIMessage, HumanMessage
|
||||
@@ -144,12 +135,8 @@ def get_buffer_string(
|
||||
msg = f"Got unsupported message type: {m}"
|
||||
raise ValueError(msg) # noqa: TRY004
|
||||
message = f"{role}: {m.text}"
|
||||
if isinstance(m, AIMessage):
|
||||
if m.tool_calls:
|
||||
message += f"{m.tool_calls}"
|
||||
elif "function_call" in m.additional_kwargs:
|
||||
# Legacy behavior assumes only one function call per message
|
||||
message += f"{m.additional_kwargs['function_call']}"
|
||||
if isinstance(m, AIMessage) and "function_call" in m.additional_kwargs:
|
||||
message += f"{m.additional_kwargs['function_call']}"
|
||||
string_messages.append(message)
|
||||
|
||||
return "\n".join(string_messages)
|
||||
@@ -397,54 +384,33 @@ def convert_to_messages(
|
||||
return [_convert_to_message(m) for m in messages]
|
||||
|
||||
|
||||
_P = ParamSpec("_P")
|
||||
_R_co = TypeVar("_R_co", covariant=True)
|
||||
|
||||
|
||||
class _RunnableSupportCallable(Protocol[_P, _R_co]):
|
||||
def _runnable_support(func: Callable) -> Callable:
|
||||
@overload
|
||||
def __call__(
|
||||
self,
|
||||
messages: None = None,
|
||||
*args: _P.args,
|
||||
**kwargs: _P.kwargs,
|
||||
) -> Runnable[Sequence[MessageLikeRepresentation], _R_co]: ...
|
||||
|
||||
@overload
|
||||
def __call__(
|
||||
self,
|
||||
messages: Sequence[MessageLikeRepresentation] | PromptValue,
|
||||
*args: _P.args,
|
||||
**kwargs: _P.kwargs,
|
||||
) -> _R_co: ...
|
||||
|
||||
def __call__(
|
||||
self,
|
||||
messages: Sequence[MessageLikeRepresentation] | PromptValue | None = None,
|
||||
*args: _P.args,
|
||||
**kwargs: _P.kwargs,
|
||||
) -> _R_co | Runnable[Sequence[MessageLikeRepresentation], _R_co]: ...
|
||||
|
||||
|
||||
def _runnable_support(
|
||||
func: Callable[
|
||||
Concatenate[Sequence[MessageLikeRepresentation] | PromptValue, _P], _R_co
|
||||
],
|
||||
) -> _RunnableSupportCallable[_P, _R_co]:
|
||||
@wraps(func)
|
||||
def wrapped(
|
||||
messages: Sequence[MessageLikeRepresentation] | PromptValue | None = None,
|
||||
*args: _P.args,
|
||||
**kwargs: _P.kwargs,
|
||||
) -> _R_co | Runnable[Sequence[MessageLikeRepresentation], _R_co]:
|
||||
messages: None = None, **kwargs: Any
|
||||
) -> Runnable[Sequence[MessageLikeRepresentation], list[BaseMessage]]: ...
|
||||
|
||||
@overload
|
||||
def wrapped(
|
||||
messages: Sequence[MessageLikeRepresentation], **kwargs: Any
|
||||
) -> list[BaseMessage]: ...
|
||||
|
||||
def wrapped(
|
||||
messages: Sequence[MessageLikeRepresentation] | None = None,
|
||||
**kwargs: Any,
|
||||
) -> (
|
||||
list[BaseMessage]
|
||||
| Runnable[Sequence[MessageLikeRepresentation], list[BaseMessage]]
|
||||
):
|
||||
# Import locally to prevent circular import.
|
||||
from langchain_core.runnables.base import RunnableLambda # noqa: PLC0415
|
||||
|
||||
if messages is not None:
|
||||
return func(messages, *args, **kwargs)
|
||||
return func(messages, **kwargs)
|
||||
return RunnableLambda(partial(func, **kwargs), name=func.__name__)
|
||||
|
||||
return cast("_RunnableSupportCallable[_P, _R_co]", wrapped)
|
||||
wrapped.__doc__ = func.__doc__
|
||||
return wrapped
|
||||
|
||||
|
||||
@_runnable_support
|
||||
@@ -729,8 +695,7 @@ def trim_messages(
|
||||
max_tokens: int,
|
||||
token_counter: Callable[[list[BaseMessage]], int]
|
||||
| Callable[[BaseMessage], int]
|
||||
| BaseLanguageModel
|
||||
| Literal["approximate"],
|
||||
| BaseLanguageModel,
|
||||
strategy: Literal["first", "last"] = "last",
|
||||
allow_partial: bool = False,
|
||||
end_on: str | type[BaseMessage] | Sequence[str | type[BaseMessage]] | None = None,
|
||||
@@ -768,65 +733,51 @@ def trim_messages(
|
||||
messages: Sequence of Message-like objects to trim.
|
||||
max_tokens: Max token count of trimmed messages.
|
||||
token_counter: Function or llm for counting tokens in a `BaseMessage` or a
|
||||
list of `BaseMessage`.
|
||||
|
||||
If a `BaseLanguageModel` is passed in then
|
||||
`BaseLanguageModel.get_num_tokens_from_messages()` will be used. Set to
|
||||
`len` to count the number of **messages** in the chat history.
|
||||
|
||||
You can also use string shortcuts for convenience:
|
||||
|
||||
- `'approximate'`: Uses `count_tokens_approximately` for fast, approximate
|
||||
token counts.
|
||||
list of `BaseMessage`. If a `BaseLanguageModel` is passed in then
|
||||
`BaseLanguageModel.get_num_tokens_from_messages()` will be used.
|
||||
Set to `len` to count the number of **messages** in the chat history.
|
||||
|
||||
!!! note
|
||||
|
||||
`count_tokens_approximately` (or the shortcut `'approximate'`) is
|
||||
recommended for using `trim_messages` on the hot path, where exact token
|
||||
counting is not necessary.
|
||||
Use `count_tokens_approximately` to get fast, approximate token
|
||||
counts.
|
||||
This is recommended for using `trim_messages` on the hot path, where
|
||||
exact token counting is not necessary.
|
||||
|
||||
strategy: Strategy for trimming.
|
||||
|
||||
- `'first'`: Keep the first `<= n_count` tokens of the messages.
|
||||
- `'last'`: Keep the last `<= n_count` tokens of the messages.
|
||||
allow_partial: Whether to split a message if only part of the message can be
|
||||
included.
|
||||
|
||||
If `strategy='last'` then the last partial contents of a message are
|
||||
included. If `strategy='first'` then the first partial contents of a
|
||||
included. If `strategy='last'` then the last partial contents of a message
|
||||
are included. If `strategy='first'` then the first partial contents of a
|
||||
message are included.
|
||||
end_on: The message type to end on.
|
||||
end_on: The message type to end on. If specified then every message after the
|
||||
last occurrence of this type is ignored. If `strategy='last'` then this
|
||||
is done before we attempt to get the last `max_tokens`. If
|
||||
`strategy='first'` then this is done after we get the first
|
||||
`max_tokens`. Can be specified as string names (e.g. `'system'`,
|
||||
`'human'`, `'ai'`, ...) or as `BaseMessage` classes (e.g.
|
||||
`SystemMessage`, `HumanMessage`, `AIMessage`, ...). Can be a single
|
||||
type or a list of types.
|
||||
|
||||
If specified then every message after the last occurrence of this type is
|
||||
ignored. If `strategy='last'` then this is done before we attempt to get the
|
||||
last `max_tokens`. If `strategy='first'` then this is done after we get the
|
||||
first `max_tokens`. Can be specified as string names (e.g. `'system'`,
|
||||
`'human'`, `'ai'`, ...) or as `BaseMessage` classes (e.g. `SystemMessage`,
|
||||
`HumanMessage`, `AIMessage`, ...). Can be a single type or a list of types.
|
||||
|
||||
start_on: The message type to start on.
|
||||
|
||||
Should only be specified if `strategy='last'`. If specified then every
|
||||
message before the first occurrence of this type is ignored. This is done
|
||||
after we trim the initial messages to the last `max_tokens`. Does not apply
|
||||
to a `SystemMessage` at index 0 if `include_system=True`. Can be specified
|
||||
as string names (e.g. `'system'`, `'human'`, `'ai'`, ...) or as
|
||||
`BaseMessage` classes (e.g. `SystemMessage`, `HumanMessage`, `AIMessage`,
|
||||
...). Can be a single type or a list of types.
|
||||
start_on: The message type to start on. Should only be specified if
|
||||
`strategy='last'`. If specified then every message before
|
||||
the first occurrence of this type is ignored. This is done after we trim
|
||||
the initial messages to the last `max_tokens`. Does not
|
||||
apply to a `SystemMessage` at index 0 if `include_system=True`. Can be
|
||||
specified as string names (e.g. `'system'`, `'human'`, `'ai'`, ...) or
|
||||
as `BaseMessage` classes (e.g. `SystemMessage`, `HumanMessage`,
|
||||
`AIMessage`, ...). Can be a single type or a list of types.
|
||||
|
||||
include_system: Whether to keep the `SystemMessage` if there is one at index
|
||||
`0`.
|
||||
|
||||
Should only be specified if `strategy="last"`.
|
||||
`0`. Should only be specified if `strategy="last"`.
|
||||
text_splitter: Function or `langchain_text_splitters.TextSplitter` for
|
||||
splitting the string contents of a message.
|
||||
|
||||
Only used if `allow_partial=True`. If `strategy='last'` then the last split
|
||||
tokens from a partial message will be included. if `strategy='first'` then
|
||||
the first split tokens from a partial message will be included. Token
|
||||
splitter assumes that separators are kept, so that split contents can be
|
||||
directly concatenated to recreate the original text. Defaults to splitting
|
||||
on newlines.
|
||||
splitting the string contents of a message. Only used if
|
||||
`allow_partial=True`. If `strategy='last'` then the last split tokens
|
||||
from a partial message will be included. if `strategy='first'` then the
|
||||
first split tokens from a partial message will be included. Token splitter
|
||||
assumes that separators are kept, so that split contents can be directly
|
||||
concatenated to recreate the original text. Defaults to splitting on
|
||||
newlines.
|
||||
|
||||
Returns:
|
||||
List of trimmed `BaseMessage`.
|
||||
@@ -837,8 +788,8 @@ def trim_messages(
|
||||
|
||||
Example:
|
||||
Trim chat history based on token count, keeping the `SystemMessage` if
|
||||
present, and ensuring that the chat history starts with a `HumanMessage` (or a
|
||||
`SystemMessage` followed by a `HumanMessage`).
|
||||
present, and ensuring that the chat history starts with a `HumanMessage` (
|
||||
or a `SystemMessage` followed by a `HumanMessage`).
|
||||
|
||||
```python
|
||||
from langchain_core.messages import (
|
||||
@@ -891,34 +842,8 @@ def trim_messages(
|
||||
]
|
||||
```
|
||||
|
||||
Trim chat history using approximate token counting with `'approximate'`:
|
||||
|
||||
```python
|
||||
trim_messages(
|
||||
messages,
|
||||
max_tokens=45,
|
||||
strategy="last",
|
||||
# Using the "approximate" shortcut for fast token counting
|
||||
token_counter="approximate",
|
||||
start_on="human",
|
||||
include_system=True,
|
||||
)
|
||||
|
||||
# This is equivalent to using `count_tokens_approximately` directly
|
||||
from langchain_core.messages.utils import count_tokens_approximately
|
||||
|
||||
trim_messages(
|
||||
messages,
|
||||
max_tokens=45,
|
||||
strategy="last",
|
||||
token_counter=count_tokens_approximately,
|
||||
start_on="human",
|
||||
include_system=True,
|
||||
)
|
||||
```
|
||||
|
||||
Trim chat history based on the message count, keeping the `SystemMessage` if
|
||||
present, and ensuring that the chat history starts with a HumanMessage (
|
||||
present, and ensuring that the chat history starts with a `HumanMessage` (
|
||||
or a `SystemMessage` followed by a `HumanMessage`).
|
||||
|
||||
trim_messages(
|
||||
@@ -1040,44 +965,24 @@ def trim_messages(
|
||||
raise ValueError(msg)
|
||||
|
||||
messages = convert_to_messages(messages)
|
||||
|
||||
# Handle string shortcuts for token counter
|
||||
if isinstance(token_counter, str):
|
||||
if token_counter in _TOKEN_COUNTER_SHORTCUTS:
|
||||
actual_token_counter = _TOKEN_COUNTER_SHORTCUTS[token_counter]
|
||||
else:
|
||||
available_shortcuts = ", ".join(
|
||||
f"'{key}'" for key in _TOKEN_COUNTER_SHORTCUTS
|
||||
)
|
||||
msg = (
|
||||
f"Invalid token_counter shortcut '{token_counter}'. "
|
||||
f"Available shortcuts: {available_shortcuts}."
|
||||
)
|
||||
raise ValueError(msg)
|
||||
else:
|
||||
# Type narrowing: at this point token_counter is not a str
|
||||
actual_token_counter = token_counter # type: ignore[assignment]
|
||||
|
||||
if hasattr(actual_token_counter, "get_num_tokens_from_messages"):
|
||||
list_token_counter = actual_token_counter.get_num_tokens_from_messages
|
||||
elif callable(actual_token_counter):
|
||||
if hasattr(token_counter, "get_num_tokens_from_messages"):
|
||||
list_token_counter = token_counter.get_num_tokens_from_messages
|
||||
elif callable(token_counter):
|
||||
if (
|
||||
next(
|
||||
iter(inspect.signature(actual_token_counter).parameters.values())
|
||||
).annotation
|
||||
next(iter(inspect.signature(token_counter).parameters.values())).annotation
|
||||
is BaseMessage
|
||||
):
|
||||
|
||||
def list_token_counter(messages: Sequence[BaseMessage]) -> int:
|
||||
return sum(actual_token_counter(msg) for msg in messages) # type: ignore[arg-type, misc]
|
||||
return sum(token_counter(msg) for msg in messages) # type: ignore[arg-type, misc]
|
||||
|
||||
else:
|
||||
list_token_counter = actual_token_counter
|
||||
list_token_counter = token_counter
|
||||
else:
|
||||
msg = (
|
||||
f"'token_counter' expected to be a model that implements "
|
||||
f"'get_num_tokens_from_messages()' or a function. Received object of type "
|
||||
f"{type(actual_token_counter)}."
|
||||
f"{type(token_counter)}."
|
||||
)
|
||||
raise ValueError(msg)
|
||||
|
||||
@@ -1117,7 +1022,6 @@ def convert_to_openai_messages(
|
||||
*,
|
||||
text_format: Literal["string", "block"] = "string",
|
||||
include_id: bool = False,
|
||||
pass_through_unknown_blocks: bool = True,
|
||||
) -> dict | list[dict]:
|
||||
"""Convert LangChain messages into OpenAI message dicts.
|
||||
|
||||
@@ -1137,9 +1041,6 @@ def convert_to_openai_messages(
|
||||
content blocks these are left as is.
|
||||
include_id: Whether to include message IDs in the openai messages, if they
|
||||
are present in the source messages.
|
||||
pass_through_unknown_blocks: Whether to include content blocks with unknown
|
||||
formats in the output. If `False`, an error is raised if an unknown
|
||||
content block is encountered.
|
||||
|
||||
Raises:
|
||||
ValueError: if an unrecognized `text_format` is specified, or if a message
|
||||
@@ -1389,36 +1290,6 @@ def convert_to_openai_messages(
|
||||
},
|
||||
}
|
||||
)
|
||||
elif block.get("type") == "function_call": # OpenAI Responses
|
||||
if not any(
|
||||
tool_call["id"] == block.get("call_id")
|
||||
for tool_call in cast("AIMessage", message).tool_calls
|
||||
):
|
||||
if missing := [
|
||||
k
|
||||
for k in ("call_id", "name", "arguments")
|
||||
if k not in block
|
||||
]:
|
||||
err = (
|
||||
f"Unrecognized content block at "
|
||||
f"messages[{i}].content[{j}] has 'type': "
|
||||
f"'tool_use', but is missing expected key(s) "
|
||||
f"{missing}. Full content block:\n\n{block}"
|
||||
)
|
||||
raise ValueError(err)
|
||||
oai_msg["tool_calls"] = oai_msg.get("tool_calls", [])
|
||||
oai_msg["tool_calls"].append(
|
||||
{
|
||||
"type": "function",
|
||||
"id": block.get("call_id"),
|
||||
"function": {
|
||||
"name": block.get("name"),
|
||||
"arguments": block.get("arguments"),
|
||||
},
|
||||
}
|
||||
)
|
||||
if pass_through_unknown_blocks:
|
||||
content.append(block)
|
||||
elif block.get("type") == "tool_result":
|
||||
if missing := [
|
||||
k for k in ("content", "tool_use_id") if k not in block
|
||||
@@ -1499,10 +1370,7 @@ def convert_to_openai_messages(
|
||||
},
|
||||
}
|
||||
)
|
||||
elif (
|
||||
block.get("type") in {"thinking", "reasoning"}
|
||||
or pass_through_unknown_blocks
|
||||
):
|
||||
elif block.get("type") in ["thinking", "reasoning"]:
|
||||
content.append(block)
|
||||
else:
|
||||
err = (
|
||||
@@ -1875,14 +1743,3 @@ def count_tokens_approximately(
|
||||
|
||||
# round up once more time in case extra_tokens_per_message is a float
|
||||
return math.ceil(token_count)
|
||||
|
||||
|
||||
# Mapping from string shortcuts to token counter functions
|
||||
def _approximate_token_counter(messages: Sequence[BaseMessage]) -> int:
|
||||
"""Wrapper for `count_tokens_approximately` that matches expected signature."""
|
||||
return count_tokens_approximately(messages)
|
||||
|
||||
|
||||
_TOKEN_COUNTER_SHORTCUTS = {
|
||||
"approximate": _approximate_token_counter,
|
||||
}
|
||||
|
||||
@@ -37,7 +37,7 @@ class OutputFunctionsParser(BaseGenerationOutputParser[Any]):
|
||||
The parsed JSON object.
|
||||
|
||||
Raises:
|
||||
OutputParserException: If the output is not valid JSON.
|
||||
`OutputParserException`: If the output is not valid JSON.
|
||||
"""
|
||||
generation = result[0]
|
||||
if not isinstance(generation, ChatGeneration):
|
||||
@@ -88,7 +88,7 @@ class JsonOutputFunctionsParser(BaseCumulativeTransformOutputParser[Any]):
|
||||
The parsed JSON object.
|
||||
|
||||
Raises:
|
||||
OutputParserException: If the output is not valid JSON.
|
||||
OutputParserExcept`ion: If the output is not valid JSON.
|
||||
"""
|
||||
if len(result) != 1:
|
||||
msg = f"Expected exactly one result, but got {len(result)}"
|
||||
|
||||
@@ -47,24 +47,22 @@ def parse_tool_call(
|
||||
"""
|
||||
if "function" not in raw_tool_call:
|
||||
return None
|
||||
|
||||
arguments = raw_tool_call["function"]["arguments"]
|
||||
|
||||
if partial:
|
||||
try:
|
||||
function_args = parse_partial_json(arguments, strict=strict)
|
||||
function_args = parse_partial_json(
|
||||
raw_tool_call["function"]["arguments"], strict=strict
|
||||
)
|
||||
except (JSONDecodeError, TypeError): # None args raise TypeError
|
||||
return None
|
||||
# Handle None or empty string arguments for parameter-less tools
|
||||
elif not arguments:
|
||||
function_args = {}
|
||||
else:
|
||||
try:
|
||||
function_args = json.loads(arguments, strict=strict)
|
||||
function_args = json.loads(
|
||||
raw_tool_call["function"]["arguments"], strict=strict
|
||||
)
|
||||
except JSONDecodeError as e:
|
||||
msg = (
|
||||
f"Function {raw_tool_call['function']['name']} arguments:\n\n"
|
||||
f"{arguments}\n\nare not valid JSON. "
|
||||
f"{raw_tool_call['function']['arguments']}\n\nare not valid JSON. "
|
||||
f"Received JSONDecodeError {e}"
|
||||
)
|
||||
raise OutputParserException(msg) from e
|
||||
|
||||
@@ -37,7 +37,7 @@ class PydanticOutputParser(JsonOutputParser, Generic[TBaseModel]):
|
||||
def _parser_exception(
|
||||
self, e: Exception, json_object: dict
|
||||
) -> OutputParserException:
|
||||
json_string = json.dumps(json_object, ensure_ascii=False)
|
||||
json_string = json.dumps(json_object)
|
||||
name = self.pydantic_object.__name__
|
||||
msg = f"Failed to parse {name} from completion {json_string}. Got: {e}"
|
||||
return OutputParserException(msg, llm_output=json_string)
|
||||
@@ -54,7 +54,7 @@ class PydanticOutputParser(JsonOutputParser, Generic[TBaseModel]):
|
||||
all the keys that have been returned so far.
|
||||
|
||||
Raises:
|
||||
OutputParserException: If the result is not valid JSON
|
||||
`OutputParserException`: If the result is not valid JSON
|
||||
or does not conform to the Pydantic model.
|
||||
|
||||
Returns:
|
||||
|
||||
@@ -6,33 +6,7 @@ from langchain_core.output_parsers.transform import BaseTransformOutputParser
|
||||
|
||||
|
||||
class StrOutputParser(BaseTransformOutputParser[str]):
|
||||
"""Extract text content from model outputs as a string.
|
||||
|
||||
Converts model outputs (such as `AIMessage` or `AIMessageChunk` objects) into plain
|
||||
text strings. It's the simplest output parser and is useful when you need string
|
||||
responses for downstream processing, display, or storage.
|
||||
|
||||
Supports streaming, yielding text chunks as they're generated by the model.
|
||||
|
||||
Example:
|
||||
```python
|
||||
from langchain_core.output_parsers import StrOutputParser
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
model = ChatOpenAI(model="gpt-4o")
|
||||
parser = StrOutputParser()
|
||||
|
||||
# Get string output from a model
|
||||
message = model.invoke("Tell me a joke")
|
||||
result = parser.invoke(message)
|
||||
print(result) # plain string
|
||||
|
||||
# With streaming - use transform() to process a stream
|
||||
stream = model.stream("Tell me a story")
|
||||
for chunk in parser.transform(stream):
|
||||
print(chunk, end="", flush=True)
|
||||
```
|
||||
"""
|
||||
"""OutputParser that parses `LLMResult` into the top likely string."""
|
||||
|
||||
@classmethod
|
||||
def is_lc_serializable(cls) -> bool:
|
||||
|
||||
@@ -2,17 +2,15 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import TYPE_CHECKING, Literal
|
||||
from typing import Literal
|
||||
|
||||
from pydantic import model_validator
|
||||
from typing_extensions import Self
|
||||
|
||||
from langchain_core.messages import BaseMessage, BaseMessageChunk
|
||||
from langchain_core.outputs.generation import Generation
|
||||
from langchain_core.utils._merge import merge_dicts
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from typing_extensions import Self
|
||||
|
||||
|
||||
class ChatGeneration(Generation):
|
||||
"""A single chat generation output.
|
||||
|
||||
@@ -6,7 +6,7 @@ import contextlib
|
||||
import json
|
||||
import typing
|
||||
from abc import ABC, abstractmethod
|
||||
from collections.abc import Mapping
|
||||
from collections.abc import Callable, Mapping
|
||||
from functools import cached_property
|
||||
from pathlib import Path
|
||||
from typing import (
|
||||
@@ -33,8 +33,6 @@ from langchain_core.runnables.config import ensure_config
|
||||
from langchain_core.utils.pydantic import create_model_v2
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable
|
||||
|
||||
from langchain_core.documents import Document
|
||||
|
||||
|
||||
|
||||
@@ -903,28 +903,23 @@ class ChatPromptTemplate(BaseChatPromptTemplate):
|
||||
5. A string which is shorthand for `("human", template)`; e.g.,
|
||||
`"{user_input}"`
|
||||
template_format: Format of the template.
|
||||
**kwargs: Additional keyword arguments passed to `BasePromptTemplate`,
|
||||
including (but not limited to):
|
||||
input_variables: A list of the names of the variables whose values are
|
||||
required as inputs to the prompt.
|
||||
optional_variables: A list of the names of the variables for placeholder
|
||||
or MessagePlaceholder that are optional.
|
||||
|
||||
- `input_variables`: A list of the names of the variables whose values
|
||||
are required as inputs to the prompt.
|
||||
- `optional_variables`: A list of the names of the variables for
|
||||
placeholder or `MessagePlaceholder` that are optional.
|
||||
These variables are auto inferred from the prompt and user need not
|
||||
provide them.
|
||||
partial_variables: A dictionary of the partial variables the prompt
|
||||
template carries.
|
||||
|
||||
These variables are auto inferred from the prompt and user need not
|
||||
provide them.
|
||||
Partial variables populate the template so that you don't need to pass
|
||||
them in every time you call the prompt.
|
||||
validate_template: Whether to validate the template.
|
||||
input_types: A dictionary of the types of the variables the prompt template
|
||||
expects.
|
||||
|
||||
- `partial_variables`: A dictionary of the partial variables the prompt
|
||||
template carries.
|
||||
|
||||
Partial variables populate the template so that you don't need to
|
||||
pass them in every time you call the prompt.
|
||||
|
||||
- `validate_template`: Whether to validate the template.
|
||||
- `input_types`: A dictionary of the types of the variables the prompt
|
||||
template expects.
|
||||
|
||||
If not provided, all variables are assumed to be strings.
|
||||
If not provided, all variables are assumed to be strings.
|
||||
|
||||
Examples:
|
||||
Instantiation from a list of message templates:
|
||||
|
||||
@@ -6,10 +6,10 @@ from abc import ABC, abstractmethod
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
from langchain_core.load import Serializable
|
||||
from langchain_core.messages import BaseMessage
|
||||
from langchain_core.utils.interactive_env import is_interactive_env
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from langchain_core.messages import BaseMessage
|
||||
from langchain_core.prompts.chat import ChatPromptTemplate
|
||||
|
||||
|
||||
|
||||
@@ -4,8 +4,9 @@ from __future__ import annotations
|
||||
|
||||
import warnings
|
||||
from abc import ABC
|
||||
from collections.abc import Callable, Sequence
|
||||
from string import Formatter
|
||||
from typing import TYPE_CHECKING, Any, Literal
|
||||
from typing import Any, Literal
|
||||
|
||||
from pydantic import BaseModel, create_model
|
||||
|
||||
@@ -15,11 +16,8 @@ from langchain_core.utils import get_colored_text, mustache
|
||||
from langchain_core.utils.formatting import formatter
|
||||
from langchain_core.utils.interactive_env import is_interactive_env
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable, Sequence
|
||||
|
||||
try:
|
||||
from jinja2 import meta
|
||||
from jinja2 import Environment, meta
|
||||
from jinja2.sandbox import SandboxedEnvironment
|
||||
|
||||
_HAS_JINJA2 = True
|
||||
@@ -61,9 +59,13 @@ def jinja2_formatter(template: str, /, **kwargs: Any) -> str:
|
||||
)
|
||||
raise ImportError(msg)
|
||||
|
||||
# Use a restricted sandbox that blocks ALL attribute/method access
|
||||
# Only simple variable lookups like {{variable}} are allowed
|
||||
# Attribute access like {{variable.attr}} or {{variable.method()}} is blocked
|
||||
# This uses a sandboxed environment to prevent arbitrary code execution.
|
||||
# Jinja2 uses an opt-out rather than opt-in approach for sand-boxing.
|
||||
# Please treat this sand-boxing as a best-effort approach rather than
|
||||
# a guarantee of security.
|
||||
# We recommend to never use jinja2 templates with untrusted inputs.
|
||||
# https://jinja.palletsprojects.com/en/3.1.x/sandbox/
|
||||
# approach not a guarantee of security.
|
||||
return SandboxedEnvironment().from_string(template).render(**kwargs)
|
||||
|
||||
|
||||
@@ -99,7 +101,7 @@ def _get_jinja2_variables_from_template(template: str) -> set[str]:
|
||||
"Please install it with `pip install jinja2`."
|
||||
)
|
||||
raise ImportError(msg)
|
||||
env = SandboxedEnvironment()
|
||||
env = Environment() # noqa: S701
|
||||
ast = env.parse(template)
|
||||
return meta.find_undeclared_variables(ast)
|
||||
|
||||
@@ -269,30 +271,6 @@ def get_template_variables(template: str, template_format: str) -> list[str]:
|
||||
msg = f"Unsupported template format: {template_format}"
|
||||
raise ValueError(msg)
|
||||
|
||||
# For f-strings, block attribute access and indexing syntax
|
||||
# This prevents template injection attacks via accessing dangerous attributes
|
||||
if template_format == "f-string":
|
||||
for var in input_variables:
|
||||
# Formatter().parse() returns field names with dots/brackets if present
|
||||
# e.g., "obj.attr" or "obj[0]" - we need to block these
|
||||
if "." in var or "[" in var or "]" in var:
|
||||
msg = (
|
||||
f"Invalid variable name {var!r} in f-string template. "
|
||||
f"Variable names cannot contain attribute "
|
||||
f"access (.) or indexing ([])."
|
||||
)
|
||||
raise ValueError(msg)
|
||||
|
||||
# Block variable names that are all digits (e.g., "0", "100")
|
||||
# These are interpreted as positional arguments, not keyword arguments
|
||||
if var.isdigit():
|
||||
msg = (
|
||||
f"Invalid variable name {var!r} in f-string template. "
|
||||
f"Variable names cannot be all digits as they are interpreted "
|
||||
f"as positional arguments."
|
||||
)
|
||||
raise ValueError(msg)
|
||||
|
||||
return sorted(input_variables)
|
||||
|
||||
|
||||
|
||||
@@ -48,17 +48,8 @@ class StructuredPrompt(ChatPromptTemplate):
|
||||
schema_: schema for the structured prompt.
|
||||
structured_output_kwargs: additional kwargs for structured output.
|
||||
template_format: template format for the prompt.
|
||||
|
||||
Raises:
|
||||
ValueError: if schema is not provided.
|
||||
"""
|
||||
schema_ = schema_ or kwargs.pop("schema", None)
|
||||
if not schema_:
|
||||
err_msg = (
|
||||
"Must pass in a non-empty structured output schema. Received: "
|
||||
f"{schema_}"
|
||||
)
|
||||
raise ValueError(err_msg)
|
||||
schema_ = schema_ or kwargs.pop("schema")
|
||||
structured_output_kwargs = structured_output_kwargs or {}
|
||||
for k in set(kwargs).difference(get_pydantic_field_names(self.__class__)):
|
||||
structured_output_kwargs[k] = kwargs.pop(k)
|
||||
|
||||
@@ -94,7 +94,7 @@ from langchain_core.tracers.root_listeners import (
|
||||
AsyncRootListenersTracer,
|
||||
RootListenersTracer,
|
||||
)
|
||||
from langchain_core.utils.aiter import aclosing, atee
|
||||
from langchain_core.utils.aiter import aclosing, atee, py_anext
|
||||
from langchain_core.utils.iter import safetee
|
||||
from langchain_core.utils.pydantic import create_model_v2
|
||||
|
||||
@@ -127,10 +127,10 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
Key Methods
|
||||
===========
|
||||
|
||||
- `invoke`/`ainvoke`: Transforms a single input into an output.
|
||||
- `batch`/`abatch`: Efficiently transforms multiple inputs into outputs.
|
||||
- `stream`/`astream`: Streams output from a single input as it's produced.
|
||||
- `astream_log`: Streams output and selected intermediate results from an
|
||||
- **`invoke`/`ainvoke`**: Transforms a single input into an output.
|
||||
- **`batch`/`abatch`**: Efficiently transforms multiple inputs into outputs.
|
||||
- **`stream`/`astream`**: Streams output from a single input as it's produced.
|
||||
- **`astream_log`**: Streams output and selected intermediate results from an
|
||||
input.
|
||||
|
||||
Built-in optimizations:
|
||||
@@ -707,53 +707,51 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
def pick(self, keys: str | list[str]) -> RunnableSerializable[Any, Any]:
|
||||
"""Pick keys from the output `dict` of this `Runnable`.
|
||||
|
||||
!!! example "Pick a single key"
|
||||
Pick a single key:
|
||||
|
||||
```python
|
||||
import json
|
||||
```python
|
||||
import json
|
||||
|
||||
from langchain_core.runnables import RunnableLambda, RunnableMap
|
||||
from langchain_core.runnables import RunnableLambda, RunnableMap
|
||||
|
||||
as_str = RunnableLambda(str)
|
||||
as_json = RunnableLambda(json.loads)
|
||||
chain = RunnableMap(str=as_str, json=as_json)
|
||||
as_str = RunnableLambda(str)
|
||||
as_json = RunnableLambda(json.loads)
|
||||
chain = RunnableMap(str=as_str, json=as_json)
|
||||
|
||||
chain.invoke("[1, 2, 3]")
|
||||
# -> {"str": "[1, 2, 3]", "json": [1, 2, 3]}
|
||||
chain.invoke("[1, 2, 3]")
|
||||
# -> {"str": "[1, 2, 3]", "json": [1, 2, 3]}
|
||||
|
||||
json_only_chain = chain.pick("json")
|
||||
json_only_chain.invoke("[1, 2, 3]")
|
||||
# -> [1, 2, 3]
|
||||
```
|
||||
json_only_chain = chain.pick("json")
|
||||
json_only_chain.invoke("[1, 2, 3]")
|
||||
# -> [1, 2, 3]
|
||||
```
|
||||
|
||||
!!! example "Pick a list of keys"
|
||||
Pick a list of keys:
|
||||
|
||||
```python
|
||||
from typing import Any
|
||||
```python
|
||||
from typing import Any
|
||||
|
||||
import json
|
||||
import json
|
||||
|
||||
from langchain_core.runnables import RunnableLambda, RunnableMap
|
||||
from langchain_core.runnables import RunnableLambda, RunnableMap
|
||||
|
||||
as_str = RunnableLambda(str)
|
||||
as_json = RunnableLambda(json.loads)
|
||||
as_str = RunnableLambda(str)
|
||||
as_json = RunnableLambda(json.loads)
|
||||
|
||||
|
||||
def as_bytes(x: Any) -> bytes:
|
||||
return bytes(x, "utf-8")
|
||||
def as_bytes(x: Any) -> bytes:
|
||||
return bytes(x, "utf-8")
|
||||
|
||||
|
||||
chain = RunnableMap(
|
||||
str=as_str, json=as_json, bytes=RunnableLambda(as_bytes)
|
||||
)
|
||||
chain = RunnableMap(str=as_str, json=as_json, bytes=RunnableLambda(as_bytes))
|
||||
|
||||
chain.invoke("[1, 2, 3]")
|
||||
# -> {"str": "[1, 2, 3]", "json": [1, 2, 3], "bytes": b"[1, 2, 3]"}
|
||||
chain.invoke("[1, 2, 3]")
|
||||
# -> {"str": "[1, 2, 3]", "json": [1, 2, 3], "bytes": b"[1, 2, 3]"}
|
||||
|
||||
json_and_bytes_chain = chain.pick(["json", "bytes"])
|
||||
json_and_bytes_chain.invoke("[1, 2, 3]")
|
||||
# -> {"json": [1, 2, 3], "bytes": b"[1, 2, 3]"}
|
||||
```
|
||||
json_and_bytes_chain = chain.pick(["json", "bytes"])
|
||||
json_and_bytes_chain.invoke("[1, 2, 3]")
|
||||
# -> {"json": [1, 2, 3], "bytes": b"[1, 2, 3]"}
|
||||
```
|
||||
|
||||
Args:
|
||||
keys: A key or list of keys to pick from the output dict.
|
||||
@@ -1374,50 +1372,48 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
).with_config({"run_name": "my_template", "tags": ["my_template"]})
|
||||
```
|
||||
|
||||
!!! example
|
||||
For instance:
|
||||
|
||||
```python
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
```python
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
|
||||
|
||||
async def reverse(s: str) -> str:
|
||||
return s[::-1]
|
||||
async def reverse(s: str) -> str:
|
||||
return s[::-1]
|
||||
|
||||
|
||||
chain = RunnableLambda(func=reverse)
|
||||
chain = RunnableLambda(func=reverse)
|
||||
|
||||
events = [
|
||||
event async for event in chain.astream_events("hello", version="v2")
|
||||
]
|
||||
events = [event async for event in chain.astream_events("hello", version="v2")]
|
||||
|
||||
# Will produce the following events
|
||||
# (run_id, and parent_ids has been omitted for brevity):
|
||||
[
|
||||
{
|
||||
"data": {"input": "hello"},
|
||||
"event": "on_chain_start",
|
||||
"metadata": {},
|
||||
"name": "reverse",
|
||||
"tags": [],
|
||||
},
|
||||
{
|
||||
"data": {"chunk": "olleh"},
|
||||
"event": "on_chain_stream",
|
||||
"metadata": {},
|
||||
"name": "reverse",
|
||||
"tags": [],
|
||||
},
|
||||
{
|
||||
"data": {"output": "olleh"},
|
||||
"event": "on_chain_end",
|
||||
"metadata": {},
|
||||
"name": "reverse",
|
||||
"tags": [],
|
||||
},
|
||||
]
|
||||
```
|
||||
# Will produce the following events
|
||||
# (run_id, and parent_ids has been omitted for brevity):
|
||||
[
|
||||
{
|
||||
"data": {"input": "hello"},
|
||||
"event": "on_chain_start",
|
||||
"metadata": {},
|
||||
"name": "reverse",
|
||||
"tags": [],
|
||||
},
|
||||
{
|
||||
"data": {"chunk": "olleh"},
|
||||
"event": "on_chain_stream",
|
||||
"metadata": {},
|
||||
"name": "reverse",
|
||||
"tags": [],
|
||||
},
|
||||
{
|
||||
"data": {"output": "olleh"},
|
||||
"event": "on_chain_end",
|
||||
"metadata": {},
|
||||
"name": "reverse",
|
||||
"tags": [],
|
||||
},
|
||||
]
|
||||
```
|
||||
|
||||
```python title="Dispatch custom event"
|
||||
```python title="Example: Dispatch Custom Event"
|
||||
from langchain_core.callbacks.manager import (
|
||||
adispatch_custom_event,
|
||||
)
|
||||
@@ -1451,13 +1447,10 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
Args:
|
||||
input: The input to the `Runnable`.
|
||||
config: The config to use for the `Runnable`.
|
||||
version: The version of the schema to use, either `'v2'` or `'v1'`.
|
||||
|
||||
version: The version of the schema to use either `'v2'` or `'v1'`.
|
||||
Users should use `'v2'`.
|
||||
|
||||
`'v1'` is for backwards compatibility and will be deprecated
|
||||
in `0.4.0`.
|
||||
|
||||
No default will be assigned until the API is stabilized.
|
||||
custom events will only be surfaced in `'v2'`.
|
||||
include_names: Only include events from `Runnable` objects with matching names.
|
||||
@@ -1467,7 +1460,6 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
exclude_types: Exclude events from `Runnable` objects with matching types.
|
||||
exclude_tags: Exclude events from `Runnable` objects with matching tags.
|
||||
**kwargs: Additional keyword arguments to pass to the `Runnable`.
|
||||
|
||||
These will be passed to `astream_log` as this implementation
|
||||
of `astream_events` is built on top of `astream_log`.
|
||||
|
||||
@@ -2277,9 +2269,6 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
Use this to implement `stream` or `transform` in `Runnable` subclasses.
|
||||
|
||||
"""
|
||||
# Extract defers_inputs from kwargs if present
|
||||
defers_inputs = kwargs.pop("defers_inputs", False)
|
||||
|
||||
# tee the input so we can iterate over it twice
|
||||
input_for_tracing, input_for_transform = tee(inputs, 2)
|
||||
# Start the input iterator to ensure the input Runnable starts before this one
|
||||
@@ -2296,7 +2285,6 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
run_type=run_type,
|
||||
name=config.get("run_name") or self.get_name(),
|
||||
run_id=config.pop("run_id", None),
|
||||
defers_inputs=defers_inputs,
|
||||
)
|
||||
try:
|
||||
child_config = patch_config(config, callbacks=run_manager.get_child())
|
||||
@@ -2378,13 +2366,10 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
Use this to implement `astream` or `atransform` in `Runnable` subclasses.
|
||||
|
||||
"""
|
||||
# Extract defers_inputs from kwargs if present
|
||||
defers_inputs = kwargs.pop("defers_inputs", False)
|
||||
|
||||
# tee the input so we can iterate over it twice
|
||||
input_for_tracing, input_for_transform = atee(inputs, 2)
|
||||
# Start the input iterator to ensure the input Runnable starts before this one
|
||||
final_input: Input | None = await anext(input_for_tracing, None)
|
||||
final_input: Input | None = await py_anext(input_for_tracing, None)
|
||||
final_input_supported = True
|
||||
final_output: Output | None = None
|
||||
final_output_supported = True
|
||||
@@ -2397,7 +2382,6 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
run_type=run_type,
|
||||
name=config.get("run_name") or self.get_name(),
|
||||
run_id=config.pop("run_id", None),
|
||||
defers_inputs=defers_inputs,
|
||||
)
|
||||
try:
|
||||
child_config = patch_config(config, callbacks=run_manager.get_child())
|
||||
@@ -2425,7 +2409,7 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
iterator = iterator_
|
||||
try:
|
||||
while True:
|
||||
chunk = await coro_with_context(anext(iterator), context)
|
||||
chunk = await coro_with_context(py_anext(iterator), context)
|
||||
yield chunk
|
||||
if final_output_supported:
|
||||
if final_output is None:
|
||||
@@ -2492,82 +2476,82 @@ class Runnable(ABC, Generic[Input, Output]):
|
||||
Returns:
|
||||
A `BaseTool` instance.
|
||||
|
||||
!!! example "`TypedDict` input"
|
||||
Typed dict input:
|
||||
|
||||
```python
|
||||
from typing_extensions import TypedDict
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
```python
|
||||
from typing_extensions import TypedDict
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
|
||||
|
||||
class Args(TypedDict):
|
||||
a: int
|
||||
b: list[int]
|
||||
class Args(TypedDict):
|
||||
a: int
|
||||
b: list[int]
|
||||
|
||||
|
||||
def f(x: Args) -> str:
|
||||
return str(x["a"] * max(x["b"]))
|
||||
def f(x: Args) -> str:
|
||||
return str(x["a"] * max(x["b"]))
|
||||
|
||||
|
||||
runnable = RunnableLambda(f)
|
||||
as_tool = runnable.as_tool()
|
||||
as_tool.invoke({"a": 3, "b": [1, 2]})
|
||||
```
|
||||
runnable = RunnableLambda(f)
|
||||
as_tool = runnable.as_tool()
|
||||
as_tool.invoke({"a": 3, "b": [1, 2]})
|
||||
```
|
||||
|
||||
!!! example "`dict` input, specifying schema via `args_schema`"
|
||||
`dict` input, specifying schema via `args_schema`:
|
||||
|
||||
```python
|
||||
from typing import Any
|
||||
from pydantic import BaseModel, Field
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
```python
|
||||
from typing import Any
|
||||
from pydantic import BaseModel, Field
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
|
||||
def f(x: dict[str, Any]) -> str:
|
||||
return str(x["a"] * max(x["b"]))
|
||||
def f(x: dict[str, Any]) -> str:
|
||||
return str(x["a"] * max(x["b"]))
|
||||
|
||||
class FSchema(BaseModel):
|
||||
\"\"\"Apply a function to an integer and list of integers.\"\"\"
|
||||
class FSchema(BaseModel):
|
||||
\"\"\"Apply a function to an integer and list of integers.\"\"\"
|
||||
|
||||
a: int = Field(..., description="Integer")
|
||||
b: list[int] = Field(..., description="List of ints")
|
||||
a: int = Field(..., description="Integer")
|
||||
b: list[int] = Field(..., description="List of ints")
|
||||
|
||||
runnable = RunnableLambda(f)
|
||||
as_tool = runnable.as_tool(FSchema)
|
||||
as_tool.invoke({"a": 3, "b": [1, 2]})
|
||||
```
|
||||
runnable = RunnableLambda(f)
|
||||
as_tool = runnable.as_tool(FSchema)
|
||||
as_tool.invoke({"a": 3, "b": [1, 2]})
|
||||
```
|
||||
|
||||
!!! example "`dict` input, specifying schema via `arg_types`"
|
||||
`dict` input, specifying schema via `arg_types`:
|
||||
|
||||
```python
|
||||
from typing import Any
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
```python
|
||||
from typing import Any
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
|
||||
|
||||
def f(x: dict[str, Any]) -> str:
|
||||
return str(x["a"] * max(x["b"]))
|
||||
def f(x: dict[str, Any]) -> str:
|
||||
return str(x["a"] * max(x["b"]))
|
||||
|
||||
|
||||
runnable = RunnableLambda(f)
|
||||
as_tool = runnable.as_tool(arg_types={"a": int, "b": list[int]})
|
||||
as_tool.invoke({"a": 3, "b": [1, 2]})
|
||||
```
|
||||
runnable = RunnableLambda(f)
|
||||
as_tool = runnable.as_tool(arg_types={"a": int, "b": list[int]})
|
||||
as_tool.invoke({"a": 3, "b": [1, 2]})
|
||||
```
|
||||
|
||||
!!! example "`str` input"
|
||||
`str` input:
|
||||
|
||||
```python
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
```python
|
||||
from langchain_core.runnables import RunnableLambda
|
||||
|
||||
|
||||
def f(x: str) -> str:
|
||||
return x + "a"
|
||||
def f(x: str) -> str:
|
||||
return x + "a"
|
||||
|
||||
|
||||
def g(x: str) -> str:
|
||||
return x + "z"
|
||||
def g(x: str) -> str:
|
||||
return x + "z"
|
||||
|
||||
|
||||
runnable = RunnableLambda(f) | g
|
||||
as_tool = runnable.as_tool()
|
||||
as_tool.invoke("b")
|
||||
```
|
||||
runnable = RunnableLambda(f) | g
|
||||
as_tool = runnable.as_tool()
|
||||
as_tool.invoke("b")
|
||||
```
|
||||
"""
|
||||
# Avoid circular import
|
||||
from langchain_core.tools import convert_runnable_to_tool # noqa: PLC0415
|
||||
@@ -2619,33 +2603,29 @@ class RunnableSerializable(Serializable, Runnable[Input, Output]):
|
||||
Returns:
|
||||
A new `Runnable` with the fields configured.
|
||||
|
||||
!!! example
|
||||
```python
|
||||
from langchain_core.runnables import ConfigurableField
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
```python
|
||||
from langchain_core.runnables import ConfigurableField
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
model = ChatOpenAI(max_tokens=20).configurable_fields(
|
||||
max_tokens=ConfigurableField(
|
||||
id="output_token_number",
|
||||
name="Max tokens in the output",
|
||||
description="The maximum number of tokens in the output",
|
||||
)
|
||||
model = ChatOpenAI(max_tokens=20).configurable_fields(
|
||||
max_tokens=ConfigurableField(
|
||||
id="output_token_number",
|
||||
name="Max tokens in the output",
|
||||
description="The maximum number of tokens in the output",
|
||||
)
|
||||
)
|
||||
|
||||
# max_tokens = 20
|
||||
print(
|
||||
"max_tokens_20: ", model.invoke("tell me something about chess").content
|
||||
)
|
||||
# max_tokens = 20
|
||||
print("max_tokens_20: ", model.invoke("tell me something about chess").content)
|
||||
|
||||
# max_tokens = 200
|
||||
print(
|
||||
"max_tokens_200: ",
|
||||
model.with_config(configurable={"output_token_number": 200})
|
||||
.invoke("tell me something about chess")
|
||||
.content,
|
||||
)
|
||||
```
|
||||
# max_tokens = 200
|
||||
print(
|
||||
"max_tokens_200: ",
|
||||
model.with_config(configurable={"output_token_number": 200})
|
||||
.invoke("tell me something about chess")
|
||||
.content,
|
||||
)
|
||||
```
|
||||
"""
|
||||
# Import locally to prevent circular import
|
||||
from langchain_core.runnables.configurable import ( # noqa: PLC0415
|
||||
@@ -2684,31 +2664,29 @@ class RunnableSerializable(Serializable, Runnable[Input, Output]):
|
||||
Returns:
|
||||
A new `Runnable` with the alternatives configured.
|
||||
|
||||
!!! example
|
||||
```python
|
||||
from langchain_anthropic import ChatAnthropic
|
||||
from langchain_core.runnables.utils import ConfigurableField
|
||||
from langchain_openai import ChatOpenAI
|
||||
|
||||
```python
|
||||
from langchain_anthropic import ChatAnthropic
|
||||
from langchain_core.runnables.utils import ConfigurableField
|
||||
from langchain_openai import ChatOpenAI
|
||||
model = ChatAnthropic(
|
||||
model_name="claude-sonnet-4-5-20250929"
|
||||
).configurable_alternatives(
|
||||
ConfigurableField(id="llm"),
|
||||
default_key="anthropic",
|
||||
openai=ChatOpenAI(),
|
||||
)
|
||||
|
||||
model = ChatAnthropic(
|
||||
model_name="claude-sonnet-4-5-20250929"
|
||||
).configurable_alternatives(
|
||||
ConfigurableField(id="llm"),
|
||||
default_key="anthropic",
|
||||
openai=ChatOpenAI(),
|
||||
)
|
||||
# uses the default model ChatAnthropic
|
||||
print(model.invoke("which organization created you?").content)
|
||||
|
||||
# uses the default model ChatAnthropic
|
||||
print(model.invoke("which organization created you?").content)
|
||||
|
||||
# uses ChatOpenAI
|
||||
print(
|
||||
model.with_config(configurable={"llm": "openai"})
|
||||
.invoke("which organization created you?")
|
||||
.content
|
||||
)
|
||||
```
|
||||
# uses ChatOpenAI
|
||||
print(
|
||||
model.with_config(configurable={"llm": "openai"})
|
||||
.invoke("which organization created you?")
|
||||
.content
|
||||
)
|
||||
```
|
||||
"""
|
||||
# Import locally to prevent circular import
|
||||
from langchain_core.runnables.configurable import ( # noqa: PLC0415
|
||||
@@ -4033,7 +4011,7 @@ class RunnableParallel(RunnableSerializable[Input, dict[str, Any]]):
|
||||
|
||||
# Wrap in a coroutine to satisfy linter
|
||||
async def get_next_chunk(generator: AsyncIterator) -> Output | None:
|
||||
return await anext(generator)
|
||||
return await py_anext(generator)
|
||||
|
||||
# Start the first iteration of each generator
|
||||
tasks = {
|
||||
@@ -4331,7 +4309,6 @@ class RunnableGenerator(Runnable[Input, Output]):
|
||||
input,
|
||||
self._transform, # type: ignore[arg-type]
|
||||
config,
|
||||
defers_inputs=True,
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
@@ -4365,7 +4342,7 @@ class RunnableGenerator(Runnable[Input, Output]):
|
||||
raise NotImplementedError(msg)
|
||||
|
||||
return self._atransform_stream_with_config(
|
||||
input, self._atransform, config, defers_inputs=True, **kwargs
|
||||
input, self._atransform, config, **kwargs
|
||||
)
|
||||
|
||||
@override
|
||||
|
||||
@@ -303,7 +303,7 @@ class RunnableBranch(RunnableSerializable[Input, Output]):
|
||||
|
||||
Args:
|
||||
input: The input to the `Runnable`.
|
||||
config: The configuration for the `Runnable`.
|
||||
config: The configuration for the Runna`ble.
|
||||
**kwargs: Additional keyword arguments to pass to the `Runnable`.
|
||||
|
||||
Yields:
|
||||
|
||||
@@ -47,59 +47,54 @@ class EmptyDict(TypedDict, total=False):
|
||||
|
||||
|
||||
class RunnableConfig(TypedDict, total=False):
|
||||
"""Configuration for a `Runnable`.
|
||||
|
||||
See the [reference docs](https://reference.langchain.com/python/langchain_core/runnables/#langchain_core.runnables.RunnableConfig)
|
||||
for more details.
|
||||
"""
|
||||
"""Configuration for a Runnable."""
|
||||
|
||||
tags: list[str]
|
||||
"""Tags for this call and any sub-calls (e.g. a Chain calling an LLM).
|
||||
|
||||
"""
|
||||
Tags for this call and any sub-calls (eg. a Chain calling an LLM).
|
||||
You can use these to filter calls.
|
||||
"""
|
||||
|
||||
metadata: dict[str, Any]
|
||||
"""Metadata for this call and any sub-calls (e.g. a Chain calling an LLM).
|
||||
|
||||
"""
|
||||
Metadata for this call and any sub-calls (eg. a Chain calling an LLM).
|
||||
Keys should be strings, values should be JSON-serializable.
|
||||
"""
|
||||
|
||||
callbacks: Callbacks
|
||||
"""Callbacks for this call and any sub-calls (e.g. a Chain calling an LLM).
|
||||
|
||||
"""
|
||||
Callbacks for this call and any sub-calls (eg. a Chain calling an LLM).
|
||||
Tags are passed to all callbacks, metadata is passed to handle*Start callbacks.
|
||||
"""
|
||||
|
||||
run_name: str
|
||||
"""Name for the tracer run for this call.
|
||||
|
||||
Defaults to the name of the class."""
|
||||
"""
|
||||
Name for the tracer run for this call. Defaults to the name of the class.
|
||||
"""
|
||||
|
||||
max_concurrency: int | None
|
||||
"""Maximum number of parallel calls to make.
|
||||
|
||||
If not provided, defaults to `ThreadPoolExecutor`'s default.
|
||||
"""
|
||||
Maximum number of parallel calls to make. If not provided, defaults to
|
||||
`ThreadPoolExecutor`'s default.
|
||||
"""
|
||||
|
||||
recursion_limit: int
|
||||
"""Maximum number of times a call can recurse.
|
||||
|
||||
If not provided, defaults to `25`.
|
||||
"""
|
||||
Maximum number of times a call can recurse. If not provided, defaults to `25`.
|
||||
"""
|
||||
|
||||
configurable: dict[str, Any]
|
||||
"""Runtime values for attributes previously made configurable on this `Runnable`,
|
||||
"""
|
||||
Runtime values for attributes previously made configurable on this `Runnable`,
|
||||
or sub-Runnables, through `configurable_fields` or `configurable_alternatives`.
|
||||
|
||||
Check `output_schema` for a description of the attributes that have been made
|
||||
configurable.
|
||||
"""
|
||||
|
||||
run_id: uuid.UUID | None
|
||||
"""Unique identifier for the tracer run for this call.
|
||||
|
||||
If not provided, a new UUID will be generated.
|
||||
"""
|
||||
Unique identifier for the tracer run for this call. If not provided, a new UUID
|
||||
will be generated.
|
||||
"""
|
||||
|
||||
|
||||
|
||||
@@ -28,6 +28,7 @@ from langchain_core.runnables.utils import (
|
||||
coro_with_context,
|
||||
get_unique_config_specs,
|
||||
)
|
||||
from langchain_core.utils.aiter import py_anext
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from langchain_core.callbacks.manager import AsyncCallbackManagerForChainRun
|
||||
@@ -562,7 +563,7 @@ class RunnableWithFallbacks(RunnableSerializable[Input, Output]):
|
||||
child_config,
|
||||
**kwargs,
|
||||
)
|
||||
chunk = await coro_with_context(anext(stream), context)
|
||||
chunk = await coro_with_context(py_anext(stream), context)
|
||||
except self.exceptions_to_handle as e:
|
||||
first_error = e if first_error is None else first_error
|
||||
last_error = e
|
||||
|
||||
@@ -4,6 +4,7 @@ from __future__ import annotations
|
||||
|
||||
import inspect
|
||||
from collections import defaultdict
|
||||
from collections.abc import Callable
|
||||
from dataclasses import dataclass, field
|
||||
from enum import Enum
|
||||
from typing import (
|
||||
@@ -21,7 +22,7 @@ from langchain_core.runnables.base import Runnable, RunnableSerializable
|
||||
from langchain_core.utils.pydantic import _IgnoreUnserializable, is_basemodel_subclass
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable, Sequence
|
||||
from collections.abc import Sequence
|
||||
|
||||
from pydantic import BaseModel
|
||||
|
||||
@@ -641,7 +642,6 @@ class Graph:
|
||||
retry_delay: float = 1.0,
|
||||
frontmatter_config: dict[str, Any] | None = None,
|
||||
base_url: str | None = None,
|
||||
proxies: dict[str, str] | None = None,
|
||||
) -> bytes:
|
||||
"""Draw the graph as a PNG image using Mermaid.
|
||||
|
||||
@@ -674,10 +674,11 @@ class Graph:
|
||||
}
|
||||
```
|
||||
base_url: The base URL of the Mermaid server for rendering via API.
|
||||
proxies: HTTP/HTTPS proxies for requests (e.g. `{"http": "http://127.0.0.1:7890"}`).
|
||||
|
||||
|
||||
Returns:
|
||||
The PNG image as bytes.
|
||||
|
||||
"""
|
||||
# Import locally to prevent circular import
|
||||
from langchain_core.runnables.graph_mermaid import ( # noqa: PLC0415
|
||||
@@ -698,7 +699,6 @@ class Graph:
|
||||
padding=padding,
|
||||
max_retries=max_retries,
|
||||
retry_delay=retry_delay,
|
||||
proxies=proxies,
|
||||
base_url=base_url,
|
||||
)
|
||||
|
||||
|
||||
@@ -7,6 +7,7 @@ from __future__ import annotations
|
||||
|
||||
import math
|
||||
import os
|
||||
from collections.abc import Mapping, Sequence
|
||||
from typing import TYPE_CHECKING, Any
|
||||
|
||||
try:
|
||||
@@ -19,8 +20,6 @@ except ImportError:
|
||||
_HAS_GRANDALF = False
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Mapping, Sequence
|
||||
|
||||
from langchain_core.runnables.graph import Edge as LangEdge
|
||||
|
||||
|
||||
@@ -165,9 +164,6 @@ class AsciiCanvas:
|
||||
y0: y coordinate of the box corner.
|
||||
width: box width.
|
||||
height: box height.
|
||||
|
||||
Raises:
|
||||
ValueError: if box dimensions are invalid.
|
||||
"""
|
||||
if width <= 1 or height <= 1:
|
||||
msg = "Box dimensions should be > 1"
|
||||
|
||||
@@ -81,7 +81,6 @@ def draw_mermaid(
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Returns:
|
||||
Mermaid graph syntax.
|
||||
|
||||
@@ -282,7 +281,6 @@ def draw_mermaid_png(
|
||||
max_retries: int = 1,
|
||||
retry_delay: float = 1.0,
|
||||
base_url: str | None = None,
|
||||
proxies: dict[str, str] | None = None,
|
||||
) -> bytes:
|
||||
"""Draws a Mermaid graph as PNG using provided syntax.
|
||||
|
||||
@@ -295,7 +293,6 @@ def draw_mermaid_png(
|
||||
max_retries: Maximum number of retries (MermaidDrawMethod.API).
|
||||
retry_delay: Delay between retries (MermaidDrawMethod.API).
|
||||
base_url: Base URL for the Mermaid.ink API.
|
||||
proxies: HTTP/HTTPS proxies for requests (e.g. `{"http": "http://127.0.0.1:7890"}`).
|
||||
|
||||
Returns:
|
||||
PNG image bytes.
|
||||
@@ -317,7 +314,6 @@ def draw_mermaid_png(
|
||||
max_retries=max_retries,
|
||||
retry_delay=retry_delay,
|
||||
base_url=base_url,
|
||||
proxies=proxies,
|
||||
)
|
||||
else:
|
||||
supported_methods = ", ".join([m.value for m in MermaidDrawMethod])
|
||||
@@ -409,7 +405,6 @@ def _render_mermaid_using_api(
|
||||
file_type: Literal["jpeg", "png", "webp"] | None = "png",
|
||||
max_retries: int = 1,
|
||||
retry_delay: float = 1.0,
|
||||
proxies: dict[str, str] | None = None,
|
||||
base_url: str | None = None,
|
||||
) -> bytes:
|
||||
"""Renders Mermaid graph using the Mermaid.INK API."""
|
||||
@@ -450,7 +445,7 @@ def _render_mermaid_using_api(
|
||||
|
||||
for attempt in range(max_retries + 1):
|
||||
try:
|
||||
response = requests.get(image_url, timeout=10, proxies=proxies)
|
||||
response = requests.get(image_url, timeout=10)
|
||||
if response.status_code == requests.codes.ok:
|
||||
img_bytes = response.content
|
||||
if output_file_path is not None:
|
||||
|
||||
@@ -201,8 +201,7 @@ class PngDrawer:
|
||||
viz, start, end, str(data) if data is not None else None, cond
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def update_styles(viz: Any, graph: Graph) -> None:
|
||||
def update_styles(self, viz: Any, graph: Graph) -> None:
|
||||
"""Update the styles of the entrypoint and END nodes.
|
||||
|
||||
Args:
|
||||
|
||||
@@ -539,7 +539,7 @@ class RunnableWithMessageHistory(RunnableBindingBase): # type: ignore[no-redef]
|
||||
hist: BaseChatMessageHistory = config["configurable"]["message_history"]
|
||||
|
||||
# Get the input messages
|
||||
inputs = load(run.inputs, allowed_objects="all")
|
||||
inputs = load(run.inputs)
|
||||
input_messages = self._get_input_messages(inputs)
|
||||
# If historic messages were prepended to the input messages, remove them to
|
||||
# avoid adding duplicate messages to history.
|
||||
@@ -548,7 +548,7 @@ class RunnableWithMessageHistory(RunnableBindingBase): # type: ignore[no-redef]
|
||||
input_messages = input_messages[len(historic_messages) :]
|
||||
|
||||
# Get the output messages
|
||||
output_val = load(run.outputs, allowed_objects="all")
|
||||
output_val = load(run.outputs)
|
||||
output_messages = self._get_output_messages(output_val)
|
||||
hist.add_messages(input_messages + output_messages)
|
||||
|
||||
@@ -556,7 +556,7 @@ class RunnableWithMessageHistory(RunnableBindingBase): # type: ignore[no-redef]
|
||||
hist: BaseChatMessageHistory = config["configurable"]["message_history"]
|
||||
|
||||
# Get the input messages
|
||||
inputs = load(run.inputs, allowed_objects="all")
|
||||
inputs = load(run.inputs)
|
||||
input_messages = self._get_input_messages(inputs)
|
||||
# If historic messages were prepended to the input messages, remove them to
|
||||
# avoid adding duplicate messages to history.
|
||||
@@ -565,7 +565,7 @@ class RunnableWithMessageHistory(RunnableBindingBase): # type: ignore[no-redef]
|
||||
input_messages = input_messages[len(historic_messages) :]
|
||||
|
||||
# Get the output messages
|
||||
output_val = load(run.outputs, allowed_objects="all")
|
||||
output_val = load(run.outputs)
|
||||
output_messages = self._get_output_messages(output_val)
|
||||
await hist.aadd_messages(input_messages + output_messages)
|
||||
|
||||
|
||||
@@ -33,7 +33,7 @@ from langchain_core.runnables.utils import (
|
||||
AddableDict,
|
||||
ConfigurableFieldSpec,
|
||||
)
|
||||
from langchain_core.utils.aiter import atee
|
||||
from langchain_core.utils.aiter import atee, py_anext
|
||||
from langchain_core.utils.iter import safetee
|
||||
from langchain_core.utils.pydantic import create_model_v2
|
||||
|
||||
@@ -614,7 +614,7 @@ class RunnableAssign(RunnableSerializable[dict[str, Any], dict[str, Any]]):
|
||||
)
|
||||
# start map output stream
|
||||
first_map_chunk_task: asyncio.Task = asyncio.create_task(
|
||||
anext(map_output, None),
|
||||
py_anext(map_output, None), # type: ignore[arg-type]
|
||||
)
|
||||
# consume passthrough stream
|
||||
async for chunk in for_passthrough:
|
||||
@@ -753,19 +753,25 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
|
||||
return AddableDict(picked)
|
||||
return None
|
||||
|
||||
def _invoke(
|
||||
self,
|
||||
value: dict[str, Any],
|
||||
) -> dict[str, Any]:
|
||||
return self._pick(value)
|
||||
|
||||
@override
|
||||
def invoke(
|
||||
self,
|
||||
input: dict[str, Any],
|
||||
config: RunnableConfig | None = None,
|
||||
**kwargs: Any,
|
||||
) -> Any:
|
||||
return self._call_with_config(self._pick, input, config, **kwargs)
|
||||
) -> dict[str, Any]:
|
||||
return self._call_with_config(self._invoke, input, config, **kwargs)
|
||||
|
||||
async def _ainvoke(
|
||||
self,
|
||||
value: dict[str, Any],
|
||||
) -> Any:
|
||||
) -> dict[str, Any]:
|
||||
return self._pick(value)
|
||||
|
||||
@override
|
||||
@@ -774,13 +780,13 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
|
||||
input: dict[str, Any],
|
||||
config: RunnableConfig | None = None,
|
||||
**kwargs: Any,
|
||||
) -> Any:
|
||||
) -> dict[str, Any]:
|
||||
return await self._acall_with_config(self._ainvoke, input, config, **kwargs)
|
||||
|
||||
def _transform(
|
||||
self,
|
||||
chunks: Iterator[dict[str, Any]],
|
||||
) -> Iterator[Any]:
|
||||
) -> Iterator[dict[str, Any]]:
|
||||
for chunk in chunks:
|
||||
picked = self._pick(chunk)
|
||||
if picked is not None:
|
||||
@@ -792,7 +798,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
|
||||
input: Iterator[dict[str, Any]],
|
||||
config: RunnableConfig | None = None,
|
||||
**kwargs: Any,
|
||||
) -> Iterator[Any]:
|
||||
) -> Iterator[dict[str, Any]]:
|
||||
yield from self._transform_stream_with_config(
|
||||
input, self._transform, config, **kwargs
|
||||
)
|
||||
@@ -800,7 +806,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
|
||||
async def _atransform(
|
||||
self,
|
||||
chunks: AsyncIterator[dict[str, Any]],
|
||||
) -> AsyncIterator[Any]:
|
||||
) -> AsyncIterator[dict[str, Any]]:
|
||||
async for chunk in chunks:
|
||||
picked = self._pick(chunk)
|
||||
if picked is not None:
|
||||
@@ -812,7 +818,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
|
||||
input: AsyncIterator[dict[str, Any]],
|
||||
config: RunnableConfig | None = None,
|
||||
**kwargs: Any,
|
||||
) -> AsyncIterator[Any]:
|
||||
) -> AsyncIterator[dict[str, Any]]:
|
||||
async for chunk in self._atransform_stream_with_config(
|
||||
input, self._atransform, config, **kwargs
|
||||
):
|
||||
@@ -824,7 +830,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
|
||||
input: dict[str, Any],
|
||||
config: RunnableConfig | None = None,
|
||||
**kwargs: Any,
|
||||
) -> Iterator[Any]:
|
||||
) -> Iterator[dict[str, Any]]:
|
||||
return self.transform(iter([input]), config, **kwargs)
|
||||
|
||||
@override
|
||||
@@ -833,7 +839,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
|
||||
input: dict[str, Any],
|
||||
config: RunnableConfig | None = None,
|
||||
**kwargs: Any,
|
||||
) -> AsyncIterator[Any]:
|
||||
) -> AsyncIterator[dict[str, Any]]:
|
||||
async def input_aiter() -> AsyncIterator[dict[str, Any]]:
|
||||
yield input
|
||||
|
||||
|
||||
@@ -7,7 +7,8 @@ import asyncio
|
||||
import inspect
|
||||
import sys
|
||||
import textwrap
|
||||
from collections.abc import Mapping, Sequence
|
||||
from collections.abc import Callable, Mapping, Sequence
|
||||
from contextvars import Context
|
||||
from functools import lru_cache
|
||||
from inspect import signature
|
||||
from itertools import groupby
|
||||
@@ -30,11 +31,9 @@ if TYPE_CHECKING:
|
||||
AsyncIterable,
|
||||
AsyncIterator,
|
||||
Awaitable,
|
||||
Callable,
|
||||
Coroutine,
|
||||
Iterable,
|
||||
)
|
||||
from contextvars import Context
|
||||
|
||||
from langchain_core.runnables.schema import StreamEvent
|
||||
|
||||
|
||||
@@ -22,7 +22,6 @@ from typing import (
|
||||
get_type_hints,
|
||||
)
|
||||
|
||||
import typing_extensions
|
||||
from pydantic import (
|
||||
BaseModel,
|
||||
ConfigDict,
|
||||
@@ -32,7 +31,6 @@ from pydantic import (
|
||||
ValidationError,
|
||||
validate_arguments,
|
||||
)
|
||||
from pydantic.fields import FieldInfo
|
||||
from pydantic.v1 import BaseModel as BaseModelV1
|
||||
from pydantic.v1 import ValidationError as ValidationErrorV1
|
||||
from pydantic.v1 import validate_arguments as validate_arguments_v1
|
||||
@@ -96,14 +94,12 @@ def _is_annotated_type(typ: type[Any]) -> bool:
|
||||
Returns:
|
||||
`True` if the type is an Annotated type, `False` otherwise.
|
||||
"""
|
||||
return get_origin(typ) in {typing.Annotated, typing_extensions.Annotated}
|
||||
return get_origin(typ) is typing.Annotated
|
||||
|
||||
|
||||
def _get_annotation_description(arg_type: type) -> str | None:
|
||||
"""Extract description from an Annotated type.
|
||||
|
||||
Checks for string annotations and `FieldInfo` objects with descriptions.
|
||||
|
||||
Args:
|
||||
arg_type: The type to extract description from.
|
||||
|
||||
@@ -115,8 +111,6 @@ def _get_annotation_description(arg_type: type) -> str | None:
|
||||
for annotation in annotated_args[1:]:
|
||||
if isinstance(annotation, str):
|
||||
return annotation
|
||||
if isinstance(annotation, FieldInfo) and annotation.description:
|
||||
return annotation.description
|
||||
return None
|
||||
|
||||
|
||||
@@ -392,8 +386,6 @@ class ToolException(Exception): # noqa: N818
|
||||
|
||||
ArgsSchema = TypeBaseModel | dict[str, Any]
|
||||
|
||||
_EMPTY_SET: frozenset[str] = frozenset()
|
||||
|
||||
|
||||
class BaseTool(RunnableSerializable[str | dict | ToolCall, Any]):
|
||||
"""Base class for all LangChain tools.
|
||||
@@ -502,24 +494,6 @@ class ChildTool(BaseTool):
|
||||
two-tuple corresponding to the `(content, artifact)` of a `ToolMessage`.
|
||||
"""
|
||||
|
||||
extras: dict[str, Any] | None = None
|
||||
"""Optional provider-specific extra fields for the tool.
|
||||
|
||||
This is used to pass provider-specific configuration that doesn't fit into
|
||||
standard tool fields.
|
||||
|
||||
Example:
|
||||
Anthropic-specific fields like [`cache_control`](https://docs.langchain.com/oss/python/integrations/chat/anthropic#prompt-caching),
|
||||
[`defer_loading`](https://docs.langchain.com/oss/python/integrations/chat/anthropic#tool-search),
|
||||
or `input_examples`.
|
||||
|
||||
```python
|
||||
@tool(extras={"defer_loading": True, "cache_control": {"type": "ephemeral"}})
|
||||
def my_tool(x: str) -> str:
|
||||
return x
|
||||
```
|
||||
"""
|
||||
|
||||
def __init__(self, **kwargs: Any) -> None:
|
||||
"""Initialize the tool.
|
||||
|
||||
@@ -595,11 +569,6 @@ class ChildTool(BaseTool):
|
||||
self.name, full_schema, fields, fn_description=self.description
|
||||
)
|
||||
|
||||
@functools.cached_property
|
||||
def _injected_args_keys(self) -> frozenset[str]:
|
||||
# base implementation doesn't manage injected args
|
||||
return _EMPTY_SET
|
||||
|
||||
# --- Runnable ---
|
||||
|
||||
@override
|
||||
@@ -659,7 +628,6 @@ class ChildTool(BaseTool):
|
||||
TypeError: If `args_schema` is not a Pydantic `BaseModel` or dict.
|
||||
"""
|
||||
input_args = self.args_schema
|
||||
|
||||
if isinstance(tool_input, str):
|
||||
if input_args is not None:
|
||||
if isinstance(input_args, dict):
|
||||
@@ -677,12 +645,10 @@ class ChildTool(BaseTool):
|
||||
msg = f"args_schema must be a Pydantic BaseModel, got {input_args}"
|
||||
raise TypeError(msg)
|
||||
return tool_input
|
||||
|
||||
if input_args is not None:
|
||||
if isinstance(input_args, dict):
|
||||
return tool_input
|
||||
if issubclass(input_args, BaseModel):
|
||||
# Check args_schema for InjectedToolCallId
|
||||
for k, v in get_all_basemodel_annotations(input_args).items():
|
||||
if _is_injected_arg_type(v, injected_type=InjectedToolCallId):
|
||||
if tool_call_id is None:
|
||||
@@ -698,7 +664,6 @@ class ChildTool(BaseTool):
|
||||
result = input_args.model_validate(tool_input)
|
||||
result_dict = result.model_dump()
|
||||
elif issubclass(input_args, BaseModelV1):
|
||||
# Check args_schema for InjectedToolCallId
|
||||
for k, v in get_all_basemodel_annotations(input_args).items():
|
||||
if _is_injected_arg_type(v, injected_type=InjectedToolCallId):
|
||||
if tool_call_id is None:
|
||||
@@ -718,47 +683,9 @@ class ChildTool(BaseTool):
|
||||
f"args_schema must be a Pydantic BaseModel, got {self.args_schema}"
|
||||
)
|
||||
raise NotImplementedError(msg)
|
||||
|
||||
# Include fields from tool_input, plus fields with explicit defaults.
|
||||
# This applies Pydantic defaults (like Field(default=1)) while excluding
|
||||
# synthetic "args"/"kwargs" fields that Pydantic creates for *args/**kwargs.
|
||||
field_info = get_fields(input_args)
|
||||
validated_input = {}
|
||||
for k in result_dict:
|
||||
if k in tool_input:
|
||||
# Field was provided in input - include it (validated)
|
||||
validated_input[k] = getattr(result, k)
|
||||
elif k in field_info and k not in ("args", "kwargs"):
|
||||
# Check if field has an explicit default defined in the schema.
|
||||
# Exclude "args"/"kwargs" as these are synthetic fields for variadic
|
||||
# parameters that should not be passed as keyword arguments.
|
||||
fi = field_info[k]
|
||||
# Pydantic v2 uses is_required() method, v1 uses required attribute
|
||||
has_default = (
|
||||
not fi.is_required()
|
||||
if hasattr(fi, "is_required")
|
||||
else not getattr(fi, "required", True)
|
||||
)
|
||||
if has_default:
|
||||
validated_input[k] = getattr(result, k)
|
||||
|
||||
for k in self._injected_args_keys:
|
||||
if k in tool_input:
|
||||
validated_input[k] = tool_input[k]
|
||||
elif k == "tool_call_id":
|
||||
if tool_call_id is None:
|
||||
msg = (
|
||||
"When tool includes an InjectedToolCallId "
|
||||
"argument, tool must always be invoked with a full "
|
||||
"model ToolCall of the form: {'args': {...}, "
|
||||
"'name': '...', 'type': 'tool_call', "
|
||||
"'tool_call_id': '...'}"
|
||||
)
|
||||
raise ValueError(msg)
|
||||
validated_input[k] = tool_call_id
|
||||
|
||||
return validated_input
|
||||
|
||||
return {
|
||||
k: getattr(result, k) for k, v in result_dict.items() if k in tool_input
|
||||
}
|
||||
return tool_input
|
||||
|
||||
@abstractmethod
|
||||
@@ -926,7 +853,6 @@ class ChildTool(BaseTool):
|
||||
name=run_name,
|
||||
run_id=run_id,
|
||||
inputs=filtered_tool_input,
|
||||
tool_call_id=tool_call_id,
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
@@ -1054,7 +980,6 @@ class ChildTool(BaseTool):
|
||||
name=run_name,
|
||||
run_id=run_id,
|
||||
inputs=filtered_tool_input,
|
||||
tool_call_id=tool_call_id,
|
||||
**kwargs,
|
||||
)
|
||||
content = None
|
||||
|
||||
@@ -23,7 +23,6 @@ def tool(
|
||||
response_format: Literal["content", "content_and_artifact"] = "content",
|
||||
parse_docstring: bool = False,
|
||||
error_on_invalid_docstring: bool = True,
|
||||
extras: dict[str, Any] | None = None,
|
||||
) -> Callable[[Callable | Runnable], BaseTool]: ...
|
||||
|
||||
|
||||
@@ -39,7 +38,6 @@ def tool(
|
||||
response_format: Literal["content", "content_and_artifact"] = "content",
|
||||
parse_docstring: bool = False,
|
||||
error_on_invalid_docstring: bool = True,
|
||||
extras: dict[str, Any] | None = None,
|
||||
) -> BaseTool: ...
|
||||
|
||||
|
||||
@@ -54,7 +52,6 @@ def tool(
|
||||
response_format: Literal["content", "content_and_artifact"] = "content",
|
||||
parse_docstring: bool = False,
|
||||
error_on_invalid_docstring: bool = True,
|
||||
extras: dict[str, Any] | None = None,
|
||||
) -> BaseTool: ...
|
||||
|
||||
|
||||
@@ -69,7 +66,6 @@ def tool(
|
||||
response_format: Literal["content", "content_and_artifact"] = "content",
|
||||
parse_docstring: bool = False,
|
||||
error_on_invalid_docstring: bool = True,
|
||||
extras: dict[str, Any] | None = None,
|
||||
) -> Callable[[Callable | Runnable], BaseTool]: ...
|
||||
|
||||
|
||||
@@ -84,7 +80,6 @@ def tool(
|
||||
response_format: Literal["content", "content_and_artifact"] = "content",
|
||||
parse_docstring: bool = False,
|
||||
error_on_invalid_docstring: bool = True,
|
||||
extras: dict[str, Any] | None = None,
|
||||
) -> BaseTool | Callable[[Callable | Runnable], BaseTool]:
|
||||
"""Convert Python functions and `Runnables` to LangChain tools.
|
||||
|
||||
@@ -135,15 +130,6 @@ def tool(
|
||||
parse parameter descriptions from Google Style function docstrings.
|
||||
error_on_invalid_docstring: If `parse_docstring` is provided, configure
|
||||
whether to raise `ValueError` on invalid Google Style docstrings.
|
||||
extras: Optional provider-specific extra fields for the tool.
|
||||
|
||||
Used to pass configuration that doesn't fit into standard tool fields.
|
||||
Chat models should process known extras when constructing model payloads.
|
||||
|
||||
!!! example
|
||||
|
||||
For example, Anthropic-specific fields like `cache_control`,
|
||||
`defer_loading`, or `input_examples`.
|
||||
|
||||
Raises:
|
||||
ValueError: If too many positional arguments are provided (e.g. violating the
|
||||
@@ -306,7 +292,6 @@ def tool(
|
||||
response_format=response_format,
|
||||
parse_docstring=parse_docstring,
|
||||
error_on_invalid_docstring=error_on_invalid_docstring,
|
||||
extras=extras,
|
||||
)
|
||||
# If someone doesn't want a schema applied, we must treat it as
|
||||
# a simple string->string function
|
||||
@@ -323,7 +308,6 @@ def tool(
|
||||
return_direct=return_direct,
|
||||
coroutine=coroutine,
|
||||
response_format=response_format,
|
||||
extras=extras,
|
||||
)
|
||||
|
||||
return _tool_factory
|
||||
|
||||
@@ -2,21 +2,22 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from functools import partial
|
||||
from typing import TYPE_CHECKING, Literal
|
||||
|
||||
from pydantic import BaseModel, Field
|
||||
|
||||
from langchain_core.callbacks import Callbacks
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.prompts import (
|
||||
BasePromptTemplate,
|
||||
PromptTemplate,
|
||||
aformat_document,
|
||||
format_document,
|
||||
)
|
||||
from langchain_core.tools.structured import StructuredTool
|
||||
from langchain_core.tools.simple import Tool
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from langchain_core.callbacks import Callbacks
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.retrievers import BaseRetriever
|
||||
|
||||
|
||||
@@ -26,6 +27,43 @@ class RetrieverInput(BaseModel):
|
||||
query: str = Field(description="query to look up in retriever")
|
||||
|
||||
|
||||
def _get_relevant_documents(
|
||||
query: str,
|
||||
retriever: BaseRetriever,
|
||||
document_prompt: BasePromptTemplate,
|
||||
document_separator: str,
|
||||
callbacks: Callbacks = None,
|
||||
response_format: Literal["content", "content_and_artifact"] = "content",
|
||||
) -> str | tuple[str, list[Document]]:
|
||||
docs = retriever.invoke(query, config={"callbacks": callbacks})
|
||||
content = document_separator.join(
|
||||
format_document(doc, document_prompt) for doc in docs
|
||||
)
|
||||
if response_format == "content_and_artifact":
|
||||
return (content, docs)
|
||||
|
||||
return content
|
||||
|
||||
|
||||
async def _aget_relevant_documents(
|
||||
query: str,
|
||||
retriever: BaseRetriever,
|
||||
document_prompt: BasePromptTemplate,
|
||||
document_separator: str,
|
||||
callbacks: Callbacks = None,
|
||||
response_format: Literal["content", "content_and_artifact"] = "content",
|
||||
) -> str | tuple[str, list[Document]]:
|
||||
docs = await retriever.ainvoke(query, config={"callbacks": callbacks})
|
||||
content = document_separator.join(
|
||||
[await aformat_document(doc, document_prompt) for doc in docs]
|
||||
)
|
||||
|
||||
if response_format == "content_and_artifact":
|
||||
return (content, docs)
|
||||
|
||||
return content
|
||||
|
||||
|
||||
def create_retriever_tool(
|
||||
retriever: BaseRetriever,
|
||||
name: str,
|
||||
@@ -34,7 +72,7 @@ def create_retriever_tool(
|
||||
document_prompt: BasePromptTemplate | None = None,
|
||||
document_separator: str = "\n\n",
|
||||
response_format: Literal["content", "content_and_artifact"] = "content",
|
||||
) -> StructuredTool:
|
||||
) -> Tool:
|
||||
r"""Create a tool to do retrieval of documents.
|
||||
|
||||
Args:
|
||||
@@ -55,31 +93,22 @@ def create_retriever_tool(
|
||||
Returns:
|
||||
Tool class to pass to an agent.
|
||||
"""
|
||||
document_prompt_ = document_prompt or PromptTemplate.from_template("{page_content}")
|
||||
|
||||
def func(
|
||||
query: str, callbacks: Callbacks = None
|
||||
) -> str | tuple[str, list[Document]]:
|
||||
docs = retriever.invoke(query, config={"callbacks": callbacks})
|
||||
content = document_separator.join(
|
||||
format_document(doc, document_prompt_) for doc in docs
|
||||
)
|
||||
if response_format == "content_and_artifact":
|
||||
return (content, docs)
|
||||
return content
|
||||
|
||||
async def afunc(
|
||||
query: str, callbacks: Callbacks = None
|
||||
) -> str | tuple[str, list[Document]]:
|
||||
docs = await retriever.ainvoke(query, config={"callbacks": callbacks})
|
||||
content = document_separator.join(
|
||||
[await aformat_document(doc, document_prompt_) for doc in docs]
|
||||
)
|
||||
if response_format == "content_and_artifact":
|
||||
return (content, docs)
|
||||
return content
|
||||
|
||||
return StructuredTool(
|
||||
document_prompt = document_prompt or PromptTemplate.from_template("{page_content}")
|
||||
func = partial(
|
||||
_get_relevant_documents,
|
||||
retriever=retriever,
|
||||
document_prompt=document_prompt,
|
||||
document_separator=document_separator,
|
||||
response_format=response_format,
|
||||
)
|
||||
afunc = partial(
|
||||
_aget_relevant_documents,
|
||||
retriever=retriever,
|
||||
document_prompt=document_prompt,
|
||||
document_separator=document_separator,
|
||||
response_format=response_format,
|
||||
)
|
||||
return Tool(
|
||||
name=name,
|
||||
description=description,
|
||||
func=func,
|
||||
|
||||
@@ -2,7 +2,6 @@
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import functools
|
||||
import textwrap
|
||||
from collections.abc import Awaitable, Callable
|
||||
from inspect import signature
|
||||
@@ -22,12 +21,10 @@ from langchain_core.callbacks import (
|
||||
)
|
||||
from langchain_core.runnables import RunnableConfig, run_in_executor
|
||||
from langchain_core.tools.base import (
|
||||
_EMPTY_SET,
|
||||
FILTERED_ARGS,
|
||||
ArgsSchema,
|
||||
BaseTool,
|
||||
_get_runnable_config_param,
|
||||
_is_injected_arg_type,
|
||||
create_schema_from_function,
|
||||
)
|
||||
from langchain_core.utils.pydantic import is_basemodel_subclass
|
||||
@@ -244,17 +241,6 @@ class StructuredTool(BaseTool):
|
||||
**kwargs,
|
||||
)
|
||||
|
||||
@functools.cached_property
|
||||
def _injected_args_keys(self) -> frozenset[str]:
|
||||
fn = self.func or self.coroutine
|
||||
if fn is None:
|
||||
return _EMPTY_SET
|
||||
return frozenset(
|
||||
k
|
||||
for k, v in signature(fn).parameters.items()
|
||||
if _is_injected_arg_type(v.annotation)
|
||||
)
|
||||
|
||||
|
||||
def _filter_schema_args(func: Callable) -> list[str]:
|
||||
filter_args = list(FILTERED_ARGS)
|
||||
|
||||
@@ -15,6 +15,12 @@ from typing import (
|
||||
|
||||
from langchain_core.exceptions import TracerException
|
||||
from langchain_core.load import dumpd
|
||||
from langchain_core.outputs import (
|
||||
ChatGeneration,
|
||||
ChatGenerationChunk,
|
||||
GenerationChunk,
|
||||
LLMResult,
|
||||
)
|
||||
from langchain_core.tracers.schemas import Run
|
||||
|
||||
if TYPE_CHECKING:
|
||||
@@ -25,12 +31,6 @@ if TYPE_CHECKING:
|
||||
|
||||
from langchain_core.documents import Document
|
||||
from langchain_core.messages import BaseMessage
|
||||
from langchain_core.outputs import (
|
||||
ChatGeneration,
|
||||
ChatGenerationChunk,
|
||||
GenerationChunk,
|
||||
LLMResult,
|
||||
)
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -284,16 +284,6 @@ class _TracerCore(ABC):
|
||||
llm_run.end_time = datetime.now(timezone.utc)
|
||||
llm_run.events.append({"name": "end", "time": llm_run.end_time})
|
||||
|
||||
tool_call_count = 0
|
||||
for generations in response.generations:
|
||||
for generation in generations:
|
||||
if hasattr(generation, "message"):
|
||||
msg = generation.message
|
||||
if hasattr(msg, "tool_calls") and msg.tool_calls:
|
||||
tool_call_count += len(msg.tool_calls)
|
||||
if tool_call_count > 0:
|
||||
llm_run.extra["tool_call_count"] = tool_call_count
|
||||
|
||||
return llm_run
|
||||
|
||||
def _errored_llm_run(
|
||||
|
||||
@@ -154,8 +154,8 @@ class EvaluatorCallbackHandler(BaseTracer):
|
||||
res
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def _select_eval_results(
|
||||
self,
|
||||
results: EvaluationResult | EvaluationResults,
|
||||
) -> list[EvaluationResult]:
|
||||
if isinstance(results, EvaluationResult):
|
||||
|
||||
@@ -12,7 +12,7 @@ from typing import (
|
||||
TypeVar,
|
||||
cast,
|
||||
)
|
||||
from uuid import UUID
|
||||
from uuid import UUID, uuid4
|
||||
|
||||
from typing_extensions import NotRequired, override
|
||||
|
||||
@@ -42,8 +42,7 @@ from langchain_core.tracers.log_stream import (
|
||||
_astream_log_implementation,
|
||||
)
|
||||
from langchain_core.tracers.memory_stream import _MemoryStream
|
||||
from langchain_core.utils.aiter import aclosing
|
||||
from langchain_core.utils.uuid import uuid7
|
||||
from langchain_core.utils.aiter import aclosing, py_anext
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import AsyncIterator, Iterator, Sequence
|
||||
@@ -189,7 +188,7 @@ class _AstreamEventsCallbackHandler(AsyncCallbackHandler, _StreamingCallbackHand
|
||||
# atomic check and set
|
||||
tap = self.is_tapped.setdefault(run_id, sentinel)
|
||||
# wait for first chunk
|
||||
first = await anext(output, sentinel)
|
||||
first = await py_anext(output, default=sentinel)
|
||||
if first is sentinel:
|
||||
return
|
||||
# get run info
|
||||
@@ -426,10 +425,6 @@ class _AstreamEventsCallbackHandler(AsyncCallbackHandler, _StreamingCallbackHand
|
||||
"""Run on new output token. Only available when streaming is enabled.
|
||||
|
||||
For both chat models and non-chat models (legacy LLMs).
|
||||
|
||||
Raises:
|
||||
ValueError: If the run type is not `llm` or `chat_model`.
|
||||
AssertionError: If the run ID is not found in the run map.
|
||||
"""
|
||||
run_info = self.run_map.get(run_id)
|
||||
chunk_: GenerationChunk | BaseMessageChunk
|
||||
@@ -711,7 +706,11 @@ class _AstreamEventsCallbackHandler(AsyncCallbackHandler, _StreamingCallbackHand
|
||||
|
||||
@override
|
||||
async def on_tool_end(self, output: Any, *, run_id: UUID, **kwargs: Any) -> None:
|
||||
"""End a trace for a tool run."""
|
||||
"""End a trace for a tool run.
|
||||
|
||||
Raises:
|
||||
AssertionError: If the run ID is a tool call and does not have inputs
|
||||
"""
|
||||
run_info, inputs = self._get_tool_run_info_with_inputs(run_id)
|
||||
|
||||
self._send(
|
||||
@@ -1007,11 +1006,7 @@ async def _astream_events_implementation_v2(
|
||||
|
||||
# Assign the stream handler to the config
|
||||
config = ensure_config(config)
|
||||
if "run_id" in config:
|
||||
run_id = cast("UUID", config["run_id"])
|
||||
else:
|
||||
run_id = uuid7()
|
||||
config["run_id"] = run_id
|
||||
run_id = cast("UUID", config.setdefault("run_id", uuid4()))
|
||||
callbacks = config.get("callbacks")
|
||||
if callbacks is None:
|
||||
config["callbacks"] = [event_streamer]
|
||||
|
||||
@@ -21,7 +21,6 @@ from typing_extensions import override
|
||||
|
||||
from langchain_core.env import get_runtime_environment
|
||||
from langchain_core.load import dumpd
|
||||
from langchain_core.messages.ai import UsageMetadata, add_usage
|
||||
from langchain_core.tracers.base import BaseTracer
|
||||
from langchain_core.tracers.schemas import Run
|
||||
|
||||
@@ -70,32 +69,6 @@ def _get_executor() -> ThreadPoolExecutor:
|
||||
return _EXECUTOR
|
||||
|
||||
|
||||
def _get_usage_metadata_from_generations(
|
||||
generations: list[list[dict[str, Any]]],
|
||||
) -> UsageMetadata | None:
|
||||
"""Extract and aggregate `usage_metadata` from generations.
|
||||
|
||||
Iterates through generations to find and aggregate all `usage_metadata` found in
|
||||
messages. This is typically present in chat model outputs.
|
||||
|
||||
Args:
|
||||
generations: List of generation batches, where each batch is a list
|
||||
of generation dicts that may contain a `'message'` key with
|
||||
`'usage_metadata'`.
|
||||
|
||||
Returns:
|
||||
The aggregated `usage_metadata` dict if found, otherwise `None`.
|
||||
"""
|
||||
output: UsageMetadata | None = None
|
||||
for generation_batch in generations:
|
||||
for generation in generation_batch:
|
||||
if isinstance(generation, dict) and "message" in generation:
|
||||
message = generation["message"]
|
||||
if isinstance(message, dict) and "usage_metadata" in message:
|
||||
output = add_usage(output, message["usage_metadata"])
|
||||
return output
|
||||
|
||||
|
||||
class LangChainTracer(BaseTracer):
|
||||
"""Implementation of the SharedTracer that POSTS to the LangChain endpoint."""
|
||||
|
||||
@@ -247,8 +220,7 @@ class LangChainTracer(BaseTracer):
|
||||
log_error_once("post", e)
|
||||
raise
|
||||
|
||||
@staticmethod
|
||||
def _update_run_single(run: Run) -> None:
|
||||
def _update_run_single(self, run: Run) -> None:
|
||||
"""Update a run."""
|
||||
if run.extra.get("__disabled"):
|
||||
return
|
||||
@@ -294,15 +266,6 @@ class LangChainTracer(BaseTracer):
|
||||
|
||||
def _on_llm_end(self, run: Run) -> None:
|
||||
"""Process the LLM Run."""
|
||||
# Extract usage_metadata from outputs and store in extra.metadata
|
||||
if run.outputs and "generations" in run.outputs:
|
||||
usage_metadata = _get_usage_metadata_from_generations(
|
||||
run.outputs["generations"]
|
||||
)
|
||||
if usage_metadata is not None:
|
||||
if "metadata" not in run.extra:
|
||||
run.extra["metadata"] = {}
|
||||
run.extra["metadata"]["usage_metadata"] = usage_metadata
|
||||
self._update_run_single(run)
|
||||
|
||||
def _on_llm_error(self, run: Run) -> None:
|
||||
@@ -313,28 +276,15 @@ class LangChainTracer(BaseTracer):
|
||||
"""Process the Chain Run upon start."""
|
||||
if run.parent_run_id is None:
|
||||
run.reference_example_id = self.example_id
|
||||
# Skip persisting if inputs are deferred (e.g., iterator/generator inputs).
|
||||
# The run will be posted when _on_chain_end is called with realized inputs.
|
||||
if not run.extra.get("defers_inputs"):
|
||||
self._persist_run_single(run)
|
||||
self._persist_run_single(run)
|
||||
|
||||
def _on_chain_end(self, run: Run) -> None:
|
||||
"""Process the Chain Run."""
|
||||
# If inputs were deferred, persist (POST) the run now that inputs are realized.
|
||||
# Otherwise, update (PATCH) the existing run.
|
||||
if run.extra.get("defers_inputs"):
|
||||
self._persist_run_single(run)
|
||||
else:
|
||||
self._update_run_single(run)
|
||||
self._update_run_single(run)
|
||||
|
||||
def _on_chain_error(self, run: Run) -> None:
|
||||
"""Process the Chain Run upon error."""
|
||||
# If inputs were deferred, persist (POST) the run now that inputs are realized.
|
||||
# Otherwise, update (PATCH) the existing run.
|
||||
if run.extra.get("defers_inputs"):
|
||||
self._persist_run_single(run)
|
||||
else:
|
||||
self._update_run_single(run)
|
||||
self._update_run_single(run)
|
||||
|
||||
def _on_tool_start(self, run: Run) -> None:
|
||||
"""Process the Tool Run upon start."""
|
||||
|
||||
@@ -563,7 +563,7 @@ def _get_standardized_inputs(
|
||||
)
|
||||
raise NotImplementedError(msg)
|
||||
|
||||
inputs = load(run.inputs, allowed_objects="all")
|
||||
inputs = load(run.inputs)
|
||||
|
||||
if run.run_type in {"retriever", "llm", "chat_model"}:
|
||||
return inputs
|
||||
@@ -595,7 +595,7 @@ def _get_standardized_outputs(
|
||||
Returns:
|
||||
An output if returned, otherwise a None
|
||||
"""
|
||||
outputs = load(run.outputs, allowed_objects="all")
|
||||
outputs = load(run.outputs)
|
||||
if schema_format == "original":
|
||||
if run.run_type == "prompt" and "output" in outputs:
|
||||
# These were previously dumped before the tracer.
|
||||
|
||||
@@ -58,7 +58,7 @@ def merge_dicts(left: dict[str, Any], *others: dict[str, Any]) -> dict[str, Any]
|
||||
# "all dicts."
|
||||
# )
|
||||
if (right_k == "index" and merged[right_k].startswith("lc_")) or (
|
||||
right_k in {"id", "output_version", "model_provider"}
|
||||
right_k in ("id", "output_version", "model_provider")
|
||||
and merged[right_k] == right_v
|
||||
):
|
||||
continue
|
||||
|
||||
@@ -26,15 +26,13 @@ from typing import (
|
||||
|
||||
from typing_extensions import override
|
||||
|
||||
from langchain_core._api.deprecation import deprecated
|
||||
|
||||
T = TypeVar("T")
|
||||
|
||||
_no_default = object()
|
||||
|
||||
|
||||
# https://github.com/python/cpython/blob/main/Lib/test/test_asyncgen.py#L54
|
||||
@deprecated(since="1.1.2", removal="2.0.0")
|
||||
# before 3.10, the builtin anext() was not available
|
||||
def py_anext(
|
||||
iterator: AsyncIterator[T], default: T | Any = _no_default
|
||||
) -> Awaitable[T | Any | None]:
|
||||
@@ -130,7 +128,7 @@ async def tee_peer(
|
||||
if buffer:
|
||||
continue
|
||||
try:
|
||||
item = await anext(iterator)
|
||||
item = await iterator.__anext__()
|
||||
except StopAsyncIteration:
|
||||
break
|
||||
else:
|
||||
|
||||
@@ -8,7 +8,7 @@ import logging
|
||||
import types
|
||||
import typing
|
||||
import uuid
|
||||
from collections.abc import Mapping
|
||||
from collections.abc import Callable
|
||||
from typing import (
|
||||
TYPE_CHECKING,
|
||||
Annotated,
|
||||
@@ -18,10 +18,8 @@ from typing import (
|
||||
cast,
|
||||
get_args,
|
||||
get_origin,
|
||||
get_type_hints,
|
||||
)
|
||||
|
||||
import typing_extensions
|
||||
from pydantic import BaseModel
|
||||
from pydantic.v1 import BaseModel as BaseModelV1
|
||||
from pydantic.v1 import Field as Field_v1
|
||||
@@ -35,8 +33,6 @@ from langchain_core.utils.json_schema import dereference_refs
|
||||
from langchain_core.utils.pydantic import is_basemodel_subclass
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable
|
||||
|
||||
from langchain_core.tools import BaseTool
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
@@ -234,20 +230,13 @@ def _convert_any_typed_dicts_to_pydantic(
|
||||
if is_typeddict(type_):
|
||||
typed_dict = type_
|
||||
docstring = inspect.getdoc(typed_dict)
|
||||
# Use get_type_hints to properly resolve forward references and
|
||||
# string annotations in Python 3.14+ (PEP 649 deferred annotations).
|
||||
# include_extras=True preserves Annotated metadata.
|
||||
try:
|
||||
annotations_ = get_type_hints(typed_dict, include_extras=True)
|
||||
except Exception:
|
||||
# Fallback for edge cases where get_type_hints might fail
|
||||
annotations_ = typed_dict.__annotations__
|
||||
annotations_ = typed_dict.__annotations__
|
||||
description, arg_descriptions = _parse_google_docstring(
|
||||
docstring, list(annotations_)
|
||||
)
|
||||
fields: dict = {}
|
||||
for arg, arg_type in annotations_.items():
|
||||
if get_origin(arg_type) in {Annotated, typing_extensions.Annotated}:
|
||||
if get_origin(arg_type) is Annotated: # type: ignore[comparison-overlap]
|
||||
annotated_args = get_args(arg_type)
|
||||
new_arg_type = _convert_any_typed_dicts_to_pydantic(
|
||||
annotated_args[0], depth=depth + 1, visited=visited
|
||||
@@ -337,7 +326,7 @@ def _format_tool_to_openai_function(tool: BaseTool) -> FunctionDescription:
|
||||
|
||||
|
||||
def convert_to_openai_function(
|
||||
function: Mapping[str, Any] | type | Callable | BaseTool,
|
||||
function: dict[str, Any] | type | Callable | BaseTool,
|
||||
*,
|
||||
strict: bool | None = None,
|
||||
) -> dict[str, Any]:
|
||||
@@ -363,7 +352,6 @@ def convert_to_openai_function(
|
||||
ValueError: If function is not in a supported format.
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 0.3.16"
|
||||
|
||||
`description` and `parameters` keys are now optional. Only `name` is
|
||||
required and guaranteed to be part of the output.
|
||||
"""
|
||||
@@ -464,7 +452,7 @@ _WellKnownOpenAITools = (
|
||||
|
||||
|
||||
def convert_to_openai_tool(
|
||||
tool: Mapping[str, Any] | type[BaseModel] | Callable | BaseTool,
|
||||
tool: dict[str, Any] | type[BaseModel] | Callable | BaseTool,
|
||||
*,
|
||||
strict: bool | None = None,
|
||||
) -> dict[str, Any]:
|
||||
@@ -488,18 +476,15 @@ def convert_to_openai_tool(
|
||||
OpenAI tool-calling API.
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 0.3.16"
|
||||
|
||||
`description` and `parameters` keys are now optional. Only `name` is
|
||||
required and guaranteed to be part of the output.
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 0.3.44"
|
||||
|
||||
Return OpenAI Responses API-style tools unchanged. This includes
|
||||
any dict with `"type"` in `"file_search"`, `"function"`,
|
||||
`"computer_use_preview"`, `"web_search_preview"`.
|
||||
|
||||
!!! warning "Behavior changed in `langchain-core` 0.3.63"
|
||||
|
||||
Added support for OpenAI's image generation built-in tool.
|
||||
"""
|
||||
# Import locally to prevent circular import
|
||||
|
||||
@@ -22,9 +22,6 @@ def get_color_mapping(
|
||||
|
||||
Returns:
|
||||
The mapping of items to colors.
|
||||
|
||||
Raises:
|
||||
ValueError: If no colors are available after applying exclusions.
|
||||
"""
|
||||
colors = list(_TEXT_COLOR_MAPPING.keys())
|
||||
if excluded_colors is not None:
|
||||
|
||||
@@ -4,13 +4,11 @@ from __future__ import annotations
|
||||
|
||||
import json
|
||||
import re
|
||||
from typing import TYPE_CHECKING, Any
|
||||
from collections.abc import Callable
|
||||
from typing import Any
|
||||
|
||||
from langchain_core.exceptions import OutputParserException
|
||||
|
||||
if TYPE_CHECKING:
|
||||
from collections.abc import Callable
|
||||
|
||||
|
||||
def _replace_new_line(match: re.Match[str]) -> str:
|
||||
value = match.group(2)
|
||||
|
||||
@@ -170,33 +170,28 @@ def dereference_refs(
|
||||
full_schema: dict | None = None,
|
||||
skip_keys: Sequence[str] | None = None,
|
||||
) -> dict:
|
||||
"""Resolve and inline JSON Schema `$ref` references in a schema object.
|
||||
"""Resolve and inline JSON Schema $ref references in a schema object.
|
||||
|
||||
This function processes a JSON Schema and resolves all `$ref` references by
|
||||
replacing them with the actual referenced content.
|
||||
|
||||
Handles both simple references and complex cases like circular references and mixed
|
||||
`$ref` objects that contain additional properties alongside the `$ref`.
|
||||
This function processes a JSON Schema and resolves all $ref references by replacing
|
||||
them with the actual referenced content. It handles both simple references and
|
||||
complex cases like circular references and mixed $ref objects that contain
|
||||
additional properties alongside the $ref.
|
||||
|
||||
Args:
|
||||
schema_obj: The JSON Schema object or fragment to process.
|
||||
|
||||
This can be a complete schema or just a portion of one.
|
||||
full_schema: The complete schema containing all definitions that `$refs` might
|
||||
point to.
|
||||
|
||||
If not provided, defaults to `schema_obj` (useful when the schema is
|
||||
self-contained).
|
||||
skip_keys: Controls recursion behavior and reference resolution depth.
|
||||
|
||||
- If `None` (Default): Only recurse under `'$defs'` and use shallow
|
||||
reference resolution (break cycles but don't deep-inline nested refs)
|
||||
- If provided (even as `[]`): Recurse under all keys and use deep reference
|
||||
resolution (fully inline all nested references)
|
||||
schema_obj: The JSON Schema object or fragment to process. This can be a
|
||||
complete schema or just a portion of one.
|
||||
full_schema: The complete schema containing all definitions that $refs might
|
||||
point to. If not provided, defaults to schema_obj (useful when the
|
||||
schema is self-contained).
|
||||
skip_keys: Controls recursion behavior and reference resolution depth:
|
||||
- If `None` (Default): Only recurse under '$defs' and use shallow reference
|
||||
resolution (break cycles but don't deep-inline nested refs)
|
||||
- If provided (even as []): Recurse under all keys and use deep reference
|
||||
resolution (fully inline all nested references)
|
||||
|
||||
Returns:
|
||||
A new dictionary with all $ref references resolved and inlined.
|
||||
The original `schema_obj` is not modified.
|
||||
A new dictionary with all $ref references resolved and inlined. The original
|
||||
schema_obj is not modified.
|
||||
|
||||
Examples:
|
||||
Basic reference resolution:
|
||||
@@ -208,8 +203,7 @@ def dereference_refs(
|
||||
>>> result = dereference_refs(schema)
|
||||
>>> result["properties"]["name"] # {"type": "string"}
|
||||
|
||||
Mixed `$ref` with additional properties:
|
||||
|
||||
Mixed $ref with additional properties:
|
||||
>>> schema = {
|
||||
... "properties": {
|
||||
... "name": {"$ref": "#/$defs/base", "description": "User name"}
|
||||
@@ -221,7 +215,6 @@ def dereference_refs(
|
||||
# {"type": "string", "minLength": 1, "description": "User name"}
|
||||
|
||||
Handling circular references:
|
||||
|
||||
>>> schema = {
|
||||
... "properties": {"user": {"$ref": "#/$defs/User"}},
|
||||
... "$defs": {
|
||||
@@ -234,11 +227,10 @@ def dereference_refs(
|
||||
>>> result = dereference_refs(schema) # Won't cause infinite recursion
|
||||
|
||||
!!! note
|
||||
|
||||
- Circular references are handled gracefully by breaking cycles
|
||||
- Mixed `$ref` objects (with both `$ref` and other properties) are supported
|
||||
- Additional properties in mixed `$refs` override resolved properties
|
||||
- The `$defs` section is preserved in the output by default
|
||||
- Mixed $ref objects (with both $ref and other properties) are supported
|
||||
- Additional properties in mixed $refs override resolved properties
|
||||
- The $defs section is preserved in the output by default
|
||||
"""
|
||||
full = full_schema or schema_obj
|
||||
keys_to_skip = list(skip_keys) if skip_keys is not None else ["$defs"]
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user