Compare commits

..

3 Commits

Author SHA1 Message Date
Sydney Runkle
b67cd71d7d release: langchain 1.0.8 (#34019) 2025-11-19 09:12:37 -05:00
Sydney Runkle
e150b7c7e3 release: langchain 1.0.7 (#33979)
support resumable (ensure or create) resources for shell middleware
2025-11-14 15:50:11 -05:00
Sydney Runkle
ee3fc91e7a fix: cherry picking fixes for langchain + langchain-anthropic releases (#33975)
Co-authored-by: ccurme <chester.curme@gmail.com>
2025-11-14 13:28:30 -05:00
431 changed files with 30675 additions and 31083 deletions

132
.github/CODE_OF_CONDUCT.md vendored Normal file
View File

@@ -0,0 +1,132 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
conduct@langchain.dev.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations

6
.github/CONTRIBUTING.md vendored Normal file
View File

@@ -0,0 +1,6 @@
# Contributing to LangChain
Hi there! Thank you for even being interested in contributing to LangChain.
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
To learn how to contribute to LangChain, please follow the [contribution guide here](https://docs.langchain.com/oss/python/contributing).

View File

@@ -8,15 +8,16 @@ body:
value: |
Thank you for taking the time to file a bug report.
For usage questions, feature requests and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
Use this to report BUGS in LangChain. For usage questions, feature requests and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
Check these before submitting to see if your issue has already been reported, fixed or if there's another way to solve your problem:
Relevant links to check before filing a bug report to see if your issue has already been reported, fixed or
if there's another way to solve your problem:
* [Documentation](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference Documentation](https://reference.langchain.com/python/),
* [LangChain Forum](https://forum.langchain.com/),
* [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference](https://reference.langchain.com/python/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
* [LangChain Forum](https://forum.langchain.com/),
- type: checkboxes
id: checks
attributes:
@@ -35,48 +36,16 @@ body:
required: true
- label: This is not related to the langchain-community package.
required: true
- label: I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
required: true
- label: I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
required: true
- type: checkboxes
id: package
attributes:
label: Package (Required)
description: |
Which `langchain` package(s) is this bug related to? Select at least one.
Note that if the package you are reporting for is not listed here, it is not in this repository (e.g. `langchain-google-genai` is in [`langchain-ai/langchain-google`](https://github.com/langchain-ai/langchain-google/)).
Please report issues for other packages to their respective repositories.
options:
- label: langchain
- label: langchain-openai
- label: langchain-anthropic
- label: langchain-classic
- label: langchain-core
- label: langchain-cli
- label: langchain-model-profiles
- label: langchain-tests
- label: langchain-text-splitters
- label: langchain-chroma
- label: langchain-deepseek
- label: langchain-exa
- label: langchain-fireworks
- label: langchain-groq
- label: langchain-huggingface
- label: langchain-mistralai
- label: langchain-nomic
- label: langchain-ollama
- label: langchain-perplexity
- label: langchain-prompty
- label: langchain-qdrant
- label: langchain-xai
- label: Other / not sure / general
- type: textarea
id: reproduction
validations:
required: true
attributes:
label: Example Code (Python)
label: Example Code
description: |
Please add a self-contained, [minimal, reproducible, example](https://stackoverflow.com/help/minimal-reproducible-example) with your use case.
@@ -84,12 +53,15 @@ body:
**Important!**
* Avoid screenshots, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
* Reduce your code to the minimum required to reproduce the issue if possible.
* Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
* Reduce your code to the minimum required to reproduce the issue if possible. This makes it much easier for others to help you.
* Use code tags (e.g., ```python ... ```) to correctly [format your code](https://help.github.com/en/github/writing-on-github/creating-and-highlighting-code-blocks#syntax-highlighting).
* INCLUDE the language label (e.g. `python`) after the first three backticks to enable syntax highlighting. (e.g., ```python rather than ```).
(This will be automatically formatted into code, so no need for backticks.)
render: python
placeholder: |
The following code:
```python
from langchain_core.runnables import RunnableLambda
def bad_code(inputs) -> int:
@@ -97,14 +69,17 @@ body:
chain = RunnableLambda(bad_code)
chain.invoke('Hello!')
```
- type: textarea
id: error
validations:
required: false
attributes:
label: Error Message and Stack Trace (if applicable)
description: |
If you are reporting an error, please copy and paste the full error message and
stack trace.
(This will be automatically formatted into code, so no need for backticks.)
render: shell
If you are reporting an error, please include the full error message and stack trace.
placeholder: |
Exception + full stack trace
- type: textarea
id: description
attributes:
@@ -124,7 +99,9 @@ body:
attributes:
label: System Info
description: |
Please share your system info with us.
Please share your system info with us. Do NOT skip this step and please don't trim
the output. Most users don't include enough information here and it makes it harder
for us to help you.
Run the following command in your terminal and paste the output here:
@@ -136,6 +113,8 @@ body:
from langchain_core import sys_info
sys_info.print_sys_info()
```
alternatively, put the entire output of `pip freeze` here.
placeholder: |
python -m langchain_core.sys_info
validations:

View File

@@ -1,15 +1,9 @@
blank_issues_enabled: false
version: 2.1
contact_links:
- name: 📚 Documentation issue
- name: 📚 Documentation
url: https://github.com/langchain-ai/docs/issues/new?template=01-langchain.yml
about: Report an issue related to the LangChain documentation
- name: 💬 LangChain Forum
url: https://forum.langchain.com/
about: General community discussions and support
- name: 📚 LangChain Documentation
url: https://docs.langchain.com/oss/python/langchain/overview
about: View the official LangChain documentation
- name: 📚 API Reference Documentation
url: https://reference.langchain.com/python/
about: View the official LangChain API reference documentation

View File

@@ -13,11 +13,11 @@ body:
Relevant links to check before filing a feature request to see if your request has already been made or
if there's another way to achieve what you want:
* [Documentation](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference Documentation](https://reference.langchain.com/python/),
* [LangChain Forum](https://forum.langchain.com/),
* [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference](https://reference.langchain.com/python/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
* [LangChain Forum](https://forum.langchain.com/),
- type: checkboxes
id: checks
attributes:
@@ -34,40 +34,6 @@ body:
required: true
- label: This is not related to the langchain-community package.
required: true
- type: checkboxes
id: package
attributes:
label: Package (Required)
description: |
Which `langchain` package(s) is this request related to? Select at least one.
Note that if the package you are requesting for is not listed here, it is not in this repository (e.g. `langchain-google-genai` is in `langchain-ai/langchain`).
Please submit feature requests for other packages to their respective repositories.
options:
- label: langchain
- label: langchain-openai
- label: langchain-anthropic
- label: langchain-classic
- label: langchain-core
- label: langchain-cli
- label: langchain-model-profiles
- label: langchain-tests
- label: langchain-text-splitters
- label: langchain-chroma
- label: langchain-deepseek
- label: langchain-exa
- label: langchain-fireworks
- label: langchain-groq
- label: langchain-huggingface
- label: langchain-mistralai
- label: langchain-nomic
- label: langchain-ollama
- label: langchain-perplexity
- label: langchain-prompty
- label: langchain-qdrant
- label: langchain-xai
- label: Other / not sure / general
- type: textarea
id: feature-description
validations:

View File

@@ -18,33 +18,3 @@ body:
attributes:
label: Issue Content
description: Add the content of the issue here.
- type: checkboxes
id: package
attributes:
label: Package (Required)
description: |
Please select package(s) that this issue is related to.
options:
- label: langchain
- label: langchain-openai
- label: langchain-anthropic
- label: langchain-classic
- label: langchain-core
- label: langchain-cli
- label: langchain-model-profiles
- label: langchain-tests
- label: langchain-text-splitters
- label: langchain-chroma
- label: langchain-deepseek
- label: langchain-exa
- label: langchain-fireworks
- label: langchain-groq
- label: langchain-huggingface
- label: langchain-mistralai
- label: langchain-nomic
- label: langchain-ollama
- label: langchain-perplexity
- label: langchain-prompty
- label: langchain-qdrant
- label: langchain-xai
- label: Other / not sure / general

View File

@@ -25,13 +25,13 @@ body:
label: Task Description
description: |
Provide a clear and detailed description of the task.
What needs to be done? Be specific about the scope and requirements.
placeholder: |
This task involves...
The goal is to...
Specific requirements:
- ...
- ...
@@ -43,7 +43,7 @@ body:
label: Acceptance Criteria
description: |
Define the criteria that must be met for this task to be considered complete.
What are the specific deliverables or outcomes expected?
placeholder: |
This task will be complete when:
@@ -58,15 +58,15 @@ body:
label: Context and Background
description: |
Provide any relevant context, background information, or links to related issues/PRs.
Why is this task needed? What problem does it solve?
placeholder: |
Background:
- ...
Related issues/PRs:
- #...
Additional context:
- ...
validations:
@@ -77,45 +77,15 @@ body:
label: Dependencies
description: |
List any dependencies or blockers for this task.
Are there other tasks, issues, or external factors that need to be completed first?
placeholder: |
This task depends on:
- [ ] Issue #...
- [ ] PR #...
- [ ] External dependency: ...
Blocked by:
- ...
validations:
required: false
- type: checkboxes
id: package
attributes:
label: Package (Required)
description: |
Please select package(s) that this task is related to.
options:
- label: langchain
- label: langchain-openai
- label: langchain-anthropic
- label: langchain-classic
- label: langchain-core
- label: langchain-cli
- label: langchain-model-profiles
- label: langchain-tests
- label: langchain-text-splitters
- label: langchain-chroma
- label: langchain-deepseek
- label: langchain-exa
- label: langchain-fireworks
- label: langchain-groq
- label: langchain-huggingface
- label: langchain-mistralai
- label: langchain-nomic
- label: langchain-ollama
- label: langchain-perplexity
- label: langchain-prompty
- label: langchain-qdrant
- label: langchain-xai
- label: Other / not sure / general

View File

@@ -1,30 +1,28 @@
(Replace this entire block of text)
Read the full contributing guidelines: https://docs.langchain.com/oss/python/contributing/overview
Thank you for contributing to LangChain! Follow these steps to have your pull request considered as ready for review.
1. PR title: Should follow the format: TYPE(SCOPE): DESCRIPTION
Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. **If any of these steps are not completed, your PR will not be considered for review.**
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- Examples:
- fix(anthropic): resolve flag parsing error
- feat(core): add multi-tenant support
- test(openai): update API usage tests
- Allowed TYPE and SCOPE values: https://github.com/langchain-ai/langchain/blob/master/.github/workflows/pr_lint.yml#L15-L33
- fix(cli): resolve flag parsing error
- docs(openai): update API usage examples
- Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, text-splitters, docs, anthropic, chroma, deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant, xai, infra
- Once you've written the title, please delete this checklist item; do not include it in the PR.
2. PR description:
- [ ] **PR message**: ***Delete this entire checklist*** and replace with
- **Description:** a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
- **Dependencies:** any dependencies required for this change
- Write 1-2 sentences summarizing the change.
- If this PR addresses a specific issue, please include "Fixes #ISSUE_NUMBER" in the description to automatically close the issue when the PR is merged.
- If there are any breaking changes, please clearly describe them.
- If this PR depends on another PR being merged first, please include "Depends on #PR_NUMBER" inthe description.
3. Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified.
- We will not consider a PR unless these three are passing in CI.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. **We will not consider a PR unless these three are passing in CI.** See [contribution guidelines](https://docs.langchain.com/oss/python/contributing) for more.
Additional guidelines:
- We ask that if you use generative AI for your contribution, you include a disclaimer.
- PRs should not touch more than one package unless absolutely necessary.
- Do not update the `uv.lock` files unless or add dependencies to `pyproject.toml` files (even optional ones) unless you have explicit permission to do so by a maintainer.
- Most PRs should not touch more than one package.
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests. Likewise, please do not update the `uv.lock` files unless you are adding a required dependency.
- Changes should be backwards compatible.
- Make sure optional dependencies are imported within a function.

330
.github/copilot-instructions.md vendored Normal file
View File

@@ -0,0 +1,330 @@
# Global Development Guidelines for LangChain Projects
## Core Development Principles
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
**Bad - Breaking Change:**
```python
def get_user(id, verbose=False): # Changed from `user_id`
pass
```
**Good - Stable Interface:**
```python
def get_user(user_id: str, verbose: bool = False) -> User:
"""Retrieve user by ID with optional verbose output."""
pass
```
**Before making ANY changes to public APIs:**
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring admonitions (using MkDocs Material, like `!!! warning`)
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
### 2. Code Quality Standards
**All Python code MUST include type hints and return types.**
**Bad:**
```python
def p(u, d):
return [x for x in u if x not in d]
```
**Good:**
```python
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Filter out users that are not in the known users set.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
return [user for user in users if user not in known_users]
```
**Style Requirements:**
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
- Avoid unnecessary abstraction or premature optimization
- Follow existing patterns in the codebase you're modifying
### 3. Testing Requirements
**Every new feature or bugfix MUST be covered by unit tests.**
**Test Organization:**
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- Use `pytest` as the testing framework
**Test Quality Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
Checklist questions:
- [ ] Does the test suite fail if your new logic is broken?
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
- [ ] Do tests use fixtures or mocks where needed?
```python
def test_filter_unknown_users():
"""Test filtering unknown users from a list."""
users = ["alice", "bob", "charlie"]
known_users = {"alice", "bob"}
result = filter_unknown_users(users, known_users)
assert result == ["charlie"]
assert len(result) == 1
```
### 4. Security and Risk Assessment
**Security Checklist:**
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
**Bad:**
```python
def load_config(path):
with open(path) as f:
return eval(f.read()) # ⚠️ Never eval config
```
**Good:**
```python
import json
def load_config(path: str) -> dict:
with open(path) as f:
return json.load(f)
```
### 5. Documentation Standards
**Use Google-style docstrings with Args and Returns sections for all public functions.**
**Insufficient Documentation:**
```python
def send_email(to, msg):
"""Send an email to a recipient."""
```
**Complete Documentation:**
```python
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""
Send an email to a recipient with specified priority.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level.
Returns:
True if email was sent successfully, False otherwise.
Raises:
InvalidEmailError: If the email address format is invalid.
SMTPConnectionError: If unable to connect to email server.
"""
```
**Documentation Guidelines:**
- Types go in function signatures, NOT in docstrings
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
### 6. Architectural Improvements
**When you encounter code that could be improved, suggest better designs:**
**Poor Design:**
```python
def process_data(data, db_conn, email_client, logger):
# Function doing too many things
validated = validate_data(data)
result = db_conn.save(validated)
email_client.send_notification(result)
logger.log(f"Processed {len(data)} items")
return result
```
**Better Design:**
```python
@dataclass
class ProcessingResult:
"""Result of data processing operation."""
items_processed: int
success: bool
errors: List[str] = field(default_factory=list)
class DataProcessor:
"""Handles data validation, storage, and notification."""
def __init__(self, db_conn: Database, email_client: EmailClient):
self.db = db_conn
self.email = email_client
def process(self, data: List[dict]) -> ProcessingResult:
"""Process and store data with notifications.
Args:
data: List of data items to process.
Returns:
ProcessingResult with details of the operation.
"""
validated = self._validate_data(data)
result = self.db.save(validated)
self._notify_completion(result)
return result
```
**Design Improvement Areas:**
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
- Reduce code duplication through shared utilities
- Make unit testing easier
- Improve separation of concerns (single responsibility)
- Make unit testing easier through dependency injection
- Add clarity without adding complexity
- Prefer dataclasses for structured data
## Development Tools & Commands
### Package Management
```bash
# Add package
uv add package-name
# Sync project dependencies
uv sync
uv lock
```
### Testing
```bash
# Run unit tests (no network)
make test
# Don't run integration tests, as API keys must be set
# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py
```
### Code Quality
```bash
# Lint code
make lint
# Format code
make format
# Type checking
uv run --group lint mypy .
```
### Dependency Management Patterns
**Local Development Dependencies:**
```toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }
```
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
```python
from langchain_core.tools import tool
@tool
def search_database(query: str) -> str:
"""Search the database for relevant information.
Args:
query: The search query string.
"""
# Implementation here
return results
```
## Commit Standards
**Use Conventional Commits format for PR titles:**
- `feat(core): add multi-tenant support`
- `!fix(cli): resolve flag parsing error` (breaking change uses exclamation mark)
- `docs: update API usage examples`
- `docs(openai): update API usage examples`
## Framework-Specific Guidelines
- Follow the existing patterns in `langchain_core` for base abstractions
- Implement proper streaming support where applicable
- Avoid deprecated components
### Partner Integrations
- Follow the established patterns in existing partner libraries
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
- Include comprehensive integration tests
- Document API key requirements and authentication
---
## Quick Reference Checklist
Before submitting code changes:
- [ ] **Breaking Changes**: Verified no public API changes
- [ ] **Type Hints**: All functions have complete type annotations
- [ ] **Tests**: New functionality is fully tested
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
- [ ] **Documentation**: Google-style docstrings for public functions
- [ ] **Code Quality**: `make lint` and `make format` pass
- [ ] **Architecture**: Suggested improvements where applicable
- [ ] **Commit Message**: Follows Conventional Commits format

View File

@@ -148,5 +148,16 @@ documentation:
- changed-files:
- any-glob-to-any-file:
- "**/*.md"
- "**/*.rst"
- "**/README*"
# Security related changes
security:
- changed-files:
- any-glob-to-any-file:
- "**/*security*"
- "**/*auth*"
- "**/*credential*"
- "**/*secret*"
- "**/*token*"
- ".github/workflows/security*"

View File

@@ -35,7 +35,7 @@ jobs:
timeout-minutes: 20
name: "Python ${{ inputs.python-version }}"
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
uses: "./.github/actions/uv_setup"

View File

@@ -38,7 +38,7 @@ jobs:
timeout-minutes: 20
steps:
- name: "📋 Checkout Code"
uses: actions/checkout@v6
uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
uses: "./.github/actions/uv_setup"
@@ -47,12 +47,6 @@ jobs:
cache-suffix: lint-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
# - name: "🔒 Verify Lockfile is Up-to-Date"
# working-directory: ${{ inputs.working-directory }}
# run: |
# unset UV_FROZEN
# uv lock --check
- name: "📦 Install Lint & Typing Dependencies"
working-directory: ${{ inputs.working-directory }}
run: |

View File

@@ -19,7 +19,7 @@ on:
required: true
type: string
description: "From which folder this pipeline executes"
default: "libs/langchain_v1"
default: "libs/langchain"
release-version:
required: true
type: string
@@ -54,7 +54,7 @@ jobs:
version: ${{ steps.check-version.outputs.version }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
@@ -77,7 +77,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
- name: Upload build
uses: actions/upload-artifact@v6
uses: actions/upload-artifact@v5
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -105,7 +105,7 @@ jobs:
outputs:
release-body: ${{ steps.generate-release-body.outputs.release-body }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain
path: langchain
@@ -206,9 +206,9 @@ jobs:
id-token: write
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- uses: actions/download-artifact@v7
- uses: actions/download-artifact@v6
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -237,7 +237,7 @@ jobs:
contents: read
timeout-minutes: 20
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
# We explicitly *don't* set up caching here. This ensures our tests are
# maximally sensitive to catching breakage.
@@ -258,7 +258,7 @@ jobs:
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v7
- uses: actions/download-artifact@v6
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -412,7 +412,7 @@ jobs:
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
LANGCHAIN_TESTS_USER_AGENT: ${{ secrets.LANGCHAIN_TESTS_USER_AGENT }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
# We implement this conditional as Github Actions does not have good support
# for conditionally needing steps. https://github.com/actions/runner/issues/491
@@ -430,7 +430,7 @@ jobs:
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v7
- uses: actions/download-artifact@v6
if: startsWith(inputs.working-directory, 'libs/core')
with:
name: dist
@@ -492,14 +492,14 @@ jobs:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v7
- uses: actions/download-artifact@v6
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -532,14 +532,14 @@ jobs:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v7
- uses: actions/download-artifact@v6
with:
name: dist
path: ${{ inputs.working-directory }}/dist/

View File

@@ -33,7 +33,7 @@ jobs:
name: "Python ${{ inputs.python-version }}"
steps:
- name: "📋 Checkout Code"
uses: actions/checkout@v6
uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
uses: "./.github/actions/uv_setup"

View File

@@ -36,7 +36,7 @@ jobs:
name: "Pydantic ~=${{ inputs.pydantic-version }}"
steps:
- name: "📋 Checkout Code"
uses: actions/checkout@v6
uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ inputs.python-version }} + UV"
uses: "./.github/actions/uv_setup"

View File

@@ -1,107 +0,0 @@
name: Auto Label Issues by Package
on:
issues:
types: [opened, edited]
jobs:
label-by-package:
permissions:
issues: write
runs-on: ubuntu-latest
steps:
- name: Sync package labels
uses: actions/github-script@v8
with:
script: |
const body = context.payload.issue.body || "";
// Extract text under "### Package"
const match = body.match(/### Package\s+([\s\S]*?)\n###/i);
if (!match) return;
const packageSection = match[1].trim();
// Mapping table for package names to labels
const mapping = {
"langchain": "langchain",
"langchain-openai": "openai",
"langchain-anthropic": "anthropic",
"langchain-classic": "langchain-classic",
"langchain-core": "core",
"langchain-cli": "cli",
"langchain-model-profiles": "model-profiles",
"langchain-tests": "standard-tests",
"langchain-text-splitters": "text-splitters",
"langchain-chroma": "chroma",
"langchain-deepseek": "deepseek",
"langchain-exa": "exa",
"langchain-fireworks": "fireworks",
"langchain-groq": "groq",
"langchain-huggingface": "huggingface",
"langchain-mistralai": "mistralai",
"langchain-nomic": "nomic",
"langchain-ollama": "ollama",
"langchain-perplexity": "perplexity",
"langchain-prompty": "prompty",
"langchain-qdrant": "qdrant",
"langchain-xai": "xai",
};
// All possible package labels we manage
const allPackageLabels = Object.values(mapping);
const selectedLabels = [];
// Check if this is checkbox format (multiple selection)
const checkboxMatches = packageSection.match(/- \[x\]\s+([^\n\r]+)/gi);
if (checkboxMatches) {
// Handle checkbox format
for (const match of checkboxMatches) {
const packageName = match.replace(/- \[x\]\s+/i, '').trim();
const label = mapping[packageName];
if (label && !selectedLabels.includes(label)) {
selectedLabels.push(label);
}
}
} else {
// Handle dropdown format (single selection)
const label = mapping[packageSection];
if (label) {
selectedLabels.push(label);
}
}
// Get current issue labels
const issue = await github.rest.issues.get({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number
});
const currentLabels = issue.data.labels.map(label => label.name);
const currentPackageLabels = currentLabels.filter(label => allPackageLabels.includes(label));
// Determine labels to add and remove
const labelsToAdd = selectedLabels.filter(label => !currentPackageLabels.includes(label));
const labelsToRemove = currentPackageLabels.filter(label => !selectedLabels.includes(label));
// Add new labels
if (labelsToAdd.length > 0) {
await github.rest.issues.addLabels({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
labels: labelsToAdd
});
}
// Remove old labels
for (const label of labelsToRemove) {
await github.rest.issues.removeLabel({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
name: label
});
}

View File

@@ -47,7 +47,7 @@ jobs:
if: ${{ !contains(github.event.pull_request.labels.*.name, 'ci-ignore') }}
steps:
- name: "📋 Checkout Code"
uses: actions/checkout@v6
uses: actions/checkout@v5
- name: "🐍 Setup Python 3.11"
uses: actions/setup-python@v6
with:
@@ -141,7 +141,7 @@ jobs:
run:
working-directory: ${{ matrix.job-configs.working-directory }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: "🐍 Set up Python ${{ matrix.job-configs.python-version }} + UV"
uses: "./.github/actions/uv_setup"
@@ -182,7 +182,7 @@ jobs:
job-configs: ${{ fromJson(needs.build.outputs.codspeed) }}
fail-fast: false
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: "📦 Install UV Package Manager"
uses: astral-sh/setup-uv@v7

View File

@@ -9,6 +9,8 @@ on:
paths:
- "libs/core/pyproject.toml"
- "libs/core/langchain_core/version.py"
- "libs/langchain_v1/pyproject.toml"
- "libs/langchain_v1/langchain/__init__.py"
permissions:
contents: read
@@ -18,9 +20,9 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
- name: "✅ Verify pyproject.toml & version.py Match"
- name: "✅ Verify pyproject.toml & version files Match"
run: |
# Check core versions
CORE_PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/core/pyproject.toml)

View File

@@ -71,14 +71,14 @@ jobs:
working-directory: ${{ fromJSON(needs.compute-matrix.outputs.matrix).working-directory }}
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
with:
path: langchain
- uses: actions/checkout@v6
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-google
path: langchain-google
- uses: actions/checkout@v6
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-aws
path: langchain-aws

View File

@@ -26,13 +26,11 @@
# * revert — reverts a previous commit
# * release — prepare a new release
#
# Allowed Scope(s) (optional):
# core, cli, langchain, langchain_v1, langchain-classic, model-profiles,
# standard-tests, text-splitters, docs, anthropic, chroma, deepseek, exa,
# fireworks, groq, huggingface, mistralai, nomic, ollama, openai,
# perplexity, prompty, qdrant, xai, infra, deps
#
# Multiple scopes can be used by separating them with a comma.
# Allowed Scopes (optional):
# core, cli, langchain, langchain_v1, langchain-classic, standard-tests,
# text-splitters, docs, anthropic, chroma, deepseek, exa, fireworks, groq,
# huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant,
# xai, infra, deps
#
# Rules:
# 1. The 'Type' must start with a lowercase letter.
@@ -102,7 +100,6 @@ jobs:
qdrant
xai
infra
deps
requireScope: false
disallowScopes: |
release

View File

@@ -23,12 +23,12 @@ jobs:
permissions:
contents: read
steps:
- uses: actions/checkout@v6
- uses: actions/checkout@v5
with:
ref: v0.3
path: langchain
- uses: actions/checkout@v6
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-api-docs-html
path: langchain-api-docs-html

8
.github/workflows/v1_changes.md vendored Normal file
View File

@@ -0,0 +1,8 @@
With the deprecation of v0 docs, the following files will need to be migrated/supported
in the new docs repo:
- run_notebooks.yml: New repo should run Integration tests on code snippets?
- people.yml: Need to fix and somehow display on the new docs site
- Subsequently, `.github/actions/people/`
- _test_doc_imports.yml
- check-broken-links.yml

3
.gitignore vendored
View File

@@ -163,6 +163,3 @@ node_modules
prof
virtualenv/
scratch/
.langgraph_api/

View File

@@ -6,6 +6,8 @@
"ms-toolsai.jupyter",
"ms-toolsai.jupyter-keymap",
"ms-toolsai.jupyter-renderers",
"ms-toolsai.vscode-jupyter-cell-tags",
"ms-toolsai.vscode-jupyter-slideshow",
"yzhang.markdown-all-in-one",
"davidanson.vscode-markdownlint",
"bierner.markdown-mermaid",

419
AGENTS.md
View File

@@ -1,58 +1,255 @@
# Global development guidelines for the LangChain monorepo
# Global Development Guidelines for LangChain Projects
This document provides context to understand the LangChain Python project and assist with development.
## Core Development Principles
## Project architecture and context
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
### Monorepo structure
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
This is a Python monorepo with multiple independently versioned packages that use `uv`.
**Bad - Breaking Change:**
```txt
langchain/
├── libs/
│ ├── core/ # `langchain-core` primitives and base abstractions
│ ├── langchain/ # `langchain-classic` (legacy, no new features)
│ ├── langchain_v1/ # Actively maintained `langchain` package
│ ├── partners/ # Third-party integrations
│ │ ├── openai/ # OpenAI models and embeddings
│ │ ├── anthropic/ # Anthropic (Claude) integration
│ │ ├── ollama/ # Local model support
│ │ └── ... (other integrations maintained by the LangChain team)
│ ├── text-splitters/ # Document chunking utilities
│ ├── standard-tests/ # Shared test suite for integrations
│ ├── model-profiles/ # Model configuration profiles
│ └── cli/ # Command-line interface tools
├── .github/ # CI/CD workflows and templates
├── .vscode/ # VSCode IDE standard settings and recommended extensions
└── README.md # Information about LangChain
```python
def get_user(id, verbose=False): # Changed from `user_id`
pass
```
- **Core layer** (`langchain-core`): Base abstractions, interfaces, and protocols. Users should not need to know about this layer directly.
- **Implementation layer** (`langchain`): Concrete implementations and high-level public utilities
- **Integration layer** (`partners/`): Third-party service integrations. Note that this monorepo is not exhaustive of all LangChain integrations; some are maintained in separate repos, such as `langchain-ai/langchain-google` and `langchain-ai/langchain-aws`. Usually these repos are cloned at the same level as this monorepo, so if needed, you can refer to their code directly by navigating to `../langchain-google/` from this monorepo.
- **Testing layer** (`standard-tests/`): Standardized integration tests for partner integrations
**Good - Stable Interface:**
### Development tools & commands**
```python
def get_user(user_id: str, verbose: bool = False) -> User:
"""Retrieve user by ID with optional verbose output."""
pass
```
- `uv` Fast Python package installer and resolver (replaces pip/poetry)
- `make` Task runner for common development commands. Feel free to look at the `Makefile` for available commands and usage patterns.
- `ruff` Fast Python linter and formatter
- `mypy` Static type checking
- `pytest` Testing framework
**Before making ANY changes to public APIs:**
This monorepo uses `uv` for dependency management. Local development uses editable installs: `[tool.uv.sources]`
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
Each package in `libs/` has its own `pyproject.toml` and `uv.lock`.
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
### 2. Code Quality Standards
**All Python code MUST include type hints and return types.**
**Bad:**
```python
def p(u, d):
return [x for x in u if x not in d]
```
**Good:**
```python
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Filter out users that are not in the known users set.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
return [user for user in users if user not in known_users]
```
**Style Requirements:**
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
- Avoid unnecessary abstraction or premature optimization
- Follow existing patterns in the codebase you're modifying
### 3. Testing Requirements
**Every new feature or bugfix MUST be covered by unit tests.**
**Test Organization:**
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- Use `pytest` as the testing framework
**Test Quality Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
Checklist questions:
- [ ] Does the test suite fail if your new logic is broken?
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
- [ ] Do tests use fixtures or mocks where needed?
```python
def test_filter_unknown_users():
"""Test filtering unknown users from a list."""
users = ["alice", "bob", "charlie"]
known_users = {"alice", "bob"}
result = filter_unknown_users(users, known_users)
assert result == ["charlie"]
assert len(result) == 1
```
### 4. Security and Risk Assessment
**Security Checklist:**
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
**Bad:**
```python
def load_config(path):
with open(path) as f:
return eval(f.read()) # ⚠️ Never eval config
```
**Good:**
```python
import json
def load_config(path: str) -> dict:
with open(path) as f:
return json.load(f)
```
### 5. Documentation Standards
**Use Google-style docstrings with Args section for all public functions.**
**Insufficient Documentation:**
```python
def send_email(to, msg):
"""Send an email to a recipient."""
```
**Complete Documentation:**
```python
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""
Send an email to a recipient with specified priority.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level (`'low'`, `'normal'`, `'high'`).
Returns:
`True` if email was sent successfully, `False` otherwise.
Raises:
`InvalidEmailError`: If the email address format is invalid.
`SMTPConnectionError`: If unable to connect to email server.
"""
```
**Documentation Guidelines:**
- Types go in function signatures, NOT in docstrings
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
- Ensure American English spelling (e.g., "behavior", not "behaviour")
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
### 6. Architectural Improvements
**When you encounter code that could be improved, suggest better designs:**
**Poor Design:**
```python
def process_data(data, db_conn, email_client, logger):
# Function doing too many things
validated = validate_data(data)
result = db_conn.save(validated)
email_client.send_notification(result)
logger.log(f"Processed {len(data)} items")
return result
```
**Better Design:**
```python
@dataclass
class ProcessingResult:
"""Result of data processing operation."""
items_processed: int
success: bool
errors: List[str] = field(default_factory=list)
class DataProcessor:
"""Handles data validation, storage, and notification."""
def __init__(self, db_conn: Database, email_client: EmailClient):
self.db = db_conn
self.email = email_client
def process(self, data: List[dict]) -> ProcessingResult:
"""Process and store data with notifications."""
validated = self._validate_data(data)
result = self.db.save(validated)
self._notify_completion(result)
return result
```
**Design Improvement Areas:**
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
- Reduce code duplication through shared utilities
- Make unit testing easier
- Improve separation of concerns (single responsibility)
- Make unit testing easier through dependency injection
- Add clarity without adding complexity
- Prefer dataclasses for structured data
## Development Tools & Commands
### Package Management
```bash
# Add package
uv add package-name
# Sync project dependencies
uv sync
uv lock
```
### Testing
```bash
# Run unit tests (no network)
make test
# Don't run integration tests, as API keys must be set
# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py
```
### Code Quality
```bash
# Lint code
make lint
@@ -64,118 +261,66 @@ make format
uv run --group lint mypy .
```
#### Key config files
### Dependency Management Patterns
- pyproject.toml: Main workspace configuration with dependency groups
- uv.lock: Locked dependencies for reproducible builds
- Makefile: Development tasks
**Local Development Dependencies:**
#### Commit standards
Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes.
#### Pull request guidelines
- Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
- Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
- Highlight areas of the proposed changes that require careful review.
## Core development principles
### Maintain stable public interfaces
CRITICAL: Always attempt to preserve function signatures, argument positions, and names for exported/public methods. Do not make breaking changes.
**Before making ANY changes to public APIs:**
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
Ask: "Would this change break someone's code if they used it last week?"
### Code quality standards
All Python code MUST include type hints and return types.
```python title="Example"
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Single line description of the function.
Any additional context about the function can go here.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
```toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }
```
- Use descriptive, self-explanatory variable names.
- Follow existing patterns in the codebase you're modifying
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
### Testing requirements
```python
from langchain_core.tools import tool
Every new feature or bugfix MUST be covered by unit tests.
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- We use `pytest` as the testing framework; if in doubt, check other existing tests for examples.
- The testing file structure should mirror the source code structure.
**Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
- [ ] Does the test suite fail if your new logic is broken?
### Security and risk assessment
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
### Documentation standards
Use Google-style docstrings with Args section for all public functions.
```python title="Example"
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""Send an email to a recipient with specified priority.
Any additional context about the function can go here.
@tool
def search_database(query: str) -> str:
"""Search the database for relevant information.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level.
Returns:
`True` if email was sent successfully, `False` otherwise.
Raises:
InvalidEmailError: If the email address format is invalid.
SMTPConnectionError: If unable to connect to email server.
query: The search query string.
"""
# Implementation here
return results
```
- Types go in function signatures, NOT in docstrings
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
- Ensure American English spelling (e.g., "behavior", not "behaviour")
## Commit Standards
## Additional resources
**Use Conventional Commits format for PR titles:**
- **Documentation:** https://docs.langchain.com/oss/python/langchain/overview and source at https://github.com/langchain-ai/docs or `../docs/`. Prefer the local install and use file search tools for best results. If needed, use the docs MCP server as defined in `.mcp.json` for programmatic access.
- **Contributing Guide:** [`.github/CONTRIBUTING.md`](https://docs.langchain.com/oss/python/contributing/overview)
- `feat(core): add multi-tenant support`
- `fix(cli): resolve flag parsing error`
- `docs: update API usage examples`
- `docs(openai): update API usage examples`
## Framework-Specific Guidelines
- Follow the existing patterns in `langchain-core` for base abstractions
- Use `langchain_core.callbacks` for execution tracking
- Implement proper streaming support where applicable
- Avoid deprecated components like legacy `LLMChain`
### Partner Integrations
- Follow the established patterns in existing partner libraries
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
- Include comprehensive integration tests
- Document API key requirements and authentication
---
## Quick Reference Checklist
Before submitting code changes:
- [ ] **Breaking Changes**: Verified no public API changes
- [ ] **Type Hints**: All functions have complete type annotations
- [ ] **Tests**: New functionality is fully tested
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
- [ ] **Documentation**: Google-style docstrings for public functions
- [ ] **Code Quality**: `make lint` and `make format` pass
- [ ] **Architecture**: Suggested improvements where applicable
- [ ] **Commit Message**: Follows Conventional Commits format

419
CLAUDE.md
View File

@@ -1,58 +1,255 @@
# Global development guidelines for the LangChain monorepo
# Global Development Guidelines for LangChain Projects
This document provides context to understand the LangChain Python project and assist with development.
## Core Development Principles
## Project architecture and context
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
### Monorepo structure
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
This is a Python monorepo with multiple independently versioned packages that use `uv`.
**Bad - Breaking Change:**
```txt
langchain/
├── libs/
│ ├── core/ # `langchain-core` primitives and base abstractions
│ ├── langchain/ # `langchain-classic` (legacy, no new features)
│ ├── langchain_v1/ # Actively maintained `langchain` package
│ ├── partners/ # Third-party integrations
│ │ ├── openai/ # OpenAI models and embeddings
│ │ ├── anthropic/ # Anthropic (Claude) integration
│ │ ├── ollama/ # Local model support
│ │ └── ... (other integrations maintained by the LangChain team)
│ ├── text-splitters/ # Document chunking utilities
│ ├── standard-tests/ # Shared test suite for integrations
│ ├── model-profiles/ # Model configuration profiles
│ └── cli/ # Command-line interface tools
├── .github/ # CI/CD workflows and templates
├── .vscode/ # VSCode IDE standard settings and recommended extensions
└── README.md # Information about LangChain
```python
def get_user(id, verbose=False): # Changed from `user_id`
pass
```
- **Core layer** (`langchain-core`): Base abstractions, interfaces, and protocols. Users should not need to know about this layer directly.
- **Implementation layer** (`langchain`): Concrete implementations and high-level public utilities
- **Integration layer** (`partners/`): Third-party service integrations. Note that this monorepo is not exhaustive of all LangChain integrations; some are maintained in separate repos, such as `langchain-ai/langchain-google` and `langchain-ai/langchain-aws`. Usually these repos are cloned at the same level as this monorepo, so if needed, you can refer to their code directly by navigating to `../langchain-google/` from this monorepo.
- **Testing layer** (`standard-tests/`): Standardized integration tests for partner integrations
**Good - Stable Interface:**
### Development tools & commands**
```python
def get_user(user_id: str, verbose: bool = False) -> User:
"""Retrieve user by ID with optional verbose output."""
pass
```
- `uv` Fast Python package installer and resolver (replaces pip/poetry)
- `make` Task runner for common development commands. Feel free to look at the `Makefile` for available commands and usage patterns.
- `ruff` Fast Python linter and formatter
- `mypy` Static type checking
- `pytest` Testing framework
**Before making ANY changes to public APIs:**
This monorepo uses `uv` for dependency management. Local development uses editable installs: `[tool.uv.sources]`
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
Each package in `libs/` has its own `pyproject.toml` and `uv.lock`.
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
### 2. Code Quality Standards
**All Python code MUST include type hints and return types.**
**Bad:**
```python
def p(u, d):
return [x for x in u if x not in d]
```
**Good:**
```python
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Filter out users that are not in the known users set.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
return [user for user in users if user not in known_users]
```
**Style Requirements:**
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
- Avoid unnecessary abstraction or premature optimization
- Follow existing patterns in the codebase you're modifying
### 3. Testing Requirements
**Every new feature or bugfix MUST be covered by unit tests.**
**Test Organization:**
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- Use `pytest` as the testing framework
**Test Quality Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
Checklist questions:
- [ ] Does the test suite fail if your new logic is broken?
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
- [ ] Do tests use fixtures or mocks where needed?
```python
def test_filter_unknown_users():
"""Test filtering unknown users from a list."""
users = ["alice", "bob", "charlie"]
known_users = {"alice", "bob"}
result = filter_unknown_users(users, known_users)
assert result == ["charlie"]
assert len(result) == 1
```
### 4. Security and Risk Assessment
**Security Checklist:**
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
**Bad:**
```python
def load_config(path):
with open(path) as f:
return eval(f.read()) # ⚠️ Never eval config
```
**Good:**
```python
import json
def load_config(path: str) -> dict:
with open(path) as f:
return json.load(f)
```
### 5. Documentation Standards
**Use Google-style docstrings with Args section for all public functions.**
**Insufficient Documentation:**
```python
def send_email(to, msg):
"""Send an email to a recipient."""
```
**Complete Documentation:**
```python
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""
Send an email to a recipient with specified priority.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level (`'low'`, `'normal'`, `'high'`).
Returns:
`True` if email was sent successfully, `False` otherwise.
Raises:
`InvalidEmailError`: If the email address format is invalid.
`SMTPConnectionError`: If unable to connect to email server.
"""
```
**Documentation Guidelines:**
- Types go in function signatures, NOT in docstrings
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
- Ensure American English spelling (e.g., "behavior", not "behaviour")
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
### 6. Architectural Improvements
**When you encounter code that could be improved, suggest better designs:**
**Poor Design:**
```python
def process_data(data, db_conn, email_client, logger):
# Function doing too many things
validated = validate_data(data)
result = db_conn.save(validated)
email_client.send_notification(result)
logger.log(f"Processed {len(data)} items")
return result
```
**Better Design:**
```python
@dataclass
class ProcessingResult:
"""Result of data processing operation."""
items_processed: int
success: bool
errors: List[str] = field(default_factory=list)
class DataProcessor:
"""Handles data validation, storage, and notification."""
def __init__(self, db_conn: Database, email_client: EmailClient):
self.db = db_conn
self.email = email_client
def process(self, data: List[dict]) -> ProcessingResult:
"""Process and store data with notifications."""
validated = self._validate_data(data)
result = self.db.save(validated)
self._notify_completion(result)
return result
```
**Design Improvement Areas:**
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
- Reduce code duplication through shared utilities
- Make unit testing easier
- Improve separation of concerns (single responsibility)
- Make unit testing easier through dependency injection
- Add clarity without adding complexity
- Prefer dataclasses for structured data
## Development Tools & Commands
### Package Management
```bash
# Add package
uv add package-name
# Sync project dependencies
uv sync
uv lock
```
### Testing
```bash
# Run unit tests (no network)
make test
# Don't run integration tests, as API keys must be set
# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py
```
### Code Quality
```bash
# Lint code
make lint
@@ -64,118 +261,66 @@ make format
uv run --group lint mypy .
```
#### Key config files
### Dependency Management Patterns
- pyproject.toml: Main workspace configuration with dependency groups
- uv.lock: Locked dependencies for reproducible builds
- Makefile: Development tasks
**Local Development Dependencies:**
#### Commit standards
Suggest PR titles that follow Conventional Commits format. Refer to .github/workflows/pr_lint for allowed types and scopes.
#### Pull request guidelines
- Always add a disclaimer to the PR description mentioning how AI agents are involved with the contribution.
- Describe the "why" of the changes, why the proposed solution is the right one. Limit prose.
- Highlight areas of the proposed changes that require careful review.
## Core development principles
### Maintain stable public interfaces
CRITICAL: Always attempt to preserve function signatures, argument positions, and names for exported/public methods. Do not make breaking changes.
**Before making ANY changes to public APIs:**
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring warnings (using MkDocs Material admonitions, like `!!! warning`)
Ask: "Would this change break someone's code if they used it last week?"
### Code quality standards
All Python code MUST include type hints and return types.
```python title="Example"
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Single line description of the function.
Any additional context about the function can go here.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
```toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }
```
- Use descriptive, self-explanatory variable names.
- Follow existing patterns in the codebase you're modifying
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
### Testing requirements
```python
from langchain_core.tools import tool
Every new feature or bugfix MUST be covered by unit tests.
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- We use `pytest` as the testing framework; if in doubt, check other existing tests for examples.
- The testing file structure should mirror the source code structure.
**Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
- [ ] Does the test suite fail if your new logic is broken?
### Security and risk assessment
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
### Documentation standards
Use Google-style docstrings with Args section for all public functions.
```python title="Example"
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""Send an email to a recipient with specified priority.
Any additional context about the function can go here.
@tool
def search_database(query: str) -> str:
"""Search the database for relevant information.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level.
Returns:
`True` if email was sent successfully, `False` otherwise.
Raises:
InvalidEmailError: If the email address format is invalid.
SMTPConnectionError: If unable to connect to email server.
query: The search query string.
"""
# Implementation here
return results
```
- Types go in function signatures, NOT in docstrings
- If a default is present, DO NOT repeat it in the docstring unless there is post-processing or it is set conditionally.
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
- Ensure American English spelling (e.g., "behavior", not "behaviour")
## Commit Standards
## Additional resources
**Use Conventional Commits format for PR titles:**
- **Documentation:** https://docs.langchain.com/oss/python/langchain/overview and source at https://github.com/langchain-ai/docs or `../docs/`. Prefer the local install and use file search tools for best results. If needed, use the docs MCP server as defined in `.mcp.json` for programmatic access.
- **Contributing Guide:** [`.github/CONTRIBUTING.md`](https://docs.langchain.com/oss/python/contributing/overview)
- `feat(core): add multi-tenant support`
- `fix(cli): resolve flag parsing error`
- `docs: update API usage examples`
- `docs(openai): update API usage examples`
## Framework-Specific Guidelines
- Follow the existing patterns in `langchain-core` for base abstractions
- Use `langchain_core.callbacks` for execution tracking
- Implement proper streaming support where applicable
- Avoid deprecated components like legacy `LLMChain`
### Partner Integrations
- Follow the established patterns in existing partner libraries
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
- Include comprehensive integration tests
- Document API key requirements and authentication
---
## Quick Reference Checklist
Before submitting code changes:
- [ ] **Breaking Changes**: Verified no public API changes
- [ ] **Type Hints**: All functions have complete type annotations
- [ ] **Tests**: New functionality is fully tested
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
- [ ] **Documentation**: Google-style docstrings for public functions
- [ ] **Code Quality**: `make lint` and `make format` pass
- [ ] **Architecture**: Suggested improvements where applicable
- [ ] **Commit Message**: Follows Conventional Commits format

9
MIGRATE.md Normal file
View File

@@ -0,0 +1,9 @@
# Migrating
Please see the following guides for migrating LangChain code:
* Migrate to [LangChain v1.0](https://docs.langchain.com/oss/python/migrate/langchain-v1)
* Migrate to [LangChain v0.3](https://python.langchain.com/docs/versions/v0_3/)
* Migrate to [LangChain v0.2](https://python.langchain.com/docs/versions/v0_2/)
* Migrating from [LangChain 0.0.x Chains](https://python.langchain.com/docs/versions/migrating_chains/)
* Upgrade to [LangGraph Memory](https://python.langchain.com/docs/versions/migrating_memory/)

80
SECURITY.md Normal file
View File

@@ -0,0 +1,80 @@
# Security Policy
LangChain has a large ecosystem of integrations with various external resources like local and remote file systems, APIs and databases. These integrations allow developers to create versatile applications that combine the power of LLMs with the ability to access, interact with and manipulate external resources.
## Best practices
When building such applications, developers should remember to follow good security practices:
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc., as appropriate for your application.
* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it's safest to assume that any LLM able to use those credentials may in fact delete data.
* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It's best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use.
Risks of not doing so include, but are not limited to:
* Data corruption or loss.
* Unauthorized access to confidential information.
* Compromised performance or availability of critical resources.
Example scenarios with mitigation strategies:
* A user may ask an agent with access to the file system to delete files that should not be deleted or read the content of files that contain sensitive information. To mitigate, limit the agent to only use a specific directory and only allow it to read or write files that are safe to read or write. Consider further sandboxing the agent by running it in a container.
* A user may ask an agent with write access to an external API to write malicious data to the API, or delete data from that API. To mitigate, give the agent read-only API keys, or limit it to only use endpoints that are already resistant to such misuse.
* A user may ask an agent with access to a database to drop a table or mutate the schema. To mitigate, scope the credentials to only the tables that the agent needs to access and consider issuing READ-ONLY credentials.
If you're building applications that access external resources like file systems, APIs or databases, consider speaking with your company's security team to determine how to best design and secure your applications.
## Reporting OSS Vulnerabilities
LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide
a bounty program for our open source projects.
Please report security vulnerabilities associated with the LangChain
open source projects at [huntr](https://huntr.com/bounties/disclose/?target=https%3A%2F%2Fgithub.com%2Flangchain-ai%2Flangchain&validSearch=true).
Before reporting a vulnerability, please review:
1) In-Scope Targets and Out-of-Scope Targets below.
2) The [langchain-ai/langchain](https://docs.langchain.com/oss/python/contributing/code#repository-structure) monorepo structure.
3) The [Best Practices](#best-practices) above to understand what we consider to be a security vulnerability vs. developer responsibility.
### In-Scope Targets
The following packages and repositories are eligible for bug bounties:
* langchain-core
* langchain (see exceptions)
* langchain-community (see exceptions)
* langgraph
* langserve
### Out of Scope Targets
All out of scope targets defined by huntr as well as:
* **langchain-experimental**: This repository is for experimental code and is not
eligible for bug bounties (see [package warning](https://pypi.org/project/langchain-experimental/)), bug reports to it will be marked as interesting or waste of
time and published with no bounty attached.
* **tools**: Tools in either `langchain` or `langchain-community` are not eligible for bug
bounties. This includes the following directories
* `libs/langchain/langchain/tools`
* `libs/community/langchain_community/tools`
* Please review the [Best Practices](#best-practices)
for more details, but generally tools interact with the real world. Developers are
expected to understand the security implications of their code and are responsible
for the security of their tools.
* Code documented with security notices. This will be decided on a case-by-case basis, but likely will not be eligible for a bounty as the code is already
documented with guidelines for developers that should be followed for making their
application secure.
* Any LangSmith related repositories or APIs (see [Reporting LangSmith Vulnerabilities](#reporting-langsmith-vulnerabilities)).
## Reporting LangSmith Vulnerabilities
Please report security vulnerabilities associated with LangSmith by email to `security@langchain.dev`.
* LangSmith site: [https://smith.langchain.com](https://smith.langchain.com)
* SDK client: [https://github.com/langchain-ai/langsmith-sdk](https://github.com/langchain-ai/langsmith-sdk)
### Other Security Concerns
For any other security concerns, please contact us at `security@langchain.dev`.

View File

@@ -1,20 +0,0 @@
# Makefile for libs/ directory
# Contains targets that operate across multiple packages
LANGCHAIN_DIRS = core text-splitters langchain langchain_v1 model-profiles
.PHONY: lock check-lock
# Regenerate lockfiles for all core packages
lock:
@for dir in $(LANGCHAIN_DIRS); do \
echo "=== Locking $$dir ==="; \
(cd $$dir && uv lock); \
done
# Verify all lockfiles are up-to-date
check-lock:
@for dir in $(LANGCHAIN_DIRS); do \
echo "=== Checking $$dir ==="; \
(cd $$dir && uv lock --check) || exit 1; \
done

View File

@@ -6,8 +6,9 @@ import hashlib
import logging
import re
import shutil
from collections.abc import Sequence
from pathlib import Path
from typing import TYPE_CHECKING, Any, TypedDict
from typing import Any, TypedDict
from git import Repo
@@ -17,9 +18,6 @@ from langchain_cli.constants import (
DEFAULT_GIT_SUBDIRECTORY,
)
if TYPE_CHECKING:
from collections.abc import Sequence
logger = logging.getLogger(__name__)

View File

@@ -1,11 +1,9 @@
from __future__ import annotations
from dataclasses import dataclass
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from .file import File
from .folder import Folder
from .file import File
from .folder import Folder
@dataclass

View File

@@ -1,12 +1,9 @@
from __future__ import annotations
from typing import TYPE_CHECKING
from pathlib import Path
from .file import File
if TYPE_CHECKING:
from pathlib import Path
class Folder:
def __init__(self, name: str, *files: Folder | File) -> None:

View File

@@ -34,7 +34,7 @@ The LangChain ecosystem is built on top of `langchain-core`. Some of the benefit
## 📖 Documentation
For full documentation, see the [API reference](https://reference.langchain.com/python/langchain_core/). For conceptual guides, tutorials, and examples on using LangChain, see the [LangChain Docs](https://docs.langchain.com/oss/python/langchain/overview).
For full documentation, see the [API reference](https://reference.langchain.com/python/langchain_core/).
## 📕 Releases & Versioning

View File

@@ -28,27 +28,6 @@ from pydantic.v1.fields import FieldInfo as FieldInfoV1
from langchain_core._api.internal import is_caller_internal
def _build_deprecation_message(
*,
alternative: str = "",
alternative_import: str = "",
) -> str:
"""Build a simple deprecation message for `__deprecated__` attribute.
Args:
alternative: An alternative API name.
alternative_import: A fully qualified import path for the alternative.
Returns:
A deprecation message string for IDE/type checker display.
"""
if alternative_import:
return f"Use {alternative_import} instead."
if alternative:
return f"Use {alternative} instead."
return "Deprecated."
class LangChainDeprecationWarning(DeprecationWarning):
"""A class for issuing deprecation warnings for LangChain users."""
@@ -102,57 +81,60 @@ def deprecated(
) -> Callable[[T], T]:
"""Decorator to mark a function, a class, or a property as deprecated.
When deprecating a classmethod, a staticmethod, or a property, the `@deprecated`
decorator should go *under* `@classmethod` and `@staticmethod` (i.e., `deprecated`
should directly decorate the underlying callable), but *over* `@property`.
When deprecating a classmethod, a staticmethod, or a property, the
`@deprecated` decorator should go *under* `@classmethod` and
`@staticmethod` (i.e., `deprecated` should directly decorate the
underlying callable), but *over* `@property`.
When deprecating a class `C` intended to be used as a base class in a multiple
inheritance hierarchy, `C` *must* define an `__init__` method (if `C` instead
inherited its `__init__` from its own base class, then `@deprecated` would mess up
`__init__` inheritance when installing its own (deprecation-emitting) `C.__init__`).
When deprecating a class `C` intended to be used as a base class in a
multiple inheritance hierarchy, `C` *must* define an `__init__` method
(if `C` instead inherited its `__init__` from its own base class, then
`@deprecated` would mess up `__init__` inheritance when installing its
own (deprecation-emitting) `C.__init__`).
Parameters are the same as for `warn_deprecated`, except that *obj_type* defaults to
'class' if decorating a class, 'attribute' if decorating a property, and 'function'
otherwise.
Parameters are the same as for `warn_deprecated`, except that *obj_type*
defaults to 'class' if decorating a class, 'attribute' if decorating a
property, and 'function' otherwise.
Args:
since: The release at which this API became deprecated.
message: Override the default deprecation message.
The `%(since)s`, `%(name)s`, `%(alternative)s`, `%(obj_type)s`,
`%(addendum)s`, and `%(removal)s` format specifiers will be replaced by the
since:
The release at which this API became deprecated.
message:
Override the default deprecation message. The %(since)s,
%(name)s, %(alternative)s, %(obj_type)s, %(addendum)s,
and %(removal)s format specifiers will be replaced by the
values of the respective arguments passed to this function.
name: The name of the deprecated object.
alternative: An alternative API that the user may use in place of the deprecated
API.
The deprecation warning will tell the user about this alternative if
provided.
alternative_import: An alternative import that the user may use instead.
pending: If `True`, uses a `PendingDeprecationWarning` instead of a
`DeprecationWarning`.
Cannot be used together with removal.
obj_type: The object type being deprecated.
addendum: Additional text appended directly to the final message.
removal: The expected removal version.
With the default (an empty string), a removal version is automatically
computed from since. Set to other Falsy values to not schedule a removal
date.
Cannot be used together with pending.
package: The package of the deprecated object.
name:
The name of the deprecated object.
alternative:
An alternative API that the user may use in place of the
deprecated API. The deprecation warning will tell the user
about this alternative if provided.
alternative_import:
An alternative import that the user may use instead.
pending:
If `True`, uses a `PendingDeprecationWarning` instead of a
DeprecationWarning. Cannot be used together with removal.
obj_type:
The object type being deprecated.
addendum:
Additional text appended directly to the final message.
removal:
The expected removal version. With the default (an empty
string), a removal version is automatically computed from
since. Set to other Falsy values to not schedule a removal
date. Cannot be used together with pending.
package:
The package of the deprecated object.
Returns:
A decorator to mark a function or class as deprecated.
Example:
```python
@deprecated("1.4.0")
def the_function_to_deprecate():
pass
```
```python
@deprecated("1.4.0")
def the_function_to_deprecate():
pass
```
"""
_validate_deprecation_params(
removal, alternative, alternative_import, pending=pending
@@ -241,11 +223,6 @@ def deprecated(
obj.__init__ = functools.wraps(obj.__init__)( # type: ignore[misc]
warn_if_direct_instance
)
# Set __deprecated__ for PEP 702 (IDE/type checker support)
obj.__deprecated__ = _build_deprecation_message( # type: ignore[attr-defined]
alternative=alternative,
alternative_import=alternative_import,
)
return obj
elif isinstance(obj, FieldInfoV1):
@@ -338,15 +315,12 @@ def deprecated(
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T: # noqa: ARG001
"""Finalize the property."""
prop = _DeprecatedProperty(
fget=obj.fget, fset=obj.fset, fdel=obj.fdel, doc=new_doc
return cast(
"T",
_DeprecatedProperty(
fget=obj.fget, fset=obj.fset, fdel=obj.fdel, doc=new_doc
),
)
# Set __deprecated__ for PEP 702 (IDE/type checker support)
prop.__deprecated__ = _build_deprecation_message( # type: ignore[attr-defined]
alternative=alternative,
alternative_import=alternative_import,
)
return cast("T", prop)
else:
_name = _name or cast("type | Callable", obj).__qualname__
@@ -369,11 +343,6 @@ def deprecated(
"""
wrapper = functools.wraps(wrapped)(wrapper)
wrapper.__doc__ = new_doc
# Set __deprecated__ for PEP 702 (IDE/type checker support)
wrapper.__deprecated__ = _build_deprecation_message( # type: ignore[attr-defined]
alternative=alternative,
alternative_import=alternative_import,
)
return cast("T", wrapper)
old_doc = inspect.cleandoc(old_doc or "").strip("\n")
@@ -429,7 +398,7 @@ def deprecated(
@contextlib.contextmanager
def suppress_langchain_deprecation_warning() -> Generator[None, None, None]:
"""Context manager to suppress `LangChainDeprecationWarning`."""
"""Context manager to suppress LangChainDeprecationWarning."""
with warnings.catch_warnings():
warnings.simplefilter("ignore", LangChainDeprecationWarning)
warnings.simplefilter("ignore", LangChainPendingDeprecationWarning)
@@ -452,33 +421,35 @@ def warn_deprecated(
"""Display a standardized deprecation.
Args:
since: The release at which this API became deprecated.
message: Override the default deprecation message.
The `%(since)s`, `%(name)s`, `%(alternative)s`, `%(obj_type)s`,
`%(addendum)s`, and `%(removal)s` format specifiers will be replaced by the
since:
The release at which this API became deprecated.
message:
Override the default deprecation message. The %(since)s,
%(name)s, %(alternative)s, %(obj_type)s, %(addendum)s,
and %(removal)s format specifiers will be replaced by the
values of the respective arguments passed to this function.
name: The name of the deprecated object.
alternative: An alternative API that the user may use in place of the
deprecated API.
The deprecation warning will tell the user about this alternative if
provided.
alternative_import: An alternative import that the user may use instead.
pending: If `True`, uses a `PendingDeprecationWarning` instead of a
`DeprecationWarning`.
Cannot be used together with removal.
obj_type: The object type being deprecated.
addendum: Additional text appended directly to the final message.
removal: The expected removal version.
With the default (an empty string), a removal version is automatically
computed from since. Set to other Falsy values to not schedule a removal
date.
Cannot be used together with pending.
package: The package of the deprecated object.
name:
The name of the deprecated object.
alternative:
An alternative API that the user may use in place of the
deprecated API. The deprecation warning will tell the user
about this alternative if provided.
alternative_import:
An alternative import that the user may use instead.
pending:
If `True`, uses a `PendingDeprecationWarning` instead of a
DeprecationWarning. Cannot be used together with removal.
obj_type:
The object type being deprecated.
addendum:
Additional text appended directly to the final message.
removal:
The expected removal version. With the default (an empty
string), a removal version is automatically computed from
since. Set to other Falsy values to not schedule a removal
date. Cannot be used together with pending.
package:
The package of the deprecated object.
"""
if not pending:
if not removal:
@@ -563,8 +534,8 @@ def rename_parameter(
"""Decorator indicating that parameter *old* of *func* is renamed to *new*.
The actual implementation of *func* should use *new*, not *old*. If *old* is passed
to *func*, a `DeprecationWarning` is emitted, and its value is used, even if *new*
is also passed by keyword.
to *func*, a DeprecationWarning is emitted, and its value is used, even if *new* is
also passed by keyword.
Args:
since: The version in which the parameter was renamed.

View File

@@ -3,7 +3,6 @@
Distinct from provider-based [prompt caching](https://docs.langchain.com/oss/python/langchain/models#prompt-caching).
!!! warning "Beta feature"
This is a beta feature. Please be wary of deploying experimental code to production
unless you've taken appropriate precautions.

View File

@@ -5,12 +5,13 @@ from __future__ import annotations
import logging
from typing import TYPE_CHECKING, Any
from typing_extensions import Self
if TYPE_CHECKING:
from collections.abc import Sequence
from uuid import UUID
from tenacity import RetryCallState
from typing_extensions import Self
from langchain_core.agents import AgentAction, AgentFinish
from langchain_core.documents import Document

View File

@@ -6,6 +6,7 @@ import asyncio
import atexit
import functools
import logging
import uuid
from abc import ABC, abstractmethod
from collections.abc import Callable
from concurrent.futures import ThreadPoolExecutor
@@ -38,9 +39,9 @@ from langchain_core.tracers.context import (
tracing_v2_callback_var,
)
from langchain_core.tracers.langchain import LangChainTracer
from langchain_core.tracers.schemas import Run
from langchain_core.tracers.stdout import ConsoleCallbackHandler
from langchain_core.utils.env import env_var_is_set
from langchain_core.utils.uuid import uuid7
if TYPE_CHECKING:
from collections.abc import AsyncGenerator, Coroutine, Generator, Sequence
@@ -51,7 +52,6 @@ if TYPE_CHECKING:
from langchain_core.documents import Document
from langchain_core.outputs import ChatGenerationChunk, GenerationChunk, LLMResult
from langchain_core.runnables.config import RunnableConfig
from langchain_core.tracers.schemas import Run
logger = logging.getLogger(__name__)
@@ -504,7 +504,7 @@ class BaseRunManager(RunManagerMixin):
"""
return cls(
run_id=uuid7(),
run_id=uuid.uuid4(),
handlers=[],
inheritable_handlers=[],
tags=[],
@@ -1330,7 +1330,7 @@ class CallbackManager(BaseCallbackManager):
managers = []
for i, prompt in enumerate(prompts):
# Can't have duplicate runs with the same run ID (if provided)
run_id_ = run_id if i == 0 and run_id is not None else uuid7()
run_id_ = run_id if i == 0 and run_id is not None else uuid.uuid4()
handle_event(
self.handlers,
"on_llm_start",
@@ -1384,7 +1384,7 @@ class CallbackManager(BaseCallbackManager):
run_id_ = run_id
run_id = None
else:
run_id_ = uuid7()
run_id_ = uuid.uuid4()
handle_event(
self.handlers,
"on_chat_model_start",
@@ -1433,7 +1433,7 @@ class CallbackManager(BaseCallbackManager):
"""
if run_id is None:
run_id = uuid7()
run_id = uuid.uuid4()
handle_event(
self.handlers,
"on_chain_start",
@@ -1488,7 +1488,7 @@ class CallbackManager(BaseCallbackManager):
"""
if run_id is None:
run_id = uuid7()
run_id = uuid.uuid4()
handle_event(
self.handlers,
@@ -1537,7 +1537,7 @@ class CallbackManager(BaseCallbackManager):
The callback manager for the retriever run.
"""
if run_id is None:
run_id = uuid7()
run_id = uuid.uuid4()
handle_event(
self.handlers,
@@ -1594,7 +1594,7 @@ class CallbackManager(BaseCallbackManager):
)
raise ValueError(msg)
if run_id is None:
run_id = uuid7()
run_id = uuid.uuid4()
handle_event(
self.handlers,
@@ -1816,7 +1816,7 @@ class AsyncCallbackManager(BaseCallbackManager):
run_id_ = run_id
run_id = None
else:
run_id_ = uuid7()
run_id_ = uuid.uuid4()
if inline_handlers:
inline_tasks.append(
@@ -1900,7 +1900,7 @@ class AsyncCallbackManager(BaseCallbackManager):
run_id_ = run_id
run_id = None
else:
run_id_ = uuid7()
run_id_ = uuid.uuid4()
for handler in self.handlers:
task = ahandle_event(
@@ -1962,7 +1962,7 @@ class AsyncCallbackManager(BaseCallbackManager):
The async callback manager for the chain run.
"""
if run_id is None:
run_id = uuid7()
run_id = uuid.uuid4()
await ahandle_event(
self.handlers,
@@ -2010,7 +2010,7 @@ class AsyncCallbackManager(BaseCallbackManager):
The async callback manager for the tool run.
"""
if run_id is None:
run_id = uuid7()
run_id = uuid.uuid4()
await ahandle_event(
self.handlers,
@@ -2060,7 +2060,7 @@ class AsyncCallbackManager(BaseCallbackManager):
if not self.handlers:
return
if run_id is None:
run_id = uuid7()
run_id = uuid.uuid4()
if kwargs:
msg = (
@@ -2102,7 +2102,7 @@ class AsyncCallbackManager(BaseCallbackManager):
The async callback manager for the retriever run.
"""
if run_id is None:
run_id = uuid7()
run_id = uuid.uuid4()
await ahandle_event(
self.handlers,

View File

@@ -95,7 +95,7 @@ def get_usage_metadata_callback(
"""Get usage metadata callback.
Get context manager for tracking usage metadata across chat model calls using
[`AIMessage.usage_metadata`][langchain.messages.AIMessage.usage_metadata].
`AIMessage.usage_metadata`.
Args:
name: The name of the context variable.

View File

@@ -11,7 +11,7 @@ from langchain_core.prompts.prompt import PromptTemplate
def _get_length_based(text: str) -> int:
return len(re.split(r"\n| ", text))
return len(re.split("\n| ", text))
class LengthBasedExampleSelector(BaseExampleSelector, BaseModel):

View File

@@ -6,9 +6,16 @@ import hashlib
import json
import uuid
import warnings
from collections.abc import (
AsyncIterable,
AsyncIterator,
Callable,
Iterable,
Iterator,
Sequence,
)
from itertools import islice
from typing import (
TYPE_CHECKING,
Any,
Literal,
TypedDict,
@@ -22,16 +29,6 @@ from langchain_core.exceptions import LangChainException
from langchain_core.indexing.base import DocumentIndex, RecordManager
from langchain_core.vectorstores import VectorStore
if TYPE_CHECKING:
from collections.abc import (
AsyncIterable,
AsyncIterator,
Callable,
Iterable,
Iterator,
Sequence,
)
# Magic UUID to use as a namespace for hashing.
# Used to try and generate a unique UUID for each document
# from hashing the document content and metadata.
@@ -302,7 +299,6 @@ def index(
are not able to specify the uid of the document.
!!! warning "Behavior changed in `langchain-core` 0.3.25"
Added `scoped_full` cleanup mode.
!!! warning
@@ -641,7 +637,6 @@ async def aindex(
are not able to specify the uid of the document.
!!! warning "Behavior changed in `langchain-core` 0.3.25"
Added `scoped_full` cleanup mode.
!!! warning

View File

@@ -1,7 +1,7 @@
"""Core language model abstractions.
"""Language models.
LangChain has two main classes to work with language models: chat models and
"old-fashioned" LLMs (string-in, string-out).
"old-fashioned" LLMs.
**Chat models**
@@ -11,16 +11,14 @@ as outputs (as opposed to using plain text).
Chat models support the assignment of distinct roles to conversation messages, helping
to distinguish messages from the AI, users, and instructions such as system messages.
The key abstraction for chat models is
[`BaseChatModel`][langchain_core.language_models.BaseChatModel]. Implementations should
inherit from this class.
The key abstraction for chat models is `BaseChatModel`. Implementations should inherit
from this class.
See existing [chat model integrations](https://docs.langchain.com/oss/python/integrations/chat).
**LLMs (legacy)**
**LLMs**
Language models that takes a string as input and returns a string.
These are traditionally older models (newer models generally are chat models).
Although the underlying models are string in, string out, the LangChain wrappers also
@@ -55,10 +53,6 @@ if TYPE_CHECKING:
ParrotFakeChatModel,
)
from langchain_core.language_models.llms import LLM, BaseLLM
from langchain_core.language_models.model_profile import (
ModelProfile,
ModelProfileRegistry,
)
__all__ = (
"LLM",
@@ -74,8 +68,6 @@ __all__ = (
"LanguageModelInput",
"LanguageModelLike",
"LanguageModelOutput",
"ModelProfile",
"ModelProfileRegistry",
"ParrotFakeChatModel",
"SimpleChatModel",
"get_tokenizer",
@@ -98,8 +90,6 @@ _dynamic_imports = {
"GenericFakeChatModel": "fake_chat_models",
"ParrotFakeChatModel": "fake_chat_models",
"LLM": "llms",
"ModelProfile": "model_profile",
"ModelProfileRegistry": "model_profile",
"BaseLLM": "llms",
"is_openai_data_block": "_utils",
}

View File

@@ -140,7 +140,6 @@ def _normalize_messages(
- LangChain v0 standard content blocks for backward compatibility
!!! warning "Behavior changed in `langchain-core` 1.0.0"
In previous versions, this function returned messages in LangChain v0 format.
Now, it returns messages in LangChain v1 format, which upgraded chat models now
expect to receive when passing back in message history. For backward

View File

@@ -299,9 +299,6 @@ class BaseLanguageModel(
Useful for checking if an input fits in a model's context window.
This should be overridden by model-specific implementations to provide accurate
token counts via model-specific tokenizers.
Args:
text: The string input to tokenize.
@@ -320,17 +317,9 @@ class BaseLanguageModel(
Useful for checking if an input fits in a model's context window.
This should be overridden by model-specific implementations to provide accurate
token counts via model-specific tokenizers.
!!! note
* The base implementation of `get_num_tokens_from_messages` ignores tool
schemas.
* The base implementation of `get_num_tokens_from_messages` adds additional
prefixes to messages in represent user roles, which will add to the
overall token count. Model-specific implementations may choose to
handle this differently.
The base implementation of `get_num_tokens_from_messages` ignores tool
schemas.
Args:
messages: The message inputs to tokenize.

View File

@@ -15,6 +15,7 @@ from typing import TYPE_CHECKING, Any, Literal, cast
from pydantic import BaseModel, ConfigDict, Field
from typing_extensions import override
from langchain_core._api.beta_decorator import beta
from langchain_core.caches import BaseCache
from langchain_core.callbacks import (
AsyncCallbackManager,
@@ -33,7 +34,6 @@ from langchain_core.language_models.base import (
LangSmithParams,
LanguageModelInput,
)
from langchain_core.language_models.model_profile import ModelProfile
from langchain_core.load import dumpd, dumps
from langchain_core.messages import (
AIMessage,
@@ -76,6 +76,8 @@ from langchain_core.utils.utils import LC_ID_PREFIX, from_env
if TYPE_CHECKING:
import uuid
from langchain_model_profiles import ModelProfile # type: ignore[import-untyped]
from langchain_core.output_parsers.base import OutputParserLike
from langchain_core.runnables import Runnable, RunnableConfig
from langchain_core.tools import BaseTool
@@ -89,10 +91,7 @@ def _generate_response_from_error(error: BaseException) -> list[ChatGeneration]:
try:
metadata["body"] = response.json()
except Exception:
try:
metadata["body"] = getattr(response, "text", None)
except Exception:
metadata["body"] = None
metadata["body"] = getattr(response, "text", None)
if hasattr(response, "headers"):
try:
metadata["headers"] = dict(response.headers)
@@ -333,26 +332,10 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
[`langchain-openai`](https://pypi.org/project/langchain-openai)) can also use this
field to roll out new content formats in a backward-compatible way.
!!! version-added "Added in `langchain-core` 1.0.0"
!!! version-added "Added in `langchain-core` 1.0"
"""
profile: ModelProfile | None = Field(default=None, exclude=True)
"""Profile detailing model capabilities.
!!! warning "Beta feature"
This is a beta feature. The format of model profiles is subject to change.
If not specified, automatically loaded from the provider package on initialization
if data is available.
Example profile data includes context window sizes, supported modalities, or support
for tool calling, structured output, and other features.
!!! version-added "Added in `langchain-core` 1.1.0"
"""
model_config = ConfigDict(
arbitrary_types_allowed=True,
)
@@ -548,7 +531,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
):
if block["type"] != index_type:
index_type = block["type"]
index += 1
index = index + 1
if "index" not in block:
block["index"] = index
run_manager.on_llm_new_token(
@@ -680,7 +663,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
):
if block["type"] != index_type:
index_type = block["type"]
index += 1
index = index + 1
if "index" not in block:
block["index"] = index
await run_manager.on_llm_new_token(
@@ -731,7 +714,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
# --- Custom methods ---
def _combine_llm_outputs(self, _llm_outputs: list[dict | None], /) -> dict:
def _combine_llm_outputs(self, llm_outputs: list[dict | None]) -> dict: # noqa: ARG002
return {}
def _convert_cached_generations(self, cache_val: list) -> list[ChatGeneration]:
@@ -1188,7 +1171,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
):
if block["type"] != index_type:
index_type = block["type"]
index += 1
index = index + 1
if "index" not in block:
block["index"] = index
if run_manager:
@@ -1306,7 +1289,7 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
):
if block["type"] != index_type:
index_type = block["type"]
index += 1
index = index + 1
if "index" not in block:
block["index"] = index
if run_manager:
@@ -1579,89 +1562,88 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
depends on the `schema` as described above.
- `'parsing_error'`: `BaseException | None`
???+ example "Pydantic schema (`include_raw=False`)"
Example: Pydantic schema (`include_raw=False`):
```python
from pydantic import BaseModel
```python
from pydantic import BaseModel
class AnswerWithJustification(BaseModel):
'''An answer to the user question along with justification for the answer.'''
class AnswerWithJustification(BaseModel):
'''An answer to the user question along with justification for the answer.'''
answer: str
justification: str
answer: str
justification: str
model = ChatModel(model="model-name", temperature=0)
structured_model = model.with_structured_output(AnswerWithJustification)
model = ChatModel(model="model-name", temperature=0)
structured_model = model.with_structured_output(AnswerWithJustification)
structured_model.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
structured_model.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
# -> AnswerWithJustification(
# answer='They weigh the same',
# justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'
# )
```
# -> AnswerWithJustification(
# answer='They weigh the same',
# justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'
# )
```
??? example "Pydantic schema (`include_raw=True`)"
Example: Pydantic schema (`include_raw=True`):
```python
from pydantic import BaseModel
```python
from pydantic import BaseModel
class AnswerWithJustification(BaseModel):
'''An answer to the user question along with justification for the answer.'''
class AnswerWithJustification(BaseModel):
'''An answer to the user question along with justification for the answer.'''
answer: str
justification: str
answer: str
justification: str
model = ChatModel(model="model-name", temperature=0)
structured_model = model.with_structured_output(
AnswerWithJustification, include_raw=True
)
model = ChatModel(model="model-name", temperature=0)
structured_model = model.with_structured_output(
AnswerWithJustification, include_raw=True
)
structured_model.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
# -> {
# 'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Ao02pnFYXD6GN1yzc0uXPsvF', 'function': {'arguments': '{"answer":"They weigh the same.","justification":"Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ."}', 'name': 'AnswerWithJustification'}, 'type': 'function'}]}),
# 'parsed': AnswerWithJustification(answer='They weigh the same.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'),
# 'parsing_error': None
# }
```
structured_model.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
# -> {
# 'raw': AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_Ao02pnFYXD6GN1yzc0uXPsvF', 'function': {'arguments': '{"answer":"They weigh the same.","justification":"Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ."}', 'name': 'AnswerWithJustification'}, 'type': 'function'}]}),
# 'parsed': AnswerWithJustification(answer='They weigh the same.', justification='Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume or density of the objects may differ.'),
# 'parsing_error': None
# }
```
??? example "Dictionary schema (`include_raw=False`)"
Example: `dict` schema (`include_raw=False`):
```python
from pydantic import BaseModel
from langchain_core.utils.function_calling import convert_to_openai_tool
```python
from pydantic import BaseModel
from langchain_core.utils.function_calling import convert_to_openai_tool
class AnswerWithJustification(BaseModel):
'''An answer to the user question along with justification for the answer.'''
class AnswerWithJustification(BaseModel):
'''An answer to the user question along with justification for the answer.'''
answer: str
justification: str
answer: str
justification: str
dict_schema = convert_to_openai_tool(AnswerWithJustification)
model = ChatModel(model="model-name", temperature=0)
structured_model = model.with_structured_output(dict_schema)
dict_schema = convert_to_openai_tool(AnswerWithJustification)
model = ChatModel(model="model-name", temperature=0)
structured_model = model.with_structured_output(dict_schema)
structured_model.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
# -> {
# 'answer': 'They weigh the same',
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
# }
```
structured_model.invoke(
"What weighs more a pound of bricks or a pound of feathers"
)
# -> {
# 'answer': 'They weigh the same',
# 'justification': 'Both a pound of bricks and a pound of feathers weigh one pound. The weight is the same, but the volume and density of the two substances differ.'
# }
```
!!! warning "Behavior changed in `langchain-core` 0.2.26"
Added support for `TypedDict` class.
""" # noqa: E501
@@ -1703,6 +1685,40 @@ class BaseChatModel(BaseLanguageModel[AIMessage], ABC):
return RunnableMap(raw=llm) | parser_with_fallback
return llm | output_parser
@property
@beta()
def profile(self) -> ModelProfile:
"""Return profiling information for the model.
This property relies on the `langchain-model-profiles` package to retrieve chat
model capabilities, such as context window sizes and supported features.
Raises:
ImportError: If `langchain-model-profiles` is not installed.
Returns:
A `ModelProfile` object containing profiling information for the model.
"""
try:
from langchain_model_profiles import get_model_profile # noqa: PLC0415
except ImportError as err:
informative_error_message = (
"To access model profiling information, please install the "
"`langchain-model-profiles` package: "
"`pip install langchain-model-profiles`."
)
raise ImportError(informative_error_message) from err
provider_id = self._llm_type
model_name = (
# Model name is not standardized across integrations. New integrations
# should prefer `model`.
getattr(self, "model", None)
or getattr(self, "model_name", None)
or getattr(self, "model_id", "")
)
return get_model_profile(provider_id, model_name) or {}
class SimpleChatModel(BaseChatModel):
"""Simplified implementation for a chat model to inherit from.

View File

@@ -61,8 +61,6 @@ if TYPE_CHECKING:
logger = logging.getLogger(__name__)
_background_tasks: set[asyncio.Task] = set()
@functools.lru_cache
def _log_error_once(msg: str) -> None:
@@ -102,9 +100,9 @@ def create_base_retry_decorator(
asyncio.run(coro)
else:
if loop.is_running():
task = loop.create_task(coro)
_background_tasks.add(task)
task.add_done_callback(_background_tasks.discard)
# TODO: Fix RUF006 - this task should have a reference
# and be awaited somewhere
loop.create_task(coro) # noqa: RUF006
else:
asyncio.run(coro)
except Exception as e:

View File

@@ -1,85 +0,0 @@
"""Model profile types and utilities."""
from typing_extensions import TypedDict
class ModelProfile(TypedDict, total=False):
"""Model profile.
!!! warning "Beta feature"
This is a beta feature. The format of model profiles is subject to change.
Provides information about chat model capabilities, such as context window sizes
and supported features.
"""
# --- Input constraints ---
max_input_tokens: int
"""Maximum context window (tokens)"""
image_inputs: bool
"""Whether image inputs are supported."""
# TODO: add more detail about formats?
image_url_inputs: bool
"""Whether [image URL inputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
are supported."""
pdf_inputs: bool
"""Whether [PDF inputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
are supported."""
# TODO: add more detail about formats? e.g. bytes or base64
audio_inputs: bool
"""Whether [audio inputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
are supported."""
# TODO: add more detail about formats? e.g. bytes or base64
video_inputs: bool
"""Whether [video inputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
are supported."""
# TODO: add more detail about formats? e.g. bytes or base64
image_tool_message: bool
"""Whether images can be included in tool messages."""
pdf_tool_message: bool
"""Whether PDFs can be included in tool messages."""
# --- Output constraints ---
max_output_tokens: int
"""Maximum output tokens"""
reasoning_output: bool
"""Whether the model supports [reasoning / chain-of-thought](https://docs.langchain.com/oss/python/langchain/models#reasoning)"""
image_outputs: bool
"""Whether [image outputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
are supported."""
audio_outputs: bool
"""Whether [audio outputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
are supported."""
video_outputs: bool
"""Whether [video outputs](https://docs.langchain.com/oss/python/langchain/models#multimodal)
are supported."""
# --- Tool calling ---
tool_calling: bool
"""Whether the model supports [tool calling](https://docs.langchain.com/oss/python/langchain/models#tool-calling)"""
tool_choice: bool
"""Whether the model supports [tool choice](https://docs.langchain.com/oss/python/langchain/models#forcing-tool-calls)"""
# --- Structured output ---
structured_output: bool
"""Whether the model supports a native [structured output](https://docs.langchain.com/oss/python/langchain/models#structured-outputs)
feature"""
ModelProfileRegistry = dict[str, ModelProfile]
"""Registry mapping model identifiers or names to their ModelProfile."""

View File

@@ -6,7 +6,7 @@ from langchain_core._import_utils import import_attr
if TYPE_CHECKING:
from langchain_core.load.dump import dumpd, dumps
from langchain_core.load.load import InitValidator, loads
from langchain_core.load.load import loads
from langchain_core.load.serializable import Serializable
# Unfortunately, we have to eagerly import load from langchain_core/load/load.py
@@ -15,19 +15,11 @@ if TYPE_CHECKING:
# the `from langchain_core.load.load import load` absolute import should also work.
from langchain_core.load.load import load
__all__ = (
"InitValidator",
"Serializable",
"dumpd",
"dumps",
"load",
"loads",
)
__all__ = ("Serializable", "dumpd", "dumps", "load", "loads")
_dynamic_imports = {
"dumpd": "dump",
"dumps": "dump",
"InitValidator": "load",
"loads": "load",
"Serializable": "serializable",
}

View File

@@ -1,176 +0,0 @@
"""Validation utilities for LangChain serialization.
Provides escape-based protection against injection attacks in serialized objects. The
approach uses an allowlist design: only dicts explicitly produced by
`Serializable.to_json()` are treated as LC objects during deserialization.
## How escaping works
During serialization, plain dicts (user data) that contain an `'lc'` key are wrapped:
```python
{"lc": 1, ...} # user data that looks like LC object
# becomes:
{"__lc_escaped__": {"lc": 1, ...}}
```
During deserialization, escaped dicts are unwrapped and returned as plain dicts,
NOT instantiated as LC objects.
"""
from typing import Any
_LC_ESCAPED_KEY = "__lc_escaped__"
"""Sentinel key used to mark escaped user dicts during serialization.
When a plain dict contains 'lc' key (which could be confused with LC objects),
we wrap it as {"__lc_escaped__": {...original...}}.
"""
def _needs_escaping(obj: dict[str, Any]) -> bool:
"""Check if a dict needs escaping to prevent confusion with LC objects.
A dict needs escaping if:
1. It has an `'lc'` key (could be confused with LC serialization format)
2. It has only the escape key (would be mistaken for an escaped dict)
"""
return "lc" in obj or (len(obj) == 1 and _LC_ESCAPED_KEY in obj)
def _escape_dict(obj: dict[str, Any]) -> dict[str, Any]:
"""Wrap a dict in the escape marker.
Example:
```python
{"key": "value"} # becomes {"__lc_escaped__": {"key": "value"}}
```
"""
return {_LC_ESCAPED_KEY: obj}
def _is_escaped_dict(obj: dict[str, Any]) -> bool:
"""Check if a dict is an escaped user dict.
Example:
```python
{"__lc_escaped__": {...}} # is an escaped dict
```
"""
return len(obj) == 1 and _LC_ESCAPED_KEY in obj
def _serialize_value(obj: Any) -> Any:
"""Serialize a value with escaping of user dicts.
Called recursively on kwarg values to escape any plain dicts that could be confused
with LC objects.
Args:
obj: The value to serialize.
Returns:
The serialized value with user dicts escaped as needed.
"""
from langchain_core.load.serializable import ( # noqa: PLC0415
Serializable,
to_json_not_implemented,
)
if isinstance(obj, Serializable):
# This is an LC object - serialize it properly (not escaped)
return _serialize_lc_object(obj)
if isinstance(obj, dict):
if not all(isinstance(k, (str, int, float, bool, type(None))) for k in obj):
# if keys are not json serializable
return to_json_not_implemented(obj)
# Check if dict needs escaping BEFORE recursing into values.
# If it needs escaping, wrap it as-is - the contents are user data that
# will be returned as-is during deserialization (no instantiation).
# This prevents re-escaping of already-escaped nested content.
if _needs_escaping(obj):
return _escape_dict(obj)
# Safe dict (no 'lc' key) - recurse into values
return {k: _serialize_value(v) for k, v in obj.items()}
if isinstance(obj, (list, tuple)):
return [_serialize_value(item) for item in obj]
if isinstance(obj, (str, int, float, bool, type(None))):
return obj
# Non-JSON-serializable object (datetime, custom objects, etc.)
return to_json_not_implemented(obj)
def _is_lc_secret(obj: Any) -> bool:
"""Check if an object is a LangChain secret marker."""
expected_num_keys = 3
return (
isinstance(obj, dict)
and obj.get("lc") == 1
and obj.get("type") == "secret"
and "id" in obj
and len(obj) == expected_num_keys
)
def _serialize_lc_object(obj: Any) -> dict[str, Any]:
"""Serialize a `Serializable` object with escaping of user data in kwargs.
Args:
obj: The `Serializable` object to serialize.
Returns:
The serialized dict with user data in kwargs escaped as needed.
Note:
Kwargs values are processed with `_serialize_value` to escape user data (like
metadata) that contains `'lc'` keys. Secret fields (from `lc_secrets`) are
skipped because `to_json()` replaces their values with secret markers.
"""
from langchain_core.load.serializable import Serializable # noqa: PLC0415
if not isinstance(obj, Serializable):
msg = f"Expected Serializable, got {type(obj)}"
raise TypeError(msg)
serialized: dict[str, Any] = dict(obj.to_json())
# Process kwargs to escape user data that could be confused with LC objects
# Skip secret fields - to_json() already converted them to secret markers
if serialized.get("type") == "constructor" and "kwargs" in serialized:
serialized["kwargs"] = {
k: v if _is_lc_secret(v) else _serialize_value(v)
for k, v in serialized["kwargs"].items()
}
return serialized
def _unescape_value(obj: Any) -> Any:
"""Unescape a value, processing escape markers in dict values and lists.
When an escaped dict is encountered (`{"__lc_escaped__": ...}`), it's
unwrapped and the contents are returned AS-IS (no further processing).
The contents represent user data that should not be modified.
For regular dicts and lists, we recurse to find any nested escape markers.
Args:
obj: The value to unescape.
Returns:
The unescaped value.
"""
if isinstance(obj, dict):
if _is_escaped_dict(obj):
# Unwrap and return the user data as-is (no further unescaping).
# The contents are user data that may contain more escape keys,
# but those are part of the user's actual data.
return obj[_LC_ESCAPED_KEY]
# Regular dict - recurse into values to find nested escape markers
return {k: _unescape_value(v) for k, v in obj.items()}
if isinstance(obj, list):
return [_unescape_value(item) for item in obj]
return obj

View File

@@ -1,26 +1,10 @@
"""Serialize LangChain objects to JSON.
Provides `dumps` (to JSON string) and `dumpd` (to dict) for serializing
`Serializable` objects.
## Escaping
During serialization, plain dicts (user data) that contain an `'lc'` key are escaped
by wrapping them: `{"__lc_escaped__": {...original...}}`. This prevents injection
attacks where malicious data could trick the deserializer into instantiating
arbitrary classes. The escape marker is removed during deserialization.
This is an allowlist approach: only dicts explicitly produced by
`Serializable.to_json()` are treated as LC objects; everything else is escaped if it
could be confused with the LC format.
"""
"""Dump objects to json."""
import json
from typing import Any
from pydantic import BaseModel
from langchain_core.load._validation import _serialize_value
from langchain_core.load.serializable import Serializable, to_json_not_implemented
from langchain_core.messages import AIMessage
from langchain_core.outputs import ChatGeneration
@@ -41,20 +25,6 @@ def default(obj: Any) -> Any:
def _dump_pydantic_models(obj: Any) -> Any:
"""Convert nested Pydantic models to dicts for JSON serialization.
Handles the special case where a `ChatGeneration` contains an `AIMessage`
with a parsed Pydantic model in `additional_kwargs["parsed"]`. Since
Pydantic models aren't directly JSON serializable, this converts them to
dicts.
Args:
obj: The object to process.
Returns:
A copy of the object with nested Pydantic models converted to dicts, or
the original object unchanged if no conversion was needed.
"""
if (
isinstance(obj, ChatGeneration)
and isinstance(obj.message, AIMessage)
@@ -70,17 +40,10 @@ def _dump_pydantic_models(obj: Any) -> Any:
def dumps(obj: Any, *, pretty: bool = False, **kwargs: Any) -> str:
"""Return a JSON string representation of an object.
Note:
Plain dicts containing an `'lc'` key are automatically escaped to prevent
confusion with LC serialization format. The escape marker is removed during
deserialization.
Args:
obj: The object to dump.
pretty: Whether to pretty print the json.
If `True`, the json will be indented by either 2 spaces or the amount
provided in the `indent` kwarg.
pretty: Whether to pretty print the json. If `True`, the json will be
indented with 2 spaces (if no indent is provided as part of `kwargs`).
**kwargs: Additional arguments to pass to `json.dumps`
Returns:
@@ -92,29 +55,28 @@ def dumps(obj: Any, *, pretty: bool = False, **kwargs: Any) -> str:
if "default" in kwargs:
msg = "`default` should not be passed to dumps"
raise ValueError(msg)
obj = _dump_pydantic_models(obj)
serialized = _serialize_value(obj)
if pretty:
indent = kwargs.pop("indent", 2)
return json.dumps(serialized, indent=indent, **kwargs)
return json.dumps(serialized, **kwargs)
try:
obj = _dump_pydantic_models(obj)
if pretty:
indent = kwargs.pop("indent", 2)
return json.dumps(obj, default=default, indent=indent, **kwargs)
return json.dumps(obj, default=default, **kwargs)
except TypeError:
if pretty:
indent = kwargs.pop("indent", 2)
return json.dumps(to_json_not_implemented(obj), indent=indent, **kwargs)
return json.dumps(to_json_not_implemented(obj), **kwargs)
def dumpd(obj: Any) -> Any:
"""Return a dict representation of an object.
Note:
Plain dicts containing an `'lc'` key are automatically escaped to prevent
confusion with LC serialization format. The escape marker is removed during
deserialization.
Args:
obj: The object to dump.
Returns:
Dictionary that can be serialized to json using `json.dumps`.
"""
obj = _dump_pydantic_models(obj)
return _serialize_value(obj)
# Unfortunately this function is not as efficient as it could be because it first
# dumps the object to a json string and then loads it back into a dictionary.
return json.loads(dumps(obj))

View File

@@ -1,83 +1,11 @@
"""Load LangChain objects from JSON strings or objects.
## How it works
Each `Serializable` LangChain object has a unique identifier (its "class path"), which
is a list of strings representing the module path and class name. For example:
- `AIMessage` -> `["langchain_core", "messages", "ai", "AIMessage"]`
- `ChatPromptTemplate` -> `["langchain_core", "prompts", "chat", "ChatPromptTemplate"]`
When deserializing, the class path from the JSON `'id'` field is checked against an
allowlist. If the class is not in the allowlist, deserialization raises a `ValueError`.
## Security model
The `allowed_objects` parameter controls which classes can be deserialized:
- **`'core'` (default)**: Allow classes defined in the serialization mappings for
langchain_core.
- **`'all'`**: Allow classes defined in the serialization mappings. This
includes core LangChain types (messages, prompts, documents, etc.) and trusted
partner integrations. See `langchain_core.load.mapping` for the full list.
- **Explicit list of classes**: Only those specific classes are allowed.
For simple data types like messages and documents, the default allowlist is safe to use.
These classes do not perform side effects during initialization.
!!! note "Side effects in allowed classes"
Deserialization calls `__init__` on allowed classes. If those classes perform side
effects during initialization (network calls, file operations, etc.), those side
effects will occur. The allowlist prevents instantiation of classes outside the
allowlist, but does not sandbox the allowed classes themselves.
Import paths are also validated against trusted namespaces before any module is
imported.
### Injection protection (escape-based)
During serialization, plain dicts that contain an `'lc'` key are escaped by wrapping
them: `{"__lc_escaped__": {...}}`. During deserialization, escaped dicts are unwrapped
and returned as plain dicts, NOT instantiated as LC objects.
This is an allowlist approach: only dicts explicitly produced by
`Serializable.to_json()` (which are NOT escaped) are treated as LC objects;
everything else is user data.
Even if an attacker's payload includes `__lc_escaped__` wrappers, it will be unwrapped
to plain dicts and NOT instantiated as malicious objects.
## Examples
```python
from langchain_core.load import load
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.messages import AIMessage, HumanMessage
# Use default allowlist (classes from mappings) - recommended
obj = load(data)
# Allow only specific classes (most restrictive)
obj = load(
data,
allowed_objects=[
ChatPromptTemplate,
AIMessage,
HumanMessage,
],
)
```
"""
"""Load LangChain objects from JSON strings or objects."""
import importlib
import json
import os
from collections.abc import Callable, Iterable
from typing import Any, Literal, cast
from typing import Any
from langchain_core._api import beta
from langchain_core.load._validation import _is_escaped_dict, _unescape_value
from langchain_core.load.mapping import (
_JS_SERIALIZABLE_MAPPING,
_OG_SERIALIZABLE_MAPPING,
@@ -116,209 +44,32 @@ ALL_SERIALIZABLE_MAPPINGS = {
**_JS_SERIALIZABLE_MAPPING,
}
# Cache for the default allowed class paths computed from mappings
# Maps mode ("all" or "core") to the cached set of paths
_default_class_paths_cache: dict[str, set[tuple[str, ...]]] = {}
def _get_default_allowed_class_paths(
allowed_object_mode: Literal["all", "core"],
) -> set[tuple[str, ...]]:
"""Get the default allowed class paths from the serialization mappings.
This uses the mappings as the source of truth for what classes are allowed
by default. Both the legacy paths (keys) and current paths (values) are included.
Args:
allowed_object_mode: either `'all'` or `'core'`.
Returns:
Set of class path tuples that are allowed by default.
"""
if allowed_object_mode in _default_class_paths_cache:
return _default_class_paths_cache[allowed_object_mode]
allowed_paths: set[tuple[str, ...]] = set()
for key, value in ALL_SERIALIZABLE_MAPPINGS.items():
if allowed_object_mode == "core" and value[0] != "langchain_core":
continue
allowed_paths.add(key)
allowed_paths.add(value)
_default_class_paths_cache[allowed_object_mode] = allowed_paths
return _default_class_paths_cache[allowed_object_mode]
def _block_jinja2_templates(
class_path: tuple[str, ...],
kwargs: dict[str, Any],
) -> None:
"""Block jinja2 templates during deserialization for security.
Jinja2 templates can execute arbitrary code, so they are blocked by default when
deserializing objects with `template_format='jinja2'`.
Note:
We intentionally do NOT check the `class_path` here to keep this simple and
future-proof. If any new class is added that accepts `template_format='jinja2'`,
it will be automatically blocked without needing to update this function.
Args:
class_path: The class path tuple being deserialized (unused).
kwargs: The kwargs dict for the class constructor.
Raises:
ValueError: If `template_format` is `'jinja2'`.
"""
_ = class_path # Unused - see docstring for rationale. Kept to satisfy signature.
if kwargs.get("template_format") == "jinja2":
msg = (
"Jinja2 templates are not allowed during deserialization for security "
"reasons. Use 'f-string' template format instead, or explicitly allow "
"jinja2 by providing a custom init_validator."
)
raise ValueError(msg)
def default_init_validator(
class_path: tuple[str, ...],
kwargs: dict[str, Any],
) -> None:
"""Default init validator that blocks jinja2 templates.
This is the default validator used by `load()` and `loads()` when no custom
validator is provided.
Args:
class_path: The class path tuple being deserialized.
kwargs: The kwargs dict for the class constructor.
Raises:
ValueError: If template_format is `'jinja2'`.
"""
_block_jinja2_templates(class_path, kwargs)
AllowedObject = type[Serializable]
"""Type alias for classes that can be included in the `allowed_objects` parameter.
Must be a `Serializable` subclass (the class itself, not an instance).
"""
InitValidator = Callable[[tuple[str, ...], dict[str, Any]], None]
"""Type alias for a callable that validates kwargs during deserialization.
The callable receives:
- `class_path`: A tuple of strings identifying the class being instantiated
(e.g., `('langchain', 'schema', 'messages', 'AIMessage')`).
- `kwargs`: The kwargs dict that will be passed to the constructor.
The validator should raise an exception if the object should not be deserialized.
"""
def _compute_allowed_class_paths(
allowed_objects: Iterable[AllowedObject],
import_mappings: dict[tuple[str, ...], tuple[str, ...]],
) -> set[tuple[str, ...]]:
"""Return allowed class paths from an explicit list of classes.
A class path is a tuple of strings identifying a serializable class, derived from
`Serializable.lc_id()`. For example: `('langchain_core', 'messages', 'AIMessage')`.
Args:
allowed_objects: Iterable of `Serializable` subclasses to allow.
import_mappings: Mapping of legacy class paths to current class paths.
Returns:
Set of allowed class paths.
Example:
```python
# Allow a specific class
_compute_allowed_class_paths([MyPrompt], {}) ->
{("langchain_core", "prompts", "MyPrompt")}
# Include legacy paths that map to the same class
import_mappings = {("old", "Prompt"): ("langchain_core", "prompts", "MyPrompt")}
_compute_allowed_class_paths([MyPrompt], import_mappings) ->
{("langchain_core", "prompts", "MyPrompt"), ("old", "Prompt")}
```
"""
allowed_objects_list = list(allowed_objects)
allowed_class_paths: set[tuple[str, ...]] = set()
for allowed_obj in allowed_objects_list:
if not isinstance(allowed_obj, type) or not issubclass(
allowed_obj, Serializable
):
msg = "allowed_objects must contain Serializable subclasses."
raise TypeError(msg)
class_path = tuple(allowed_obj.lc_id())
allowed_class_paths.add(class_path)
# Add legacy paths that map to the same class.
for mapping_key, mapping_value in import_mappings.items():
if tuple(mapping_value) == class_path:
allowed_class_paths.add(mapping_key)
return allowed_class_paths
class Reviver:
"""Reviver for JSON objects.
Used as the `object_hook` for `json.loads` to reconstruct LangChain objects from
their serialized JSON representation.
Only classes in the allowlist can be instantiated.
"""
"""Reviver for JSON objects."""
def __init__(
self,
allowed_objects: Iterable[AllowedObject] | Literal["all", "core"] = "core",
secrets_map: dict[str, str] | None = None,
valid_namespaces: list[str] | None = None,
secrets_from_env: bool = False, # noqa: FBT001,FBT002
secrets_from_env: bool = True, # noqa: FBT001,FBT002
additional_import_mappings: dict[tuple[str, ...], tuple[str, ...]]
| None = None,
*,
ignore_unserializable_fields: bool = False,
init_validator: InitValidator | None = default_init_validator,
) -> None:
"""Initialize the reviver.
Args:
allowed_objects: Allowlist of classes that can be deserialized.
- `'core'` (default): Allow classes defined in the serialization
mappings for `langchain_core`.
- `'all'`: Allow classes defined in the serialization mappings.
This includes core LangChain types (messages, prompts, documents,
etc.) and trusted partner integrations. See
`langchain_core.load.mapping` for the full list.
- Explicit list of classes: Only those specific classes are allowed.
secrets_map: A map of secrets to load.
If a secret is not found in the map, it will be loaded from the
environment if `secrets_from_env` is `True`.
valid_namespaces: Additional namespaces (modules) to allow during
deserialization, beyond the default trusted namespaces.
secrets_map: A map of secrets to load. If a secret is not found in
the map, it will be loaded from the environment if `secrets_from_env`
is True.
valid_namespaces: A list of additional namespaces (modules)
to allow to be deserialized.
secrets_from_env: Whether to load secrets from the environment.
additional_import_mappings: A dictionary of additional namespace mappings.
additional_import_mappings: A dictionary of additional namespace mappings
You can use this to override default mappings or add new mappings.
When `allowed_objects` is `None` (using defaults), paths from these
mappings are also added to the allowed class paths.
ignore_unserializable_fields: Whether to ignore unserializable fields.
init_validator: Optional callable to validate kwargs before instantiation.
If provided, this function is called with `(class_path, kwargs)` where
`class_path` is the class path tuple and `kwargs` is the kwargs dict.
The validator should raise an exception if the object should not be
deserialized, otherwise return `None`.
Defaults to `default_init_validator` which blocks jinja2 templates.
"""
self.secrets_from_env = secrets_from_env
self.secrets_map = secrets_map or {}
@@ -337,26 +88,7 @@ class Reviver:
if self.additional_import_mappings
else ALL_SERIALIZABLE_MAPPINGS
)
# Compute allowed class paths:
# - "all" -> use default paths from mappings (+ additional_import_mappings)
# - Explicit list -> compute from those classes
if allowed_objects in ("all", "core"):
self.allowed_class_paths: set[tuple[str, ...]] | None = (
_get_default_allowed_class_paths(
cast("Literal['all', 'core']", allowed_objects)
).copy()
)
# Add paths from additional_import_mappings to the defaults
if self.additional_import_mappings:
for key, value in self.additional_import_mappings.items():
self.allowed_class_paths.add(key)
self.allowed_class_paths.add(value)
else:
self.allowed_class_paths = _compute_allowed_class_paths(
cast("Iterable[AllowedObject]", allowed_objects), self.import_mappings
)
self.ignore_unserializable_fields = ignore_unserializable_fields
self.init_validator = init_validator
def __call__(self, value: dict[str, Any]) -> Any:
"""Revive the value.
@@ -407,20 +139,6 @@ class Reviver:
[*namespace, name] = value["id"]
mapping_key = tuple(value["id"])
if (
self.allowed_class_paths is not None
and mapping_key not in self.allowed_class_paths
):
msg = (
f"Deserialization of {mapping_key!r} is not allowed. "
"The default (allowed_objects='core') only permits core "
"langchain-core classes. To allow trusted partner integrations, "
"use allowed_objects='all'. Alternatively, pass an explicit list "
"of allowed classes via allowed_objects=[...]. "
"See langchain_core.load.mapping for the full allowlist."
)
raise ValueError(msg)
if (
namespace[0] not in self.valid_namespaces
# The root namespace ["langchain"] is not a valid identifier.
@@ -428,11 +146,13 @@ class Reviver:
):
msg = f"Invalid namespace: {value}"
raise ValueError(msg)
# Determine explicit import path
# Has explicit import path.
if mapping_key in self.import_mappings:
import_path = self.import_mappings[mapping_key]
# Split into module and name
import_dir, name = import_path[:-1], import_path[-1]
# Import module
mod = importlib.import_module(".".join(import_dir))
elif namespace[0] in DISALLOW_LOAD_FROM_PATH:
msg = (
"Trying to deserialize something that cannot "
@@ -440,16 +160,9 @@ class Reviver:
f"{mapping_key}."
)
raise ValueError(msg)
# Otherwise, treat namespace as path.
else:
# Otherwise, treat namespace as path.
import_dir = namespace
# Validate import path is in trusted namespaces before importing
if import_dir[0] not in self.valid_namespaces:
msg = f"Invalid namespace: {value}"
raise ValueError(msg)
mod = importlib.import_module(".".join(import_dir))
mod = importlib.import_module(".".join(namespace))
cls = getattr(mod, name)
@@ -461,10 +174,6 @@ class Reviver:
# We don't need to recurse on kwargs
# as json.loads will do that for us.
kwargs = value.get("kwargs", {})
if self.init_validator is not None:
self.init_validator(mapping_key, kwargs)
return cls(**kwargs)
return value
@@ -474,81 +183,40 @@ class Reviver:
def loads(
text: str,
*,
allowed_objects: Iterable[AllowedObject] | Literal["all", "core"] = "core",
secrets_map: dict[str, str] | None = None,
valid_namespaces: list[str] | None = None,
secrets_from_env: bool = False,
secrets_from_env: bool = True,
additional_import_mappings: dict[tuple[str, ...], tuple[str, ...]] | None = None,
ignore_unserializable_fields: bool = False,
init_validator: InitValidator | None = default_init_validator,
) -> Any:
"""Revive a LangChain class from a JSON string.
Equivalent to `load(json.loads(text))`.
Only classes in the allowlist can be instantiated. The default allowlist includes
core LangChain types (messages, prompts, documents, etc.). See
`langchain_core.load.mapping` for the full list.
!!! warning "Beta feature"
This is a beta feature. Please be wary of deploying experimental code to
production unless you've taken appropriate precautions.
Args:
text: The string to load.
allowed_objects: Allowlist of classes that can be deserialized.
- `'core'` (default): Allow classes defined in the serialization mappings
for `langchain_core`.
- `'all'`: Allow classes defined in the serialization mappings.
This includes core LangChain types (messages, prompts, documents, etc.)
and trusted partner integrations. See `langchain_core.load.mapping` for
the full list.
- Explicit list of classes: Only those specific classes are allowed.
- `[]`: Disallow all deserialization (will raise on any object).
secrets_map: A map of secrets to load.
If a secret is not found in the map, it will be loaded from the environment
if `secrets_from_env` is `True`.
valid_namespaces: Additional namespaces (modules) to allow during
deserialization, beyond the default trusted namespaces.
secrets_map: A map of secrets to load. If a secret is not found in
the map, it will be loaded from the environment if `secrets_from_env`
is True.
valid_namespaces: A list of additional namespaces (modules)
to allow to be deserialized.
secrets_from_env: Whether to load secrets from the environment.
additional_import_mappings: A dictionary of additional namespace mappings.
additional_import_mappings: A dictionary of additional namespace mappings
You can use this to override default mappings or add new mappings.
When `allowed_objects` is `None` (using defaults), paths from these
mappings are also added to the allowed class paths.
ignore_unserializable_fields: Whether to ignore unserializable fields.
init_validator: Optional callable to validate kwargs before instantiation.
If provided, this function is called with `(class_path, kwargs)` where
`class_path` is the class path tuple and `kwargs` is the kwargs dict.
The validator should raise an exception if the object should not be
deserialized, otherwise return `None`.
Defaults to `default_init_validator` which blocks jinja2 templates.
Returns:
Revived LangChain objects.
Raises:
ValueError: If an object's class path is not in the `allowed_objects` allowlist.
"""
# Parse JSON and delegate to load() for proper escape handling
raw_obj = json.loads(text)
return load(
raw_obj,
allowed_objects=allowed_objects,
secrets_map=secrets_map,
valid_namespaces=valid_namespaces,
secrets_from_env=secrets_from_env,
additional_import_mappings=additional_import_mappings,
ignore_unserializable_fields=ignore_unserializable_fields,
init_validator=init_validator,
return json.loads(
text,
object_hook=Reviver(
secrets_map,
valid_namespaces,
secrets_from_env,
additional_import_mappings,
ignore_unserializable_fields=ignore_unserializable_fields,
),
)
@@ -556,116 +224,49 @@ def loads(
def load(
obj: Any,
*,
allowed_objects: Iterable[AllowedObject] | Literal["all", "core"] = "core",
secrets_map: dict[str, str] | None = None,
valid_namespaces: list[str] | None = None,
secrets_from_env: bool = False,
secrets_from_env: bool = True,
additional_import_mappings: dict[tuple[str, ...], tuple[str, ...]] | None = None,
ignore_unserializable_fields: bool = False,
init_validator: InitValidator | None = default_init_validator,
) -> Any:
"""Revive a LangChain class from a JSON object.
Use this if you already have a parsed JSON object, eg. from `json.load` or
`orjson.loads`.
Only classes in the allowlist can be instantiated. The default allowlist includes
core LangChain types (messages, prompts, documents, etc.). See
`langchain_core.load.mapping` for the full list.
!!! warning "Beta feature"
This is a beta feature. Please be wary of deploying experimental code to
production unless you've taken appropriate precautions.
Use this if you already have a parsed JSON object,
eg. from `json.load` or `orjson.loads`.
Args:
obj: The object to load.
allowed_objects: Allowlist of classes that can be deserialized.
- `'core'` (default): Allow classes defined in the serialization mappings
for `langchain_core`.
- `'all'`: Allow classes defined in the serialization mappings.
This includes core LangChain types (messages, prompts, documents, etc.)
and trusted partner integrations. See `langchain_core.load.mapping` for
the full list.
- Explicit list of classes: Only those specific classes are allowed.
- `[]`: Disallow all deserialization (will raise on any object).
secrets_map: A map of secrets to load.
If a secret is not found in the map, it will be loaded from the environment
if `secrets_from_env` is `True`.
valid_namespaces: Additional namespaces (modules) to allow during
deserialization, beyond the default trusted namespaces.
secrets_map: A map of secrets to load. If a secret is not found in
the map, it will be loaded from the environment if `secrets_from_env`
is True.
valid_namespaces: A list of additional namespaces (modules)
to allow to be deserialized.
secrets_from_env: Whether to load secrets from the environment.
additional_import_mappings: A dictionary of additional namespace mappings.
additional_import_mappings: A dictionary of additional namespace mappings
You can use this to override default mappings or add new mappings.
When `allowed_objects` is `None` (using defaults), paths from these
mappings are also added to the allowed class paths.
ignore_unserializable_fields: Whether to ignore unserializable fields.
init_validator: Optional callable to validate kwargs before instantiation.
If provided, this function is called with `(class_path, kwargs)` where
`class_path` is the class path tuple and `kwargs` is the kwargs dict.
The validator should raise an exception if the object should not be
deserialized, otherwise return `None`.
Defaults to `default_init_validator` which blocks jinja2 templates.
Returns:
Revived LangChain objects.
Raises:
ValueError: If an object's class path is not in the `allowed_objects` allowlist.
Example:
```python
from langchain_core.load import load, dumpd
from langchain_core.messages import AIMessage
msg = AIMessage(content="Hello")
data = dumpd(msg)
# Deserialize using default allowlist
loaded = load(data)
# Or with explicit allowlist
loaded = load(data, allowed_objects=[AIMessage])
# Or extend defaults with additional mappings
loaded = load(
data,
additional_import_mappings={
("my_pkg", "MyClass"): ("my_pkg", "module", "MyClass"),
},
)
```
"""
reviver = Reviver(
allowed_objects,
secrets_map,
valid_namespaces,
secrets_from_env,
additional_import_mappings,
ignore_unserializable_fields=ignore_unserializable_fields,
init_validator=init_validator,
)
def _load(obj: Any) -> Any:
if isinstance(obj, dict):
# Check for escaped dict FIRST (before recursing).
# Escaped dicts are user data that should NOT be processed as LC objects.
if _is_escaped_dict(obj):
return _unescape_value(obj)
# Not escaped - recurse into children then apply reviver
# Need to revive leaf nodes before reviving this node
loaded_obj = {k: _load(v) for k, v in obj.items()}
return reviver(loaded_obj)
if isinstance(obj, list):
return [_load(o) for o in obj]
if isinstance(obj, str) and obj in reviver.secrets_map:
return reviver.secrets_map[obj]
return obj
return _load(obj)

View File

@@ -1,19 +1,21 @@
"""Serialization mapping.
This file contains a mapping between the `lc_namespace` path for a given
subclass that implements from `Serializable` to the namespace
This file contains a mapping between the lc_namespace path for a given
subclass that implements from Serializable to the namespace
where that class is actually located.
This mapping helps maintain the ability to serialize and deserialize
well-known LangChain objects even if they are moved around in the codebase
across different LangChain versions.
For example, the code for the `AIMessage` class is located in
`langchain_core.messages.ai.AIMessage`. This message is associated with the
`lc_namespace` of `["langchain", "schema", "messages", "AIMessage"]`,
because this code was originally in `langchain.schema.messages.AIMessage`.
For example,
The mapping allows us to deserialize an `AIMessage` created with an older
The code for AIMessage class is located in langchain_core.messages.ai.AIMessage,
This message is associated with the lc_namespace
["langchain", "schema", "messages", "AIMessage"],
because this code was originally in langchain.schema.messages.AIMessage.
The mapping allows us to deserialize an AIMessage created with an older
version of LangChain where the code was in a different location.
"""
@@ -273,11 +275,6 @@ SERIALIZABLE_MAPPING: dict[tuple[str, ...], tuple[str, ...]] = {
"chat_models",
"ChatGroq",
),
("langchain_xai", "chat_models", "ChatXAI"): (
"langchain_xai",
"chat_models",
"ChatXAI",
),
("langchain", "chat_models", "fireworks", "ChatFireworks"): (
"langchain_fireworks",
"chat_models",
@@ -532,6 +529,16 @@ SERIALIZABLE_MAPPING: dict[tuple[str, ...], tuple[str, ...]] = {
"structured",
"StructuredPrompt",
),
("langchain_sambanova", "chat_models", "ChatSambaNovaCloud"): (
"langchain_sambanova",
"chat_models",
"ChatSambaNovaCloud",
),
("langchain_sambanova", "chat_models", "ChatSambaStudio"): (
"langchain_sambanova",
"chat_models",
"ChatSambaStudio",
),
("langchain_core", "prompts", "message", "_DictMessagePromptTemplate"): (
"langchain_core",
"prompts",

View File

@@ -92,12 +92,11 @@ class Serializable(BaseModel, ABC):
It relies on the following methods and properties:
- [`is_lc_serializable`][langchain_core.load.serializable.Serializable.is_lc_serializable]: Is this class serializable?
- `is_lc_serializable`: Is this class serializable?
By design, even if a class inherits from `Serializable`, it is not serializable
by default. This is to prevent accidental serialization of objects that should
not be serialized.
- [`get_lc_namespace`][langchain_core.load.serializable.Serializable.get_lc_namespace]: Get the namespace of the LangChain object.
- `get_lc_namespace`: Get the namespace of the LangChain object.
During deserialization, this namespace is used to identify
the correct class to instantiate.
@@ -106,10 +105,10 @@ class Serializable(BaseModel, ABC):
During deserialization an additional mapping is handle classes that have moved
or been renamed across package versions.
- [`lc_secrets`][langchain_core.load.serializable.Serializable.lc_secrets]: A map of constructor argument names to secret ids.
- [`lc_attributes`][langchain_core.load.serializable.Serializable.lc_attributes]: List of additional attribute names that should be included
- `lc_secrets`: A map of constructor argument names to secret ids.
- `lc_attributes`: List of additional attribute names that should be included
as part of the serialized representation.
""" # noqa: E501
"""
# Remove default BaseModel init docstring.
def __init__(self, *args: Any, **kwargs: Any) -> None:
@@ -133,12 +132,12 @@ class Serializable(BaseModel, ABC):
def get_lc_namespace(cls) -> list[str]:
"""Get the namespace of the LangChain object.
For example, if the class is [`langchain.llms.openai.OpenAI`][langchain_openai.OpenAI],
then the namespace is `["langchain", "llms", "openai"]`
For example, if the class is `langchain.llms.openai.OpenAI`, then the
namespace is `["langchain", "llms", "openai"]`
Returns:
The namespace.
""" # noqa: E501
"""
return cls.__module__.split(".")
@property

View File

@@ -51,22 +51,22 @@ class InputTokenDetails(TypedDict, total=False):
May also hold extra provider-specific keys.
!!! version-added "Added in `langchain-core` 0.3.9"
"""
audio: int
"""Audio input tokens."""
cache_creation: int
"""Input tokens that were cached and there was a cache miss.
Since there was a cache miss, the cache was created from these tokens.
"""
cache_read: int
"""Input tokens that were cached and there was a cache hit.
Since there was a cache hit, the tokens were read from the cache. More precisely,
the model state given these tokens was read from the cache.
"""
@@ -91,12 +91,12 @@ class OutputTokenDetails(TypedDict, total=False):
audio: int
"""Audio output tokens."""
reasoning: int
"""Reasoning output tokens.
Tokens generated by the model in a chain of thought process (i.e. by OpenAI's o1
models) that are not returned as part of model output.
"""
@@ -124,11 +124,9 @@ class UsageMetadata(TypedDict):
```
!!! warning "Behavior changed in `langchain-core` 0.3.9"
Added `input_token_details` and `output_token_details`.
!!! note "LangSmith SDK"
The LangSmith SDK also has a `UsageMetadata` class. While the two share fields,
LangSmith's `UsageMetadata` has additional fields to capture cost information
used by the LangSmith platform.
@@ -136,19 +134,15 @@ class UsageMetadata(TypedDict):
input_tokens: int
"""Count of input (or prompt) tokens. Sum of all input token types."""
output_tokens: int
"""Count of output (or completion) tokens. Sum of all output token types."""
total_tokens: int
"""Total token count. Sum of `input_tokens` + `output_tokens`."""
input_token_details: NotRequired[InputTokenDetails]
"""Breakdown of input token counts.
Does *not* need to sum to full input token count. Does *not* need to have all keys.
"""
output_token_details: NotRequired[OutputTokenDetails]
"""Breakdown of output token counts.
@@ -168,10 +162,8 @@ class AIMessage(BaseMessage):
tool_calls: list[ToolCall] = []
"""If present, tool calls associated with the message."""
invalid_tool_calls: list[InvalidToolCall] = []
"""If present, tool calls with parsing errors associated with the message."""
usage_metadata: UsageMetadata | None = None
"""If present, usage metadata for a message, such as token counts.
@@ -326,7 +318,7 @@ class AIMessage(BaseMessage):
if tool_calls := values.get("tool_calls"):
values["tool_calls"] = [
create_tool_call(
**{k: v for k, v in tc.items() if k not in {"type", "extras"}}
**{k: v for k, v in tc.items() if k not in ("type", "extras")}
)
for tc in tool_calls
]
@@ -442,7 +434,7 @@ class AIMessageChunk(AIMessage, BaseMessageChunk):
blocks = [
block
for block in blocks
if block["type"] not in {"tool_call", "invalid_tool_call"}
if block["type"] not in ("tool_call", "invalid_tool_call")
]
for tool_call_chunk in self.tool_call_chunks:
tc: types.ToolCallChunk = {
@@ -563,7 +555,7 @@ class AIMessageChunk(AIMessage, BaseMessageChunk):
@model_validator(mode="after")
def init_server_tool_calls(self) -> Self:
"""Parse `server_tool_call_chunks` from [`ServerToolCallChunk`][langchain.messages.ServerToolCallChunk] objects.""" # noqa: E501
"""Parse `server_tool_call_chunks`."""
if (
self.chunk_position == "last"
and self.response_metadata.get("output_version") == "v1"
@@ -573,7 +565,7 @@ class AIMessageChunk(AIMessage, BaseMessageChunk):
if (
isinstance(block, dict)
and block.get("type")
in {"server_tool_call", "server_tool_call_chunk"}
in ("server_tool_call", "server_tool_call_chunk")
and (args_str := block.get("args"))
and isinstance(args_str, str)
):

View File

@@ -5,9 +5,11 @@ from __future__ import annotations
from typing import TYPE_CHECKING, Any, cast, overload
from pydantic import ConfigDict, Field
from typing_extensions import Self
from langchain_core._api.deprecation import warn_deprecated
from langchain_core.load.serializable import Serializable
from langchain_core.messages import content as types
from langchain_core.utils import get_bolded_text
from langchain_core.utils._merge import merge_dicts, merge_lists
from langchain_core.utils.interactive_env import is_interactive_env
@@ -15,9 +17,6 @@ from langchain_core.utils.interactive_env import is_interactive_env
if TYPE_CHECKING:
from collections.abc import Sequence
from typing_extensions import Self
from langchain_core.messages import content as types
from langchain_core.prompts.chat import ChatPromptTemplate
@@ -391,12 +390,12 @@ class BaseMessageChunk(BaseMessage):
Raises:
TypeError: If the other object is not a message chunk.
Example:
```txt
AIMessageChunk(content="Hello", ...)
+ AIMessageChunk(content=" World", ...)
= AIMessageChunk(content="Hello World", ...)
```
For example,
`AIMessageChunk(content="Hello") + AIMessageChunk(content=" World")`
will give `AIMessageChunk(content="Hello World")`
"""
if isinstance(other, BaseMessageChunk):
# If both are (subclasses of) BaseMessageChunk,

View File

@@ -12,11 +12,10 @@ the implementation in `BaseMessage`.
from __future__ import annotations
from collections.abc import Callable
from typing import TYPE_CHECKING
if TYPE_CHECKING:
from collections.abc import Callable
from langchain_core.messages import AIMessage, AIMessageChunk
from langchain_core.messages import content as types

View File

@@ -159,12 +159,12 @@ def _convert_citation_to_v1(citation: dict[str, Any]) -> types.Annotation:
return url_citation
if citation_type in {
if citation_type in (
"char_location",
"content_block_location",
"page_location",
"search_result_location",
}:
):
document_citation: types.Citation = {
"type": "citation",
"cited_text": citation["cited_text"],
@@ -173,6 +173,8 @@ def _convert_citation_to_v1(citation: dict[str, Any]) -> types.Annotation:
document_citation["title"] = citation["document_title"]
elif title := citation.get("title"):
document_citation["title"] = title
else:
pass
known_fields = {
"type",
"cited_text",
@@ -243,20 +245,11 @@ def _convert_to_v1_from_anthropic(message: AIMessage) -> list[types.ContentBlock
and message.chunk_position != "last"
):
# Isolated chunk
chunk = message.tool_call_chunks[0]
tool_call_chunk = types.ToolCallChunk(
name=chunk.get("name"),
id=chunk.get("id"),
args=chunk.get("args"),
type="tool_call_chunk",
tool_call_chunk: types.ToolCallChunk = (
message.tool_call_chunks[0].copy() # type: ignore[assignment]
)
if "caller" in block:
tool_call_chunk["extras"] = {"caller": block["caller"]}
index = chunk.get("index")
if index is not None:
tool_call_chunk["index"] = index
if "type" not in tool_call_chunk:
tool_call_chunk["type"] = "tool_call_chunk"
yield tool_call_chunk
else:
tool_call_block: types.ToolCall | None = None
@@ -278,6 +271,8 @@ def _convert_to_v1_from_anthropic(message: AIMessage) -> list[types.ContentBlock
"id": tc.get("id"),
}
break
else:
pass
if not tool_call_block:
tool_call_block = {
"type": "tool_call",
@@ -287,27 +282,17 @@ def _convert_to_v1_from_anthropic(message: AIMessage) -> list[types.ContentBlock
}
if "index" in block:
tool_call_block["index"] = block["index"]
if "caller" in block:
if "extras" not in tool_call_block:
tool_call_block["extras"] = {}
tool_call_block["extras"]["caller"] = block["caller"]
yield tool_call_block
elif block_type == "input_json_delta" and isinstance(
message, AIMessageChunk
):
if len(message.tool_call_chunks) == 1:
chunk = message.tool_call_chunks[0]
tool_call_chunk = types.ToolCallChunk(
name=chunk.get("name"),
id=chunk.get("id"),
args=chunk.get("args"),
type="tool_call_chunk",
tool_call_chunk = (
message.tool_call_chunks[0].copy() # type: ignore[assignment]
)
index = chunk.get("index")
if index is not None:
tool_call_chunk["index"] = index
if "type" not in tool_call_chunk:
tool_call_chunk["type"] = "tool_call_chunk"
yield tool_call_chunk
else:
@@ -461,26 +446,12 @@ def _convert_to_v1_from_anthropic(message: AIMessage) -> list[types.ContentBlock
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message with Anthropic content.
Args:
message: The message to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message with Anthropic content."""
return _convert_to_v1_from_anthropic(message)
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message chunk with Anthropic content.
Args:
message: The message chunk to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message chunk with Anthropic content."""
return _convert_to_v1_from_anthropic(message)

View File

@@ -65,28 +65,14 @@ def _convert_to_v1_from_bedrock_chunk(
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message with Bedrock content.
Args:
message: The message to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message with Bedrock content."""
if "claude" not in message.response_metadata.get("model_name", "").lower():
raise NotImplementedError # fall back to best-effort parsing
return _convert_to_v1_from_bedrock(message)
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message chunk with Bedrock content.
Args:
message: The message chunk to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message chunk with Bedrock content."""
# TODO: add model_name to all Bedrock chunks and update core merging logic
# to not append during aggregation. Then raise NotImplementedError here if
# not an Anthropic model to fall back to best-effort parsing.

View File

@@ -209,16 +209,11 @@ def _convert_to_v1_from_converse(message: AIMessage) -> list[types.ContentBlock]
and message.chunk_position != "last"
):
# Isolated chunk
chunk = message.tool_call_chunks[0]
tool_call_chunk = types.ToolCallChunk(
name=chunk.get("name"),
id=chunk.get("id"),
args=chunk.get("args"),
type="tool_call_chunk",
tool_call_chunk: types.ToolCallChunk = (
message.tool_call_chunks[0].copy() # type: ignore[assignment]
)
index = chunk.get("index")
if index is not None:
tool_call_chunk["index"] = index
if "type" not in tool_call_chunk:
tool_call_chunk["type"] = "tool_call_chunk"
yield tool_call_chunk
else:
tool_call_block: types.ToolCall | None = None
@@ -240,6 +235,8 @@ def _convert_to_v1_from_converse(message: AIMessage) -> list[types.ContentBlock]
"id": tc.get("id"),
}
break
else:
pass
if not tool_call_block:
tool_call_block = {
"type": "tool_call",
@@ -256,16 +253,11 @@ def _convert_to_v1_from_converse(message: AIMessage) -> list[types.ContentBlock]
and isinstance(message, AIMessageChunk)
and len(message.tool_call_chunks) == 1
):
chunk = message.tool_call_chunks[0]
tool_call_chunk = types.ToolCallChunk(
name=chunk.get("name"),
id=chunk.get("id"),
args=chunk.get("args"),
type="tool_call_chunk",
tool_call_chunk = (
message.tool_call_chunks[0].copy() # type: ignore[assignment]
)
index = chunk.get("index")
if index is not None:
tool_call_chunk["index"] = index
if "type" not in tool_call_chunk:
tool_call_chunk["type"] = "tool_call_chunk"
yield tool_call_chunk
else:
@@ -281,26 +273,12 @@ def _convert_to_v1_from_converse(message: AIMessage) -> list[types.ContentBlock]
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message with Bedrock Converse content.
Args:
message: The message to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message with Bedrock Converse content."""
return _convert_to_v1_from_converse(message)
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
"""Derive standard content blocks from a chunk with Bedrock Converse content.
Args:
message: The message chunk to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a chunk with Bedrock Converse content."""
return _convert_to_v1_from_converse(message)

View File

@@ -76,36 +76,21 @@ def translate_grounding_metadata_to_citations(
for chunk_index in chunk_indices:
if chunk_index < len(grounding_chunks):
chunk = grounding_chunks[chunk_index]
# Handle web and maps grounding
web_info = chunk.get("web") or {}
maps_info = chunk.get("maps") or {}
# Extract citation info depending on source
url = maps_info.get("uri") or web_info.get("uri")
title = maps_info.get("title") or web_info.get("title")
# Note: confidence_scores is a legacy field from Gemini 2.0 and earlier
# that indicated confidence (0.0-1.0) for each grounding chunk.
#
# In Gemini 2.5+, this field is always None/empty and should be ignored.
extras_metadata = {
"web_search_queries": web_search_queries,
"grounding_chunk_index": chunk_index,
"confidence_scores": support.get("confidence_scores") or [],
}
# Add maps-specific metadata if present
if maps_info.get("placeId"):
extras_metadata["place_id"] = maps_info["placeId"]
web_info = chunk.get("web", {})
citation = create_citation(
url=url,
title=title,
url=web_info.get("uri"),
title=web_info.get("title"),
start_index=start_index,
end_index=end_index,
cited_text=cited_text,
google_ai_metadata=extras_metadata,
extras={
"google_ai_metadata": {
"web_search_queries": web_search_queries,
"grounding_chunk_index": chunk_index,
"confidence_scores": support.get("confidence_scores", []),
}
},
)
citations.append(citation)
@@ -411,10 +396,7 @@ def _convert_to_v1_from_genai(message: AIMessage) -> list[types.ContentBlock]:
except Exception:
# Not valid base64, treat as non-standard
converted_blocks.append(
{
"type": "non_standard",
"value": item,
}
{"type": "non_standard", "value": item}
)
else:
# This likely won't be reached according to previous implementations
@@ -526,26 +508,12 @@ def _convert_to_v1_from_genai(message: AIMessage) -> list[types.ContentBlock]:
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message with Google (GenAI) content.
Args:
message: The message to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message with Google (GenAI) content."""
return _convert_to_v1_from_genai(message)
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
"""Derive standard content blocks from a chunk with Google (GenAI) content.
Args:
message: The message chunk to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a chunk with Google (GenAI) content."""
return _convert_to_v1_from_genai(message)

View File

@@ -119,26 +119,12 @@ def _convert_to_v1_from_groq(message: AIMessage) -> list[types.ContentBlock]:
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message with groq content.
Args:
message: The message to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message with groq content."""
return _convert_to_v1_from_groq(message)
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message chunk with groq content.
Args:
message: The message chunk to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message chunk with groq content."""
return _convert_to_v1_from_groq(message)

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import json
import warnings
from collections.abc import Iterable
from typing import TYPE_CHECKING, Any, Literal, cast
from langchain_core.language_models._utils import (
@@ -13,24 +14,11 @@ from langchain_core.language_models._utils import (
from langchain_core.messages import content as types
if TYPE_CHECKING:
from collections.abc import Iterable
from langchain_core.messages import AIMessage, AIMessageChunk
def convert_to_openai_image_block(block: dict[str, Any]) -> dict:
"""Convert `ImageContentBlock` to format expected by OpenAI Chat Completions.
Args:
block: The image content block to convert.
Raises:
ValueError: If required keys are missing.
ValueError: If source type is unsupported.
Returns:
The formatted image content block.
"""
"""Convert `ImageContentBlock` to format expected by OpenAI Chat Completions."""
if "url" in block:
return {
"type": "image_url",
@@ -61,18 +49,6 @@ def convert_to_openai_data_block(
"Standard data content block" can include old-style LangChain v0 blocks
(URLContentBlock, Base64ContentBlock, IDContentBlock) or new ones.
Args:
block: The content block to convert.
api: The OpenAI API being targeted. Either "chat/completions" or "responses".
Raises:
ValueError: If required keys are missing.
ValueError: If file URLs are used with Chat Completions API.
ValueError: If block type is unsupported.
Returns:
The formatted content block.
"""
if block["type"] == "image":
chat_completions_block = convert_to_openai_image_block(block)
@@ -271,7 +247,7 @@ def _convert_from_v1_to_chat_completions(message: AIMessage) -> AIMessage:
if block_type == "text":
# Strip annotations
new_content.append({"type": "text", "text": block["text"]})
elif block_type in {"reasoning", "tool_call"}:
elif block_type in ("reasoning", "tool_call"):
pass
else:
new_content.append(block)
@@ -729,6 +705,8 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
if invalid_tool_call.get("id") == call_id:
tool_call_block = invalid_tool_call.copy()
break
else:
pass
if tool_call_block:
if "id" in block:
if "extras" not in tool_call_block:
@@ -756,7 +734,7 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
k: v for k, v in block["action"].items() if k != "sources"
}
for key in block:
if key not in {"type", "id", "action", "status", "index"}:
if key not in ("type", "id", "action", "status", "index"):
web_search_call[key] = block[key]
yield cast("types.ServerToolCall", web_search_call)
@@ -782,6 +760,8 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
web_search_result["status"] = "success"
elif status:
web_search_result["extras"] = {"status": status}
else:
pass
if "index" in block and isinstance(block["index"], int):
web_search_result["index"] = f"lc_wsr_{block['index'] + 1}"
yield cast("types.ServerToolResult", web_search_result)
@@ -797,14 +777,14 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
file_search_call["index"] = f"lc_fsc_{block['index']}"
for key in block:
if key not in {
if key not in (
"type",
"id",
"queries",
"results",
"status",
"index",
}:
):
file_search_call[key] = block[key]
yield cast("types.ServerToolCall", file_search_call)
@@ -823,6 +803,8 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
file_search_result["status"] = "success"
elif status:
file_search_result["extras"] = {"status": status}
else:
pass
if "index" in block and isinstance(block["index"], int):
file_search_result["index"] = f"lc_fsr_{block['index'] + 1}"
yield cast("types.ServerToolResult", file_search_result)
@@ -866,6 +848,8 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
code_interpreter_result["status"] = "success"
elif status:
code_interpreter_result["extras"] = {"status": status}
else:
pass
if "index" in block and isinstance(block["index"], int):
code_interpreter_result["index"] = f"lc_cir_{block['index'] + 1}"
@@ -996,14 +980,7 @@ def _convert_to_v1_from_responses(message: AIMessage) -> list[types.ContentBlock
def translate_content(message: AIMessage) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message with OpenAI content.
Args:
message: The message to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message with OpenAI content."""
if isinstance(message.content, str):
return _convert_to_v1_from_chat_completions(message)
message = _convert_from_v03_ai_message(message)
@@ -1011,14 +988,7 @@ def translate_content(message: AIMessage) -> list[types.ContentBlock]:
def translate_content_chunk(message: AIMessageChunk) -> list[types.ContentBlock]:
"""Derive standard content blocks from a message chunk with OpenAI content.
Args:
message: The message chunk to translate.
Returns:
The derived content blocks.
"""
"""Derive standard content blocks from a message chunk with OpenAI content."""
if isinstance(message.content, str):
return _convert_to_v1_from_chat_completions_chunk(message)
message = _convert_from_v03_ai_message(message) # type: ignore[assignment]

View File

@@ -654,7 +654,7 @@ class PlainTextContentBlock(TypedDict):
!!! note
Title and context are optional fields that may be passed to the model. See
Anthropic [example](https://platform.claude.com/docs/en/build-with-claude/citations#citable-vs-non-citable-content).
Anthropic [example](https://docs.claude.com/en/docs/build-with-claude/citations#citable-vs-non-citable-content).
!!! note "Factory function"
`create_plaintext_block` may also be used as a factory to create a

View File

@@ -29,39 +29,38 @@ class ToolMessage(BaseMessage, ToolOutputMixin):
`ToolMessage` objects contain the result of a tool invocation. Typically, the result
is encoded inside the `content` field.
`tool_call_id` is used to associate the tool call request with the tool call
response. Useful in situations where a chat model is able to request multiple tool
calls in parallel.
Example: A `ToolMessage` representing a result of `42` from a tool call with id
Example:
A `ToolMessage` representing a result of `42` from a tool call with id
```python
from langchain_core.messages import ToolMessage
```python
from langchain_core.messages import ToolMessage
ToolMessage(content="42", tool_call_id="call_Jja7J89XsjrOLA5r!MEOW!SL")
```
ToolMessage(content="42", tool_call_id="call_Jja7J89XsjrOLA5r!MEOW!SL")
```
Example: A `ToolMessage` where only part of the tool output is sent to the model
and the full output is passed in to artifact.
Example:
A `ToolMessage` where only part of the tool output is sent to the model
and the full output is passed in to artifact.
```python
from langchain_core.messages import ToolMessage
```python
from langchain_core.messages import ToolMessage
tool_output = {
"stdout": "From the graph we can see that the correlation between "
"x and y is ...",
"stderr": None,
"artifacts": {"type": "image", "base64_data": "/9j/4gIcSU..."},
}
tool_output = {
"stdout": "From the graph we can see that the correlation between "
"x and y is ...",
"stderr": None,
"artifacts": {"type": "image", "base64_data": "/9j/4gIcSU..."},
}
ToolMessage(
content=tool_output["stdout"],
artifact=tool_output,
tool_call_id="call_Jja7J89XsjrOLA5r!MEOW!SL",
)
```
The `tool_call_id` field is used to associate the tool call request with the
tool call response. Useful in situations where a chat model is able
to request multiple tool calls in parallel.
ToolMessage(
content=tool_output["stdout"],
artifact=tool_output,
tool_call_id="call_Jja7J89XsjrOLA5r!MEOW!SL",
)
```
"""
tool_call_id: str

View File

@@ -15,16 +15,12 @@ import json
import logging
import math
from collections.abc import Callable, Iterable, Sequence
from functools import partial, wraps
from functools import partial
from typing import (
TYPE_CHECKING,
Annotated,
Any,
Concatenate,
Literal,
ParamSpec,
Protocol,
TypeVar,
cast,
overload,
)
@@ -109,11 +105,6 @@ def get_buffer_string(
Raises:
ValueError: If an unsupported message type is encountered.
Note:
If a message is an `AIMessage` and contains both tool calls under `tool_calls`
and a function call under `additional_kwargs["function_call"]`, only the tool
calls will be appended to the string representation.
Example:
```python
from langchain_core import AIMessage, HumanMessage
@@ -144,12 +135,8 @@ def get_buffer_string(
msg = f"Got unsupported message type: {m}"
raise ValueError(msg) # noqa: TRY004
message = f"{role}: {m.text}"
if isinstance(m, AIMessage):
if m.tool_calls:
message += f"{m.tool_calls}"
elif "function_call" in m.additional_kwargs:
# Legacy behavior assumes only one function call per message
message += f"{m.additional_kwargs['function_call']}"
if isinstance(m, AIMessage) and "function_call" in m.additional_kwargs:
message += f"{m.additional_kwargs['function_call']}"
string_messages.append(message)
return "\n".join(string_messages)
@@ -397,54 +384,33 @@ def convert_to_messages(
return [_convert_to_message(m) for m in messages]
_P = ParamSpec("_P")
_R_co = TypeVar("_R_co", covariant=True)
class _RunnableSupportCallable(Protocol[_P, _R_co]):
def _runnable_support(func: Callable) -> Callable:
@overload
def __call__(
self,
messages: None = None,
*args: _P.args,
**kwargs: _P.kwargs,
) -> Runnable[Sequence[MessageLikeRepresentation], _R_co]: ...
@overload
def __call__(
self,
messages: Sequence[MessageLikeRepresentation] | PromptValue,
*args: _P.args,
**kwargs: _P.kwargs,
) -> _R_co: ...
def __call__(
self,
messages: Sequence[MessageLikeRepresentation] | PromptValue | None = None,
*args: _P.args,
**kwargs: _P.kwargs,
) -> _R_co | Runnable[Sequence[MessageLikeRepresentation], _R_co]: ...
def _runnable_support(
func: Callable[
Concatenate[Sequence[MessageLikeRepresentation] | PromptValue, _P], _R_co
],
) -> _RunnableSupportCallable[_P, _R_co]:
@wraps(func)
def wrapped(
messages: Sequence[MessageLikeRepresentation] | PromptValue | None = None,
*args: _P.args,
**kwargs: _P.kwargs,
) -> _R_co | Runnable[Sequence[MessageLikeRepresentation], _R_co]:
messages: None = None, **kwargs: Any
) -> Runnable[Sequence[MessageLikeRepresentation], list[BaseMessage]]: ...
@overload
def wrapped(
messages: Sequence[MessageLikeRepresentation], **kwargs: Any
) -> list[BaseMessage]: ...
def wrapped(
messages: Sequence[MessageLikeRepresentation] | None = None,
**kwargs: Any,
) -> (
list[BaseMessage]
| Runnable[Sequence[MessageLikeRepresentation], list[BaseMessage]]
):
# Import locally to prevent circular import.
from langchain_core.runnables.base import RunnableLambda # noqa: PLC0415
if messages is not None:
return func(messages, *args, **kwargs)
return func(messages, **kwargs)
return RunnableLambda(partial(func, **kwargs), name=func.__name__)
return cast("_RunnableSupportCallable[_P, _R_co]", wrapped)
wrapped.__doc__ = func.__doc__
return wrapped
@_runnable_support
@@ -729,8 +695,7 @@ def trim_messages(
max_tokens: int,
token_counter: Callable[[list[BaseMessage]], int]
| Callable[[BaseMessage], int]
| BaseLanguageModel
| Literal["approximate"],
| BaseLanguageModel,
strategy: Literal["first", "last"] = "last",
allow_partial: bool = False,
end_on: str | type[BaseMessage] | Sequence[str | type[BaseMessage]] | None = None,
@@ -768,65 +733,51 @@ def trim_messages(
messages: Sequence of Message-like objects to trim.
max_tokens: Max token count of trimmed messages.
token_counter: Function or llm for counting tokens in a `BaseMessage` or a
list of `BaseMessage`.
If a `BaseLanguageModel` is passed in then
`BaseLanguageModel.get_num_tokens_from_messages()` will be used. Set to
`len` to count the number of **messages** in the chat history.
You can also use string shortcuts for convenience:
- `'approximate'`: Uses `count_tokens_approximately` for fast, approximate
token counts.
list of `BaseMessage`. If a `BaseLanguageModel` is passed in then
`BaseLanguageModel.get_num_tokens_from_messages()` will be used.
Set to `len` to count the number of **messages** in the chat history.
!!! note
`count_tokens_approximately` (or the shortcut `'approximate'`) is
recommended for using `trim_messages` on the hot path, where exact token
counting is not necessary.
Use `count_tokens_approximately` to get fast, approximate token
counts.
This is recommended for using `trim_messages` on the hot path, where
exact token counting is not necessary.
strategy: Strategy for trimming.
- `'first'`: Keep the first `<= n_count` tokens of the messages.
- `'last'`: Keep the last `<= n_count` tokens of the messages.
allow_partial: Whether to split a message if only part of the message can be
included.
If `strategy='last'` then the last partial contents of a message are
included. If `strategy='first'` then the first partial contents of a
included. If `strategy='last'` then the last partial contents of a message
are included. If `strategy='first'` then the first partial contents of a
message are included.
end_on: The message type to end on.
end_on: The message type to end on. If specified then every message after the
last occurrence of this type is ignored. If `strategy='last'` then this
is done before we attempt to get the last `max_tokens`. If
`strategy='first'` then this is done after we get the first
`max_tokens`. Can be specified as string names (e.g. `'system'`,
`'human'`, `'ai'`, ...) or as `BaseMessage` classes (e.g.
`SystemMessage`, `HumanMessage`, `AIMessage`, ...). Can be a single
type or a list of types.
If specified then every message after the last occurrence of this type is
ignored. If `strategy='last'` then this is done before we attempt to get the
last `max_tokens`. If `strategy='first'` then this is done after we get the
first `max_tokens`. Can be specified as string names (e.g. `'system'`,
`'human'`, `'ai'`, ...) or as `BaseMessage` classes (e.g. `SystemMessage`,
`HumanMessage`, `AIMessage`, ...). Can be a single type or a list of types.
start_on: The message type to start on.
Should only be specified if `strategy='last'`. If specified then every
message before the first occurrence of this type is ignored. This is done
after we trim the initial messages to the last `max_tokens`. Does not apply
to a `SystemMessage` at index 0 if `include_system=True`. Can be specified
as string names (e.g. `'system'`, `'human'`, `'ai'`, ...) or as
`BaseMessage` classes (e.g. `SystemMessage`, `HumanMessage`, `AIMessage`,
...). Can be a single type or a list of types.
start_on: The message type to start on. Should only be specified if
`strategy='last'`. If specified then every message before
the first occurrence of this type is ignored. This is done after we trim
the initial messages to the last `max_tokens`. Does not
apply to a `SystemMessage` at index 0 if `include_system=True`. Can be
specified as string names (e.g. `'system'`, `'human'`, `'ai'`, ...) or
as `BaseMessage` classes (e.g. `SystemMessage`, `HumanMessage`,
`AIMessage`, ...). Can be a single type or a list of types.
include_system: Whether to keep the `SystemMessage` if there is one at index
`0`.
Should only be specified if `strategy="last"`.
`0`. Should only be specified if `strategy="last"`.
text_splitter: Function or `langchain_text_splitters.TextSplitter` for
splitting the string contents of a message.
Only used if `allow_partial=True`. If `strategy='last'` then the last split
tokens from a partial message will be included. if `strategy='first'` then
the first split tokens from a partial message will be included. Token
splitter assumes that separators are kept, so that split contents can be
directly concatenated to recreate the original text. Defaults to splitting
on newlines.
splitting the string contents of a message. Only used if
`allow_partial=True`. If `strategy='last'` then the last split tokens
from a partial message will be included. if `strategy='first'` then the
first split tokens from a partial message will be included. Token splitter
assumes that separators are kept, so that split contents can be directly
concatenated to recreate the original text. Defaults to splitting on
newlines.
Returns:
List of trimmed `BaseMessage`.
@@ -837,8 +788,8 @@ def trim_messages(
Example:
Trim chat history based on token count, keeping the `SystemMessage` if
present, and ensuring that the chat history starts with a `HumanMessage` (or a
`SystemMessage` followed by a `HumanMessage`).
present, and ensuring that the chat history starts with a `HumanMessage` (
or a `SystemMessage` followed by a `HumanMessage`).
```python
from langchain_core.messages import (
@@ -891,34 +842,8 @@ def trim_messages(
]
```
Trim chat history using approximate token counting with `'approximate'`:
```python
trim_messages(
messages,
max_tokens=45,
strategy="last",
# Using the "approximate" shortcut for fast token counting
token_counter="approximate",
start_on="human",
include_system=True,
)
# This is equivalent to using `count_tokens_approximately` directly
from langchain_core.messages.utils import count_tokens_approximately
trim_messages(
messages,
max_tokens=45,
strategy="last",
token_counter=count_tokens_approximately,
start_on="human",
include_system=True,
)
```
Trim chat history based on the message count, keeping the `SystemMessage` if
present, and ensuring that the chat history starts with a HumanMessage (
present, and ensuring that the chat history starts with a `HumanMessage` (
or a `SystemMessage` followed by a `HumanMessage`).
trim_messages(
@@ -1040,44 +965,24 @@ def trim_messages(
raise ValueError(msg)
messages = convert_to_messages(messages)
# Handle string shortcuts for token counter
if isinstance(token_counter, str):
if token_counter in _TOKEN_COUNTER_SHORTCUTS:
actual_token_counter = _TOKEN_COUNTER_SHORTCUTS[token_counter]
else:
available_shortcuts = ", ".join(
f"'{key}'" for key in _TOKEN_COUNTER_SHORTCUTS
)
msg = (
f"Invalid token_counter shortcut '{token_counter}'. "
f"Available shortcuts: {available_shortcuts}."
)
raise ValueError(msg)
else:
# Type narrowing: at this point token_counter is not a str
actual_token_counter = token_counter # type: ignore[assignment]
if hasattr(actual_token_counter, "get_num_tokens_from_messages"):
list_token_counter = actual_token_counter.get_num_tokens_from_messages
elif callable(actual_token_counter):
if hasattr(token_counter, "get_num_tokens_from_messages"):
list_token_counter = token_counter.get_num_tokens_from_messages
elif callable(token_counter):
if (
next(
iter(inspect.signature(actual_token_counter).parameters.values())
).annotation
next(iter(inspect.signature(token_counter).parameters.values())).annotation
is BaseMessage
):
def list_token_counter(messages: Sequence[BaseMessage]) -> int:
return sum(actual_token_counter(msg) for msg in messages) # type: ignore[arg-type, misc]
return sum(token_counter(msg) for msg in messages) # type: ignore[arg-type, misc]
else:
list_token_counter = actual_token_counter
list_token_counter = token_counter
else:
msg = (
f"'token_counter' expected to be a model that implements "
f"'get_num_tokens_from_messages()' or a function. Received object of type "
f"{type(actual_token_counter)}."
f"{type(token_counter)}."
)
raise ValueError(msg)
@@ -1117,7 +1022,6 @@ def convert_to_openai_messages(
*,
text_format: Literal["string", "block"] = "string",
include_id: bool = False,
pass_through_unknown_blocks: bool = True,
) -> dict | list[dict]:
"""Convert LangChain messages into OpenAI message dicts.
@@ -1137,9 +1041,6 @@ def convert_to_openai_messages(
content blocks these are left as is.
include_id: Whether to include message IDs in the openai messages, if they
are present in the source messages.
pass_through_unknown_blocks: Whether to include content blocks with unknown
formats in the output. If `False`, an error is raised if an unknown
content block is encountered.
Raises:
ValueError: if an unrecognized `text_format` is specified, or if a message
@@ -1389,36 +1290,6 @@ def convert_to_openai_messages(
},
}
)
elif block.get("type") == "function_call": # OpenAI Responses
if not any(
tool_call["id"] == block.get("call_id")
for tool_call in cast("AIMessage", message).tool_calls
):
if missing := [
k
for k in ("call_id", "name", "arguments")
if k not in block
]:
err = (
f"Unrecognized content block at "
f"messages[{i}].content[{j}] has 'type': "
f"'tool_use', but is missing expected key(s) "
f"{missing}. Full content block:\n\n{block}"
)
raise ValueError(err)
oai_msg["tool_calls"] = oai_msg.get("tool_calls", [])
oai_msg["tool_calls"].append(
{
"type": "function",
"id": block.get("call_id"),
"function": {
"name": block.get("name"),
"arguments": block.get("arguments"),
},
}
)
if pass_through_unknown_blocks:
content.append(block)
elif block.get("type") == "tool_result":
if missing := [
k for k in ("content", "tool_use_id") if k not in block
@@ -1499,10 +1370,7 @@ def convert_to_openai_messages(
},
}
)
elif (
block.get("type") in {"thinking", "reasoning"}
or pass_through_unknown_blocks
):
elif block.get("type") in ["thinking", "reasoning"]:
content.append(block)
else:
err = (
@@ -1875,14 +1743,3 @@ def count_tokens_approximately(
# round up once more time in case extra_tokens_per_message is a float
return math.ceil(token_count)
# Mapping from string shortcuts to token counter functions
def _approximate_token_counter(messages: Sequence[BaseMessage]) -> int:
"""Wrapper for `count_tokens_approximately` that matches expected signature."""
return count_tokens_approximately(messages)
_TOKEN_COUNTER_SHORTCUTS = {
"approximate": _approximate_token_counter,
}

View File

@@ -37,7 +37,7 @@ class OutputFunctionsParser(BaseGenerationOutputParser[Any]):
The parsed JSON object.
Raises:
OutputParserException: If the output is not valid JSON.
`OutputParserException`: If the output is not valid JSON.
"""
generation = result[0]
if not isinstance(generation, ChatGeneration):
@@ -88,7 +88,7 @@ class JsonOutputFunctionsParser(BaseCumulativeTransformOutputParser[Any]):
The parsed JSON object.
Raises:
OutputParserException: If the output is not valid JSON.
OutputParserExcept`ion: If the output is not valid JSON.
"""
if len(result) != 1:
msg = f"Expected exactly one result, but got {len(result)}"

View File

@@ -47,24 +47,22 @@ def parse_tool_call(
"""
if "function" not in raw_tool_call:
return None
arguments = raw_tool_call["function"]["arguments"]
if partial:
try:
function_args = parse_partial_json(arguments, strict=strict)
function_args = parse_partial_json(
raw_tool_call["function"]["arguments"], strict=strict
)
except (JSONDecodeError, TypeError): # None args raise TypeError
return None
# Handle None or empty string arguments for parameter-less tools
elif not arguments:
function_args = {}
else:
try:
function_args = json.loads(arguments, strict=strict)
function_args = json.loads(
raw_tool_call["function"]["arguments"], strict=strict
)
except JSONDecodeError as e:
msg = (
f"Function {raw_tool_call['function']['name']} arguments:\n\n"
f"{arguments}\n\nare not valid JSON. "
f"{raw_tool_call['function']['arguments']}\n\nare not valid JSON. "
f"Received JSONDecodeError {e}"
)
raise OutputParserException(msg) from e

View File

@@ -37,7 +37,7 @@ class PydanticOutputParser(JsonOutputParser, Generic[TBaseModel]):
def _parser_exception(
self, e: Exception, json_object: dict
) -> OutputParserException:
json_string = json.dumps(json_object, ensure_ascii=False)
json_string = json.dumps(json_object)
name = self.pydantic_object.__name__
msg = f"Failed to parse {name} from completion {json_string}. Got: {e}"
return OutputParserException(msg, llm_output=json_string)
@@ -54,7 +54,7 @@ class PydanticOutputParser(JsonOutputParser, Generic[TBaseModel]):
all the keys that have been returned so far.
Raises:
OutputParserException: If the result is not valid JSON
`OutputParserException`: If the result is not valid JSON
or does not conform to the Pydantic model.
Returns:

View File

@@ -6,33 +6,7 @@ from langchain_core.output_parsers.transform import BaseTransformOutputParser
class StrOutputParser(BaseTransformOutputParser[str]):
"""Extract text content from model outputs as a string.
Converts model outputs (such as `AIMessage` or `AIMessageChunk` objects) into plain
text strings. It's the simplest output parser and is useful when you need string
responses for downstream processing, display, or storage.
Supports streaming, yielding text chunks as they're generated by the model.
Example:
```python
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o")
parser = StrOutputParser()
# Get string output from a model
message = model.invoke("Tell me a joke")
result = parser.invoke(message)
print(result) # plain string
# With streaming - use transform() to process a stream
stream = model.stream("Tell me a story")
for chunk in parser.transform(stream):
print(chunk, end="", flush=True)
```
"""
"""OutputParser that parses `LLMResult` into the top likely string."""
@classmethod
def is_lc_serializable(cls) -> bool:

View File

@@ -2,17 +2,15 @@
from __future__ import annotations
from typing import TYPE_CHECKING, Literal
from typing import Literal
from pydantic import model_validator
from typing_extensions import Self
from langchain_core.messages import BaseMessage, BaseMessageChunk
from langchain_core.outputs.generation import Generation
from langchain_core.utils._merge import merge_dicts
if TYPE_CHECKING:
from typing_extensions import Self
class ChatGeneration(Generation):
"""A single chat generation output.

View File

@@ -6,7 +6,7 @@ import contextlib
import json
import typing
from abc import ABC, abstractmethod
from collections.abc import Mapping
from collections.abc import Callable, Mapping
from functools import cached_property
from pathlib import Path
from typing import (
@@ -33,8 +33,6 @@ from langchain_core.runnables.config import ensure_config
from langchain_core.utils.pydantic import create_model_v2
if TYPE_CHECKING:
from collections.abc import Callable
from langchain_core.documents import Document

View File

@@ -903,28 +903,23 @@ class ChatPromptTemplate(BaseChatPromptTemplate):
5. A string which is shorthand for `("human", template)`; e.g.,
`"{user_input}"`
template_format: Format of the template.
**kwargs: Additional keyword arguments passed to `BasePromptTemplate`,
including (but not limited to):
input_variables: A list of the names of the variables whose values are
required as inputs to the prompt.
optional_variables: A list of the names of the variables for placeholder
or MessagePlaceholder that are optional.
- `input_variables`: A list of the names of the variables whose values
are required as inputs to the prompt.
- `optional_variables`: A list of the names of the variables for
placeholder or `MessagePlaceholder` that are optional.
These variables are auto inferred from the prompt and user need not
provide them.
partial_variables: A dictionary of the partial variables the prompt
template carries.
These variables are auto inferred from the prompt and user need not
provide them.
Partial variables populate the template so that you don't need to pass
them in every time you call the prompt.
validate_template: Whether to validate the template.
input_types: A dictionary of the types of the variables the prompt template
expects.
- `partial_variables`: A dictionary of the partial variables the prompt
template carries.
Partial variables populate the template so that you don't need to
pass them in every time you call the prompt.
- `validate_template`: Whether to validate the template.
- `input_types`: A dictionary of the types of the variables the prompt
template expects.
If not provided, all variables are assumed to be strings.
If not provided, all variables are assumed to be strings.
Examples:
Instantiation from a list of message templates:

View File

@@ -6,10 +6,10 @@ from abc import ABC, abstractmethod
from typing import TYPE_CHECKING, Any
from langchain_core.load import Serializable
from langchain_core.messages import BaseMessage
from langchain_core.utils.interactive_env import is_interactive_env
if TYPE_CHECKING:
from langchain_core.messages import BaseMessage
from langchain_core.prompts.chat import ChatPromptTemplate

View File

@@ -4,8 +4,9 @@ from __future__ import annotations
import warnings
from abc import ABC
from collections.abc import Callable, Sequence
from string import Formatter
from typing import TYPE_CHECKING, Any, Literal
from typing import Any, Literal
from pydantic import BaseModel, create_model
@@ -15,11 +16,8 @@ from langchain_core.utils import get_colored_text, mustache
from langchain_core.utils.formatting import formatter
from langchain_core.utils.interactive_env import is_interactive_env
if TYPE_CHECKING:
from collections.abc import Callable, Sequence
try:
from jinja2 import meta
from jinja2 import Environment, meta
from jinja2.sandbox import SandboxedEnvironment
_HAS_JINJA2 = True
@@ -61,9 +59,13 @@ def jinja2_formatter(template: str, /, **kwargs: Any) -> str:
)
raise ImportError(msg)
# Use a restricted sandbox that blocks ALL attribute/method access
# Only simple variable lookups like {{variable}} are allowed
# Attribute access like {{variable.attr}} or {{variable.method()}} is blocked
# This uses a sandboxed environment to prevent arbitrary code execution.
# Jinja2 uses an opt-out rather than opt-in approach for sand-boxing.
# Please treat this sand-boxing as a best-effort approach rather than
# a guarantee of security.
# We recommend to never use jinja2 templates with untrusted inputs.
# https://jinja.palletsprojects.com/en/3.1.x/sandbox/
# approach not a guarantee of security.
return SandboxedEnvironment().from_string(template).render(**kwargs)
@@ -99,7 +101,7 @@ def _get_jinja2_variables_from_template(template: str) -> set[str]:
"Please install it with `pip install jinja2`."
)
raise ImportError(msg)
env = SandboxedEnvironment()
env = Environment() # noqa: S701
ast = env.parse(template)
return meta.find_undeclared_variables(ast)
@@ -269,30 +271,6 @@ def get_template_variables(template: str, template_format: str) -> list[str]:
msg = f"Unsupported template format: {template_format}"
raise ValueError(msg)
# For f-strings, block attribute access and indexing syntax
# This prevents template injection attacks via accessing dangerous attributes
if template_format == "f-string":
for var in input_variables:
# Formatter().parse() returns field names with dots/brackets if present
# e.g., "obj.attr" or "obj[0]" - we need to block these
if "." in var or "[" in var or "]" in var:
msg = (
f"Invalid variable name {var!r} in f-string template. "
f"Variable names cannot contain attribute "
f"access (.) or indexing ([])."
)
raise ValueError(msg)
# Block variable names that are all digits (e.g., "0", "100")
# These are interpreted as positional arguments, not keyword arguments
if var.isdigit():
msg = (
f"Invalid variable name {var!r} in f-string template. "
f"Variable names cannot be all digits as they are interpreted "
f"as positional arguments."
)
raise ValueError(msg)
return sorted(input_variables)

View File

@@ -48,17 +48,8 @@ class StructuredPrompt(ChatPromptTemplate):
schema_: schema for the structured prompt.
structured_output_kwargs: additional kwargs for structured output.
template_format: template format for the prompt.
Raises:
ValueError: if schema is not provided.
"""
schema_ = schema_ or kwargs.pop("schema", None)
if not schema_:
err_msg = (
"Must pass in a non-empty structured output schema. Received: "
f"{schema_}"
)
raise ValueError(err_msg)
schema_ = schema_ or kwargs.pop("schema")
structured_output_kwargs = structured_output_kwargs or {}
for k in set(kwargs).difference(get_pydantic_field_names(self.__class__)):
structured_output_kwargs[k] = kwargs.pop(k)

View File

@@ -94,7 +94,7 @@ from langchain_core.tracers.root_listeners import (
AsyncRootListenersTracer,
RootListenersTracer,
)
from langchain_core.utils.aiter import aclosing, atee
from langchain_core.utils.aiter import aclosing, atee, py_anext
from langchain_core.utils.iter import safetee
from langchain_core.utils.pydantic import create_model_v2
@@ -127,10 +127,10 @@ class Runnable(ABC, Generic[Input, Output]):
Key Methods
===========
- `invoke`/`ainvoke`: Transforms a single input into an output.
- `batch`/`abatch`: Efficiently transforms multiple inputs into outputs.
- `stream`/`astream`: Streams output from a single input as it's produced.
- `astream_log`: Streams output and selected intermediate results from an
- **`invoke`/`ainvoke`**: Transforms a single input into an output.
- **`batch`/`abatch`**: Efficiently transforms multiple inputs into outputs.
- **`stream`/`astream`**: Streams output from a single input as it's produced.
- **`astream_log`**: Streams output and selected intermediate results from an
input.
Built-in optimizations:
@@ -707,53 +707,51 @@ class Runnable(ABC, Generic[Input, Output]):
def pick(self, keys: str | list[str]) -> RunnableSerializable[Any, Any]:
"""Pick keys from the output `dict` of this `Runnable`.
!!! example "Pick a single key"
Pick a single key:
```python
import json
```python
import json
from langchain_core.runnables import RunnableLambda, RunnableMap
from langchain_core.runnables import RunnableLambda, RunnableMap
as_str = RunnableLambda(str)
as_json = RunnableLambda(json.loads)
chain = RunnableMap(str=as_str, json=as_json)
as_str = RunnableLambda(str)
as_json = RunnableLambda(json.loads)
chain = RunnableMap(str=as_str, json=as_json)
chain.invoke("[1, 2, 3]")
# -> {"str": "[1, 2, 3]", "json": [1, 2, 3]}
chain.invoke("[1, 2, 3]")
# -> {"str": "[1, 2, 3]", "json": [1, 2, 3]}
json_only_chain = chain.pick("json")
json_only_chain.invoke("[1, 2, 3]")
# -> [1, 2, 3]
```
json_only_chain = chain.pick("json")
json_only_chain.invoke("[1, 2, 3]")
# -> [1, 2, 3]
```
!!! example "Pick a list of keys"
Pick a list of keys:
```python
from typing import Any
```python
from typing import Any
import json
import json
from langchain_core.runnables import RunnableLambda, RunnableMap
from langchain_core.runnables import RunnableLambda, RunnableMap
as_str = RunnableLambda(str)
as_json = RunnableLambda(json.loads)
as_str = RunnableLambda(str)
as_json = RunnableLambda(json.loads)
def as_bytes(x: Any) -> bytes:
return bytes(x, "utf-8")
def as_bytes(x: Any) -> bytes:
return bytes(x, "utf-8")
chain = RunnableMap(
str=as_str, json=as_json, bytes=RunnableLambda(as_bytes)
)
chain = RunnableMap(str=as_str, json=as_json, bytes=RunnableLambda(as_bytes))
chain.invoke("[1, 2, 3]")
# -> {"str": "[1, 2, 3]", "json": [1, 2, 3], "bytes": b"[1, 2, 3]"}
chain.invoke("[1, 2, 3]")
# -> {"str": "[1, 2, 3]", "json": [1, 2, 3], "bytes": b"[1, 2, 3]"}
json_and_bytes_chain = chain.pick(["json", "bytes"])
json_and_bytes_chain.invoke("[1, 2, 3]")
# -> {"json": [1, 2, 3], "bytes": b"[1, 2, 3]"}
```
json_and_bytes_chain = chain.pick(["json", "bytes"])
json_and_bytes_chain.invoke("[1, 2, 3]")
# -> {"json": [1, 2, 3], "bytes": b"[1, 2, 3]"}
```
Args:
keys: A key or list of keys to pick from the output dict.
@@ -1374,50 +1372,48 @@ class Runnable(ABC, Generic[Input, Output]):
).with_config({"run_name": "my_template", "tags": ["my_template"]})
```
!!! example
For instance:
```python
from langchain_core.runnables import RunnableLambda
```python
from langchain_core.runnables import RunnableLambda
async def reverse(s: str) -> str:
return s[::-1]
async def reverse(s: str) -> str:
return s[::-1]
chain = RunnableLambda(func=reverse)
chain = RunnableLambda(func=reverse)
events = [
event async for event in chain.astream_events("hello", version="v2")
]
events = [event async for event in chain.astream_events("hello", version="v2")]
# Will produce the following events
# (run_id, and parent_ids has been omitted for brevity):
[
{
"data": {"input": "hello"},
"event": "on_chain_start",
"metadata": {},
"name": "reverse",
"tags": [],
},
{
"data": {"chunk": "olleh"},
"event": "on_chain_stream",
"metadata": {},
"name": "reverse",
"tags": [],
},
{
"data": {"output": "olleh"},
"event": "on_chain_end",
"metadata": {},
"name": "reverse",
"tags": [],
},
]
```
# Will produce the following events
# (run_id, and parent_ids has been omitted for brevity):
[
{
"data": {"input": "hello"},
"event": "on_chain_start",
"metadata": {},
"name": "reverse",
"tags": [],
},
{
"data": {"chunk": "olleh"},
"event": "on_chain_stream",
"metadata": {},
"name": "reverse",
"tags": [],
},
{
"data": {"output": "olleh"},
"event": "on_chain_end",
"metadata": {},
"name": "reverse",
"tags": [],
},
]
```
```python title="Dispatch custom event"
```python title="Example: Dispatch Custom Event"
from langchain_core.callbacks.manager import (
adispatch_custom_event,
)
@@ -1451,13 +1447,10 @@ class Runnable(ABC, Generic[Input, Output]):
Args:
input: The input to the `Runnable`.
config: The config to use for the `Runnable`.
version: The version of the schema to use, either `'v2'` or `'v1'`.
version: The version of the schema to use either `'v2'` or `'v1'`.
Users should use `'v2'`.
`'v1'` is for backwards compatibility and will be deprecated
in `0.4.0`.
No default will be assigned until the API is stabilized.
custom events will only be surfaced in `'v2'`.
include_names: Only include events from `Runnable` objects with matching names.
@@ -1467,7 +1460,6 @@ class Runnable(ABC, Generic[Input, Output]):
exclude_types: Exclude events from `Runnable` objects with matching types.
exclude_tags: Exclude events from `Runnable` objects with matching tags.
**kwargs: Additional keyword arguments to pass to the `Runnable`.
These will be passed to `astream_log` as this implementation
of `astream_events` is built on top of `astream_log`.
@@ -2277,9 +2269,6 @@ class Runnable(ABC, Generic[Input, Output]):
Use this to implement `stream` or `transform` in `Runnable` subclasses.
"""
# Extract defers_inputs from kwargs if present
defers_inputs = kwargs.pop("defers_inputs", False)
# tee the input so we can iterate over it twice
input_for_tracing, input_for_transform = tee(inputs, 2)
# Start the input iterator to ensure the input Runnable starts before this one
@@ -2296,7 +2285,6 @@ class Runnable(ABC, Generic[Input, Output]):
run_type=run_type,
name=config.get("run_name") or self.get_name(),
run_id=config.pop("run_id", None),
defers_inputs=defers_inputs,
)
try:
child_config = patch_config(config, callbacks=run_manager.get_child())
@@ -2378,13 +2366,10 @@ class Runnable(ABC, Generic[Input, Output]):
Use this to implement `astream` or `atransform` in `Runnable` subclasses.
"""
# Extract defers_inputs from kwargs if present
defers_inputs = kwargs.pop("defers_inputs", False)
# tee the input so we can iterate over it twice
input_for_tracing, input_for_transform = atee(inputs, 2)
# Start the input iterator to ensure the input Runnable starts before this one
final_input: Input | None = await anext(input_for_tracing, None)
final_input: Input | None = await py_anext(input_for_tracing, None)
final_input_supported = True
final_output: Output | None = None
final_output_supported = True
@@ -2397,7 +2382,6 @@ class Runnable(ABC, Generic[Input, Output]):
run_type=run_type,
name=config.get("run_name") or self.get_name(),
run_id=config.pop("run_id", None),
defers_inputs=defers_inputs,
)
try:
child_config = patch_config(config, callbacks=run_manager.get_child())
@@ -2425,7 +2409,7 @@ class Runnable(ABC, Generic[Input, Output]):
iterator = iterator_
try:
while True:
chunk = await coro_with_context(anext(iterator), context)
chunk = await coro_with_context(py_anext(iterator), context)
yield chunk
if final_output_supported:
if final_output is None:
@@ -2492,82 +2476,82 @@ class Runnable(ABC, Generic[Input, Output]):
Returns:
A `BaseTool` instance.
!!! example "`TypedDict` input"
Typed dict input:
```python
from typing_extensions import TypedDict
from langchain_core.runnables import RunnableLambda
```python
from typing_extensions import TypedDict
from langchain_core.runnables import RunnableLambda
class Args(TypedDict):
a: int
b: list[int]
class Args(TypedDict):
a: int
b: list[int]
def f(x: Args) -> str:
return str(x["a"] * max(x["b"]))
def f(x: Args) -> str:
return str(x["a"] * max(x["b"]))
runnable = RunnableLambda(f)
as_tool = runnable.as_tool()
as_tool.invoke({"a": 3, "b": [1, 2]})
```
runnable = RunnableLambda(f)
as_tool = runnable.as_tool()
as_tool.invoke({"a": 3, "b": [1, 2]})
```
!!! example "`dict` input, specifying schema via `args_schema`"
`dict` input, specifying schema via `args_schema`:
```python
from typing import Any
from pydantic import BaseModel, Field
from langchain_core.runnables import RunnableLambda
```python
from typing import Any
from pydantic import BaseModel, Field
from langchain_core.runnables import RunnableLambda
def f(x: dict[str, Any]) -> str:
return str(x["a"] * max(x["b"]))
def f(x: dict[str, Any]) -> str:
return str(x["a"] * max(x["b"]))
class FSchema(BaseModel):
\"\"\"Apply a function to an integer and list of integers.\"\"\"
class FSchema(BaseModel):
\"\"\"Apply a function to an integer and list of integers.\"\"\"
a: int = Field(..., description="Integer")
b: list[int] = Field(..., description="List of ints")
a: int = Field(..., description="Integer")
b: list[int] = Field(..., description="List of ints")
runnable = RunnableLambda(f)
as_tool = runnable.as_tool(FSchema)
as_tool.invoke({"a": 3, "b": [1, 2]})
```
runnable = RunnableLambda(f)
as_tool = runnable.as_tool(FSchema)
as_tool.invoke({"a": 3, "b": [1, 2]})
```
!!! example "`dict` input, specifying schema via `arg_types`"
`dict` input, specifying schema via `arg_types`:
```python
from typing import Any
from langchain_core.runnables import RunnableLambda
```python
from typing import Any
from langchain_core.runnables import RunnableLambda
def f(x: dict[str, Any]) -> str:
return str(x["a"] * max(x["b"]))
def f(x: dict[str, Any]) -> str:
return str(x["a"] * max(x["b"]))
runnable = RunnableLambda(f)
as_tool = runnable.as_tool(arg_types={"a": int, "b": list[int]})
as_tool.invoke({"a": 3, "b": [1, 2]})
```
runnable = RunnableLambda(f)
as_tool = runnable.as_tool(arg_types={"a": int, "b": list[int]})
as_tool.invoke({"a": 3, "b": [1, 2]})
```
!!! example "`str` input"
`str` input:
```python
from langchain_core.runnables import RunnableLambda
```python
from langchain_core.runnables import RunnableLambda
def f(x: str) -> str:
return x + "a"
def f(x: str) -> str:
return x + "a"
def g(x: str) -> str:
return x + "z"
def g(x: str) -> str:
return x + "z"
runnable = RunnableLambda(f) | g
as_tool = runnable.as_tool()
as_tool.invoke("b")
```
runnable = RunnableLambda(f) | g
as_tool = runnable.as_tool()
as_tool.invoke("b")
```
"""
# Avoid circular import
from langchain_core.tools import convert_runnable_to_tool # noqa: PLC0415
@@ -2619,33 +2603,29 @@ class RunnableSerializable(Serializable, Runnable[Input, Output]):
Returns:
A new `Runnable` with the fields configured.
!!! example
```python
from langchain_core.runnables import ConfigurableField
from langchain_openai import ChatOpenAI
```python
from langchain_core.runnables import ConfigurableField
from langchain_openai import ChatOpenAI
model = ChatOpenAI(max_tokens=20).configurable_fields(
max_tokens=ConfigurableField(
id="output_token_number",
name="Max tokens in the output",
description="The maximum number of tokens in the output",
)
model = ChatOpenAI(max_tokens=20).configurable_fields(
max_tokens=ConfigurableField(
id="output_token_number",
name="Max tokens in the output",
description="The maximum number of tokens in the output",
)
)
# max_tokens = 20
print(
"max_tokens_20: ", model.invoke("tell me something about chess").content
)
# max_tokens = 20
print("max_tokens_20: ", model.invoke("tell me something about chess").content)
# max_tokens = 200
print(
"max_tokens_200: ",
model.with_config(configurable={"output_token_number": 200})
.invoke("tell me something about chess")
.content,
)
```
# max_tokens = 200
print(
"max_tokens_200: ",
model.with_config(configurable={"output_token_number": 200})
.invoke("tell me something about chess")
.content,
)
```
"""
# Import locally to prevent circular import
from langchain_core.runnables.configurable import ( # noqa: PLC0415
@@ -2684,31 +2664,29 @@ class RunnableSerializable(Serializable, Runnable[Input, Output]):
Returns:
A new `Runnable` with the alternatives configured.
!!! example
```python
from langchain_anthropic import ChatAnthropic
from langchain_core.runnables.utils import ConfigurableField
from langchain_openai import ChatOpenAI
```python
from langchain_anthropic import ChatAnthropic
from langchain_core.runnables.utils import ConfigurableField
from langchain_openai import ChatOpenAI
model = ChatAnthropic(
model_name="claude-sonnet-4-5-20250929"
).configurable_alternatives(
ConfigurableField(id="llm"),
default_key="anthropic",
openai=ChatOpenAI(),
)
model = ChatAnthropic(
model_name="claude-sonnet-4-5-20250929"
).configurable_alternatives(
ConfigurableField(id="llm"),
default_key="anthropic",
openai=ChatOpenAI(),
)
# uses the default model ChatAnthropic
print(model.invoke("which organization created you?").content)
# uses the default model ChatAnthropic
print(model.invoke("which organization created you?").content)
# uses ChatOpenAI
print(
model.with_config(configurable={"llm": "openai"})
.invoke("which organization created you?")
.content
)
```
# uses ChatOpenAI
print(
model.with_config(configurable={"llm": "openai"})
.invoke("which organization created you?")
.content
)
```
"""
# Import locally to prevent circular import
from langchain_core.runnables.configurable import ( # noqa: PLC0415
@@ -4033,7 +4011,7 @@ class RunnableParallel(RunnableSerializable[Input, dict[str, Any]]):
# Wrap in a coroutine to satisfy linter
async def get_next_chunk(generator: AsyncIterator) -> Output | None:
return await anext(generator)
return await py_anext(generator)
# Start the first iteration of each generator
tasks = {
@@ -4331,7 +4309,6 @@ class RunnableGenerator(Runnable[Input, Output]):
input,
self._transform, # type: ignore[arg-type]
config,
defers_inputs=True,
**kwargs,
)
@@ -4365,7 +4342,7 @@ class RunnableGenerator(Runnable[Input, Output]):
raise NotImplementedError(msg)
return self._atransform_stream_with_config(
input, self._atransform, config, defers_inputs=True, **kwargs
input, self._atransform, config, **kwargs
)
@override

View File

@@ -303,7 +303,7 @@ class RunnableBranch(RunnableSerializable[Input, Output]):
Args:
input: The input to the `Runnable`.
config: The configuration for the `Runnable`.
config: The configuration for the Runna`ble.
**kwargs: Additional keyword arguments to pass to the `Runnable`.
Yields:

View File

@@ -47,59 +47,54 @@ class EmptyDict(TypedDict, total=False):
class RunnableConfig(TypedDict, total=False):
"""Configuration for a `Runnable`.
See the [reference docs](https://reference.langchain.com/python/langchain_core/runnables/#langchain_core.runnables.RunnableConfig)
for more details.
"""
"""Configuration for a Runnable."""
tags: list[str]
"""Tags for this call and any sub-calls (e.g. a Chain calling an LLM).
"""
Tags for this call and any sub-calls (eg. a Chain calling an LLM).
You can use these to filter calls.
"""
metadata: dict[str, Any]
"""Metadata for this call and any sub-calls (e.g. a Chain calling an LLM).
"""
Metadata for this call and any sub-calls (eg. a Chain calling an LLM).
Keys should be strings, values should be JSON-serializable.
"""
callbacks: Callbacks
"""Callbacks for this call and any sub-calls (e.g. a Chain calling an LLM).
"""
Callbacks for this call and any sub-calls (eg. a Chain calling an LLM).
Tags are passed to all callbacks, metadata is passed to handle*Start callbacks.
"""
run_name: str
"""Name for the tracer run for this call.
Defaults to the name of the class."""
"""
Name for the tracer run for this call. Defaults to the name of the class.
"""
max_concurrency: int | None
"""Maximum number of parallel calls to make.
If not provided, defaults to `ThreadPoolExecutor`'s default.
"""
Maximum number of parallel calls to make. If not provided, defaults to
`ThreadPoolExecutor`'s default.
"""
recursion_limit: int
"""Maximum number of times a call can recurse.
If not provided, defaults to `25`.
"""
Maximum number of times a call can recurse. If not provided, defaults to `25`.
"""
configurable: dict[str, Any]
"""Runtime values for attributes previously made configurable on this `Runnable`,
"""
Runtime values for attributes previously made configurable on this `Runnable`,
or sub-Runnables, through `configurable_fields` or `configurable_alternatives`.
Check `output_schema` for a description of the attributes that have been made
configurable.
"""
run_id: uuid.UUID | None
"""Unique identifier for the tracer run for this call.
If not provided, a new UUID will be generated.
"""
Unique identifier for the tracer run for this call. If not provided, a new UUID
will be generated.
"""

View File

@@ -28,6 +28,7 @@ from langchain_core.runnables.utils import (
coro_with_context,
get_unique_config_specs,
)
from langchain_core.utils.aiter import py_anext
if TYPE_CHECKING:
from langchain_core.callbacks.manager import AsyncCallbackManagerForChainRun
@@ -562,7 +563,7 @@ class RunnableWithFallbacks(RunnableSerializable[Input, Output]):
child_config,
**kwargs,
)
chunk = await coro_with_context(anext(stream), context)
chunk = await coro_with_context(py_anext(stream), context)
except self.exceptions_to_handle as e:
first_error = e if first_error is None else first_error
last_error = e

View File

@@ -4,6 +4,7 @@ from __future__ import annotations
import inspect
from collections import defaultdict
from collections.abc import Callable
from dataclasses import dataclass, field
from enum import Enum
from typing import (
@@ -21,7 +22,7 @@ from langchain_core.runnables.base import Runnable, RunnableSerializable
from langchain_core.utils.pydantic import _IgnoreUnserializable, is_basemodel_subclass
if TYPE_CHECKING:
from collections.abc import Callable, Sequence
from collections.abc import Sequence
from pydantic import BaseModel
@@ -641,7 +642,6 @@ class Graph:
retry_delay: float = 1.0,
frontmatter_config: dict[str, Any] | None = None,
base_url: str | None = None,
proxies: dict[str, str] | None = None,
) -> bytes:
"""Draw the graph as a PNG image using Mermaid.
@@ -674,10 +674,11 @@ class Graph:
}
```
base_url: The base URL of the Mermaid server for rendering via API.
proxies: HTTP/HTTPS proxies for requests (e.g. `{"http": "http://127.0.0.1:7890"}`).
Returns:
The PNG image as bytes.
"""
# Import locally to prevent circular import
from langchain_core.runnables.graph_mermaid import ( # noqa: PLC0415
@@ -698,7 +699,6 @@ class Graph:
padding=padding,
max_retries=max_retries,
retry_delay=retry_delay,
proxies=proxies,
base_url=base_url,
)

View File

@@ -7,6 +7,7 @@ from __future__ import annotations
import math
import os
from collections.abc import Mapping, Sequence
from typing import TYPE_CHECKING, Any
try:
@@ -19,8 +20,6 @@ except ImportError:
_HAS_GRANDALF = False
if TYPE_CHECKING:
from collections.abc import Mapping, Sequence
from langchain_core.runnables.graph import Edge as LangEdge
@@ -165,9 +164,6 @@ class AsciiCanvas:
y0: y coordinate of the box corner.
width: box width.
height: box height.
Raises:
ValueError: if box dimensions are invalid.
"""
if width <= 1 or height <= 1:
msg = "Box dimensions should be > 1"

View File

@@ -81,7 +81,6 @@ def draw_mermaid(
}
}
```
Returns:
Mermaid graph syntax.
@@ -282,7 +281,6 @@ def draw_mermaid_png(
max_retries: int = 1,
retry_delay: float = 1.0,
base_url: str | None = None,
proxies: dict[str, str] | None = None,
) -> bytes:
"""Draws a Mermaid graph as PNG using provided syntax.
@@ -295,7 +293,6 @@ def draw_mermaid_png(
max_retries: Maximum number of retries (MermaidDrawMethod.API).
retry_delay: Delay between retries (MermaidDrawMethod.API).
base_url: Base URL for the Mermaid.ink API.
proxies: HTTP/HTTPS proxies for requests (e.g. `{"http": "http://127.0.0.1:7890"}`).
Returns:
PNG image bytes.
@@ -317,7 +314,6 @@ def draw_mermaid_png(
max_retries=max_retries,
retry_delay=retry_delay,
base_url=base_url,
proxies=proxies,
)
else:
supported_methods = ", ".join([m.value for m in MermaidDrawMethod])
@@ -409,7 +405,6 @@ def _render_mermaid_using_api(
file_type: Literal["jpeg", "png", "webp"] | None = "png",
max_retries: int = 1,
retry_delay: float = 1.0,
proxies: dict[str, str] | None = None,
base_url: str | None = None,
) -> bytes:
"""Renders Mermaid graph using the Mermaid.INK API."""
@@ -450,7 +445,7 @@ def _render_mermaid_using_api(
for attempt in range(max_retries + 1):
try:
response = requests.get(image_url, timeout=10, proxies=proxies)
response = requests.get(image_url, timeout=10)
if response.status_code == requests.codes.ok:
img_bytes = response.content
if output_file_path is not None:

View File

@@ -201,8 +201,7 @@ class PngDrawer:
viz, start, end, str(data) if data is not None else None, cond
)
@staticmethod
def update_styles(viz: Any, graph: Graph) -> None:
def update_styles(self, viz: Any, graph: Graph) -> None:
"""Update the styles of the entrypoint and END nodes.
Args:

View File

@@ -539,7 +539,7 @@ class RunnableWithMessageHistory(RunnableBindingBase): # type: ignore[no-redef]
hist: BaseChatMessageHistory = config["configurable"]["message_history"]
# Get the input messages
inputs = load(run.inputs, allowed_objects="all")
inputs = load(run.inputs)
input_messages = self._get_input_messages(inputs)
# If historic messages were prepended to the input messages, remove them to
# avoid adding duplicate messages to history.
@@ -548,7 +548,7 @@ class RunnableWithMessageHistory(RunnableBindingBase): # type: ignore[no-redef]
input_messages = input_messages[len(historic_messages) :]
# Get the output messages
output_val = load(run.outputs, allowed_objects="all")
output_val = load(run.outputs)
output_messages = self._get_output_messages(output_val)
hist.add_messages(input_messages + output_messages)
@@ -556,7 +556,7 @@ class RunnableWithMessageHistory(RunnableBindingBase): # type: ignore[no-redef]
hist: BaseChatMessageHistory = config["configurable"]["message_history"]
# Get the input messages
inputs = load(run.inputs, allowed_objects="all")
inputs = load(run.inputs)
input_messages = self._get_input_messages(inputs)
# If historic messages were prepended to the input messages, remove them to
# avoid adding duplicate messages to history.
@@ -565,7 +565,7 @@ class RunnableWithMessageHistory(RunnableBindingBase): # type: ignore[no-redef]
input_messages = input_messages[len(historic_messages) :]
# Get the output messages
output_val = load(run.outputs, allowed_objects="all")
output_val = load(run.outputs)
output_messages = self._get_output_messages(output_val)
await hist.aadd_messages(input_messages + output_messages)

View File

@@ -33,7 +33,7 @@ from langchain_core.runnables.utils import (
AddableDict,
ConfigurableFieldSpec,
)
from langchain_core.utils.aiter import atee
from langchain_core.utils.aiter import atee, py_anext
from langchain_core.utils.iter import safetee
from langchain_core.utils.pydantic import create_model_v2
@@ -614,7 +614,7 @@ class RunnableAssign(RunnableSerializable[dict[str, Any], dict[str, Any]]):
)
# start map output stream
first_map_chunk_task: asyncio.Task = asyncio.create_task(
anext(map_output, None),
py_anext(map_output, None), # type: ignore[arg-type]
)
# consume passthrough stream
async for chunk in for_passthrough:
@@ -753,19 +753,25 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
return AddableDict(picked)
return None
def _invoke(
self,
value: dict[str, Any],
) -> dict[str, Any]:
return self._pick(value)
@override
def invoke(
self,
input: dict[str, Any],
config: RunnableConfig | None = None,
**kwargs: Any,
) -> Any:
return self._call_with_config(self._pick, input, config, **kwargs)
) -> dict[str, Any]:
return self._call_with_config(self._invoke, input, config, **kwargs)
async def _ainvoke(
self,
value: dict[str, Any],
) -> Any:
) -> dict[str, Any]:
return self._pick(value)
@override
@@ -774,13 +780,13 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
input: dict[str, Any],
config: RunnableConfig | None = None,
**kwargs: Any,
) -> Any:
) -> dict[str, Any]:
return await self._acall_with_config(self._ainvoke, input, config, **kwargs)
def _transform(
self,
chunks: Iterator[dict[str, Any]],
) -> Iterator[Any]:
) -> Iterator[dict[str, Any]]:
for chunk in chunks:
picked = self._pick(chunk)
if picked is not None:
@@ -792,7 +798,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
input: Iterator[dict[str, Any]],
config: RunnableConfig | None = None,
**kwargs: Any,
) -> Iterator[Any]:
) -> Iterator[dict[str, Any]]:
yield from self._transform_stream_with_config(
input, self._transform, config, **kwargs
)
@@ -800,7 +806,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
async def _atransform(
self,
chunks: AsyncIterator[dict[str, Any]],
) -> AsyncIterator[Any]:
) -> AsyncIterator[dict[str, Any]]:
async for chunk in chunks:
picked = self._pick(chunk)
if picked is not None:
@@ -812,7 +818,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
input: AsyncIterator[dict[str, Any]],
config: RunnableConfig | None = None,
**kwargs: Any,
) -> AsyncIterator[Any]:
) -> AsyncIterator[dict[str, Any]]:
async for chunk in self._atransform_stream_with_config(
input, self._atransform, config, **kwargs
):
@@ -824,7 +830,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
input: dict[str, Any],
config: RunnableConfig | None = None,
**kwargs: Any,
) -> Iterator[Any]:
) -> Iterator[dict[str, Any]]:
return self.transform(iter([input]), config, **kwargs)
@override
@@ -833,7 +839,7 @@ class RunnablePick(RunnableSerializable[dict[str, Any], Any]):
input: dict[str, Any],
config: RunnableConfig | None = None,
**kwargs: Any,
) -> AsyncIterator[Any]:
) -> AsyncIterator[dict[str, Any]]:
async def input_aiter() -> AsyncIterator[dict[str, Any]]:
yield input

View File

@@ -7,7 +7,8 @@ import asyncio
import inspect
import sys
import textwrap
from collections.abc import Mapping, Sequence
from collections.abc import Callable, Mapping, Sequence
from contextvars import Context
from functools import lru_cache
from inspect import signature
from itertools import groupby
@@ -30,11 +31,9 @@ if TYPE_CHECKING:
AsyncIterable,
AsyncIterator,
Awaitable,
Callable,
Coroutine,
Iterable,
)
from contextvars import Context
from langchain_core.runnables.schema import StreamEvent

View File

@@ -22,7 +22,6 @@ from typing import (
get_type_hints,
)
import typing_extensions
from pydantic import (
BaseModel,
ConfigDict,
@@ -32,7 +31,6 @@ from pydantic import (
ValidationError,
validate_arguments,
)
from pydantic.fields import FieldInfo
from pydantic.v1 import BaseModel as BaseModelV1
from pydantic.v1 import ValidationError as ValidationErrorV1
from pydantic.v1 import validate_arguments as validate_arguments_v1
@@ -96,14 +94,12 @@ def _is_annotated_type(typ: type[Any]) -> bool:
Returns:
`True` if the type is an Annotated type, `False` otherwise.
"""
return get_origin(typ) in {typing.Annotated, typing_extensions.Annotated}
return get_origin(typ) is typing.Annotated
def _get_annotation_description(arg_type: type) -> str | None:
"""Extract description from an Annotated type.
Checks for string annotations and `FieldInfo` objects with descriptions.
Args:
arg_type: The type to extract description from.
@@ -115,8 +111,6 @@ def _get_annotation_description(arg_type: type) -> str | None:
for annotation in annotated_args[1:]:
if isinstance(annotation, str):
return annotation
if isinstance(annotation, FieldInfo) and annotation.description:
return annotation.description
return None
@@ -392,8 +386,6 @@ class ToolException(Exception): # noqa: N818
ArgsSchema = TypeBaseModel | dict[str, Any]
_EMPTY_SET: frozenset[str] = frozenset()
class BaseTool(RunnableSerializable[str | dict | ToolCall, Any]):
"""Base class for all LangChain tools.
@@ -502,24 +494,6 @@ class ChildTool(BaseTool):
two-tuple corresponding to the `(content, artifact)` of a `ToolMessage`.
"""
extras: dict[str, Any] | None = None
"""Optional provider-specific extra fields for the tool.
This is used to pass provider-specific configuration that doesn't fit into
standard tool fields.
Example:
Anthropic-specific fields like [`cache_control`](https://docs.langchain.com/oss/python/integrations/chat/anthropic#prompt-caching),
[`defer_loading`](https://docs.langchain.com/oss/python/integrations/chat/anthropic#tool-search),
or `input_examples`.
```python
@tool(extras={"defer_loading": True, "cache_control": {"type": "ephemeral"}})
def my_tool(x: str) -> str:
return x
```
"""
def __init__(self, **kwargs: Any) -> None:
"""Initialize the tool.
@@ -595,11 +569,6 @@ class ChildTool(BaseTool):
self.name, full_schema, fields, fn_description=self.description
)
@functools.cached_property
def _injected_args_keys(self) -> frozenset[str]:
# base implementation doesn't manage injected args
return _EMPTY_SET
# --- Runnable ---
@override
@@ -659,7 +628,6 @@ class ChildTool(BaseTool):
TypeError: If `args_schema` is not a Pydantic `BaseModel` or dict.
"""
input_args = self.args_schema
if isinstance(tool_input, str):
if input_args is not None:
if isinstance(input_args, dict):
@@ -677,12 +645,10 @@ class ChildTool(BaseTool):
msg = f"args_schema must be a Pydantic BaseModel, got {input_args}"
raise TypeError(msg)
return tool_input
if input_args is not None:
if isinstance(input_args, dict):
return tool_input
if issubclass(input_args, BaseModel):
# Check args_schema for InjectedToolCallId
for k, v in get_all_basemodel_annotations(input_args).items():
if _is_injected_arg_type(v, injected_type=InjectedToolCallId):
if tool_call_id is None:
@@ -698,7 +664,6 @@ class ChildTool(BaseTool):
result = input_args.model_validate(tool_input)
result_dict = result.model_dump()
elif issubclass(input_args, BaseModelV1):
# Check args_schema for InjectedToolCallId
for k, v in get_all_basemodel_annotations(input_args).items():
if _is_injected_arg_type(v, injected_type=InjectedToolCallId):
if tool_call_id is None:
@@ -718,47 +683,9 @@ class ChildTool(BaseTool):
f"args_schema must be a Pydantic BaseModel, got {self.args_schema}"
)
raise NotImplementedError(msg)
# Include fields from tool_input, plus fields with explicit defaults.
# This applies Pydantic defaults (like Field(default=1)) while excluding
# synthetic "args"/"kwargs" fields that Pydantic creates for *args/**kwargs.
field_info = get_fields(input_args)
validated_input = {}
for k in result_dict:
if k in tool_input:
# Field was provided in input - include it (validated)
validated_input[k] = getattr(result, k)
elif k in field_info and k not in ("args", "kwargs"):
# Check if field has an explicit default defined in the schema.
# Exclude "args"/"kwargs" as these are synthetic fields for variadic
# parameters that should not be passed as keyword arguments.
fi = field_info[k]
# Pydantic v2 uses is_required() method, v1 uses required attribute
has_default = (
not fi.is_required()
if hasattr(fi, "is_required")
else not getattr(fi, "required", True)
)
if has_default:
validated_input[k] = getattr(result, k)
for k in self._injected_args_keys:
if k in tool_input:
validated_input[k] = tool_input[k]
elif k == "tool_call_id":
if tool_call_id is None:
msg = (
"When tool includes an InjectedToolCallId "
"argument, tool must always be invoked with a full "
"model ToolCall of the form: {'args': {...}, "
"'name': '...', 'type': 'tool_call', "
"'tool_call_id': '...'}"
)
raise ValueError(msg)
validated_input[k] = tool_call_id
return validated_input
return {
k: getattr(result, k) for k, v in result_dict.items() if k in tool_input
}
return tool_input
@abstractmethod
@@ -926,7 +853,6 @@ class ChildTool(BaseTool):
name=run_name,
run_id=run_id,
inputs=filtered_tool_input,
tool_call_id=tool_call_id,
**kwargs,
)
@@ -1054,7 +980,6 @@ class ChildTool(BaseTool):
name=run_name,
run_id=run_id,
inputs=filtered_tool_input,
tool_call_id=tool_call_id,
**kwargs,
)
content = None

View File

@@ -23,7 +23,6 @@ def tool(
response_format: Literal["content", "content_and_artifact"] = "content",
parse_docstring: bool = False,
error_on_invalid_docstring: bool = True,
extras: dict[str, Any] | None = None,
) -> Callable[[Callable | Runnable], BaseTool]: ...
@@ -39,7 +38,6 @@ def tool(
response_format: Literal["content", "content_and_artifact"] = "content",
parse_docstring: bool = False,
error_on_invalid_docstring: bool = True,
extras: dict[str, Any] | None = None,
) -> BaseTool: ...
@@ -54,7 +52,6 @@ def tool(
response_format: Literal["content", "content_and_artifact"] = "content",
parse_docstring: bool = False,
error_on_invalid_docstring: bool = True,
extras: dict[str, Any] | None = None,
) -> BaseTool: ...
@@ -69,7 +66,6 @@ def tool(
response_format: Literal["content", "content_and_artifact"] = "content",
parse_docstring: bool = False,
error_on_invalid_docstring: bool = True,
extras: dict[str, Any] | None = None,
) -> Callable[[Callable | Runnable], BaseTool]: ...
@@ -84,7 +80,6 @@ def tool(
response_format: Literal["content", "content_and_artifact"] = "content",
parse_docstring: bool = False,
error_on_invalid_docstring: bool = True,
extras: dict[str, Any] | None = None,
) -> BaseTool | Callable[[Callable | Runnable], BaseTool]:
"""Convert Python functions and `Runnables` to LangChain tools.
@@ -135,15 +130,6 @@ def tool(
parse parameter descriptions from Google Style function docstrings.
error_on_invalid_docstring: If `parse_docstring` is provided, configure
whether to raise `ValueError` on invalid Google Style docstrings.
extras: Optional provider-specific extra fields for the tool.
Used to pass configuration that doesn't fit into standard tool fields.
Chat models should process known extras when constructing model payloads.
!!! example
For example, Anthropic-specific fields like `cache_control`,
`defer_loading`, or `input_examples`.
Raises:
ValueError: If too many positional arguments are provided (e.g. violating the
@@ -306,7 +292,6 @@ def tool(
response_format=response_format,
parse_docstring=parse_docstring,
error_on_invalid_docstring=error_on_invalid_docstring,
extras=extras,
)
# If someone doesn't want a schema applied, we must treat it as
# a simple string->string function
@@ -323,7 +308,6 @@ def tool(
return_direct=return_direct,
coroutine=coroutine,
response_format=response_format,
extras=extras,
)
return _tool_factory

View File

@@ -2,21 +2,22 @@
from __future__ import annotations
from functools import partial
from typing import TYPE_CHECKING, Literal
from pydantic import BaseModel, Field
from langchain_core.callbacks import Callbacks
from langchain_core.documents import Document
from langchain_core.prompts import (
BasePromptTemplate,
PromptTemplate,
aformat_document,
format_document,
)
from langchain_core.tools.structured import StructuredTool
from langchain_core.tools.simple import Tool
if TYPE_CHECKING:
from langchain_core.callbacks import Callbacks
from langchain_core.documents import Document
from langchain_core.retrievers import BaseRetriever
@@ -26,6 +27,43 @@ class RetrieverInput(BaseModel):
query: str = Field(description="query to look up in retriever")
def _get_relevant_documents(
query: str,
retriever: BaseRetriever,
document_prompt: BasePromptTemplate,
document_separator: str,
callbacks: Callbacks = None,
response_format: Literal["content", "content_and_artifact"] = "content",
) -> str | tuple[str, list[Document]]:
docs = retriever.invoke(query, config={"callbacks": callbacks})
content = document_separator.join(
format_document(doc, document_prompt) for doc in docs
)
if response_format == "content_and_artifact":
return (content, docs)
return content
async def _aget_relevant_documents(
query: str,
retriever: BaseRetriever,
document_prompt: BasePromptTemplate,
document_separator: str,
callbacks: Callbacks = None,
response_format: Literal["content", "content_and_artifact"] = "content",
) -> str | tuple[str, list[Document]]:
docs = await retriever.ainvoke(query, config={"callbacks": callbacks})
content = document_separator.join(
[await aformat_document(doc, document_prompt) for doc in docs]
)
if response_format == "content_and_artifact":
return (content, docs)
return content
def create_retriever_tool(
retriever: BaseRetriever,
name: str,
@@ -34,7 +72,7 @@ def create_retriever_tool(
document_prompt: BasePromptTemplate | None = None,
document_separator: str = "\n\n",
response_format: Literal["content", "content_and_artifact"] = "content",
) -> StructuredTool:
) -> Tool:
r"""Create a tool to do retrieval of documents.
Args:
@@ -55,31 +93,22 @@ def create_retriever_tool(
Returns:
Tool class to pass to an agent.
"""
document_prompt_ = document_prompt or PromptTemplate.from_template("{page_content}")
def func(
query: str, callbacks: Callbacks = None
) -> str | tuple[str, list[Document]]:
docs = retriever.invoke(query, config={"callbacks": callbacks})
content = document_separator.join(
format_document(doc, document_prompt_) for doc in docs
)
if response_format == "content_and_artifact":
return (content, docs)
return content
async def afunc(
query: str, callbacks: Callbacks = None
) -> str | tuple[str, list[Document]]:
docs = await retriever.ainvoke(query, config={"callbacks": callbacks})
content = document_separator.join(
[await aformat_document(doc, document_prompt_) for doc in docs]
)
if response_format == "content_and_artifact":
return (content, docs)
return content
return StructuredTool(
document_prompt = document_prompt or PromptTemplate.from_template("{page_content}")
func = partial(
_get_relevant_documents,
retriever=retriever,
document_prompt=document_prompt,
document_separator=document_separator,
response_format=response_format,
)
afunc = partial(
_aget_relevant_documents,
retriever=retriever,
document_prompt=document_prompt,
document_separator=document_separator,
response_format=response_format,
)
return Tool(
name=name,
description=description,
func=func,

View File

@@ -2,7 +2,6 @@
from __future__ import annotations
import functools
import textwrap
from collections.abc import Awaitable, Callable
from inspect import signature
@@ -22,12 +21,10 @@ from langchain_core.callbacks import (
)
from langchain_core.runnables import RunnableConfig, run_in_executor
from langchain_core.tools.base import (
_EMPTY_SET,
FILTERED_ARGS,
ArgsSchema,
BaseTool,
_get_runnable_config_param,
_is_injected_arg_type,
create_schema_from_function,
)
from langchain_core.utils.pydantic import is_basemodel_subclass
@@ -244,17 +241,6 @@ class StructuredTool(BaseTool):
**kwargs,
)
@functools.cached_property
def _injected_args_keys(self) -> frozenset[str]:
fn = self.func or self.coroutine
if fn is None:
return _EMPTY_SET
return frozenset(
k
for k, v in signature(fn).parameters.items()
if _is_injected_arg_type(v.annotation)
)
def _filter_schema_args(func: Callable) -> list[str]:
filter_args = list(FILTERED_ARGS)

View File

@@ -15,6 +15,12 @@ from typing import (
from langchain_core.exceptions import TracerException
from langchain_core.load import dumpd
from langchain_core.outputs import (
ChatGeneration,
ChatGenerationChunk,
GenerationChunk,
LLMResult,
)
from langchain_core.tracers.schemas import Run
if TYPE_CHECKING:
@@ -25,12 +31,6 @@ if TYPE_CHECKING:
from langchain_core.documents import Document
from langchain_core.messages import BaseMessage
from langchain_core.outputs import (
ChatGeneration,
ChatGenerationChunk,
GenerationChunk,
LLMResult,
)
logger = logging.getLogger(__name__)
@@ -284,16 +284,6 @@ class _TracerCore(ABC):
llm_run.end_time = datetime.now(timezone.utc)
llm_run.events.append({"name": "end", "time": llm_run.end_time})
tool_call_count = 0
for generations in response.generations:
for generation in generations:
if hasattr(generation, "message"):
msg = generation.message
if hasattr(msg, "tool_calls") and msg.tool_calls:
tool_call_count += len(msg.tool_calls)
if tool_call_count > 0:
llm_run.extra["tool_call_count"] = tool_call_count
return llm_run
def _errored_llm_run(

View File

@@ -154,8 +154,8 @@ class EvaluatorCallbackHandler(BaseTracer):
res
)
@staticmethod
def _select_eval_results(
self,
results: EvaluationResult | EvaluationResults,
) -> list[EvaluationResult]:
if isinstance(results, EvaluationResult):

View File

@@ -12,7 +12,7 @@ from typing import (
TypeVar,
cast,
)
from uuid import UUID
from uuid import UUID, uuid4
from typing_extensions import NotRequired, override
@@ -42,8 +42,7 @@ from langchain_core.tracers.log_stream import (
_astream_log_implementation,
)
from langchain_core.tracers.memory_stream import _MemoryStream
from langchain_core.utils.aiter import aclosing
from langchain_core.utils.uuid import uuid7
from langchain_core.utils.aiter import aclosing, py_anext
if TYPE_CHECKING:
from collections.abc import AsyncIterator, Iterator, Sequence
@@ -189,7 +188,7 @@ class _AstreamEventsCallbackHandler(AsyncCallbackHandler, _StreamingCallbackHand
# atomic check and set
tap = self.is_tapped.setdefault(run_id, sentinel)
# wait for first chunk
first = await anext(output, sentinel)
first = await py_anext(output, default=sentinel)
if first is sentinel:
return
# get run info
@@ -426,10 +425,6 @@ class _AstreamEventsCallbackHandler(AsyncCallbackHandler, _StreamingCallbackHand
"""Run on new output token. Only available when streaming is enabled.
For both chat models and non-chat models (legacy LLMs).
Raises:
ValueError: If the run type is not `llm` or `chat_model`.
AssertionError: If the run ID is not found in the run map.
"""
run_info = self.run_map.get(run_id)
chunk_: GenerationChunk | BaseMessageChunk
@@ -711,7 +706,11 @@ class _AstreamEventsCallbackHandler(AsyncCallbackHandler, _StreamingCallbackHand
@override
async def on_tool_end(self, output: Any, *, run_id: UUID, **kwargs: Any) -> None:
"""End a trace for a tool run."""
"""End a trace for a tool run.
Raises:
AssertionError: If the run ID is a tool call and does not have inputs
"""
run_info, inputs = self._get_tool_run_info_with_inputs(run_id)
self._send(
@@ -1007,11 +1006,7 @@ async def _astream_events_implementation_v2(
# Assign the stream handler to the config
config = ensure_config(config)
if "run_id" in config:
run_id = cast("UUID", config["run_id"])
else:
run_id = uuid7()
config["run_id"] = run_id
run_id = cast("UUID", config.setdefault("run_id", uuid4()))
callbacks = config.get("callbacks")
if callbacks is None:
config["callbacks"] = [event_streamer]

View File

@@ -21,7 +21,6 @@ from typing_extensions import override
from langchain_core.env import get_runtime_environment
from langchain_core.load import dumpd
from langchain_core.messages.ai import UsageMetadata, add_usage
from langchain_core.tracers.base import BaseTracer
from langchain_core.tracers.schemas import Run
@@ -70,32 +69,6 @@ def _get_executor() -> ThreadPoolExecutor:
return _EXECUTOR
def _get_usage_metadata_from_generations(
generations: list[list[dict[str, Any]]],
) -> UsageMetadata | None:
"""Extract and aggregate `usage_metadata` from generations.
Iterates through generations to find and aggregate all `usage_metadata` found in
messages. This is typically present in chat model outputs.
Args:
generations: List of generation batches, where each batch is a list
of generation dicts that may contain a `'message'` key with
`'usage_metadata'`.
Returns:
The aggregated `usage_metadata` dict if found, otherwise `None`.
"""
output: UsageMetadata | None = None
for generation_batch in generations:
for generation in generation_batch:
if isinstance(generation, dict) and "message" in generation:
message = generation["message"]
if isinstance(message, dict) and "usage_metadata" in message:
output = add_usage(output, message["usage_metadata"])
return output
class LangChainTracer(BaseTracer):
"""Implementation of the SharedTracer that POSTS to the LangChain endpoint."""
@@ -247,8 +220,7 @@ class LangChainTracer(BaseTracer):
log_error_once("post", e)
raise
@staticmethod
def _update_run_single(run: Run) -> None:
def _update_run_single(self, run: Run) -> None:
"""Update a run."""
if run.extra.get("__disabled"):
return
@@ -294,15 +266,6 @@ class LangChainTracer(BaseTracer):
def _on_llm_end(self, run: Run) -> None:
"""Process the LLM Run."""
# Extract usage_metadata from outputs and store in extra.metadata
if run.outputs and "generations" in run.outputs:
usage_metadata = _get_usage_metadata_from_generations(
run.outputs["generations"]
)
if usage_metadata is not None:
if "metadata" not in run.extra:
run.extra["metadata"] = {}
run.extra["metadata"]["usage_metadata"] = usage_metadata
self._update_run_single(run)
def _on_llm_error(self, run: Run) -> None:
@@ -313,28 +276,15 @@ class LangChainTracer(BaseTracer):
"""Process the Chain Run upon start."""
if run.parent_run_id is None:
run.reference_example_id = self.example_id
# Skip persisting if inputs are deferred (e.g., iterator/generator inputs).
# The run will be posted when _on_chain_end is called with realized inputs.
if not run.extra.get("defers_inputs"):
self._persist_run_single(run)
self._persist_run_single(run)
def _on_chain_end(self, run: Run) -> None:
"""Process the Chain Run."""
# If inputs were deferred, persist (POST) the run now that inputs are realized.
# Otherwise, update (PATCH) the existing run.
if run.extra.get("defers_inputs"):
self._persist_run_single(run)
else:
self._update_run_single(run)
self._update_run_single(run)
def _on_chain_error(self, run: Run) -> None:
"""Process the Chain Run upon error."""
# If inputs were deferred, persist (POST) the run now that inputs are realized.
# Otherwise, update (PATCH) the existing run.
if run.extra.get("defers_inputs"):
self._persist_run_single(run)
else:
self._update_run_single(run)
self._update_run_single(run)
def _on_tool_start(self, run: Run) -> None:
"""Process the Tool Run upon start."""

View File

@@ -563,7 +563,7 @@ def _get_standardized_inputs(
)
raise NotImplementedError(msg)
inputs = load(run.inputs, allowed_objects="all")
inputs = load(run.inputs)
if run.run_type in {"retriever", "llm", "chat_model"}:
return inputs
@@ -595,7 +595,7 @@ def _get_standardized_outputs(
Returns:
An output if returned, otherwise a None
"""
outputs = load(run.outputs, allowed_objects="all")
outputs = load(run.outputs)
if schema_format == "original":
if run.run_type == "prompt" and "output" in outputs:
# These were previously dumped before the tracer.

View File

@@ -58,7 +58,7 @@ def merge_dicts(left: dict[str, Any], *others: dict[str, Any]) -> dict[str, Any]
# "all dicts."
# )
if (right_k == "index" and merged[right_k].startswith("lc_")) or (
right_k in {"id", "output_version", "model_provider"}
right_k in ("id", "output_version", "model_provider")
and merged[right_k] == right_v
):
continue

View File

@@ -26,15 +26,13 @@ from typing import (
from typing_extensions import override
from langchain_core._api.deprecation import deprecated
T = TypeVar("T")
_no_default = object()
# https://github.com/python/cpython/blob/main/Lib/test/test_asyncgen.py#L54
@deprecated(since="1.1.2", removal="2.0.0")
# before 3.10, the builtin anext() was not available
def py_anext(
iterator: AsyncIterator[T], default: T | Any = _no_default
) -> Awaitable[T | Any | None]:
@@ -130,7 +128,7 @@ async def tee_peer(
if buffer:
continue
try:
item = await anext(iterator)
item = await iterator.__anext__()
except StopAsyncIteration:
break
else:

View File

@@ -8,7 +8,7 @@ import logging
import types
import typing
import uuid
from collections.abc import Mapping
from collections.abc import Callable
from typing import (
TYPE_CHECKING,
Annotated,
@@ -18,10 +18,8 @@ from typing import (
cast,
get_args,
get_origin,
get_type_hints,
)
import typing_extensions
from pydantic import BaseModel
from pydantic.v1 import BaseModel as BaseModelV1
from pydantic.v1 import Field as Field_v1
@@ -35,8 +33,6 @@ from langchain_core.utils.json_schema import dereference_refs
from langchain_core.utils.pydantic import is_basemodel_subclass
if TYPE_CHECKING:
from collections.abc import Callable
from langchain_core.tools import BaseTool
logger = logging.getLogger(__name__)
@@ -234,20 +230,13 @@ def _convert_any_typed_dicts_to_pydantic(
if is_typeddict(type_):
typed_dict = type_
docstring = inspect.getdoc(typed_dict)
# Use get_type_hints to properly resolve forward references and
# string annotations in Python 3.14+ (PEP 649 deferred annotations).
# include_extras=True preserves Annotated metadata.
try:
annotations_ = get_type_hints(typed_dict, include_extras=True)
except Exception:
# Fallback for edge cases where get_type_hints might fail
annotations_ = typed_dict.__annotations__
annotations_ = typed_dict.__annotations__
description, arg_descriptions = _parse_google_docstring(
docstring, list(annotations_)
)
fields: dict = {}
for arg, arg_type in annotations_.items():
if get_origin(arg_type) in {Annotated, typing_extensions.Annotated}:
if get_origin(arg_type) is Annotated: # type: ignore[comparison-overlap]
annotated_args = get_args(arg_type)
new_arg_type = _convert_any_typed_dicts_to_pydantic(
annotated_args[0], depth=depth + 1, visited=visited
@@ -337,7 +326,7 @@ def _format_tool_to_openai_function(tool: BaseTool) -> FunctionDescription:
def convert_to_openai_function(
function: Mapping[str, Any] | type | Callable | BaseTool,
function: dict[str, Any] | type | Callable | BaseTool,
*,
strict: bool | None = None,
) -> dict[str, Any]:
@@ -363,7 +352,6 @@ def convert_to_openai_function(
ValueError: If function is not in a supported format.
!!! warning "Behavior changed in `langchain-core` 0.3.16"
`description` and `parameters` keys are now optional. Only `name` is
required and guaranteed to be part of the output.
"""
@@ -464,7 +452,7 @@ _WellKnownOpenAITools = (
def convert_to_openai_tool(
tool: Mapping[str, Any] | type[BaseModel] | Callable | BaseTool,
tool: dict[str, Any] | type[BaseModel] | Callable | BaseTool,
*,
strict: bool | None = None,
) -> dict[str, Any]:
@@ -488,18 +476,15 @@ def convert_to_openai_tool(
OpenAI tool-calling API.
!!! warning "Behavior changed in `langchain-core` 0.3.16"
`description` and `parameters` keys are now optional. Only `name` is
required and guaranteed to be part of the output.
!!! warning "Behavior changed in `langchain-core` 0.3.44"
Return OpenAI Responses API-style tools unchanged. This includes
any dict with `"type"` in `"file_search"`, `"function"`,
`"computer_use_preview"`, `"web_search_preview"`.
!!! warning "Behavior changed in `langchain-core` 0.3.63"
Added support for OpenAI's image generation built-in tool.
"""
# Import locally to prevent circular import

View File

@@ -22,9 +22,6 @@ def get_color_mapping(
Returns:
The mapping of items to colors.
Raises:
ValueError: If no colors are available after applying exclusions.
"""
colors = list(_TEXT_COLOR_MAPPING.keys())
if excluded_colors is not None:

View File

@@ -4,13 +4,11 @@ from __future__ import annotations
import json
import re
from typing import TYPE_CHECKING, Any
from collections.abc import Callable
from typing import Any
from langchain_core.exceptions import OutputParserException
if TYPE_CHECKING:
from collections.abc import Callable
def _replace_new_line(match: re.Match[str]) -> str:
value = match.group(2)

View File

@@ -170,33 +170,28 @@ def dereference_refs(
full_schema: dict | None = None,
skip_keys: Sequence[str] | None = None,
) -> dict:
"""Resolve and inline JSON Schema `$ref` references in a schema object.
"""Resolve and inline JSON Schema $ref references in a schema object.
This function processes a JSON Schema and resolves all `$ref` references by
replacing them with the actual referenced content.
Handles both simple references and complex cases like circular references and mixed
`$ref` objects that contain additional properties alongside the `$ref`.
This function processes a JSON Schema and resolves all $ref references by replacing
them with the actual referenced content. It handles both simple references and
complex cases like circular references and mixed $ref objects that contain
additional properties alongside the $ref.
Args:
schema_obj: The JSON Schema object or fragment to process.
This can be a complete schema or just a portion of one.
full_schema: The complete schema containing all definitions that `$refs` might
point to.
If not provided, defaults to `schema_obj` (useful when the schema is
self-contained).
skip_keys: Controls recursion behavior and reference resolution depth.
- If `None` (Default): Only recurse under `'$defs'` and use shallow
reference resolution (break cycles but don't deep-inline nested refs)
- If provided (even as `[]`): Recurse under all keys and use deep reference
resolution (fully inline all nested references)
schema_obj: The JSON Schema object or fragment to process. This can be a
complete schema or just a portion of one.
full_schema: The complete schema containing all definitions that $refs might
point to. If not provided, defaults to schema_obj (useful when the
schema is self-contained).
skip_keys: Controls recursion behavior and reference resolution depth:
- If `None` (Default): Only recurse under '$defs' and use shallow reference
resolution (break cycles but don't deep-inline nested refs)
- If provided (even as []): Recurse under all keys and use deep reference
resolution (fully inline all nested references)
Returns:
A new dictionary with all $ref references resolved and inlined.
The original `schema_obj` is not modified.
A new dictionary with all $ref references resolved and inlined. The original
schema_obj is not modified.
Examples:
Basic reference resolution:
@@ -208,8 +203,7 @@ def dereference_refs(
>>> result = dereference_refs(schema)
>>> result["properties"]["name"] # {"type": "string"}
Mixed `$ref` with additional properties:
Mixed $ref with additional properties:
>>> schema = {
... "properties": {
... "name": {"$ref": "#/$defs/base", "description": "User name"}
@@ -221,7 +215,6 @@ def dereference_refs(
# {"type": "string", "minLength": 1, "description": "User name"}
Handling circular references:
>>> schema = {
... "properties": {"user": {"$ref": "#/$defs/User"}},
... "$defs": {
@@ -234,11 +227,10 @@ def dereference_refs(
>>> result = dereference_refs(schema) # Won't cause infinite recursion
!!! note
- Circular references are handled gracefully by breaking cycles
- Mixed `$ref` objects (with both `$ref` and other properties) are supported
- Additional properties in mixed `$refs` override resolved properties
- The `$defs` section is preserved in the output by default
- Mixed $ref objects (with both $ref and other properties) are supported
- Additional properties in mixed $refs override resolved properties
- The $defs section is preserved in the output by default
"""
full = full_schema or schema_obj
keys_to_skip = list(skip_keys) if skip_keys is not None else ["$defs"]

Some files were not shown because too many files have changed in this diff Show More