Compare commits

..

522 Commits

Author SHA1 Message Date
Lauren Hirata Singh
29c5e5970a fix 2025-10-16 23:50:15 -04:00
Lauren Hirata Singh
9b4c81d8d1 Merge v0.3 branch - keep all redirects from both branches 2025-10-16 23:27:03 -04:00
Lauren Hirata Singh
56a4cf001c v0.3 redirects 2025-10-16 22:47:15 -04:00
Lauren Hirata Singh
d273341249 chore(docs): add middleware to handle redirects (#33547)
still need to add v0.3 redirects
2025-10-16 21:12:08 -04:00
Lauren Hirata Singh
d9d819c7c1 chore(docs): add middleware to handle redirects 2025-10-16 20:47:18 -04:00
Lauren Hirata Singh
db49a14a34 chore(docs): Redirects v0.1/v0.2 (#33538) 2025-10-16 16:46:37 -04:00
Mason Daugherty
ab7eda236e fix: feature table for MongoDB (#33471) 2025-10-13 21:17:22 -04:00
Jib
d418cbdf44 docs: flag Multi Tenancy as a MongoDBAtlasVectorStore supported feature (#33469)
- **Description:** 
- Change the docs flag for v0.3 branch to list Multi-tenancy as a
MongoDBAtlasVectorStore supported feature
  - **Issue:** N/A
  - **Dependencies:** None

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://docs.langchain.com/oss/python/contributing) for
more.

Additional guidelines:

- Most PRs should not touch more than one package.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests. Likewise,
please do not update the `uv.lock` files unless you are adding a
required dependency.
- Changes should be backwards compatible.
- Make sure optional dependencies are imported within a function.
2025-10-13 16:57:40 -04:00
ccurme
b93d2f7f3a release(core): 0.3.79 (#33401) 2025-10-09 16:59:16 -04:00
ccurme
a763ebe86c release(anthropic): 0.3.22 (#33394) 2025-10-09 14:29:38 -04:00
ccurme
5fa1094451 fix(anthropic,standard-tests): carry over updates to v0.3 (#33393)
Cherry pick of https://github.com/langchain-ai/langchain/pull/33390 and
https://github.com/langchain-ai/langchain/pull/33391.
2025-10-09 14:25:34 -04:00
Anika
dd4de696b8 fix(core): handle parent/child mustache vars (#33346)
Description:

currently mustache_schema("{{x.y}} {{x}}") will error. pr fixes

Issue: na
**Dependencies:**na

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-10-09 08:52:10 -07:00
Mason Daugherty
809a0216a5 chore: update v0.3 ref homepage (#33338) 2025-10-07 12:56:07 -04:00
Mason Daugherty
44ec72fa0d fix(infra): allow prerelease installations for partner packages during api ref doc build (#33325) 2025-10-06 18:05:30 -04:00
Mason Daugherty
5459ff1ee3 chore: cap lib upper bounds, run sync (#33320) 2025-10-06 17:39:01 -04:00
ccurme
f95669aa0a release(openai): 0.3.35 (#33299) 2025-10-06 10:51:01 -04:00
ccurme
3a465d635b feat(openai): enable stream_usage when using default base URL and client (#33296) 2025-10-06 10:22:36 -04:00
ccurme
0b51de4cab release(core): 0.3.78 (#33253) 2025-10-03 12:40:15 -04:00
ccurme
5904cbea89 feat(core): add optional include_id param to convert_to_openai_messages function (#33248) 2025-10-03 11:37:16 -04:00
Mason Daugherty
c9590ef79d docs: fix infinite loop in vercel.json redirects (#33240) 2025-10-02 20:24:09 -04:00
Mason Daugherty
c972552c40 docs: work for freeze (#33239) 2025-10-02 20:01:26 -04:00
Mason Daugherty
e16feb93b9 release(ollama): 0.3.10 (#33210) 2025-10-02 11:35:24 -04:00
Mason Daugherty
2bb57d45d2 fix(ollama): exclude None parameters from options dictionary (#33208) (#33209)
fix #33206
2025-10-02 11:32:21 -04:00
ccurme
d7cce2f469 feat(langchain_v1): update messages namespace (#33207) 2025-10-02 10:35:00 -04:00
Mason Daugherty
48b77752d0 release(ollama): 0.3.9 (#33200) 2025-10-01 22:31:20 -04:00
Mason Daugherty
6f2d16e6be refactor(ollama): simplify options handling (#33199)
Fixes #32744

Don't restrict options; the client accepts any dict
2025-10-01 21:58:12 -04:00
Mason Daugherty
a9eda18e1e refactor(ollama): clean up tests (#33198) 2025-10-01 21:52:01 -04:00
Mason Daugherty
a89c549cb0 feat(ollama): add basic auth support (#32328)
support for URL authentication in the format
`https://user:password@host:port` for all LangChain Ollama clients.

Related to #32327 and #25055
2025-10-01 20:46:37 -04:00
Sydney Runkle
a336afaecd feat(langchain): use decorators for jumps instead (#33179)
The old `before_model_jump_to` classvar approach was quite clunky, this
is nicer imo and easier to document. Also moving from `jump_to` to
`can_jump_to` which is more idiomatic.

Before:

```py
class MyMiddleware(AgentMiddleware):
    before_model_jump_to: ClassVar[list[JumpTo]] = ["end"]

    def before_model(state, runtime) -> dict[str, Any]:
        return {"jump_to": "end"}
```

After

```py
class MyMiddleware(AgentMiddleware):

    @hook_config(can_jump_to=["end"])
    def before_model(state, runtime) -> dict[str, Any]:
        return {"jump_to": "end"}
```
2025-10-01 16:49:27 -07:00
Lauren Hirata Singh
af07949d13 fix(docs): Redirects (#33190) 2025-10-01 16:28:47 -04:00
Sydney Runkle
a10e880c00 feat(langchain_v1): add async support for create_agent (#33175)
This makes branching **much** more simple internally and helps greatly
w/ type safety for users. It just allows for one signature on hooks
instead of multiple.

Opened after https://github.com/langchain-ai/langchain/pull/33164
ballooned more than expected, w/ branching for:
* sync vs async
* runtime vs no runtime (this is self imposed)

**This also removes support for nodes w/o `runtime` in the signature.**
We can always go back and add support for nodes w/o `runtime`.

I think @christian-bromann's idea to re-export `runtime` from
langchain's agents might make sense due to the abundance of imports
here.

Check out the value of the change based on this diff:
https://github.com/langchain-ai/langchain/pull/33176
2025-10-01 19:15:39 +00:00
Eugene Yurtsev
7b5e839be3 chore(langchain_v1): use list[str] for modifyModelRequest (#33166)
Update model request to return tools by name. This will decrease the
odds of misusing the API.

We'll need to extend the type for built-in tools later.
2025-10-01 14:46:19 -04:00
ccurme
740842485c fix(openai): bump min core version (#33188)
Required for new tests added in
https://github.com/langchain-ai/langchain/pull/32541 and
https://github.com/langchain-ai/langchain/pull/33183.
2025-10-01 11:01:15 -04:00
noeliecherrier
08bb74f148 fix(mistralai): handle HTTP errors in async embed documents (#33187)
The async embed function does not properly handle HTTP errors.

For instance with large batches, Mistral AI returns `Too many inputs in
request, split into more batches.` in a 400 error.

This leads to a KeyError in `response.json()["data"]` l.288

This PR fixes the issue by:
- calling `response.raise_for_status()` before returning
- adding a retry similarly to what is done in the synchronous
counterpart `embed_documents`

I also added an integration test, but willing to move it to unit tests
if more relevant.
2025-10-01 10:57:47 -04:00
ccurme
7d78ed9b53 release(standard-tests): 0.3.22 (#33186) 2025-10-01 10:39:17 -04:00
ccurme
7ccff656eb release(core): 0.3.77 (#33185) 2025-10-01 10:24:07 -04:00
ccurme
002d623f2d feat: (core, standard-tests) support PDF inputs in ToolMessages (#33183) 2025-10-01 10:16:16 -04:00
Mohammad Mohtashim
34f8031bd9 feat(langchain): Using Structured Response as Key in Output Schema for Middleware Agent (#33159)
- **Description:** Changing the key from `response` to
`structured_response` for middleware agent to keep it sync with agent
without middleware. This a breaking change.
 - **Issue:** #33154
2025-10-01 03:24:59 +00:00
Mason Daugherty
a541b5bee1 chore(infra): rfc README.md for better presentation (#33172) 2025-09-30 17:44:42 -04:00
Mason Daugherty
3e970506ba chore(core): remove runnable section from README.md (#33171) 2025-09-30 17:15:31 -04:00
Mason Daugherty
d1b0196faa chore(infra): whitespace fix (#33170) 2025-09-30 17:14:55 -04:00
ccurme
aac69839a9 release(openai): 0.3.34 (#33169) 2025-09-30 16:48:39 -04:00
ccurme
64141072a3 feat(openai): support openai sdk 2.0 (#33168) 2025-09-30 16:34:00 -04:00
Mason Daugherty
0795be2a04 docs(core): remove non-existent param from as_tool docstring (#33165) 2025-09-30 19:43:34 +00:00
Eugene Yurtsev
9c97597175 chore(langchain_v1): expose middleware decorators and selected messages (#33163)
* Make it easy to improve the middleware shortcuts
* Export the messages that we're confident we'll expose
2025-09-30 14:14:57 -04:00
Sydney Runkle
eed0f6c289 feat(langchain): todo middleware (#33152)
Porting the [planning
middleware](39c0138d0f/src/deepagents/middleware.py (L21))
over from deepagents.

Also adding the ability to configure:
* System prompt
* Tool description

```py
from langchain.agents.middleware.planning import PlanningMiddleware
from langchain.agents import create_agent

agent = create_agent("openai:gpt-4o", middleware=[PlanningMiddleware()])

result = await agent.invoke({"messages": [HumanMessage("Help me refactor my codebase")]})

print(result["todos"])  # Array of todo items with status tracking
```
2025-09-30 02:23:26 +00:00
ccurme
729637a347 docs(anthropic): document support for memory tool and context management (#33149) 2025-09-29 16:38:01 -04:00
Mason Daugherty
3325196be1 fix(langchain): handle gpt-5 model name in init_chat_model (#33148)
expand to match any `gpt-*` model to openai
2025-09-29 16:16:17 -04:00
Mason Daugherty
f402fdcea3 fix(langchain): add context_management to Anthropic chat model init (#33150) 2025-09-29 16:13:47 -04:00
ccurme
ca9217c02d release(anthropic): 0.3.21 (#33147) 2025-09-29 19:56:28 +00:00
ccurme
f9bae40475 feat(anthropic): support memory and context management features (#33146)
https://docs.claude.com/en/docs/build-with-claude/context-editing

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-29 15:42:38 -04:00
ccurme
839a18e112 fix(openai): remove __future__.annotations import from test files (#33144)
Breaks schema conversion in places.
2025-09-29 16:23:32 +00:00
Mohammad Mohtashim
33a6def762 fix(core): Support of 'reasoning' type in 'convert_to_openai_messages' (#33050) 2025-09-29 09:17:05 -04:00
nhuang-lc
c456c8ae51 fix(langchain): fix response action for HITL (#33131)
Multiple improvements to HITL flow:

* On a `response` type resume, we should still append the tool call to
the last AIMessage (otherwise we have a ToolResult without a
corresponding ToolCall)
* When all interrupts have `response` types (so there's no pending tool
calls), we should jump back to the first node (instead of end) as we
enforced in the previous `post_model_hook_router`
* Added comments to `model_to_tools` router so clarify all of the
potential exit conditions

Additionally:
* Lockfile update to use latest LG alpha release
* Added test for `jump_to` behaving ephemerally, this was fixed in LG
but surfaced as a bug w/ `jump_to`.
* Bump version to v1.0.0a10 to prep for alpha release

---------

Co-authored-by: Sydney Runkle <sydneymarierunkle@gmail.com>
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
2025-09-29 13:08:18 +00:00
Eugene Yurtsev
54ea62050b chore(langchain_v1): move tool node to tools namespace (#33132)
* Move ToolNode to tools namespace
* Expose injected variable as well in tools namespace
* Update doc-strings throughout
2025-09-26 15:23:57 -04:00
Mason Daugherty
986302322f docs: more standardization (#33124) 2025-09-25 20:46:20 -04:00
Mason Daugherty
a5137b0a3e refactor(langchain): resolve pydantic deprecation warnings (#33125) 2025-09-25 17:33:18 -04:00
Mason Daugherty
5bea28393d docs: standardize .. code-block directive usage (#33122)
and fix typos
2025-09-25 16:49:56 -04:00
Mason Daugherty
c3fed20940 docs: correct ported over directives (#33121)
Match rest of repo
2025-09-25 15:54:54 -04:00
Mason Daugherty
6d418ba983 test(mistralai): add xfail for structured output test (#33119)
In rare cases (difficult to reproduce), Mistral's API fails to return
valid bodies, leading to failures from `PydanticToolsParser`
2025-09-25 13:05:31 -04:00
Mason Daugherty
12daba63ff test(openai): raise token limit for o1 test (#33118)
`test_o1[False-False]` was sometimes failing because the OpenAI o1 model
was hitting a token limit with only 100 tokens
2025-09-25 12:57:33 -04:00
Christophe Bornet
eaf8dce7c2 chore: bump ruff version to 0.13 (#33043)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-25 12:27:39 -04:00
Mason Daugherty
f82de1a8d7 chore: bump locks (#33114) 2025-09-25 01:46:01 -04:00
Mason Daugherty
e3efd1e891 test(text-splitters): capture beta warnings (#33113) 2025-09-25 01:30:20 -04:00
Mason Daugherty
d6769cf032 test(text-splitters): resolve pytest marker warning (#33112)
#33111
2025-09-25 01:29:42 -04:00
Mason Daugherty
7ab2e0dd3b test(core): resolve pytest marker warning (#33111)
Remove redundant/outdated `@pytest.mark.requires("jinja2")` decorator

Pytest marks (like `@pytest.mark.requires(...)`) applied directly to
fixtures have no effect and are deprecated.
2025-09-25 01:08:54 -04:00
Mason Daugherty
81319ad3f0 test(core): resolve pydantic_v1 deprecation warning (#33110)
Excluded pydantic_v1 module from import testing

Acceptable since this pydantic_v1 is explicitly deprecated. Testing its
importability at this stage serves little purpose since users should
migrate away from it.
2025-09-25 01:08:03 -04:00
Mason Daugherty
e3f3c13b75 refactor(core): use aadd_documents in vectorstores unit tests (#33109)
Don't use the deprecated `upsert()` and `aupsert()`

Instead use the recommended alternatives
2025-09-25 00:57:08 -04:00
Mason Daugherty
c30844fce4 fix(core): use version agnostic get_fields (#33108)
Resolves a warning
2025-09-25 00:54:29 -04:00
Mason Daugherty
c9eb3bdb2d test(core): use secure hash algorithm in indexing test to eliminate SHA-1 warning (#33107)
Finish work from #33101
2025-09-25 00:49:11 -04:00
Mason Daugherty
e97baeb9a6 test(core): suppress pydantic_v1 deprecation warnings during import tests (#33106)
We intentionally import these. Hide warnings to reduce testing noise.
2025-09-25 00:37:40 -04:00
Mason Daugherty
3a6046b157 test(core): don't use deprecated input_variables param in from_file (#33105)
finish #33104
2025-09-25 04:29:31 +00:00
Mason Daugherty
8fdc619f75 refactor(core): don't use deprecated input_variables param in from_file (#33104)
Missed awhile back; causes warnings during tests
2025-09-25 00:14:17 -04:00
Ali Ismail
729bfe8369 test(core): enhance stringify_value test coverage for nested structures (#33099)
## Summary
Adds test coverage for the `stringify_value` utility function to handle
complex nested data structures that weren't previously tested.

## Changes
- Added `test_stringify_value_nested_structures()` to `test_strings.py`
- Tests nested dictionaries within lists
- Tests mixed-type lists with various data types
- Verifies proper stringification of complex nested structures

## Why This Matters
- Fills a gap in test coverage for edge cases
- Ensures `stringify_value` handles complex data structures correctly  
- Improves confidence in string utility functions used throughout the
codebase
- Low risk addition that strengthens existing test suite

## Testing
```bash
uv run --group test pytest libs/core/tests/unit_tests/utils/test_strings.py::test_stringify_value_nested_structures -v
```

This test addition follows the project's testing patterns and adds
meaningful coverage without introducing any breaking changes.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-25 00:04:47 -04:00
Mason Daugherty
9b624a79b2 test(core): suppress deprecation warnings in PipelinePromptTemplate (#33102)
We're intentionally testing this still so as not to regress. Reduce
warning noise.
2025-09-25 04:03:27 +00:00
Mason Daugherty
c60c5a91cb fix(core): use secure hash algorithm in indexing test to eliminate SHA-1 warning (#33101)
Use SHA-256 (collision-resistant) instead of the default SHA-1. No
functional changes to test behavior.
2025-09-25 00:02:11 -04:00
Mason Daugherty
d9e0c212e0 chore(infra): add tests to label mapping (#33103) 2025-09-25 00:01:53 -04:00
Sydney Runkle
f015526e42 release(langchain): v1.0.0a9 (#33098) 2025-09-24 21:02:53 +00:00
Sydney Runkle
57d931532f fix(langchain): extra arg for anthropic caching, __end__ -> end for jump_to (#33097)
Also updating `jump_to` to use `end` instead of `__end__`
2025-09-24 17:00:40 -04:00
Mason Daugherty
50012d95e2 chore: update pull_request_target types, harden (#33096)
Enhance the pull request workflows by updating the `pull_request_target`
types and ensuring safety by avoiding checkout of the PR's head. Update
the action to use a specific commit from the archived repository.
2025-09-24 16:37:16 -04:00
Mason Daugherty
33f06875cb fix(langchain_v1): version equality check (#33095) 2025-09-24 16:27:55 -04:00
dependabot[bot]
e5730307e7 chore: bump actions/setup-node from 4 to 5 (#32952)
Bumps [actions/setup-node](https://github.com/actions/setup-node) from 4
to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-node/releases">actions/setup-node's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Enhance caching in setup-node with automatic package manager
detection by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
</ul>
<p>This update, introduces automatic caching when a valid
<code>packageManager</code> field is present in your
<code>package.json</code>. This aims to improve workflow performance and
make dependency management more seamless.
To disable this automatic caching, set <code>package-manager-cache:
false</code></p>
<pre lang="yaml"><code>steps:
- uses: actions/checkout@v5
- uses: actions/setup-node@v5
  with:
    package-manager-cache: false
</code></pre>
<ul>
<li>Upgrade action to use node24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p>Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">See
Release Notes</a></p>
<h3>Dependency Upgrades</h3>
<ul>
<li>Upgrade <code>@​octokit/request-error</code> and
<code>@​actions/github</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1227">actions/setup-node#1227</a></li>
<li>Upgrade uuid from 9.0.1 to 11.1.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1273">actions/setup-node#1273</a></li>
<li>Upgrade undici from 5.28.5 to 5.29.0 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1295">actions/setup-node#1295</a></li>
<li>Upgrade form-data to bring in fix for critical vulnerability by <a
href="https://github.com/gowridurgad"><code>@​gowridurgad</code></a> in
<a
href="https://redirect.github.com/actions/setup-node/pull/1332">actions/setup-node#1332</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-node/pull/1345">actions/setup-node#1345</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1348">actions/setup-node#1348</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1325">actions/setup-node#1325</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v4...v5.0.0">https://github.com/actions/setup-node/compare/v4...v5.0.0</a></p>
<h2>v4.4.0</h2>
<h2>What's Changed</h2>
<h3>Bug fixes:</h3>
<ul>
<li>Make eslint-compact matcher compatible with Stylelint by <a
href="https://github.com/FloEdelmann"><code>@​FloEdelmann</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/98">actions/setup-node#98</a></li>
<li>Add support for indented eslint output by <a
href="https://github.com/fregante"><code>@​fregante</code></a> in <a
href="https://redirect.github.com/actions/setup-node/pull/1245">actions/setup-node#1245</a></li>
</ul>
<h3>Enhancement:</h3>
<ul>
<li>Support private mirrors by <a
href="https://github.com/marco-ippolito"><code>@​marco-ippolito</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1240">actions/setup-node#1240</a></li>
</ul>
<h3>Dependency update:</h3>
<ul>
<li>Upgrade <code>@​action/cache</code> from 4.0.2 to 4.0.3 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-node/pull/1262">actions/setup-node#1262</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/FloEdelmann"><code>@​FloEdelmann</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/98">actions/setup-node#98</a></li>
<li><a href="https://github.com/fregante"><code>@​fregante</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1245">actions/setup-node#1245</a></li>
<li><a
href="https://github.com/marco-ippolito"><code>@​marco-ippolito</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-node/pull/1240">actions/setup-node#1240</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-node/compare/v4...v4.4.0">https://github.com/actions/setup-node/compare/v4...v4.4.0</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="a0853c2454"><code>a0853c2</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-node/issues/1345">#1345</a>)</li>
<li><a
href="b7234cc9fe"><code>b7234cc</code></a>
Upgrade action to use node24 (<a
href="https://redirect.github.com/actions/setup-node/issues/1325">#1325</a>)</li>
<li><a
href="d7a11313b5"><code>d7a1131</code></a>
Enhance caching in setup-node with automatic package manager detection
(<a
href="https://redirect.github.com/actions/setup-node/issues/1348">#1348</a>)</li>
<li><a
href="5e2628c959"><code>5e2628c</code></a>
Bumps form-data (<a
href="https://redirect.github.com/actions/setup-node/issues/1332">#1332</a>)</li>
<li><a
href="65beceff8e"><code>65becef</code></a>
Bump undici from 5.28.5 to 5.29.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1295">#1295</a>)</li>
<li><a
href="7e24a656e1"><code>7e24a65</code></a>
Bump uuid from 9.0.1 to 11.1.0 (<a
href="https://redirect.github.com/actions/setup-node/issues/1273">#1273</a>)</li>
<li><a
href="08f58d1471"><code>08f58d1</code></a>
Bump <code>@​octokit/request-error</code> and
<code>@​actions/github</code> (<a
href="https://redirect.github.com/actions/setup-node/issues/1227">#1227</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/setup-node/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-node&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-24 16:26:05 -04:00
Mason Daugherty
4783a9c18e style: update workflow name for version equality check (#33094) 2025-09-24 20:11:30 +00:00
Mason Daugherty
ee4d84de7c style(core): typo/docs lint pass (#33093) 2025-09-24 16:11:21 -04:00
Mason Daugherty
092dd5e174 chore: update link for monorepo structure (#33091) 2025-09-24 19:55:00 +00:00
Sydney Runkle
dd81e1c3fb release(langchain): 1.0.0a8 (#33090) 2025-09-24 15:31:29 -04:00
Sydney Runkle
135a5b97e6 feat(langchain): improvements to anthropic prompt caching (#33058)
Adding an `unsupported_model_behavior` arg that can be `'ignore'`,
`'warn'`, or `'raise'`. Defaults to `'warn'`.
2025-09-24 15:28:49 -04:00
Mason Daugherty
b92b394804 style: repo linting pass (#33089)
enable docstring-code-format
2025-09-24 15:25:55 -04:00
Sydney Runkle
083bb3cdd7 fix(langchain): need to inject all state for tools registered by middleware (#33087)
Type hints matter for conditional edges!
2025-09-24 15:25:51 -04:00
Mason Daugherty
2e9291cdd7 fix: lift openai version constraints across packages (#33088)
re: #33038 and https://github.com/openai/openai-python/issues/2644
2025-09-24 15:25:10 -04:00
Sydney Runkle
4f8a76b571 chore(langchain): renaming for HITL (#33067) 2025-09-24 07:19:44 -04:00
Mason Daugherty
05ba941230 style(cli): linting pass (#33078) 2025-09-24 01:24:52 -04:00
Mason Daugherty
ae4976896e chore: delete erroneous .readthedocs.yaml (#33079)
From the legacy docs/not needed here
2025-09-24 01:24:42 -04:00
Mason Daugherty
504ef96500 chore: add commit message generation instructions for VSCode (#33077) 2025-09-24 05:06:43 +00:00
Mason Daugherty
d99a02bb27 chore: add AGENTS.md (#33076)
it would be super cool if Anthropic supported this instead of
`CLAUDE.md` :/

https://agents.md/
2025-09-24 05:02:14 +00:00
Mason Daugherty
793de80429 chore: update label mapping in PR title labeler configuration (#33075) 2025-09-24 01:00:14 -04:00
Mason Daugherty
7d4e9d8cda revert(infra): put SECURITY.md at root (#33074) 2025-09-24 00:54:37 -04:00
Mason Daugherty
54dca494cf chore: delete erroneous poetry.toml configuration file (#33073)
- Not used by the current build system
- Potentially confusing for new contributors
- A leftover artifact from the Poetry to uv migration
2025-09-24 04:40:17 +00:00
Mason Daugherty
7b30e58386 chore: delete erroneous yarn.lock in root (#33072)
Appears to have had no purpose/was added by mistake and nobody
questioned it
2025-09-24 04:35:00 +00:00
Mason Daugherty
e62b541dfd chore(infra): move SECURITY.md to .github (#33071)
cleaning up top-level. `.github` folder placement will continue to show
on repo homepage:
https://docs.github.com/en/code-security/getting-started/adding-a-security-policy-to-your-repository#about-security-policies
2025-09-24 00:27:48 -04:00
Mason Daugherty
8699980d09 chore(scripts): remove obsolete release and mypy/ruff update scripts (#33070)
Outdated scripts related to release management and mypy/ruff updates

Cleaning up the root-level
2025-09-24 04:24:38 +00:00
Mason Daugherty
79e536b0d6 chore(infra): further docs build cleanup (#33057)
Reorganize the requirements for better clarity and consistency. Improve
documentation on scripts and workflows.
2025-09-23 17:29:58 -04:00
Sydney Runkle
b5720ff17a chore(langchain): simplifying HITL condition (#33065)
Simplifying condition
2025-09-23 21:24:14 +00:00
nhuang-lc
48b05224ad fix(langchain_v1): only interrupt if at least one ToolConfig value is True (#33064)
**Description:** Right now, we interrupt even if the provided ToolConfig
has all false values. We should ignore ToolConfigs which do not have at
least one value marked as true (just as we would if tool_name: False was
passed into the dict).
2025-09-23 17:20:34 -04:00
Sydney Runkle
89079ad411 feat(langchain): new decorator pattern for dynamically generated middleware (#33053)
# Main Changes

1. Adding decorator utilities for dynamically defining middleware with
single hook functions (see an example below for dynamic system prompt)
2. Adding better conditional edge drawing with jump configuration
attached to middleware. Can be registered w/ the decorator new
decorator!

## Decorator Utilities

```py
from langchain.agents.middleware_agent import create_agent, AgentState, ModelRequest
from langchain.agents.middleware.types import modify_model_request
from langchain_core.messages import HumanMessage
from langgraph.checkpoint.memory import InMemorySaver


@modify_model_request
def modify_system_prompt(request: ModelRequest, state: AgentState) -> ModelRequest:
    request.system_prompt = (
        "You are a helpful assistant."
        f"Please record the number of previous messages in your response: {len(state['messages'])}"
    )
    return request

agent = create_agent(
    model="openai:gpt-4o-mini", 
    middleware=[modify_system_prompt]
).compile(checkpointer=InMemorySaver())
```

## Visualization and Routing improvements

We now require that middlewares define the valid jumps for each hook.

If using the new decorator syntax, this can be done with:

```py
@before_model(jump_to=["__end__"])
@after_model(jump_to=["tools", "__end__"])
```

If using the subclassing syntax, you can use these two class vars:

```py
class MyMiddlewareAgentMiddleware):
    before_model_jump_to = ["__end__"]
    after_model_jump_to = ["tools", "__end__"]
```

Open for debate if we want to bundle these in a single jump map / config
for a middleware. Easy to migrate later if we decide to add more hooks.

We will need to **really clearly document** that these must be
explicitly set in order to enable conditional edges.

Notice for the below case, `Middleware2` does actually enable jumps.

<table>
  <thead>
    <tr>
      <th>Before (broken), adding conditional edges unconditionally</th>
      <th>After (fixed), adding conditional edges sparingly</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>
<img width="619" height="508" alt="Screenshot 2025-09-23 at 10 23 23 AM"
src="https://github.com/user-attachments/assets/bba2d098-a839-4335-8e8c-b50dd8090959"
/>
      </td>
      <td>
<img width="469" height="490" alt="Screenshot 2025-09-23 at 10 23 13 AM"
src="https://github.com/user-attachments/assets/717abf0b-fc73-4d5f-9313-b81247d8fe26"
/>
      </td>
    </tr>
  </tbody>
</table>

<details>
<summary>Snippet for the above</summary>

```py
from typing import Any
from langchain.agents.tool_node import InjectedState
from langgraph.runtime import Runtime
from langchain.agents.middleware.types import AgentMiddleware, AgentState
from langchain.agents.middleware_agent import create_agent
from langchain_core.tools import tool
from typing import Annotated
from langchain_core.messages import HumanMessage
from typing_extensions import NotRequired

@tool
def simple_tool(input: str) -> str:
    """A simple tool."""
    return "successful tool call"


class Middleware1(AgentMiddleware):
    """Custom middleware that adds a simple tool."""

    tools = [simple_tool]

    def before_model(self, state: AgentState, runtime: Runtime) -> None:
        return None

    def after_model(self, state: AgentState, runtime: Runtime) -> None:
        return None

class Middleware2(AgentMiddleware):

    before_model_jump_to = ["tools", "__end__"]

    def before_model(self, state: AgentState, runtime: Runtime) -> None:
        return None

    def after_model(self, state: AgentState, runtime: Runtime) -> None:
        return None

class Middleware3(AgentMiddleware):

    def before_model(self, state: AgentState, runtime: Runtime) -> None:
        return None

    def after_model(self, state: AgentState, runtime: Runtime) -> None:
        return None

builder = create_agent(
    model="openai:gpt-4o-mini",
    middleware=[Middleware1(), Middleware2(), Middleware3()],
    system_prompt="You are a helpful assistant.",
)
agent = builder.compile()
```

</details>

## More Examples

### Guardrails `after_model`

<img width="379" height="335" alt="Screenshot 2025-09-23 at 10 40 09 AM"
src="https://github.com/user-attachments/assets/45bac7dd-398e-45d1-ae58-6ecfa27dfc87"
/>

<details>
<summary>Code</summary>

```py
from langchain.agents.middleware_agent import create_agent, AgentState, ModelRequest
from langchain.agents.middleware.types import after_model
from langchain_core.messages import HumanMessage, AIMessage
from langgraph.checkpoint.memory import InMemorySaver
from typing import cast, Any

@after_model(jump_to=["model", "__end__"])
def after_model_hook(state: AgentState) -> dict[str, Any]:
    """Check the last AI message for safety violations."""
    last_message_content = cast(AIMessage, state["messages"][-1]).content.lower()
    print(last_message_content)

    unsafe_keywords = ["pineapple"]
    if any(keyword in last_message_content for keyword in unsafe_keywords):

        # Jump back to model to regenerate response
        return {"jump_to": "model", "messages": [HumanMessage("Please regenerate your response, and don't talk about pineapples. You can talk about apples instead.")]}

    return {"jump_to": "__end__"}

# Create agent with guardrails middleware
agent = create_agent(
    model="openai:gpt-4o-mini",
    middleware=[after_model_hook],
    system_prompt="Keep your responses to one sentence please!"
).compile()

# Test with potentially unsafe input
result = agent.invoke(
    {"messages": [HumanMessage("Tell me something about pineapples")]},
)

for msg in result["messages"]:
    print(msg.pretty_print())

"""
================================ Human Message =================================

Tell me something about pineapples
None
================================== Ai Message ==================================

Pineapples are tropical fruits known for their sweet, tangy flavor and distinctive spiky exterior.
None
================================ Human Message =================================

Please regenerate your response, and don't talk about pineapples. You can talk about apples instead.
None
================================== Ai Message ==================================

Apples are popular fruits that come in various varieties, known for their crisp texture and sweetness, and are often used in cooking and baking.
None
"""
```

</details>
2025-09-23 13:25:55 -04:00
Mason Daugherty
2c95586f2a chore(infra): audit workflows, scripts (#33055)
Mostly adding a descriptive frontmatter to workflow files. Also address
some formatting and outdated artifacts

No functional changes outside of
[d5457c3](d5457c39ee),
[90708a0](90708a0d99),
and
[338c82d](338c82d21e)
2025-09-23 17:08:19 +00:00
Mason Daugherty
9c1285cf5b chore(infra): fix ping pong pr labeler config (#33054)
The title-based labeler was clearing all pre-existing labels (including
the file-based ones) before adding its semantic labels.
2025-09-22 21:19:53 -04:00
Sydney Runkle
c3be45bf14 fix(langchain): HITL bug causing dupe interrupt (#33052)
Need to find **last** AI msg (not first). Getting too creative w/
generators.
2025-09-22 20:09:12 -04:00
Arman Tsaturian
8f488d62b2 docs: fix stripe toolkit import in the guide (#33044)
**Description:**
Stripe tools integration guide incorrectly referenced the `crewai`
toolkit. Updated the import to use the correct `langchain` toolkit.

Stripe docs reference:
https://docs.stripe.com/agents?framework=langchain&lang=python
2025-09-22 15:17:09 -04:00
Mason Daugherty
cdae9e4942 fix(infra): prevent labeler workflow from adding/removing same labels (#33039)
The file-based and title-based labeler workflows were conflicting,
causing the bot to add and remove identical labels in the same
operation. Hopefully this fixes
2025-09-21 04:37:59 +00:00
Mason Daugherty
7ddc798f95 fix(openai): pin upper bound to prevent Pydantic 2.7.0 issues (#33038)
https://github.com/openai/openai-python/issues/2644
2025-09-21 00:27:03 -04:00
Mason Daugherty
7dcf6a515e fix: update method calls from dict to model_dump in Chain (#33035) 2025-09-20 23:47:44 -04:00
Mason Daugherty
043a7560a5 test: use .get() for safe ls_params access (#33034) 2025-09-20 23:46:37 -04:00
Mason Daugherty
5b418d3f26 feat(infra): add PR labeler configurations and workflows (#33031) 2025-09-20 22:33:08 -04:00
Mason Daugherty
6b4054c795 chore(infra): update pre-commit hooks to include linting (#33029) 2025-09-21 02:26:19 +00:00
Mason Daugherty
30fde5af38 chore(infra): remove couchbase formatting hook from pre-commit (#33030)
Should've been done when it was removed from the monorepo
2025-09-20 22:09:57 -04:00
Mason Daugherty
781db9d892 chore: update pyproject.toml files, remove codespell (#33028)
- Removes Codespell from deps, docs, and `Makefile`s
- Python version requirements in all `pyproject.toml` files now use the
`~=` (compatible release) specifier
- All dependency groups and main dependencies now use explicit lower and
upper bounds, reducing potential for breaking changes
2025-09-20 22:09:33 -04:00
Sydney Runkle
f2b0afd0b7 release(langchain): 1.0.0a6 (#33024)
w/ improvements to HITL, state schema merging, dynamic system prompt
2025-09-19 18:47:41 +00:00
Sydney Runkle
c3654202a3 fix(langchain): use state schema as input schema to middleware nodes (#33023)
We want state schema as the input schema to middleware nodes because the
conditional edges after these nodes need access to the full state.

Also, we just generally want all state passed to middleware nodes, so we
should be specifying this explicitly. If we don't, the state annotations
used by users in their node signatures are used (so they might be
missing fields).
2025-09-19 18:43:33 +00:00
Sydney Runkle
4d118777bc feat(langchain): dynamic system prompt middleware (#33006)
# Changes

## Adds support for `DynamicSystemPromptMiddleware`

```py
from langchain.agents.middleware import DynamicSystemPromptMiddleware
from langgraph.runtime import Runtime
from typing_extensions import TypedDict

class Context(TypedDict):
    user_name: str

def system_prompt(state: AgentState, runtime: Runtime[Context]) -> str:
    user_name = runtime.context.get("user_name", "n/a")
    return f"You are a helpful assistant. Always address the user by their name: {user_name}"

middleware = DynamicSystemPromptMiddleware(system_prompt)
```

## Adds support for `runtime` in middleware hooks

```py
class AgentMiddleware(Generic[StateT, ContextT]):
    def modify_model_request(
        self,
        request: ModelRequest,
        state: StateT,
        runtime: Runtime[ContextT],  # Optional runtime parameter
    ) -> ModelRequest:
        # upgrade model if runtime.context.subscription is `top-tier` or whatever
```

## Adds support for omitting state attributes from input / output
schemas

```py
from typing import Annotated, NotRequired
from langchain.agents.middleware.types import PrivateStateAttr, OmitFromInput, OmitFromOutput

class CustomState(AgentState):
    # Private field - not in input or output schemas
    internal_counter: NotRequired[Annotated[int, PrivateStateAttr]]
    
    # Input-only field - not in output schema
    user_input: NotRequired[Annotated[str, OmitFromOutput]]
    
    # Output-only field - not in input schema  
    computed_result: NotRequired[Annotated[str, OmitFromInput]]
```

## Additionally
* Removes filtering of state before passing into middleware hooks

Typing is not foolproof here, still need to figure out some of the
generics stuff w/ state and context schema extensions for middleware.

TODO:
* More docs for middleware, should hold off on this until other prios
like MCP and deepagents are met

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-09-18 16:07:16 -04:00
Mason Daugherty
f158cea1e8 release(mistralai): 0.2.12 (#33008) 2025-09-18 11:42:11 -04:00
Sadiq Khan
90280d1f58 docs(core): fix bugs and improve example code in chat_history.py (#32994)
## Summary

This PR fixes several bugs and improves the example code in
`BaseChatMessageHistory` docstring that would prevent it from working
correctly.

### Bugs Fixed
- **Critical bug**: Fixed `json.dump(messages, f)` →
`json.dump(serialized, f)` - was using wrong variable
- **NameError**: Fixed bare variable references to use
`self.storage_path` and `self.session_id`
- **Missing imports**: Added required imports (`json`, `os`, message
converters) to make example runnable

### Improvements
- Added missing type hints following project standards (`messages() ->
list[BaseMessage]`, `clear() -> None`)
- Added robust error handling with `FileNotFoundError` exception
handling
- Added directory creation with `os.makedirs(exist_ok=True)` to prevent
path errors
- Improved performance: `json.load(f)` instead of `json.loads(f.read())`
- Added explicit UTF-8 encoding to all file operations
- Updated stores.py to use modern union syntax (`int | None` vs
`Optional[int]`)

### Test Plan
- [x] Code passes linting (`ruff check`)
- [x] Example code now has all required imports and proper syntax
- [x] Fixed variable references prevent runtime errors
- [x] Follows project's type annotation standards

The example code in the docstring is now fully functional and follows
LangChain's coding standards.

---------

Co-authored-by: sadiqkhzn <sadiqkhzn@users.noreply.github.com>
2025-09-18 09:34:19 -04:00
Dushmanta
ee340e0a3b fix(docs): update dead link to docling github and docs (#33001)
- **Description:** Updated the dead/unreachable links to Docling from
the additional resources section of the langchain-docling docs
  - **Issue:** Fixes langchain-ai/docs/issues/574
  - **Dependencies:** None
2025-09-18 09:30:29 -04:00
Sydney Runkle
d5ba5d3511 feat(langchain): improved HITL patterns (#32996)
# Main changes / new features

## Better support for parallel tool calls

1. Support for multiple tool calls requiring human input
2. Support for combination of tool calls requiring human input + those
that are auto-approved
3. Support structured output w/ tool calls requiring human input
4. Support structured output w/ standard tool calls

## Shortcut for allowed actions

Adds a shortcut where tool config can be specified as a `bool`, meaning
"all actions allowed"

```py
HumanInTheLoopMiddleware(tool_configs={"expensive_tool": True})
```

## A few design decisions here
* We only raise one interrupt w/ all `HumanInterrupt`s, currently we
won't be able to execute all tools until all of these are resolved. This
isn't super blocking bc we can't re-invoke the model until all tools
have finished execution. That being said, if you have a long running
auto-approved tool, this could slow things down.

## TODOs

* Ideally, we would rename `accept` -> `approve`
* Ideally, we would rename `respond` -> `reject`
* Docs update (@sydney-runkle to own)
* In another PR I'd like to refactor testing to have one file for each
prebuilt middleware :)

Fast follow to https://github.com/langchain-ai/langchain/pull/32962
which was deemed as too breaking
2025-09-17 16:53:01 -04:00
Mason Daugherty
76d0758007 fix(docs): json_mode -> json_schema (#32993) 2025-09-17 18:21:34 +00:00
Mason Daugherty
8b3f74012c docs: update GenAI structured output section to include JSON mode details (#32992) 2025-09-17 17:40:34 +00:00
Mason Daugherty
54a9556f5c chore(cli): update lock (#32986) 2025-09-17 02:08:20 +00:00
Mason Daugherty
66041a2778 refactor(cli): target ruff 310 (#32985)
Use union types for optional parameters
2025-09-16 22:04:28 -04:00
Mason Daugherty
ab1b822523 chore: update PR title lint (#32983) 2025-09-16 22:04:19 -04:00
Chase Lean
543d90e108 docs: add langchain-scraperapi (#31973)
Adds documentation for the integration langchain-scraperapi, which
contains 3 tools using the ScraperAPI service.

The tools give AI agents the ability to

Scrape the web and return HTML/text/markdown
Perform Google search and return json output
Perform Amazon search and return json output

For reference, here is the official repo for langchain_scraperapi:
https://github.com/scraperapi/langchain-scraperapi
2025-09-16 21:46:20 -04:00
Adam Deedman
f8640630d8 docs: fix memory for agents (#32979)
Replaced `input_message` parameter with a directly called tuple, e.g.
`{"messages": [("user", "What is my name?")]}`

Before, the memory function wasn't working with the agent, using the
format of the input_message parameter.

Specifically, on page [Build an
Agent#adding-in-memory](https://python.langchain.com/docs/tutorials/agents/#adding-in-memory)

In the previous code, the query "What's my name?" wasn't working, as the
agent could not recall memory correctly.

<img width="860" height="679" alt="image"
src="https://github.com/user-attachments/assets/dfbca21e-ffe9-4645-a810-3be7a46d81d5"
/>
2025-09-16 15:46:15 -04:00
Mason Daugherty
f9605c7438 chore(infra): update contribution guide link in CONTRIBUTING.md (#32976) 2025-09-16 15:15:53 +00:00
Mason Daugherty
ebd6f7d8a3 chore(infra): update security guidelines formatting (#32975) 2025-09-16 15:12:10 +00:00
ccurme
e63c1d7171 chore(langchain): drop cap on python version (#32974) 2025-09-16 10:44:21 -04:00
Mason Daugherty
8180020b93 chore: restore commented out optional deps (#32971)
langchain & langchain_v1
2025-09-16 10:10:49 -04:00
Username46786
435194acf6 docs: add cross-links between summarization how-to pages (#32968)
This PR improves navigation in the summarization how-to section by
adding
cross-links from the single-call guide to the related map-reduce and
refine
guides. This mirrors the docs style guide’s emphasis on clear
cross-references
and should help readers discover the appropriate pattern for longer
texts.

- Source edited: docs/docs/how_to/summarize_stuff.ipynb
- Links added:
  - /docs/how_to/summarize_map_reduce/
  - /docs/how_to/summarize_refine/

Type: docs-only (no code changes)
2025-09-16 09:59:03 -04:00
Mason Daugherty
244c699551 refactor(cli): drop Python 3.9 (#32964) 2025-09-15 19:22:53 -04:00
Mason Daugherty
369858de19 chore(infra): fix codspeed (#32963)
Related to #32950

CodSpeed v4 runs pytest inside its own runner process, which does not
automatically inherit environment variables from the job
2025-09-15 21:52:46 +00:00
Ali Ismail
4ebce80fbb docs(langchain): add docstring for _load_map_reduce_chain (#32961)
Description:
Add a docstring to _load_map_reduce_chain in chains/summarize/ to
explain the purpose of the prompt argument and document function
parameters. This addresses an existing TODO in the codebase.

Issue:
N/A (documentation improvement only)

Dependencies:
None
2025-09-15 17:19:20 -04:00
Mason Daugherty
8670b24c8e test(groq): xfail tool integration test (#32960)
Groq models have known issues with tool calling consistency,
[particularly with complex tools derived from
runnables](https://github.com/langchain-ai/langchain/discussions/19990).
[(more)](https://github.com/langchain-ai/langchain/discussions/24309)

xfail until we can dedicate time to wrangling their API/model handling
2025-09-15 14:23:22 -04:00
Ademílson Tonato
8d60ddba3a docs: update installation command for firecrawl-py package (#32958) 2025-09-15 14:10:08 -04:00
Mason Daugherty
9f6431924f feat(openai): add max_tokens to AzureChatOpenAI (#32959)
Fixes #32949

This pattern is [present in
`ChatOpenAI`](https://github.com/langchain-ai/langchain/blob/master/libs/partners/openai/langchain_openai/chat_models/base.py#L2821)
but wasn't carried over to Azure.


[CI](https://github.com/langchain-ai/langchain/actions/runs/17741751797/job/50417180998)
2025-09-15 14:09:20 -04:00
Ali Ismail
569a3d9602 docs(langchain): add docstring for _load_stuff_chain (#32932)
**Description:**  
Add a docstring to `_load_stuff_chain` in `chains/summarize/` to explain
the purpose of the `prompt` argument and document function parameters.
This addresses an existing TODO in the codebase.

**Issue:**  
N/A (documentation improvement only)

**Dependencies:**  
None
2025-09-15 10:02:50 -04:00
dependabot[bot]
8ef4df903f chore(infra): bump CodSpeedHQ/action from 3 to 4 (#32950)
Bumps [CodSpeedHQ/action](https://github.com/codspeedhq/action) from 3
to 4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/codspeedhq/action/releases">CodSpeedHQ/action's
releases</a>.</em></p>
<blockquote>
<h2>v4.0.0</h2>
<h2>💥 BREAKING</h2>
<p>It's now required to explicitly set the runner mode to
<code>instrumentation</code> or <code>walltime</code> using either:</p>
<ul>
<li>the <code>mode</code> argument</li>
<li>or the <code>CODSPEED_RUNNER_MODE</code> environment variable</li>
</ul>
<blockquote>
<p>[!TIP]
Before, this variable was automatically set to
<code>instrumentation</code> on every runner except for <a
href="https://codspeed.io/docs/instruments/walltime">CodSpeed macro
runners</a> where it was set to <code>walltime</code> by default.</p>
</blockquote>
<p>Find more details in <a
href="https://codspeed.io/docs/instruments">the instruments
documentation</a>.</p>
<h2>Details</h2>
<h3><!-- raw HTML omitted -->🚀 Features</h3>
<ul>
<li>Make perf profiling enabled by default by <a
href="https://github.com/GuillaumeLagrange"><code>@​GuillaumeLagrange</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/110">#110</a></li>
<li>Make the runner mode argument required by <a
href="https://github.com/GuillaumeLagrange"><code>@​GuillaumeLagrange</code></a></li>
<li>Use introspected node in walltime mode by <a
href="https://github.com/GuillaumeLagrange"><code>@​GuillaumeLagrange</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/108">#108</a></li>
<li>Add instrumented go shell script by <a
href="https://github.com/not-matthias"><code>@​not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/102">#102</a></li>
</ul>
<h3><!-- raw HTML omitted -->🐛 Bug Fixes</h3>
<ul>
<li>Compute proper load bias by <a
href="https://github.com/not-matthias"><code>@​not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/107">#107</a></li>
<li>Increase timeout for first perf ping by <a
href="https://github.com/GuillaumeLagrange"><code>@​GuillaumeLagrange</code></a></li>
<li>Prevent running with valgrind by <a
href="https://github.com/not-matthias"><code>@​not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/106">#106</a></li>
</ul>
<h3><!-- raw HTML omitted -->🏗️ Refactor</h3>
<ul>
<li>Change go-runner binary name by <a
href="https://github.com/not-matthias"><code>@​not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/111">#111</a></li>
</ul>
<p><strong>Full Runner Changelog</strong>: <a
href="https://github.com/CodSpeedHQ/runner/blob/main/CHANGELOG.md">https://github.com/CodSpeedHQ/runner/blob/main/CHANGELOG.md</a></p>
<h2>v3.8.1</h2>
<h2>What's Changed</h2>
<h3><!-- raw HTML omitted -->🐛 Bug Fixes</h3>
<ul>
<li>Don't show error when libpython is not found by <a
href="https://github.com/not-matthias"><code>@​not-matthias</code></a></li>
</ul>
<h3><!-- raw HTML omitted -->🏗️ Refactor</h3>
<ul>
<li>Improve conditional compilation in
<code>get_pipe_open_options</code> by <a
href="https://github.com/art049"><code>@​art049</code></a> in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/100">#100</a></li>
</ul>
<h3><!-- raw HTML omitted -->⚙️ Internals</h3>
<ul>
<li>Change log level to warn for venv_compat error by <a
href="https://github.com/not-matthias"><code>@​not-matthias</code></a>
in <a
href="https://redirect.github.com/CodSpeedHQ/runner/pull/104">#104</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/CodSpeedHQ/action/compare/v3.8.0...v3.8.1">https://github.com/CodSpeedHQ/action/compare/v3.8.0...v3.8.1</a>
<strong>Full Runner Changelog</strong>: <a
href="https://github.com/CodSpeedHQ/runner/blob/main/CHANGELOG.md">https://github.com/CodSpeedHQ/runner/blob/main/CHANGELOG.md</a></p>
<h2>v3.8.0</h2>
<h2>What's Changed</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="653fdc30e6"><code>653fdc3</code></a>
Release v4.0.1 🚀</li>
<li><a
href="4da7be1bda"><code>4da7be1</code></a>
chore: bump runner version to 4.0.1</li>
<li><a
href="172d6c5630"><code>172d6c5</code></a>
chore: make the comment about input validation more discrete</li>
<li><a
href="d15e1ce813"><code>d15e1ce</code></a>
chore: improve the release script</li>
<li><a
href="6eeb021fd0"><code>6eeb021</code></a>
Release v4.0.0 🚀</li>
<li><a
href="74312dabbe"><code>74312da</code></a>
chore: improve the release script</li>
<li><a
href="8a17a350a8"><code>8a17a35</code></a>
ci: add modes to the matrix</li>
<li><a
href="8e3f02a649"><code>8e3f02a</code></a>
feat: make the mode argument required</li>
<li><a
href="97c7a6f5fc"><code>97c7a6f</code></a>
chore: bump runner version to 4.0.0</li>
<li><a
href="8a4cadd026"><code>8a4cadd</code></a>
chore: point the changelog to the runner</li>
<li>See full diff in <a
href="https://github.com/codspeedhq/action/compare/v3...v4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=CodSpeedHQ/action&package-manager=github_actions&previous-version=3&new-version=4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-15 13:56:46 +00:00
doubleinfinity
b944bbc766 docs: add ZeusDB vector store integration (#32822)
## Description

This PR adds documentation for the new ZeusDB vector store integration
with LangChain.

## Motivation

ZeusDB is a high-performance vector database (Python/Rust backend)
designed for AI applications that need fast similarity search and
real-time vector ops. This integration brings ZeusDB's capabilities to
the LangChain ecosystem, giving developers another production-oriented
option for vector storage and retrieval.

**Key Features:**
- **User-Friendly Python API**: Intuitive interface that integrates
seamlessly with Python ML workflows
- **High Performance**: Powered by a robust Rust backend for
lightning-fast vector operations
- **Enterprise Logging**: Comprehensive logging capabilities for
monitoring and debugging production systems
- **Advanced Features**: Includes product quantization and persistence
capabilities
- **AI-Optimized**: Purpose-built for modern AI applications and RAG
pipelines

## Changes

- Added provider documentation:
`docs/docs/integrations/providers/zeusdb.mdx` (installation, setup).

- Added vector store documentation:
`docs/docs/integrations/vectorstores/zeusdb.ipynb` (quickstart for
creating/querying a ZeusDBVectorStore).

- Registered langchain-zeusdb in `libs/packages.yml` for discovery.

## Target users

- AI/ML engineers building RAG pipelines

- Data scientists working with large document collections

- Developers needing high-throughput vector search

- Teams requiring near real-time vector operations

## Testing

- Followed LangChain's "How to add standard tests to an integration"
guidance.
- Code passes format, lint, and test checks locally.
- Tested with LangChain Core 0.3.74
- Works with Python 3.10 to 3.13

## Package Information
**PyPI:** https://pypi.org/project/langchain-zeusdb
**Github:** https://github.com/ZeusDB/langchain-zeusdb
2025-09-15 09:55:14 -04:00
Filip Makraduli
0be7515abc docs: add superlinked retriever integration (#32433)
# feat(superlinked): add superlinked retriever integration

**Description:** 
Add Superlinked as a custom retriever with full LangChain compatibility.
This integration enables users to leverage Superlinked's multi-modal
vector search capabilities including text similarity, categorical
similarity, recency, and numerical spaces with flexible weighting
strategies. The implementation provides a `SuperlinkedRetriever` class
that extends LangChain's `BaseRetriever` with comprehensive error
handling, parameter validation, and support for various vector databases
(in-memory, Qdrant, Redis, MongoDB).

**Key Features:**
- Full LangChain `BaseRetriever` compatibility with `k` parameter
support
- Multi-modal search spaces (text, categorical, numerical, recency)
- Flexible weighting strategies for complex search scenarios
- Vector database agnostic implementation
- Comprehensive validation and error handling
- Complete test coverage (unit tests, integration tests)
- Detailed documentation with 6 practical usage examples

**Issue:** N/A (new integration)

**Dependencies:** 
- `superlinked==33.5.1` (peer dependency, imported within functions)
- `pandas^2.2.0` (required by superlinked)

**Linkedin handle:** https://www.linkedin.com/in/filipmakraduli/

## Implementation Details

### Files Added/Modified:
- `libs/partners/superlinked/` - Complete package structure
- `libs/partners/superlinked/langchain_superlinked/retrievers.py` - Main
retriever implementation
- `libs/partners/superlinked/tests/unit_tests/test_retrievers.py` - unit
tests
- `libs/partners/superlinked/tests/integration_tests/test_retrievers.py`
- Integration tests with mocking
- `docs/docs/integrations/retrievers/superlinked.ipynb` - Documentation
a few usage examples

### Testing:
- `make format` - passing
- `make lint` - passing 
- `make test` - passing (16 unit tests, integration tests)
- Comprehensive test coverage including error handling, validation, and
edge cases

### Documentation:
- Example notebook with 6 practical scenarios:
  1. Simple text search
  2. Multi-space blog search (content + category + recency)
  3. E-commerce product search (price + brand + ratings)
  4. News article search (sentiment + topics + recency)
  5. LangChain RAG integration example
  6. Qdrant vector database integration

### Code Quality:
- Follows LangChain contribution guidelines
- Backwards compatible
- Optional dependencies imported within functions
- Comprehensive error handling and validation
- Type hints and docstrings throughout

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-15 13:54:04 +00:00
Sadiq Khan
cc9a97a477 docs(core): add type hints to BaseStore example code (#32946)
## Summary
- Add comprehensive type hints to the MyInMemoryStore example code in
BaseStore docstring
- Improve documentation quality and educational value for developers
- Align with LangChain's coding standards requiring type hints on all
Python code

## Changes Made
- Added return type annotations to all methods (__init__, mget, mset,
mdelete, yield_keys)
- Added parameter type annotations using proper generic types (Sequence,
Iterator)
- Added instance variable type annotation for the store attribute
- Used modern Python union syntax (str | None) for optional types

## Test Plan
- Verified Python syntax validity with ast.parse()
- No functional changes to actual code, only documentation improvements
- Example code now follows best practices and coding standards

This change improves the educational value of the example code and
ensures consistency with LangChain's requirement that "All Python code
MUST include type hints and return types" as specified in the
development guidelines.

---------

Co-authored-by: sadiqkhzn <sadiqkhzn@users.noreply.github.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-15 13:45:34 +00:00
Dmitry
ee17adb022 docs: add AI/ML API integration (#32430)
**Description:**
Introduces documentation notebooks for AI/ML API integration covering
the following use cases:
- Chat models (`ChatAimlapi`)
- Text completion models (`AimlapiLLM`)
- Provider usage examples
- Text embedding models (`AimlapiEmbeddings`)

Additionally, adds the `langchain-aimlapi` package entry to
`libs/packages.yml` for package management.

This PR aims to provide a comprehensive starting point for developers
integrating AI/ML API models with LangChain via the new
`langchain-aimlapi` package.

**Issue:** N/A

**Dependencies:** None

**Twitter handle:** @aimlapi

---

### **To-Do Before Submitting PR:**

* [x] Run `make format`
* [x] Run `make lint`
* [x] Confirm all documentation notebooks are in
`docs/docs/integrations/`
* [x] Double-check `libs/packages.yml` has the correct repo path
* [x] Confirm no `pyproject.toml` modifications were made unnecessarily

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-15 09:41:40 -04:00
Noraina
6a43f140bc docs: update SerpApi free searches amount in tool feature table (#32945)
**Description:** 
This PR updates the free searches per month from **100** to **250** and
renames SerpAPI to [SerpApi](https://serpapi.com/) to prevent confusion.
Add import API keys and enhance usage instructions in the Jupyter
notebook

**Issue:** N/A

**Dependencies:** N/A

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.
2025-09-14 21:42:59 -04:00
Youngho Kim
4619a2727f docs(anthropic): update documentation links (#32938)
**Description:**
This PR updated links to the latest Anthropic documentation. Changes
include revised links for model overview, tool usage, web search tool,
text editor tool, and more.

**Issue:**
N/A

**Dependencies:**
None

**Twitter handle:**
N/A
2025-09-14 21:38:51 -04:00
湛露先生
6487a7e2e5 chore(langchain): remove duplicate .pdf listing (#32929) 2025-09-14 21:33:40 -04:00
湛露先生
406ebc9141 chore(langchain): Fix typos in core docstrings (#32928)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2025-09-14 21:33:06 -04:00
Nikhil Chandrappa
e6b5ff213a docs: add YugabyteDB Distributed SQL database (#32571)
- **Description:** The `langchain-yugabytedb` package implementations of
core LangChain abstractions using `YugabyteDB` Distributed SQL Database.
  
YugabyteDB is a cloud-native distributed PostgreSQL-compatible database
that combines strong consistency with ultra-resilience, seamless
scalability, geo-distribution, and highly flexible data locality to
deliver business-critical, transactional applications.

[YugabyteDB](https://www.yugabyte.com/ai/) combines the power of the
`pgvector` PostgreSQL extension with an inherently distributed
architecture. This future-proofed foundation helps you build GenAI
applications using RAG retrieval that demands high-performance vector
search.

- [ ] **tests and docs**: 
1. `langchain-yugabytedb`
[github](https://github.com/yugabyte/langchain-yugabytedb) repo.
2. YugabyteDB VectorStore example notebook showing its use. It lives in
`langchain/docs/docs/integrations/vectorstores/yugabytedb.ipynb`
directory.
  3. Running `langchain-yugabytedb` unit tests 
  
- Setting up a Development Environment

This document details how to set up a local development environment that
will
allow you to contribute changes to the project.

Acquire sources and create virtualenv.
```shell
git clone https://github.com/yugabyte/langchain-yugabytedb
cd langchain-yugabytedb
uv venv --python=3.13
source .venv/bin/activate
```

Install package in editable mode.
```shell
uv pip install pipx  
pipx install poetry
poetry install
uv pip install pytest pytest_asyncio pytest-timeout langchain-core langchain_tests sqlalchemy psycopg psycopg-binary numpy pgvector
```

Start YugabyteDB RF-1 Universe.
```shell
docker run -d --name yugabyte_node01 --hostname yugabyte01 \
  -p 7000:7000 -p 9000:9000 -p 15433:15433 -p 5433:5433 -p 9042:9042 \
  yugabytedb/yugabyte:2.25.2.0-b359 bin/yugabyted start --background=false \
  --master_flags="allowed_preview_flags_csv=ysql_yb_enable_advisory_locks,ysql_yb_enable_advisory_locks=true" \
  --tserver_flags="allowed_preview_flags_csv=ysql_yb_enable_advisory_locks,ysql_yb_enable_advisory_locks=true"

docker exec -it yugabyte_node01 bin/ysqlsh -h yugabyte01 -c "CREATE extension vector;"
```

Invoke test cases.
```shell
pytest -vvv tests/unit_tests/yugabytedb_tests
```
2025-09-12 16:55:09 -04:00
Michael Yilma
03f0ebd93e docs: add Bigtable Key-value Store and Vector Store Docs (#32598)
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- [x] **feat(docs)**: add Bigtable Key-value store doc
- [X] **feat(docs)**: add Bigtable Vector store doc 

This PR adds a doc for Bigtable and LangChain Key-value store
integration. It contains guides on how to add, delete, get, and yield
key-value pairs from Bigtable Key-value Store for LangChain.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-12 16:53:59 -04:00
Bar Cohen
c9eed530ce docs: add Timbr tools integration (#32862)
# feat(integrations): Add Timbr tools integration

## DESCRIPTION

This PR adds comprehensive documentation and integration support for
Timbr's semantic layer tools in LangChain.

[Timbr](https://timbr.ai/) provides an ontology-driven semantic layer
that enables natural language querying of databases through
business-friendly concepts. It connects raw data to governed business
measures for consistent access across BI, APIs, and AI applications.

[`langchain-timbr`](https://pypi.org/project/langchain-timbr/) is a
Python SDK that extends
[LangChain](https://github.com/WPSemantix/Timbr-GenAI/tree/main/LangChain)
and
[LangGraph](https://github.com/WPSemantix/Timbr-GenAI/tree/main/LangGraph)
with custom agents, chains, and nodes for seamless integration with the
Timbr semantic layer. It enables converting natural language prompts
into optimized semantic-SQL queries and executing them directly against
your data.

**What's Added:**
- Complete integration documentation for `langchain-timbr` package
- Tool documentation page with usage examples and API reference

**Integration Components:**
- `IdentifyTimbrConceptChain` - Identify relevant concepts from user
prompts
- `GenerateTimbrSqlChain` - Generate SQL queries from natural language
- `ValidateTimbrSqlChain` - Validate queries against knowledge graph
schemas
- `ExecuteTimbrQueryChain` - Execute queries against semantic databases
- `GenerateAnswerChain` - Generate human-readable answers from results

## Documentation Added

- `/docs/integrations/providers/timbr.mdx` - Provider overview and
configuration
- `/docs/integrations/tools/timbr.ipynb` - Comprehensive tool usage
examples

## Links

- [PyPI Package](https://pypi.org/project/langchain-timbr/)
- [GitHub Repository](https://github.com/WPSemantix/langchain-timbr)
- [Official
Documentation](https://docs.timbr.ai/doc/docs/integration/langchain-sdk/)

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-12 16:51:42 -04:00
tbice
e6c38a043f docs: add Qwen integration guide and update qwq documentation (#32817)
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

**Description:**  
Add documentation for Qwen integration in LangChain, including setup
instructions, usage examples, and configuration details. Update related
qwq documentation to reflect current best practices and improve clarity
for users.

This PR enhances the documentation ecosystem by:
- Adding a new guide for integrating Qwen models
- Updating outdated or incomplete qwq documentation
- Improving structure and readability of relevant sections

**Issue:** N/A  
**Dependencies:** None

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-12 16:49:20 -04:00
Elif Sema Balcioglu
dc47c2c598 docs: update langchain-oracledb documentation (#32805)
`Oracle AI Vector Search` integrations for LangChain have been moved to
a dedicated package, [langchain-oracledb
](https://pypi.org/project/langchain-oracledb/), and a new repository,
[langchain-oracle
](https://github.com/oracle/langchain-oracle/tree/main/libs/oracledb).
This PR updates the corresponding documentation, including installation
instructions and import statements, to reflect these changes.

This PR is complemented with:
https://github.com/langchain-ai/langchain-community/pull/283

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-12 16:47:10 -04:00
Yuvraj Chandra
3420ca1da2 docs: add ZenRows provider and tool integration docs (#31742)
**Description:** Adds documentation for ZenRows integration with
LangChain, including provider overview and detailed tool documentation.
ZenRows is an enterprise-grade web scraping solution that enables
LangChain agents to extract web content at scale with advanced features
like JavaScript rendering, anti-bot bypass, geo-targeting, and multiple
output formats.

This PR includes:
- Provider documentation
(`docs/docs/integrations/providers/zenrows.ipynb`)
- Tool documentation
(`docs/docs/integrations/tools/zenrows_universal_scraper.ipynb`)
- Complete usage examples and API reference links

**Issue:** N/A

**Dependencies:** 
- [langchain-zenrows](https://github.com/ZenRows-Hub/langchain-zenrows)
package (external, available on
[PyPI](https://pypi.org/project/langchain-zenrows/))
- No changes to core LangChain dependencies

**LinkedIn handle:** https://www.linkedin.com/company/zenrows/

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-12 16:37:49 -04:00
Vishal Karwande
f11dd177e9 docs: update oci documentation and examples. (#32749)
Adding Oracle Generative AI as one of the providers for langchain.
Updated the old examples in the documentation with the new working
examples.

---------

Co-authored-by: Vishal Karwande <vishalkarwande@Vishals-MacBook-Pro.local>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-12 16:28:03 -04:00
Ali Ismail
d5a4abf960 docs(core): remove duplicate 'the' in indexing/api.py (#32924)
**Description:** Fixes a small typo in `_get_document_with_hash` inside 
`libs/core/langchain_core/indexing/api.py`.

**Issue:** N/A (no related issue)

**Dependencies:** None
2025-09-12 15:49:54 -04:00
Eugene Yurtsev
b1497bcea1 chore(core): test that default values in tool calls are preserved in json schema representation (#32921)
Add unit test coverage for this issue:
https://github.com/langchain-ai/langchain/issues/32232
2025-09-12 12:50:54 -04:00
Sydney Runkle
84f9824cc9 chore: use uv caches (#32919)
Especially helpful for the text splitters tests where we're installing
pytorch (expensive and slow slow slow). Should speed up CI by 5-10 mins.

w/o caches, CI taking 20 minutes 😨 
w/ caches, CI taking 3 minutes
2025-09-12 10:29:35 -04:00
Sydney Runkle
0814bfe5ed ci: use partial runs w/ codspeed (#32920)
Taking advantage of [partial
runs](https://codspeed.io/docs/features/partial-runs)!

This should save us minutes on every CI job, we only run codspeed for
libs w/ changes and this doesn't affect benchmarking drops
2025-09-12 09:46:01 -04:00
Christophe Bornet
cbaf97ada4 chore: bump mypy version to 1.18 (#32914) 2025-09-12 09:19:23 -04:00
Sydney Runkle
dc2da95ac0 release(langchain): v1.0.0a5 (#32917) 2025-09-12 08:36:44 -04:00
Sydney Runkle
9e78ff19ab fix(langchain): use messages from model request (#32908)
Oversight when moving back to basic function call for
`modify_model_request` rather than implementation as its own node.

Basic test right now failing on main, passing on this branch

Revealed a gap in testing. Will write up a more robust test suite for
basic middleware features.
2025-09-12 08:18:02 -04:00
Mason Daugherty
649d8a8223 test(anthropic): enable VCR for web fetch test (#32913)
The API issues have been resolved; no longer xfailing
2025-09-12 03:19:55 +00:00
Mason Daugherty
338d3d2795 chore: remove infra tag from task issue template (#32912) 2025-09-11 22:02:14 -04:00
Mason Daugherty
31f641a11f chore(infra): issue template updates (#32911) 2025-09-11 22:00:44 -04:00
open-swe[bot]
91286b0b27 chore(infra): issue template updates (#32910)
Fixes: #32909

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-11 21:53:35 -04:00
dishaprakash
bea72bac3e docs: add hybrid search documentation to PGVectorStore (#32549)
Adding documentation for Hybrid Search in the PGVectorStore Notebook

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-11 21:12:58 -04:00
Caspar Broekhuizen
15d558ff16 fix(core): resolve mermaid node id collisions when special chars are used (#32857)
### Description

* Replace the Mermaid graph node label escaping logic
(`_escape_node_label`) with `_to_safe_id`, which converts a string into
a unique, Mermaid-compatible node id. Ensures nodes with special
characters always render correctly.

**Before**
* Invalid characters (e.g. `开`) replaced with `_`. Causes collisions
between nodes with names that are the same length and contain all
non-safe characters:
```python
_escape_node_label("开") # '_'
_escape_node_label("始") # '_'  same as above, but different character passed in. not a unique mapping.
```

**After**
```python
_to_safe_id("开") # \5f00
_to_safe_id("始") # \59cb  unique!
```

### Tests
* Rename `test_graph_mermaid_escape_node_label()` to
`test_graph_mermaid_to_safe_id()` and update function logic to use
`_to_safe_id`
* Add `test_graph_mermaid_special_chars()`

### Issue

Fixes langchain-ai/langgraph#6036
2025-09-11 14:15:17 -07:00
Hyunjoon Jeong
9cc85387d1 fix(text-splitters): add validation to prevent infinite loop and prevent empty token splitter (#32205)
### Description
1) Add validation to prevent infinite loop condition when
```tokenizer.tokens_per_chunk > tokenizer.chunk_overlap```
2) Avoid empty decoded chunk when splitter appends tokens

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-09-11 16:55:32 -04:00
Mason Daugherty
7e5180e2fa refactor: inline test release (#32903)
Reusable workflows are not currently supported by PyPI's Trusted
Publishing
functionality, and are subject to breakage. Users are strongly
encouraged
to avoid using reusable workflows for Trusted Publishing until support
becomes official. Please, do not report bugs if this breaks.
2025-09-11 16:20:07 -04:00
Mason Daugherty
bbb1b9085d release(prompty): 0.1.2 (#32907) 2025-09-11 16:19:07 -04:00
Vincent Min
ff9f17bc66 fix(core): preserve ordering in RunnableRetry batch/abatch results (#32526)
Description: Fixes a bug in RunnableRetry where .batch / .abatch could
return misordered outputs (e.g. inputs [0,1,2] yielding [1,1,2]) when
some items succeeded on an earlier attempt and others were retried. Root
cause: successful results were stored keyed by the index within the
shrinking “pending” subset rather than the original input index, causing
collisions and reordered/duplicated outputs after retries. Fix updates
_batch and _abatch to:

- Track remaining original indices explicitly.
- Call underlying batch/abatch only on remaining inputs.
- Map results back to original indices.
- Preserve final ordering by reconstructing outputs in original
positional order.

Issue: Fixes #21326

Tests:

- Added regression tests: test_retry_batch_preserves_order and
test_async_retry_batch_preserves_order asserting correct ordering after
a single controlled failure + retry.
- Existing retry tests still pass.

Dependencies:

- None added or changed.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-09-11 16:18:25 -04:00
Matthew Lapointe
b1f08467cd feat(core): allow overriding ls_model_name from kwargs (#32541) 2025-09-11 16:18:06 -04:00
Eugene Yurtsev
2903e08311 chore(docs): remove langchain_experimental from api reference (#32904)
This removes langchain-experimental from api reference.

We do not recommend it to users for production use cases, so let's also
deprecate it from documentation
2025-09-11 16:13:58 -04:00
Mason Daugherty
115e20a0bc release(ollama): 0.3.8 (#32906) 2025-09-11 16:00:41 -04:00
Mason Daugherty
0ea945d291 release(nomic): 0.1.5 (#32905) 2025-09-11 15:54:19 -04:00
Mason Daugherty
5795ec3c4d release(exa): 0.3.1 (#32902) 2025-09-11 15:53:13 -04:00
Mason Daugherty
bd765753ca release(chroma): 0.2.6 (#32901) 2025-09-11 15:52:19 -04:00
Christophe Bornet
5fd7962a78 fix(core): fix support of Pydantic v1 models in BaseTool.args (#32487)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-09-11 15:44:51 -04:00
Marcus Chia
c68796579e fix(core): resolve infinite recursion in _dereference_refs_helper with mixed $ref objects (#32578)
**Description:** Fixes infinite recursion issue in JSON schema
dereferencing when objects contain both $ref and other properties (e.g.,
nullable, description, additionalProperties). This was causing Apollo
MCP server schemas to hang indefinitely during tool binding.

**Problem:**
- Commit fb5da8384 changed the condition from `set(obj.keys()) ==
{"$ref"}` to `"$ref" in set(obj.keys())`
- This caused objects with $ref + other properties to be treated as pure
$ref nodes
- Result: other properties were lost and infinite recursion occurred
with complex schemas

**Solution:**
- Restore pure $ref detection for objects with only $ref key  
- Add proper handling for mixed $ref objects that preserves all
properties
- Merge resolved reference content with other properties
- Maintain cycle detection to prevent infinite recursion

**Impact:**
- Fixes Apollo MCP server schema integration
- Resolves tool binding infinite recursion with complex GraphQL schemas
- Preserves backward compatibility with existing functionality
- No performance impact - actually improves handling of complex schemas

**Issue:** Fixes #32511

**Dependencies:** None

**Testing:**
- Added comprehensive unit tests covering mixed $ref scenarios
- All existing tests pass (1326 passed, 0 failed)
- Tested with realistic Apollo GraphQL schemas
- Stress tested with 100 iterations of complex schemas

**Verification:**
-  `make format` - All files properly formatted
-  `make lint` - All linting checks pass  
-  `make test` - All 1326 unit tests pass
-  No breaking changes - full backwards compatibility maintained

---------

Co-authored-by: Marcus <marcus@Marcus-M4-MAX.local>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-09-11 15:21:31 -04:00
Mason Daugherty
255ad31955 release(anthropic): 0.3.20 (#32900) 2025-09-11 15:18:43 -04:00
Mason Daugherty
00e992a780 feat(anthropic): web fetch beta (#32894)
Note: citations are broken until Anthropic fixes their API
2025-09-11 15:14:06 -04:00
Mason Daugherty
83d938593b lint 2025-09-11 15:12:38 -04:00
Mason Daugherty
38afeddcb6 fix(groq): update docs due to model deprecation (#32899)
On Friday, October 10th, the moonshotai/kimi-k2-instruct model will be
decommissioned in favor of the latest version,
moonshotai/kimi-k2-instruct-0905.
 
Until then, requests to moonshotai/kimi-k2-instruct will automatically
be routed to moonshotai/kimi-k2-instruct-0905.
2025-09-11 15:00:24 -04:00
Yu Zhong
fca1aaa9b5 fix(core): force overwrite additionalProperties to False in strict mode (#32879)
# Description
This PR fixes a bug in _recursive_set_additional_properties_false used
in function_calling.convert_to_openai_function.

Previously, schemas with "additionalProperties=True" were not correctly
overridden when strict validation was expected, which could lead to
invalid OpenAI function schemas.

The updated implementation ensures that:
- Any schema with "additionalProperties" already set will now be forced
to False under strict mode.
- Recursive traversal of properties, items, and anyOf is preserved.
- Function signature remains unchanged for backward compatibility.

# Issue
When using tool calling in OpenAI structured output strict mode
(strict=True), 400: "Invalid schema for response_format XXXXX
'additionalProperties' is required to be supplied and to be false" error
raises for the parameter that contains dict type. OpenAI requires
additionalProperties to be set to False.
Some PRs try to resolved the issue.
- PR #25169 introduced _recursive_set_additional_properties_false to
recursively set additionalProperties=False.
- PR #26287 fixed handling of empty parameter tools for OpenAI function
generation.
- PR #30971 added support for Union type arguments in strict mode of
OpenAI function calling / structured output.

Despite these improvements, since Pydantic 2.11, it will always add
`additionalProperties: True` for arbitrary dictionary schemas dict or
Any (https://pydantic.dev/articles/pydantic-v2-11-release#changes).
Schemas that already had additionalProperties=True in such cases were
not being overridden, which this PR addresses to ensure strict mode
behaves correctly in all cases.

# Dependencies
No Changes

---------

Co-authored-by: Zhong, Yu <yzhong@freewheel.com>
2025-09-11 11:02:12 -04:00
Jonathan Paserman
af17774186 docs: add MLflow tracking and evaluation cookbook (#32667)
This PR adds a new cookbook demonstrating how to build a RAG pipeline
with LangChain and track + evaluate it using MLflow.
Currently not much documentation on LangChain MLflow integration, hope
this can help folks trying to monitor and evaluate their LangChain
applications.

- ArXiv document loader 
- In Memory vector store
- LCEL rag pipeline
- MLflow tracing
- MLflow evaluation

Issue:
N/A

Dependencies:
N/A
2025-09-10 22:55:28 -04:00
chen-assert
d72da29c0b docs: Fix classification notebook small mistake (#32636)
Fix some minor issues in the Classification Notebook.
While some code still using hardcoded OpenAI model instead of selected
chat model.

Specifically, on page [Classify Text into
Labels](https://python.langchain.com/docs/tutorials/classification/)

We selected chat model before and have init_chat_model with our chosen
mode.
<img width="1262" height="576" alt="image"
src="https://github.com/user-attachments/assets/14eb436b-d2ef-4074-96d8-71640a13c0f7"
/>

But the following sample code still uses the hard-coded OpenAI model,
which in my case is obviously unrunable (lack of openai api key)
<img width="1263" height="543" alt="image"
src="https://github.com/user-attachments/assets/d13846aa-1c4b-4dee-b9c1-c66570ba3461"
/>
2025-09-10 22:43:44 -04:00
Amit Biswas
653b0908af docs: update Confident callback integration and examples (#32458)
**Description:**
Updates the Confident AI integration documentation to use modern
patterns and improve code quality. This change:
- Replaces deprecated `DeepEvalCallbackHandler` with the new
`CallbackHandler` from `deepeval.integrations.langchain`
- Updates installation and authentication instructions to match current
best practices
- Adds modern integration examples using LangChain's latest patterns
- Removes deprecated metrics and outdated code examples
- Updates code samples to follow current best practices

The changes make the documentation more maintainable and ensure users
follow the recommended integration patterns.

**Issue:** Fixes #32444

**Dependencies:**
- deepeval
- langchain
- langchain-openai

**Twitter handle:** @Muwinuddin

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-10 22:43:31 -04:00
GDanksAnchor
eb77da7de5 docs: add name title for Anchor Browser (#32512)
# description
change the sidebar name to Anchor Browser from anchor_browser.

# Issue
Anchor Browser sidebar name looks unattractive.
2025-09-10 22:40:37 -04:00
Tianyu Chen
9c93439a01 docs: add Linux quick setup method for JaguarDB (#32520)
Description:
Added "Method Two: Quick Setup (Linux)" section to prerequisites,
providing a curl-based installation method for deploying JaguarDB
without Docker. Retained original Docker setup instructions for
flexibility.
2025-09-10 22:36:01 -04:00
Marco Vinciguerra
64fe1e9a80 docs: update scrapegraph.ipynb (#32617)
I updated ScrapeGraphAI for checking the new ScrapeGraphAI tool
2025-09-10 22:33:57 -04:00
chen-assert
e4a90490c3 docs: Fix agents tutorials parameter missing (#32639)
Fix a minor issue in the Agents Tutorials Notebook.
While a config parameter is missing.

Specifically, on page [Build an Agent#Streaming
tokens](https://python.langchain.com/docs/tutorials/agents/#streaming-tokens)

These pieces of code can not be run without the config parameter, which
seems to have been omitted by the author.
<img width="1318" height="691" alt="image"
src="https://github.com/user-attachments/assets/54ce2833-9499-41bb-9de0-d5f9beba9ef9"
/>
2025-09-10 22:27:24 -04:00
dwelch-spike
80776b80f0 docs: remove aerospike vector store (#32726)
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- **Description:** Aerospike Vector Store has been retired. It is no
longer supported so It should no longer be documented on the Langchain
site.

- **Add tests and docs**: Removes docs for retired Aerospike vector
store.

- **Lint and test**: NA
2025-09-10 22:19:43 -04:00
may
2c2bab93fc docs: add example for reusing an existing collection (#32774)
Added a short section to the Weaviate integration docs showing how to
connect to an existing collection (reuse an index) with
`WeaviateVectorStore`. This helps clarify required parameters
(`index_name`, `text_key`) when loading a pre-existing store, which was
previously missing.

Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

### Description
Added a short section to the Weaviate integration docs showing how to
connect to an existing collection (reuse an index) with
`WeaviateVectorStore`. This helps clarify required parameters
(`index_name`, `text_key`) when loading a pre-existing store, which was
previously missing.

### Issue
Fixes langchain-ai/langchain-weaviate#197

### Dependencies
None
2025-09-10 22:16:46 -04:00
Mateusz Świtała
221c96e7b4 docs: fix import path in WatsonxToolkit after releasing langchain-ibm 0.3.17 (#32746)
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- [x] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
  - Examples:
    - feat(core): add multi-tenant support
    - fix(cli): resolve flag parsing error
    - docs(openai): update API usage examples
  - Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
  - Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
  - Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.

- [x] **PR message**: 
- **Description:** Fixing the import path for `WatsonxToolkit` in
examples after releasing `lnagchain-ibm==0.3.17`

- [ ] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
2025-09-10 22:14:43 -04:00
yrk111222
364465bd11 docs: update modelscope.mdx (#32823)
### Description
This PR is primarily aimed at updating some usage methods in the
`modelscope.mdx` file.
Specifically, it changes from `ModelScopeLLM` to `ModelScopeEndpoint`.
### Relevant PR
The relevant PR link is:
https://github.com/langchain-ai/langchain/pull/28941
2025-09-10 22:07:19 -04:00
Mason Daugherty
7b874da9b2 fix(docs): text-embedding-004 -> gemini-embedding-001 (#32596)
`text-embedding-004` will be discontinued
2025-09-10 21:47:45 -04:00
Mason Daugherty
8e213c9f1a fix(core): AsyncCallbackHandler docstring cleanup (#32897)
plus IDE warning fixes
2025-09-10 21:31:45 -04:00
Yash Vishwanath Tobre
a8828b1bda fix(core): raise OutputParserException for non-dict JSON outputs (#32236)
**Description:**
Raise a more descriptive OutputParserException when JSON parsing results
in a non-dict type. This improves debugging and aligns behavior with
expectations when using expected_keys.

**Issue:**
Fixes #32233

**Twitter handle:**
@yashvtobre

**Testing:**

- Ran make format and make lint from the root directory; both passed
cleanly.
- Attempted make test but no such target exists in the root Makefile.
- Executed tests directly via pytest targeting the relevant test file,
confirming all tests pass except for unrelated async test failures
outside the scope of this change.

**Notes:**

- No additional dependencies introduced.
- Changes are backward compatible and isolated within the output parser
module.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-09-10 20:57:09 -04:00
Mason Daugherty
7a158c7f1c revert: "chore: remove ruff target-version" (#32895)
Reverts langchain-ai/langchain#32880

Not needed at the moment, will do when finishing v1
2025-09-10 20:56:48 -04:00
Daniel Barker
25c34bd9b2 feat(core): allow custom Mermaid URL (#32831)
- **Description:** Currently,
`langchain_core.runnables.graph_mermaid.py` is hardcoded to use
mermaid.ink to render graph diagrams. It would be nice to allow users to
specify a custom URL, e.g. for self-hosted instances of the Mermaid
server.
- **Issue:** [Langchain Forum: allow custom mermaid API
URL](https://forum.langchain.com/t/feature-request-allow-custom-mermaid-api-url/1472)
  - **Dependencies:** None

- [X] **Add tests and docs**: Added unit tests using mock requests.
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`.

Minimal example using the feature:

```python
import os
import operator
from pathlib import Path
from typing import Any, Annotated, TypedDict

from langgraph.graph import StateGraph

class State(TypedDict):
    messages: Annotated[list[dict[str, Any]], operator.add]

def hello_node(state: State) -> State:
    return {"messages": [{"role": "assistant", "content": "pong!"}]}

builder = StateGraph(State)
builder.add_node("hello_node", hello_node)
builder.add_edge("__start__", "hello_node")
builder.add_edge("hello_node", "__end__")

graph = builder.compile()

# Run graph
output = graph.invoke({"messages": [{"role": "user", "content": "ping?"}]})

# Draw graph
Path("graph.png").write_bytes(graph.get_graph().draw_mermaid_png(base_url="https://custom-mermaid.ink"))
```

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-09-10 17:14:50 -04:00
Mason Daugherty
38001699d5 fix(anthropic): remove unneeded beta flags (#32893)
- Beta isn't needed for search result tests anymore
- Add TODO for other tests to come back when generally available
- Regenerate remote MCP snapshot after some testing (now the same, but
fresher)
- Bump deps
2025-09-10 20:47:13 +00:00
Mason Daugherty
3da0377c02 fix(anthropic): update ChatAnthropic model in tests/doc (#32892)
from `'claude-3-5-sonnet-latest'` to `'claude-3-5-haiku-latest'` since
sonnet is deprecated
2025-09-10 16:44:04 -04:00
JADAVA VINEETH KUMAR RAO
0abf82a45a fix(openai): ainvoke uses async _aget_response; add async tests (#32459) 2025-09-10 15:52:15 -04:00
Jonathan Hill
2fed177d0b fix(core): preserve ToolMessage.status field in convert_to_messages (#32840) 2025-09-10 15:49:39 -04:00
Aasish
9c7d262ff4 fix(openai): update AzureOpenAIEmbeddings validation logic for openai_api_base (#31782) 2025-09-10 14:53:30 -04:00
ccurme
67e651b592 fix(infra): fix min version check (#32891)
Should no longer require `langchain-core>=(version in monorepo)`
2025-09-10 14:04:26 -04:00
Shibayan003
f08dfb6f49 test: Add failing test for BaseCallbackManager.merge (#32040)
This pull request introduces a failing unit test to reproduce the bug
reported in issue #32028.
The test asserts the expected behavior: `BaseCallbackManager.merge()`
should combine `handlers` and `inheritable_handlers` independently,
without mixing them. This test will fail on the current codebase and is
intended to guide the fix and prevent future regressions.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-10 13:56:18 -04:00
ccurme
450870c9ac release(qdrant): 0.2.1 (#32889) 2025-09-10 13:21:16 -04:00
Zhou Jing
10dfeea110 fix(qdrant): allow as_retriever to work without embeddings in SPARSE mode (#32757) 2025-09-10 13:08:50 -04:00
ccurme
34ecb92178 release(openai): 0.3.33 (#32887) 2025-09-10 11:53:26 -04:00
ccurme
49b3918c26 fix(infra): update scheduled test workflow following uv migration in langchain-google (#32886) 2025-09-10 11:30:55 -04:00
Christophe Bornet
12921a94c5 test(core): reactivate commented tests in test_indexing (#32882)
* These tests now pass
* Commenting them is a [ruff
ERA](https://docs.astral.sh/ruff/rules/commented-out-code/) violation
2025-09-10 11:14:14 -04:00
Alexey Bondarenko
181bb91ce0 fix(ollama): Fix handling message content lists (#32881)
The Ollama chat model adapter does not support all of the possible
message content formats. That leads to Ollama model adapter crashing on
some messages from different models (e.g. Gemini 2.5 Flash).

These changes should fix one known scenario - when `content` is a list
containing a string.
2025-09-10 11:13:28 -04:00
Christophe Bornet
b274416441 chore: remove ruff target-version (#32880)
This is not needed anymore since `requires-python` was added when moving
to `uv`.
2025-09-10 11:12:30 -04:00
ccurme
389a781aa0 fix(infra): exclude pre-releases from latest version checks in core release workflow (#32883) 2025-09-10 10:35:24 -04:00
William FH
443f0ccb0e release(core): 0.3.76 (#32877) 2025-09-10 14:10:44 +00:00
Lauren Hirata Singh
00e547c311 docs: update banner with docs deprecation notice (#32871) 2025-09-10 00:35:43 +00:00
Sydney Runkle
d464d3089b chore: redirect docs template -> docs repo (#32872) 2025-09-09 18:24:22 -04:00
William FH
f1d44d0f9d fix(core): honor enabled=false in nested tracing (#31986)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-09-09 13:12:17 -07:00
Christophe Bornet
a35ee49f37 chore(langchain): enable ruff docstring-code-format in langchain (#32858) 2025-09-09 15:00:38 -04:00
Christophe Bornet
352ff363ca chore(cli): remove ruff exclusion of templates (#32864) 2025-09-09 14:56:47 -04:00
Christophe Bornet
256a0b5f2f chore(langchain): add ruff rule BLE (#32868)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-09 18:52:53 +00:00
ccurme
937087a29c release(groq): 0.3.8 (#32870) 2025-09-09 14:39:02 -04:00
Jan Z
08bf4c321f feat(groq): add support for json_schema (#32396) 2025-09-09 18:30:07 +00:00
Mason Daugherty
4c6af2d1b2 fix(openai): structured output (#32551) 2025-09-09 11:37:50 -04:00
Christophe Bornet
ee268db1c5 feat(standard-tests): add a property to skip relevant tests if the vector store doesn't support get_by_ids() (#32633) 2025-09-09 11:37:23 -04:00
Zhou Jing
dcc517b187 fix(core): ensure InjectedToolCallId always overrides LLM-generated values (#32766) 2025-09-09 11:25:52 -04:00
Mason Daugherty
c124e67325 chore(docs): update package READMEs (#32869)
- Fix badges
- Focus on agents
- Cut down fluff
2025-09-09 14:50:32 +00:00
Christophe Bornet
699a5d06d1 chore(langchain): add ruff rule ERA (#32867) 2025-09-09 10:13:18 -04:00
Christophe Bornet
00f699c60d chore(core): cleanup pyproject.toml (#32865) 2025-09-09 10:12:18 -04:00
Christophe Bornet
e36e25fe2f feat(langchain): support PEP604 ( | union) in tool node error handlers (#32861)
This allows to use PEP604 syntax for `ToolNode` error handlers
```python
def error_handler(e: ValueError | ToolException) -> str:
    return "error"

ToolNode(my_tool, handle_tool_errors=error_handler).invoke(...)
```
Without this change, this fails with `AttributeError: 'types.UnionType'
object has no attribute '__mro__'`
2025-09-09 10:11:12 -04:00
Christophe Bornet
cc3b5afe52 fix(huggingface): fix typing in test_standard (#32863) 2025-09-09 10:05:41 -04:00
Gal Bloch
428c2ee6c5 fix(langchain): preserve supplied llm in FlareChain.from_llm (#32847) 2025-09-09 13:41:23 +00:00
Christophe Bornet
714f74a847 refactor(core): improve beta decorator (#32505)
This is better than using a subclass as returning a `property` works
with `ClassWithBetaMethods.beta_property.__doc__`

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 18:06:48 -04:00
Christophe Bornet
c3b28c769a chore(langchain): add ruff rules D (except D100 and D104) (#31994)
See https://docs.astral.sh/ruff/rules/#pydocstyle-d

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 21:47:22 +00:00
Christophe Bornet
017348b27c chore(langchain): add ruff rule E501 in langchain_v1 (#32812)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 17:28:14 -04:00
Christophe Bornet
1e101ae9a2 chore(langchain): add ruff rules N (#32098)
See https://docs.astral.sh/ruff/rules/#pep8-naming-n

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 17:27:43 -04:00
Christophe Bornet
fe6c415c9f chore(langchain): add ruff rule UP007 in langchain_v1 (#32811)
Done by autofix
2025-09-08 17:26:00 -04:00
Christophe Bornet
54c2419a4e chore(langchain): enable ruff docstring-code-format in langchain_v1 (#32855) 2025-09-08 16:51:18 -04:00
Mason Daugherty
35e9d36b0e fix(standard-tests): ensure non-negative token counts in usage metadata assertions (#32593) 2025-09-08 16:49:26 -04:00
Christophe Bornet
8b90eae455 chore(text-splitters): enable ruff docstring-code-format (#32854) 2025-09-08 16:40:11 -04:00
Christophe Bornet
05d14775f2 chore(standard-tests): enable ruff docstring-code-format (#32852) 2025-09-08 16:39:53 -04:00
PieterKok-jaam
33c7f230e0 feat(core): add id field to Document passed to filter for InMemoryVectorStore similarity search (#32688)
Added an id field to the Document passed to filter for
InMemoryVectorStore similarity search. This allows filtering by Document
id and brings the input to the filter in line with the result returned
by the vector similarity search.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-09-08 20:39:18 +00:00
Mason Daugherty
97dd7628d2 chore: update badges (#32851)
- stars badge redundant (look at the top of the page)
- remove version badge since we have many pkgs (and it was only showing
core) -- also, just look at the releases tab to the right of the readme
2025-09-08 20:06:59 +00:00
Adithya1617
f5bd00d1f1 feat(core): support AWS Bedrock document content blocks in msg_content_output (#32799) 2025-09-08 19:40:28 +00:00
Sadra Barikbin
3486d6c74d feat(core): support for adding PromptTemplates with formats other than f-string (#32253)
Allow adding`PromptTemplate`s with formats other than `f-string`. Fixes
#32151

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-09-08 19:16:54 +00:00
Stefano Lottini
390606c155 fix(standard-tests): standard vectorstore tests accept out-of-order get_by_ids (#32821)
- **Description:** The vectorstore standard-test mistakenly assumes that
the store's `get_by_ids` respects the order of the provided `ids`. This
is not the case (as the base class docstring states). This PR fixes
those tests that would fail otherwise (see issue #32820 for details,
repro and all). Fixes #32820
- **Issue:** Fixes #32820
- **Dependencies:** none

Co-authored-by: Stefano Lottini <stefano.lottini@ibm.com>
2025-09-08 14:22:14 -04:00
Christophe Bornet
cc98fb9bee chore(core): add ruff rule PLC0415 (#32351)
See https://docs.astral.sh/ruff/rules/import-outside-top-level/

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 14:15:04 -04:00
Christophe Bornet
16420cad71 chore(core): fix some pydocs to use google-style (#32764)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 17:52:17 +00:00
Christophe Bornet
01fdeede50 chore(core): fix some ruff preview rules (#32785)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 15:55:20 +00:00
Christophe Bornet
f4e83e0ad8 chore(core): fix some docstrings (from DOC preview rule) (#32833)
* Add `Raises` sections
* Add `Returns` sections
* Add `Yields` sections

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 15:44:15 +00:00
dependabot[bot]
4024d47412 chore(infra): bump actions/setup-python from 5 to 6 (#32842)
Bumps [actions/setup-python](https://github.com/actions/setup-python)
from 5 to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/setup-python/releases">actions/setup-python's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2>What's Changed</h2>
<h3>Breaking Changes</h3>
<ul>
<li>Upgrade to node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1164">actions/setup-python#1164</a></li>
</ul>
<p>Make sure your runner is on version v2.327.1 or later to ensure
compatibility with this release. <a
href="https://github.com/actions/runner/releases/tag/v2.327.1">See
Release Notes</a></p>
<h3>Enhancements:</h3>
<ul>
<li>Add support for <code>pip-version</code> by <a
href="https://github.com/priyagupta108"><code>@​priyagupta108</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1129">actions/setup-python#1129</a></li>
<li>Enhance reading from .python-version by <a
href="https://github.com/krystof-k"><code>@​krystof-k</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/787">actions/setup-python#787</a></li>
<li>Add version parsing from Pipfile by <a
href="https://github.com/aradkdj"><code>@​aradkdj</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1067">actions/setup-python#1067</a></li>
</ul>
<h3>Bug fixes:</h3>
<ul>
<li>Clarify pythonLocation behaviour for PyPy and GraalPy in environment
variables by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1183">actions/setup-python#1183</a></li>
<li>Change missing cache directory error to warning by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1182">actions/setup-python#1182</a></li>
<li>Add Architecture-Specific PATH Management for Python with --user
Flag on Windows by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1122">actions/setup-python#1122</a></li>
<li>Include python version in PyPy python-version output by <a
href="https://github.com/cdce8p"><code>@​cdce8p</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1110">actions/setup-python#1110</a></li>
<li>Update docs: clarification on pip authentication with setup-python
by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1156">actions/setup-python#1156</a></li>
</ul>
<h3>Dependency updates:</h3>
<ul>
<li>Upgrade idna from 2.9 to 3.7 in /<strong>tests</strong>/data by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/843">actions/setup-python#843</a></li>
<li>Upgrade form-data to fix critical vulnerabilities <a
href="https://redirect.github.com/actions/setup-python/issues/182">#182</a>
&amp; <a
href="https://redirect.github.com/actions/setup-python/issues/183">#183</a>
by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1163">actions/setup-python#1163</a></li>
<li>Upgrade setuptools to 78.1.1 to fix path traversal vulnerability in
PackageIndex.download by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1165">actions/setup-python#1165</a></li>
<li>Upgrade actions/checkout from 4 to 5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/1181">actions/setup-python#1181</a></li>
<li>Upgrade <code>@​actions/tool-cache</code> from 2.0.1 to 2.0.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>[bot]
in <a
href="https://redirect.github.com/actions/setup-python/pull/1095">actions/setup-python#1095</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/krystof-k"><code>@​krystof-k</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/787">actions/setup-python#787</a></li>
<li><a href="https://github.com/cdce8p"><code>@​cdce8p</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/1110">actions/setup-python#1110</a></li>
<li><a href="https://github.com/aradkdj"><code>@​aradkdj</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/setup-python/pull/1067">actions/setup-python#1067</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-python/compare/v5...v6.0.0">https://github.com/actions/setup-python/compare/v5...v6.0.0</a></p>
<h2>v5.6.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Workflow updates related to Ubuntu 20.04 by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1065">actions/setup-python#1065</a></li>
<li>Fix for Candidate Not Iterable Error by <a
href="https://github.com/aparnajyothi-y"><code>@​aparnajyothi-y</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1082">actions/setup-python#1082</a></li>
<li>Upgrade semver and <code>@​types/semver</code> by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1091">actions/setup-python#1091</a></li>
<li>Upgrade prettier from 2.8.8 to 3.5.3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1046">actions/setup-python#1046</a></li>
<li>Upgrade ts-jest from 29.1.2 to 29.3.2 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1081">actions/setup-python#1081</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/setup-python/compare/v5...v5.6.0">https://github.com/actions/setup-python/compare/v5...v5.6.0</a></p>
<h2>v5.5.0</h2>
<h2>What's Changed</h2>
<h3>Enhancements:</h3>
<ul>
<li>Support free threaded Python versions like '3.13t' by <a
href="https://github.com/colesbury"><code>@​colesbury</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/973">actions/setup-python#973</a></li>
<li>Enhance Workflows: Include ubuntu-arm runners, Add e2e Testing for
free threaded and Upgrade <code>@​action/cache</code> from 4.0.0 to
4.0.3 by <a
href="https://github.com/priya-kinthali"><code>@​priya-kinthali</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1056">actions/setup-python#1056</a></li>
<li>Add support for .tool-versions file in setup-python by <a
href="https://github.com/mahabaleshwars"><code>@​mahabaleshwars</code></a>
in <a
href="https://redirect.github.com/actions/setup-python/pull/1043">actions/setup-python#1043</a></li>
</ul>
<h3>Bug fixes:</h3>
<ul>
<li>Fix architecture for pypy on Linux ARM64 by <a
href="https://github.com/mayeut"><code>@​mayeut</code></a> in <a
href="https://redirect.github.com/actions/setup-python/pull/1011">actions/setup-python#1011</a>
This update maps arm64 to aarch64 for Linux ARM64 PyPy
installations.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="e797f83bcb"><code>e797f83</code></a>
Upgrade to node 24 (<a
href="https://redirect.github.com/actions/setup-python/issues/1164">#1164</a>)</li>
<li><a
href="3d1e2d2ca0"><code>3d1e2d2</code></a>
Revert &quot;Enhance cache-dependency-path handling to support files
outside the w...</li>
<li><a
href="65b071217a"><code>65b0712</code></a>
Clarify pythonLocation behavior for PyPy and GraalPy in environment
variables...</li>
<li><a
href="5b668cf765"><code>5b668cf</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/actions/setup-python/issues/1181">#1181</a>)</li>
<li><a
href="f62a0e252f"><code>f62a0e2</code></a>
Change missing cache directory error to warning (<a
href="https://redirect.github.com/actions/setup-python/issues/1182">#1182</a>)</li>
<li><a
href="9322b3ca74"><code>9322b3c</code></a>
Upgrade setuptools to 78.1.1 to fix path traversal vulnerability in
PackageIn...</li>
<li><a
href="fbeb884f69"><code>fbeb884</code></a>
Bump form-data to fix critical vulnerabilities <a
href="https://redirect.github.com/actions/setup-python/issues/182">#182</a>
&amp; <a
href="https://redirect.github.com/actions/setup-python/issues/183">#183</a>
(<a
href="https://redirect.github.com/actions/setup-python/issues/1163">#1163</a>)</li>
<li><a
href="03bb6152f4"><code>03bb615</code></a>
Bump idna from 2.9 to 3.7 in /<strong>tests</strong>/data (<a
href="https://redirect.github.com/actions/setup-python/issues/843">#843</a>)</li>
<li><a
href="36da51d563"><code>36da51d</code></a>
Add version parsing from Pipfile (<a
href="https://redirect.github.com/actions/setup-python/issues/1067">#1067</a>)</li>
<li><a
href="3c6f142cc0"><code>3c6f142</code></a>
update documentation (<a
href="https://redirect.github.com/actions/setup-python/issues/1156">#1156</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/actions/setup-python/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/setup-python&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 11:27:54 -04:00
Christophe Bornet
f589168411 refactor(core): use pytest style in TestGetBufferString (#32786)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 15:16:13 +00:00
Christophe Bornet
5840dad40b chore(core): enable ruff docstring-code-format (#32834)
See https://docs.astral.sh/ruff/settings/#format_docstring-code-format

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 15:13:50 +00:00
Christophe Bornet
e3b6c9bb66 chore(core): fix some mypy warn_unreachable issues (#32560)
Found by setting `warn_unreachable: true` in mypy.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 15:02:08 +00:00
Christophe Bornet
c672590f42 chore(standard-tests): select ALL rules with exclusions (#31937)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 14:57:47 +00:00
Christophe Bornet
323729915a chore(standard-tests): add mypy strict checking (#32384)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 10:50:38 -04:00
Christophe Bornet
0c3e8ccd0e chore(text-splitters): select ALL rules with exclusions (#32325)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 14:46:09 +00:00
Christophe Bornet
20401df25d chore(cli): fix some DOC rules (preview) (#32839)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-08 14:36:22 +00:00
dependabot[bot]
e0aaaccb61 chore(infra): bump aws-actions/configure-aws-credentials from 4 to 5 (#32841)
Bumps
[aws-actions/configure-aws-credentials](https://github.com/aws-actions/configure-aws-credentials)
from 4 to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/aws-actions/configure-aws-credentials/releases">aws-actions/configure-aws-credentials's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2><a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4.3.1...v5.0.0">5.0.0</a>
(2025-09-03)</h2>
<h3>⚠ BREAKING CHANGES</h3>
<ul>
<li>Cleanup input handling. Changes invalid boolean input behavior (see
<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1445">#1445</a>)</li>
</ul>
<h3>Features</h3>
<ul>
<li>add skip OIDC option (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1458">#1458</a>)
(<a
href="8c45f6b081">8c45f6b</a>)</li>
<li>Cleanup input handling. Changes invalid boolean input behavior (see
<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1445">#1445</a>)
(<a
href="74b3e27aa8">74b3e27</a>)</li>
<li>support account id allowlist (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1456">#1456</a>)
(<a
href="c4be498953">c4be498</a>)</li>
</ul>
<h2>v4.3.1</h2>
<h2><a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4.3.0...v4.3.1">4.3.1</a>
(2025-08-04)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>update readme to 4.3.1 (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1424">#1424</a>)
(<a
href="be2e7ad815">be2e7ad</a>)</li>
</ul>
<h2>v4.3.0</h2>
<h2><a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4.3.0...v4.3.0">4.3.0</a>
(2025-08-04)</h2>
<p>NOTE: This release tag originally pointed to
59b441846ad109fa4a1549b73ef4e149c4bfb53b, but a critical bug was
discovered shortly after publishing. We updated this tag to
d0834ad3a60a024346910e522a81b0002bd37fea to prevent anyone using the
4.3.0 tag from encountering the bug, and we published 4.3.1 to allow
workflows to auto update correctly.</p>
<h3>Features</h3>
<ul>
<li>dependency update and feature cleanup (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1414">#1414</a>)
(<a
href="59489ba544">59489ba</a>),
closes <a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1062">#1062</a>
<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1191">#1191</a></li>
<li>Optional environment variable output (<a
href="c3b3ce61b0">c3b3ce6</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li><strong>docs:</strong> readme samples versioning (<a
href="5b3c895046">5b3c895</a>)</li>
<li>the wrong example region for China partition in README (<a
href="37fe9a740b">37fe9a7</a>)</li>
<li>properly set proxy environment variable (<a
href="cbea70821e">cbea708</a>)</li>
</ul>
<h3>Miscellaneous Chores</h3>
<ul>
<li>release 4.3.0 (<a
href="3f7c218721">3f7c218</a>)</li>
</ul>
<h2>v4.2.1</h2>
<h2><a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4.2.0...v4.2.1">4.2.1</a>
(2025-05-14)</h2>
<h3>Bug Fixes</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/aws-actions/configure-aws-credentials/blob/main/CHANGELOG.md">aws-actions/configure-aws-credentials's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4.3.0...v4.3.1">4.3.1</a>
(2025-08-04)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>update readme to 4.3.1 (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1424">#1424</a>)
(<a
href="be2e7ad815">be2e7ad</a>)</li>
</ul>
<h2><a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4.2.1...v4.3.0">4.3.0</a>
(2025-08-04)</h2>
<h3>Features</h3>
<ul>
<li>depenency update and feature cleanup (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1414">#1414</a>)
(<a
href="59489ba544">59489ba</a>),
closes <a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1062">#1062</a>
<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1191">#1191</a></li>
<li>Optional environment variable output (<a
href="c3b3ce61b0">c3b3ce6</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li><strong>docs:</strong> readme samples versioning (<a
href="5b3c895046">5b3c895</a>)</li>
<li>the wrong example region for China partition in README (<a
href="37fe9a740b">37fe9a7</a>)</li>
<li>properly set proxy environment variable (<a
href="cbea70821e">cbea708</a>)</li>
</ul>
<h3>Miscellaneous Chores</h3>
<ul>
<li>release 4.3.0 (<a
href="3f7c218721">3f7c218</a>)</li>
</ul>
<h2><a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4.2.0...v4.2.1">4.2.1</a>
(2025-05-14)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>ensure explicit inputs take precedence over environment variables
(<a
href="e56e6c4038">e56e6c4</a>)</li>
<li>prioritize explicit inputs over environment variables (<a
href="df9c8fed6b">df9c8fe</a>)</li>
</ul>
<h2><a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4.1.0...v4.2.0">4.2.0</a>
(2025-05-06)</h2>
<h3>Features</h3>
<ul>
<li>add Expiration field to Outputs (<a
href="a4f326760c">a4f3267</a>)</li>
<li>Document role-duration-seconds range (<a
href="5a0cf0167f">5a0cf01</a>)</li>
<li>support action inputs as environment variables (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1338">#1338</a>)
(<a
href="2c168adcae">2c168ad</a>)</li>
</ul>
<h3>Bug Fixes</h3>
<ul>
<li>make sure action builds, also fix dependabot autoapprove (<a
href="c401b8a98c">c401b8a</a>)</li>
<li>role chaning on mulitple runs (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1340">#1340</a>)
(<a
href="9e38641911">9e38641</a>)</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="a03048d875"><code>a03048d</code></a>
chore(main): release 5.0.0 (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1451">#1451</a>)</li>
<li><a
href="337f510212"><code>337f510</code></a>
chore: Fix markdown link formatting in README.md (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1466">#1466</a>)</li>
<li><a
href="f001d79eaa"><code>f001d79</code></a>
chore: update README with versioning (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1465">#1465</a>)</li>
<li><a
href="cf5f2acba3"><code>cf5f2ac</code></a>
chore: Update dist</li>
<li><a
href="b394bdd9f0"><code>b394bdd</code></a>
chore(deps-dev): bump <code>@​aws-sdk/credential-provider-env</code> (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1463">#1463</a>)</li>
<li><a
href="b632c0b5e4"><code>b632c0b</code></a>
chore(deps-dev): bump memfs from 4.38.1 to 4.38.2 (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1462">#1462</a>)</li>
<li><a
href="978e44aa36"><code>978e44a</code></a>
chore: Update dist</li>
<li><a
href="c4be498953"><code>c4be498</code></a>
feat: support account id allowlist (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1456">#1456</a>)</li>
<li><a
href="c5a43c32e1"><code>c5a43c3</code></a>
chore: Update dist</li>
<li><a
href="8c45f6b081"><code>8c45f6b</code></a>
feat: add skip OIDC option (<a
href="https://redirect.github.com/aws-actions/configure-aws-credentials/issues/1458">#1458</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/aws-actions/configure-aws-credentials/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=aws-actions/configure-aws-credentials&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 10:18:01 -04:00
dependabot[bot]
9368ce6b07 chore(infra): bump google-github-actions/auth from 2 to 3 (#32777)
Bumps
[google-github-actions/auth](https://github.com/google-github-actions/auth)
from 2 to 3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/google-github-actions/auth/releases">google-github-actions/auth's
releases</a>.</em></p>
<blockquote>
<h2>v3.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Bump to Node 24 and remove old parameters by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/508">google-github-actions/auth#508</a></li>
<li>Remove hacky script by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/509">google-github-actions/auth#509</a></li>
<li>Release: v3.0.0 by <a
href="https://github.com/google-github-actions-bot"><code>@​google-github-actions-bot</code></a>
in <a
href="https://redirect.github.com/google-github-actions/auth/pull/510">google-github-actions/auth#510</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/google-github-actions/auth/compare/v2...v3.0.0">https://github.com/google-github-actions/auth/compare/v2...v3.0.0</a></p>
<h2>v2.1.13</h2>
<h2>What's Changed</h2>
<ul>
<li>Update deps by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/506">google-github-actions/auth#506</a></li>
<li>Release: v2.1.13 by <a
href="https://github.com/google-github-actions-bot"><code>@​google-github-actions-bot</code></a>
in <a
href="https://redirect.github.com/google-github-actions/auth/pull/507">google-github-actions/auth#507</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/google-github-actions/auth/compare/v2.1.12...v2.1.13">https://github.com/google-github-actions/auth/compare/v2.1.12...v2.1.13</a></p>
<h2>v2.1.12</h2>
<h2>What's Changed</h2>
<ul>
<li>Add retries for getIDToken by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/502">google-github-actions/auth#502</a></li>
<li>Release: v2.1.12 by <a
href="https://github.com/google-github-actions-bot"><code>@​google-github-actions-bot</code></a>
in <a
href="https://redirect.github.com/google-github-actions/auth/pull/503">google-github-actions/auth#503</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/google-github-actions/auth/compare/v2.1.11...v2.1.12">https://github.com/google-github-actions/auth/compare/v2.1.11...v2.1.12</a></p>
<h2>v2.1.11</h2>
<h2>What's Changed</h2>
<ul>
<li>Update troubleshooting docs for Python by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/488">google-github-actions/auth#488</a></li>
<li>Add linters by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/499">google-github-actions/auth#499</a></li>
<li>Update deps by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/500">google-github-actions/auth#500</a></li>
<li>Release: v2.1.11 by <a
href="https://github.com/google-github-actions-bot"><code>@​google-github-actions-bot</code></a>
in <a
href="https://redirect.github.com/google-github-actions/auth/pull/501">google-github-actions/auth#501</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/google-github-actions/auth/compare/v2.1.10...v2.1.11">https://github.com/google-github-actions/auth/compare/v2.1.10...v2.1.11</a></p>
<h2>v2.1.10</h2>
<h2>What's Changed</h2>
<ul>
<li>Declare workflow permissions by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/482">google-github-actions/auth#482</a></li>
<li>Document that the OIDC token expires in 5min by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/483">google-github-actions/auth#483</a></li>
<li>Release: v2.1.10 by <a
href="https://github.com/google-github-actions-bot"><code>@​google-github-actions-bot</code></a>
in <a
href="https://redirect.github.com/google-github-actions/auth/pull/484">google-github-actions/auth#484</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/google-github-actions/auth/compare/v2.1.9...v2.1.10">https://github.com/google-github-actions/auth/compare/v2.1.9...v2.1.10</a></p>
<h2>v2.1.9</h2>
<h2>What's Changed</h2>
<ul>
<li>Use our custom boolean parsing by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/478">google-github-actions/auth#478</a></li>
<li>Update deps by <a
href="https://github.com/sethvargo"><code>@​sethvargo</code></a> in <a
href="https://redirect.github.com/google-github-actions/auth/pull/479">google-github-actions/auth#479</a></li>
<li>Release: v2.1.9 by <a
href="https://github.com/google-github-actions-bot"><code>@​google-github-actions-bot</code></a>
in <a
href="https://redirect.github.com/google-github-actions/auth/pull/480">google-github-actions/auth#480</a></li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="7c6bc770da"><code>7c6bc77</code></a>
Release: v3.0.0 (<a
href="https://redirect.github.com/google-github-actions/auth/issues/510">#510</a>)</li>
<li><a
href="42e4997ee3"><code>42e4997</code></a>
Remove hacky script (<a
href="https://redirect.github.com/google-github-actions/auth/issues/509">#509</a>)</li>
<li><a
href="5ea4dc1147"><code>5ea4dc1</code></a>
Bump to Node 24 and remove old parameters (<a
href="https://redirect.github.com/google-github-actions/auth/issues/508">#508</a>)</li>
<li>See full diff in <a
href="https://github.com/google-github-actions/auth/compare/v2...v3">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=google-github-actions/auth&package-manager=github_actions&previous-version=2&new-version=3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 10:16:59 -04:00
dependabot[bot]
f8bcc98362 chore(infra): bump amannn/action-semantic-pull-request from 5 to 6 (#32585)
Bumps
[amannn/action-semantic-pull-request](https://github.com/amannn/action-semantic-pull-request)
from 5 to 6.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/amannn/action-semantic-pull-request/releases">amannn/action-semantic-pull-request's
releases</a>.</em></p>
<blockquote>
<h2>v6.0.0</h2>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.3...v6.0.0">6.0.0</a>
(2025-08-13)</h2>
<h3>⚠ BREAKING CHANGES</h3>
<ul>
<li>Upgrade action to use Node.js 24 and ESM (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>)</li>
</ul>
<h3>Features</h3>
<ul>
<li>Upgrade action to use Node.js 24 and ESM (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>)
(<a
href="bc0c9a79ab">bc0c9a7</a>)</li>
</ul>
<h2>v5.5.3</h2>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.2...v5.5.3">5.5.3</a>
(2024-06-28)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Bump <code>braces</code> dependency (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/269">#269</a>.
by <a href="https://github.com/EelcoLos"><code>@​EelcoLos</code></a>)
(<a
href="2d952a1bf9">2d952a1</a>)</li>
</ul>
<h2>v5.5.2</h2>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.1...v5.5.2">5.5.2</a>
(2024-04-24)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Bump tar from 6.1.11 to 6.2.1 (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/262">#262</a>
by <a href="https://github.com/EelcoLos"><code>@​EelcoLos</code></a>)
(<a
href="9a90d5a5ac">9a90d5a</a>)</li>
</ul>
<h2>v5.5.1</h2>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.5.0...v5.5.1">5.5.1</a>
(2024-04-24)</h2>
<h3>Bug Fixes</h3>
<ul>
<li>Bump ip from 2.0.0 to 2.0.1 (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/263">#263</a>
by <a href="https://github.com/EelcoLos"><code>@​EelcoLos</code></a>)
(<a
href="5e7e9acca3">5e7e9ac</a>)</li>
</ul>
<h2>v5.5.0</h2>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.4.0...v5.5.0">5.5.0</a>
(2024-04-23)</h2>
<h3>Features</h3>
<ul>
<li>Add outputs for <code>type</code>, <code>scope</code> and
<code>subject</code> (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/261">#261</a>
by <a href="https://github.com/bcaurel"><code>@​bcaurel</code></a>) (<a
href="b05f5f6423">b05f5f6</a>)</li>
</ul>
<h2>v5.4.0</h2>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.3.0...v5.4.0">5.4.0</a>
(2023-11-03)</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/amannn/action-semantic-pull-request/blob/main/CHANGELOG.md">amannn/action-semantic-pull-request's
changelog</a>.</em></p>
<blockquote>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.2.0...v5.3.0">5.3.0</a>
(2023-09-25)</h2>
<h3>Features</h3>
<ul>
<li>Use Node.js 20 in action (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/240">#240</a>)
(<a
href="4c0d5a21fc">4c0d5a2</a>)</li>
</ul>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.1.0...v5.2.0">5.2.0</a>
(2023-03-16)</h2>
<h3>Features</h3>
<ul>
<li>Update dependencies by <a
href="https://github.com/EelcoLos"><code>@​EelcoLos</code></a> (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/229">#229</a>)
(<a
href="e797448a07">e797448</a>)</li>
</ul>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.0.2...v5.1.0">5.1.0</a>
(2023-02-10)</h2>
<h3>Features</h3>
<ul>
<li>Add regex support to <code>scope</code> and
<code>disallowScopes</code> configuration (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/226">#226</a>)
(<a
href="403a6f8924">403a6f8</a>)</li>
</ul>
<h3><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.0.1...v5.0.2">5.0.2</a>
(2022-10-17)</h3>
<h3>Bug Fixes</h3>
<ul>
<li>Upgrade <code>@actions/core</code> to avoid deprecation warnings (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/208">#208</a>)
(<a
href="91f4126c9e">91f4126</a>)</li>
</ul>
<h3><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5.0.0...v5.0.1">5.0.1</a>
(2022-10-14)</h3>
<h3>Bug Fixes</h3>
<ul>
<li>Upgrade GitHub Action to use Node v16 (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/207">#207</a>)
(<a
href="6282ee339b">6282ee3</a>)</li>
</ul>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v4.6.0...v5.0.0">5.0.0</a>
(2022-10-11)</h2>
<h3>⚠ BREAKING CHANGES</h3>
<ul>
<li>Enum options need to be newline delimited (to allow whitespace
within them) (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/205">#205</a>)</li>
</ul>
<h3>Features</h3>
<ul>
<li>Enum options need to be newline delimited (to allow whitespace
within them) (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/205">#205</a>)
(<a
href="c906fe1e5a">c906fe1</a>)</li>
</ul>
<h2><a
href="https://github.com/amannn/action-semantic-pull-request/compare/v4.5.0...v4.6.0">4.6.0</a>
(2022-09-26)</h2>
<h3>Features</h3>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="fdd4d3ddf6"><code>fdd4d3d</code></a>
chore: Release 6.0.1 [skip ci]</li>
<li><a
href="58e4ab40f5"><code>58e4ab4</code></a>
fix: Actually execute action (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/289">#289</a>)</li>
<li><a
href="04a8d177d9"><code>04a8d17</code></a>
chore: Release 6.0.0 [skip ci]</li>
<li><a
href="bc0c9a79ab"><code>bc0c9a7</code></a>
feat!: Upgrade action to use Node.js 24 and ESM (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/287">#287</a>)</li>
<li><a
href="631ffdc028"><code>631ffdc</code></a>
build(deps): bump the github-action-workflows group with 2 updates (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/286">#286</a>)</li>
<li><a
href="c1807ceb58"><code>c1807ce</code></a>
build: configure Dependabot (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/231">#231</a>)</li>
<li><a
href="3352882559"><code>3352882</code></a>
docs: Remove <code>synchronize</code> trigger (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/281">#281</a>)</li>
<li><a
href="04501d43b5"><code>04501d4</code></a>
docs: More restrictive permissions (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/280">#280</a>)</li>
<li><a
href="40166f0081"><code>40166f0</code></a>
chore: Update actions in release workflow (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/276">#276</a>)</li>
<li><a
href="80c0371c57"><code>80c0371</code></a>
docs: Mention <code>reopened</code> trigger in README (<a
href="https://redirect.github.com/amannn/action-semantic-pull-request/issues/272">#272</a>
by <a
href="https://github.com/garysassano"><code>@​garysassano</code></a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/amannn/action-semantic-pull-request/compare/v5...v6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=amannn/action-semantic-pull-request&package-manager=github_actions&previous-version=5&new-version=6)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 10:16:19 -04:00
dependabot[bot]
d8d93882f9 chore(infra): bump actions/checkout from 4 to 5 (#32584)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to
5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/releases">actions/checkout's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
<li>Prepare v5.0.0 release by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2238">actions/checkout#2238</a></li>
</ul>
<h2>⚠️ Minimum Compatible Runner Version</h2>
<p><strong>v2.327.1</strong><br />
<a
href="https://github.com/actions/runner/releases/tag/v2.327.1">Release
Notes</a></p>
<p>Make sure your runner is updated to this version or newer to use this
release.</p>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v5.0.0">https://github.com/actions/checkout/compare/v4...v5.0.0</a></p>
<h2>v4.3.0</h2>
<h2>What's Changed</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
<li>Prepare release v4.3.0 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2237">actions/checkout#2237</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/motss"><code>@​motss</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li><a href="https://github.com/mouismail"><code>@​mouismail</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li><a href="https://github.com/benwells"><code>@​benwells</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li><a href="https://github.com/nebuk89"><code>@​nebuk89</code></a> made
their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li><a href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4...v4.3.0">https://github.com/actions/checkout/compare/v4...v4.3.0</a></p>
<h2>v4.2.2</h2>
<h2>What's Changed</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4.2.1...v4.2.2">https://github.com/actions/checkout/compare/v4.2.1...v4.2.2</a></p>
<h2>v4.2.1</h2>
<h2>What's Changed</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/Jcambass"><code>@​Jcambass</code></a>
made their first contribution in <a
href="https://redirect.github.com/actions/checkout/pull/1919">actions/checkout#1919</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/actions/checkout/compare/v4.2.0...v4.2.1">https://github.com/actions/checkout/compare/v4.2.0...v4.2.1</a></p>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/actions/checkout/blob/main/CHANGELOG.md">actions/checkout's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<h2>V5.0.0</h2>
<ul>
<li>Update actions checkout to use node 24 by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2226">actions/checkout#2226</a></li>
</ul>
<h2>V4.3.0</h2>
<ul>
<li>docs: update README.md by <a
href="https://github.com/motss"><code>@​motss</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1971">actions/checkout#1971</a></li>
<li>Add internal repos for checking out multiple repositories by <a
href="https://github.com/mouismail"><code>@​mouismail</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1977">actions/checkout#1977</a></li>
<li>Documentation update - add recommended permissions to Readme by <a
href="https://github.com/benwells"><code>@​benwells</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2043">actions/checkout#2043</a></li>
<li>Adjust positioning of user email note and permissions heading by <a
href="https://github.com/joshmgross"><code>@​joshmgross</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2044">actions/checkout#2044</a></li>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2194">actions/checkout#2194</a></li>
<li>Update CODEOWNERS for actions by <a
href="https://github.com/TingluoHuang"><code>@​TingluoHuang</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/2224">actions/checkout#2224</a></li>
<li>Update package dependencies by <a
href="https://github.com/salmanmkc"><code>@​salmanmkc</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/2236">actions/checkout#2236</a></li>
</ul>
<h2>v4.2.2</h2>
<ul>
<li><code>url-helper.ts</code> now leverages well-known environment
variables by <a href="https://github.com/jww3"><code>@​jww3</code></a>
in <a
href="https://redirect.github.com/actions/checkout/pull/1941">actions/checkout#1941</a></li>
<li>Expand unit test coverage for <code>isGhes</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1946">actions/checkout#1946</a></li>
</ul>
<h2>v4.2.1</h2>
<ul>
<li>Check out other refs/* by commit if provided, fall back to ref by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1924">actions/checkout#1924</a></li>
</ul>
<h2>v4.2.0</h2>
<ul>
<li>Add Ref and Commit outputs by <a
href="https://github.com/lucacome"><code>@​lucacome</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1180">actions/checkout#1180</a></li>
<li>Dependency updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a>- <a
href="https://redirect.github.com/actions/checkout/pull/1777">actions/checkout#1777</a>,
<a
href="https://redirect.github.com/actions/checkout/pull/1872">actions/checkout#1872</a></li>
</ul>
<h2>v4.1.7</h2>
<ul>
<li>Bump the minor-npm-dependencies group across 1 directory with 4
updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1739">actions/checkout#1739</a></li>
<li>Bump actions/checkout from 3 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1697">actions/checkout#1697</a></li>
<li>Check out other refs/* by commit by <a
href="https://github.com/orhantoy"><code>@​orhantoy</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1774">actions/checkout#1774</a></li>
<li>Pin actions/checkout's own workflows to a known, good, stable
version. by <a href="https://github.com/jww3"><code>@​jww3</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1776">actions/checkout#1776</a></li>
</ul>
<h2>v4.1.6</h2>
<ul>
<li>Check platform to set archive extension appropriately by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1732">actions/checkout#1732</a></li>
</ul>
<h2>v4.1.5</h2>
<ul>
<li>Update NPM dependencies by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1703">actions/checkout#1703</a></li>
<li>Bump github/codeql-action from 2 to 3 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1694">actions/checkout#1694</a></li>
<li>Bump actions/setup-node from 1 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1696">actions/checkout#1696</a></li>
<li>Bump actions/upload-artifact from 2 to 4 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1695">actions/checkout#1695</a></li>
<li>README: Suggest <code>user.email</code> to be
<code>41898282+github-actions[bot]@users.noreply.github.com</code> by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1707">actions/checkout#1707</a></li>
</ul>
<h2>v4.1.4</h2>
<ul>
<li>Disable <code>extensions.worktreeConfig</code> when disabling
<code>sparse-checkout</code> by <a
href="https://github.com/jww3"><code>@​jww3</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1692">actions/checkout#1692</a></li>
<li>Add dependabot config by <a
href="https://github.com/cory-miller"><code>@​cory-miller</code></a> in
<a
href="https://redirect.github.com/actions/checkout/pull/1688">actions/checkout#1688</a></li>
<li>Bump the minor-actions-dependencies group with 2 updates by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1693">actions/checkout#1693</a></li>
<li>Bump word-wrap from 1.2.3 to 1.2.5 by <a
href="https://github.com/dependabot"><code>@​dependabot</code></a> in <a
href="https://redirect.github.com/actions/checkout/pull/1643">actions/checkout#1643</a></li>
</ul>
<h2>v4.1.3</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="08c6903cd8"><code>08c6903</code></a>
Prepare v5.0.0 release (<a
href="https://redirect.github.com/actions/checkout/issues/2238">#2238</a>)</li>
<li><a
href="9f265659d3"><code>9f26565</code></a>
Update actions checkout to use node 24 (<a
href="https://redirect.github.com/actions/checkout/issues/2226">#2226</a>)</li>
<li>See full diff in <a
href="https://github.com/actions/checkout/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/checkout&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-08 10:14:09 -04:00
Sadiq Khan
228fbac3a6 fix(openai): handle AIMessages without response_id in _get_last_messages (#32824) 2025-09-08 10:12:50 -04:00
JunHyungKang
6ea06ca972 fix(openai): Fix Azure OpenAI Responses API model field issue (#32649) 2025-09-08 10:08:35 -04:00
ccurme
5b0a55ad35 chore(openai): apply formatting changes to AzureChatOpenAI (#32848) 2025-09-08 09:54:20 -04:00
Sydney Runkle
6e2f46d04c feat(langchain): middleware support in create_agent (#32828)
## Overview

Adding new `AgentMiddleware` primitive that supports `before_model`,
`after_model`, and `prepare_model_request` hooks.

This is very exciting! It makes our `create_agent` prebuilt much more
extensible + capable. Still in alpha and subject to change.

This is different than the initial
[implementation](https://github.com/langchain-ai/langgraph/tree/nc/25aug/agent)
in that it:
* Fills in gaps w/ missing features, for ex -- new structured output,
optionality of tools + system prompt, sync and async model requests,
provider builtin tools
* Exposes private state extensions for middleware, enabling things like
model call tracking, etc
* Middleware can register tools
* Uses a `TypedDict` for `AgentState` -- dataclass subclassing is tricky
w/ required values + required decorators
* Addition of `model_settings` to `ModelRequest` so that we can pass
through things to bind (like cache kwargs for anthropic middleware)

## TODOs

### top prio
- [x] add middleware support to existing agent
- [x] top prio middlewares
  - [x] summarization node
  - [x] HITL
  - [x] prompt caching
 
other ones
- [x] model call limits
- [x] tool calling limits
- [ ] usage (requires output state)

### secondary prio
- [x] improve typing for state updates from middleware (not working
right now w/ simple `AgentUpdate` and `AgentJump`, at least in Python)
- [ ] add support for public state (input / output modifications via
pregel channel mods) -- to be tackled in another PR
- [x] testing!

### docs
See https://github.com/langchain-ai/docs/pull/390
- [x] high level docs about middleware
- [x] summarization node
- [x] HITL
- [x] prompt caching

## open questions

Lots of open questions right now, many of them inlined as comments for
the short term, will catalog some more significant ones here.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2025-09-08 01:10:57 +00:00
Christophe Bornet
5bf0b218c8 chore(cli): fix some ruff preview rules (#32803) 2025-09-07 16:53:19 -04:00
Mason Daugherty
4e39c164bb fix(anthropic): remove beta header warning for TTL (#32832)
No longer beta as of Aug 13
2025-09-05 14:28:58 -04:00
ScarletMercy
0b3af47335 fix(docs): resolve malformed character in tool_calling.ipynb (#32825)
**Description:**  
Remove a character in tool_calling.ipynb that causes a grammatical error
Verification: Local docs build passed after fix 
 
**Issue:**  
None (direct hotfix for rendering issue identified during documentation
review)
 
**Dependencies:**  
None
2025-09-05 11:28:56 -04:00
Mason Daugherty
bc91a4811c chore: update PR template (#32819) 2025-09-04 19:53:54 +00:00
Christophe Bornet
05a61f9508 fix(langchain): fix mypy versions in langchain_v1 (#32816) 2025-09-04 11:51:08 -04:00
Christophe Bornet
aa63de9366 chore(langchain): cleanup langchain_v1 mypy config (#32809)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-03 19:28:06 +00:00
Christophe Bornet
86fa34f3eb chore(langchain): add ruff rules D for langchain_v1 (#32808) 2025-09-03 15:26:17 -04:00
JING
36037c9251 fix(docs): update Anthropic model name and add version warnings (#32807)
**Description:** This PR fixes the broken Anthropic model example in the
documentation introduction page and adds a comment field to display
model version warnings in code blocks. The changes ensure that users can
successfully run the example code and are reminded to check for the
latest model versions.

**Issue:** https://github.com/langchain-ai/langchain/issues/32806

**Changes made:**
- Update Anthropic model from broken "claude-3-5-sonnet-latest" to
working "claude-3-7-sonnet-20250219"
- Add comment field to display model version warnings in code blocks
- Improve user experience by providing working examples and version
guidance

**Dependencies:** None required
2025-09-03 15:25:13 -04:00
Martin Meier-Zavodsky
ad26c892ea docs(langchain): update evaluation tutorial link (#32796)
**Description**
This PR updates the evaluation tutorial link for LangSmith to the new
official docs location.

**Issue**
N/A

**Dependencies**
None
2025-09-03 15:22:46 -04:00
Shahroz Ahmad
4828a85ab0 feat(core): add web_search in OpenAI tools list (#32738) 2025-09-02 21:57:25 +00:00
ccurme
b999f356e8 fix(langchain): update __init__ version (#32793) 2025-09-02 13:14:42 -04:00
Sydney Runkle
062196a7b3 release(langchain): v1.0.0a3 (#32791) 2025-09-02 12:29:14 -04:00
Sydney Runkle
dc9f941326 chore(langchain): rename create_react_agent -> create_agent (#32789) 2025-09-02 12:13:12 -04:00
Adithya1617
238ecd09e0 docs(langchain): update redirect url of "this langsmith conceptual guide" in tracing.mdx (#32776)
…ge (issue : #32775)

- **Description: updated the redirect url of "this langsmith conceptual
guide" in tracing.mdx
  - **Issue:** fixes #32775

---------

Co-authored-by: Adithya <adithya.vardhan1617@gmail.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-09-01 19:02:21 +00:00
Mason Daugherty
6b5fdfb804 release(text-splitters): 0.3.11 (#32770)
Fixes #32747

SpaCy integration test fixture was trying to use pip to download the
SpaCy language model (`en_core_web_sm`), but uv-managed environments
don't include pip by default. Fail test if not installed as opposed to
downloading.
2025-08-31 23:00:05 +00:00
Ravirajsingh Sodha
b42dac5fe6 docs: standardize OllamaLLM and BaseOpenAI docstrings (#32758)
- Add comprehensive docstring following LangChain standards
- Include Setup, Key init args, Instantiate, Invoke, Stream, and Async
sections
- Provide detailed parameter descriptions and code examples
- Fix linting issues for code formatting compliance

Contributes to #24803

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-31 17:45:56 -05:00
Christophe Bornet
e0a4af8d8b docs(text-splitters): fix some docstrings (#32767) 2025-08-31 13:46:11 -05:00
Rémy HUBSCHER
fcf7175392 chore(langchain): improve PostgreSQL Manager upsert SQLAlchemy API calls. (#32748)
- Make explicit the `constraint` parameter name to avoid mixing it with
`index_elements`
[[Documentation](https://docs.sqlalchemy.org/en/20/dialects/postgresql.html#sqlalchemy.dialects.postgresql.Insert.on_conflict_do_update)]
- ~Fallback on the existing `group_id` row value, to avoid setting it to
`None`.~
2025-08-30 14:13:24 -05:00
Kush Goswami
1f2ab17dff docs: fix typo and grammer in Conceptual guide (#32754)
fixed small typo and grammatical inconsistency in Conceptual guide
2025-08-30 13:48:55 -05:00
Mason Daugherty
2dc89a2ae7 release(cli): 0.0.37 (#32760)
It's been a minute. Final release prior to dropping Python 3.9 support.
2025-08-30 13:07:55 -05:00
Christophe Bornet
e3c4aeaea1 chore(cli): add mypy strict checking (#32386)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-30 13:02:45 -05:00
Vikas Shivpuriya
444939945a docs: fix punctuation in style guide (#32756)
Removed a period in bulleted list for consistency

Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
  - Examples:
    - feat(core): add multi-tenant support
    - fix(cli): resolve flag parsing error
    - docs(openai): update API usage examples
  - Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
  - Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
  - Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change. Include a [closing
keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)
if applicable to a relevant issue.
  - **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
  - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

- [ ] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
2025-08-30 12:56:17 -05:00
Vikas Shivpuriya
ae8db86486 docs: fixed typo in contributing guide (#32755)
Completed the sentence by adding a period ".", in sync with other points

>> Click "Propose changes"

to 

>> Click "Propose changes".

Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
  - Examples:
    - feat(core): add multi-tenant support
    - fix(cli): resolve flag parsing error
    - docs(openai): update API usage examples
  - Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
  - Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
  - Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change. Include a [closing
keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)
if applicable to a relevant issue.
  - **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
  - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

- [ ] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
2025-08-30 12:55:25 -05:00
Christophe Bornet
8a1419dad1 chore(cli): add ruff rules ANN401 and D1 (#32576) 2025-08-30 12:41:16 -05:00
Kush Goswami
840e4c8e9f docs: fix grammar and typo in Documentation style guide (#32741)
fixed grammer and one typo in the Documentation style guide
2025-08-29 14:22:54 -04:00
Caspar Broekhuizen
37aff0a153 chore: bump langchain-core minimum to 0.3.75 (#32753)
Update `langchain-core` dependency min from `>=0.3.63` to `>=0.3.75`.

### Motivation
- We located the `langchain-core` package locally in the monorepo and
need to align `langchain-tests` with the new minimum version.
2025-08-29 14:11:28 -04:00
Caspar Broekhuizen
a163d59988 chore(standard-tests): relax langchain-core bounds for langchain-tests 1.0.0a1 (#32752)
### Overview
Preparing the `1.0.0a1` release of `langchain-tests` to align with
`langchain-core` version `1.0.0a1`.

### Changes
- Bump package version to `1.0.0a1`
- Relax `langchain-core` requirement from `<1.0.0,>=0.3.63` to
`<2.0.0,>=0.3.63`

### Motivation
All main LangChain packages are now publishing `1.0.0a` prereleases.  
`langchain-tests` needs a matching prerelease so downstreams can install
tests alongside the 1.0 series without conflicts.

### Tests
- Verified installation and tests against both `0.3.75` and `1.0.0a1`.
2025-08-29 13:46:48 -04:00
Sydney Runkle
b26e52aa4d chore(text-splitters): bump version of core (#32740) 2025-08-28 13:14:57 -04:00
Sydney Runkle
38cdd7a2ec chore(text-splitters): relax max bound for langchain-core (#32739) 2025-08-28 13:05:47 -04:00
Sydney Runkle
26e5d1302b chore(langchain): remove upper bound at v1 for core (#32737) 2025-08-28 12:14:42 -04:00
Christopher Jones
107425c68d docs: fix basic Oracle example issues such as capitalization (#32730)
**Description:** fix capitalization and basic issues in
https://python.langchain.com/docs/integrations/document_loaders/oracleadb_loader/

Signed-off-by: Christopher Jones <christopher.jones@oracle.com>
2025-08-28 10:32:45 -04:00
Tik1993
009cc3bf50 docs(docs): added content= keyword when creating SystemMessage and HumanMessage (#32734)
Description: 
Added the content= keyword when creating SystemMessage and HumanMessage
in the messages list, making it consistent with the API reference.
2025-08-28 10:31:46 -04:00
NOOR UL HUDA
6185558449 docs: replace smart quotes with straight quotes on How-to guides landing page (#32725)
### Summary

This PR updates the sentence on the "How-to guides" landing page to
replace smart (curly) quotes with straight quotes in the phrase:

> "How do I...?"

### Why This Change?

- Ensures formatting consistency across documentation
- Avoids encoding or rendering issues with smart quotes
- Matches standard Markdown and inline code formatting

This is a small change, but improves clarity and polish on a key landing
page.
2025-08-28 10:30:12 -04:00
Kush Goswami
0928ff5b12 docs: fix typo in LangGraph section of Introduction (#32728)
Change "Linkedin" to "LinkedIn" to be consistent with LinkedIn's
spelling.

Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
2025-08-28 10:29:35 -04:00
Sydney Runkle
7f9b0772fc chore(langchain): also bump text splitters (#32722) 2025-08-27 18:09:57 +00:00
Sydney Runkle
d6e618258f chore(langchain): use latest core (#32720) 2025-08-27 14:06:07 -04:00
Sydney Runkle
806bc593ab chore(langchain): revert back to static versioning for now (#32719) 2025-08-27 13:54:41 -04:00
Sydney Runkle
047bcbaa13 release(langchain): v1.0.0a1 (#32718)
Also removing globals usage + static version
2025-08-27 13:46:20 -04:00
Sydney Runkle
18db07c292 feat(langchain): revamped create_react_agent (#32705)
Adding `create_react_agent` and introducing `langchain.agents`!

## Enhanced Structured Output

`create_react_agent` supports coercion of outputs to structured data
types like `pydantic` models, dataclasses, typed dicts, or JSON schemas
specifications.

### Structural Changes

In langgraph < 1.0, `create_react_agent` implemented support for
structured output via an additional LLM call to the model after the
standard model / tool calling loop finished. This introduced extra
expense and was unnecessary.

This new version implements structured output support in the main loop,
allowing a model to choose between calling tools or generating
structured output (or both).

The same basic pattern for structured output generation works:

```py
from langchain.agents import create_react_agent
from langchain_core.messages import HumanMessage
from pydantic import BaseModel


class Weather(BaseModel):
    temperature: float
    condition: str


def weather_tool(city: str) -> str:
    """Get the weather for a city."""

    return f"it's sunny and 70 degrees in {city}"


agent = create_react_agent("openai:gpt-4o-mini", tools=[weather_tool], response_format=Weather)
print(repr(result["structured_response"]))
#> Weather(temperature=70.0, condition='sunny')
```

### Advanced Configuration

The new API exposes two ways to configure how structured output is
generated. Under the hood, LangChain will attempt to pick the best
approach if not explicitly specified. That is, if provider native
support is available for a given model, that takes priority over
artificial tool calling.

1. Artificial tool calling (the default for most models)

LangChain generates a tool (or tools) under the hood that match the
schema of your response format. When the model calls those tools,
LangChain coerces the args to the desired format. Note, LangChain does
not validate outputs adhering to JSON schema specifications.

<details>
<summary>Extended example</summary>

```py
from langchain.agents import create_react_agent
from langchain_core.messages import HumanMessage
from langchain.agents.structured_output import ToolStrategy
from pydantic import BaseModel


class Weather(BaseModel):
    temperature: float
    condition: str


def weather_tool(city: str) -> str:
    """Get the weather for a city."""

    return f"it's sunny and 70 degrees in {city}"


agent = create_react_agent(
    "openai:gpt-4o-mini",
    tools=[weather_tool],
    response_format=ToolStrategy(
        schema=Weather, tool_message_content="Final Weather result generated"
    ),
)

result = agent.invoke({"messages": [HumanMessage("What's the weather in Tokyo?")]})
for message in result["messages"]:
    message.pretty_print()

"""
================================ Human Message =================================

What's the weather in Tokyo?
================================== Ai Message ==================================
Tool Calls:
  weather_tool (call_Gg933BMHMwck50Q39dtBjXm7)
 Call ID: call_Gg933BMHMwck50Q39dtBjXm7
  Args:
    city: Tokyo
================================= Tool Message =================================
Name: weather_tool

it's sunny and 70 degrees in Tokyo
================================== Ai Message ==================================
Tool Calls:
  Weather (call_9xOkYUM7PuEXl9DQq9sWGv5l)
 Call ID: call_9xOkYUM7PuEXl9DQq9sWGv5l
  Args:
    temperature: 70
    condition: sunny
================================= Tool Message =================================
Name: Weather

Final Weather result generated
"""

print(repr(result["structured_response"]))
#> Weather(temperature=70.0, condition='sunny')
```

</details>

2. Provider implementations (limited to OpenAI, Groq)

Some providers support structured output generating directly. For those
cases, we offer the `ProviderStrategy` hint:

<details>
<summary>Extended example</summary>

```py
from langchain.agents import create_react_agent
from langchain_core.messages import HumanMessage
from langchain.agents.structured_output import ProviderStrategy
from pydantic import BaseModel


class Weather(BaseModel):
    temperature: float
    condition: str


def weather_tool(city: str) -> str:
    """Get the weather for a city."""

    return f"it's sunny and 70 degrees in {city}"


agent = create_react_agent(
    "openai:gpt-4o-mini",
    tools=[weather_tool],
    response_format=ProviderStrategy(Weather),
)

result = agent.invoke({"messages": [HumanMessage("What's the weather in Tokyo?")]})
for message in result["messages"]:
    message.pretty_print()

"""
================================ Human Message =================================

What's the weather in Tokyo?
================================== Ai Message ==================================
Tool Calls:
  weather_tool (call_OFJq1FngIXS6cvjWv5nfSFZp)
 Call ID: call_OFJq1FngIXS6cvjWv5nfSFZp
  Args:
    city: Tokyo
================================= Tool Message =================================
Name: weather_tool

it's sunny and 70 degrees in Tokyo
================================== Ai Message ==================================

{"temperature":70,"condition":"sunny"}
Weather(temperature=70.0, condition='sunny')
"""

print(repr(result["structured_response"]))
#> Weather(temperature=70.0, condition='sunny')
```

Note! The final tool message has the custom content provided by the dev.

</details>

Prompted output was previously supported and is no longer supported via
the `response_format` argument to `create_react_agent`. If there's
significant demand for this, we'd be happy to engineer a solution.

## Error Handling

`create_react_agent` now exposes an API for managing errors associated
with structured output generation. There are two common problems with
structured output generation (w/ artificial tool calling):

1. **Parsing error** -- the model generates data that doesn't match the
desired structure for the output
2. **Multiple tool calls error** -- the model generates 2 or more tool
calls associated with structured output schemas

A developer can control the desired behavior for this via the
`handle_errors` arg to `ToolStrategy`.

<details>
<summary>Extended example</summary>

```py
from langchain_core.messages import HumanMessage
from pydantic import BaseModel

from langchain.agents import create_react_agent
from langchain.agents.structured_output import StructuredOutputValidationError, ToolStrategy


class Weather(BaseModel):
    temperature: float
    condition: str


def weather_tool(city: str) -> str:
    """Get the weather for a city."""
    return f"it's sunny and 70 degrees in {city}"


def handle_validation_error(error: Exception) -> str:
    if isinstance(error, StructuredOutputValidationError):
        return (
            f"Please call the {error.tool_name} call again with the correct arguments. "
            f"Your mistake was: {error.source}"
        )
    raise error


agent = create_react_agent(
    "openai:gpt-5",
    tools=[weather_tool],
    response_format=ToolStrategy(
        schema=Weather,
        handle_errors=handle_validation_error,
    ),
)
```

</details>

## Error Handling for Tool Calling

Tools fail for two main reasons:

1. **Invocation failure** -- the args generated by the model for the
tool are incorrect (missing, incompatible data types, etc)
2. **Execution failure** -- the tool execution itself fails due to a
developer error, network error, or some other exception.

By default, when tool **invocation** fails, the react agent will return
an artificial `ToolMessage` to the model asking it to correct its
mistakes and retry.

Now, when tool **execution** fails, the react agent raises the
`ToolException` by default instead of asking the model to retry. This
helps to avoid looping that should be avoided due to the aforementioned
issues.

Developers can configure their desired behavior for retries / error
handling via the `handle_tool_errors` arg to `ToolNode`.

## Pre-Bound Models

`create_react_agent` no longer supports inputs to `model` that have been
pre-bound w/ tools or other configuration. To properly support
structured output generation, the agent itself needs the power to bind
tools + structured output kwargs.

This also makes the devx cleaner - it's always expected that `model` is
an instance of `BaseChatModel` (or `str` that we coerce into a chat
model instance).

Dynamic model functions can return a pre-bound model **IF** structured
output is not also used. Dynamic model functions can then bind tools /
structured output logic.

## Import Changes

Users should now use `create_react_agent` from `langchain.agents`
instead of `langgraph.prebuilts`.
Other imports have a similar migration path, `ToolNode` and `AgentState`
for example.

* `chat_agent_executor.py` -> `react_agent.py`

Some notes:
1. Disabled blockbuster + some linting in `langchain/agents` -- beyond
ideal, but necessary to get this across the line for the alpha. We
should re-enable before official release.
2025-08-27 17:32:21 +00:00
Sydney Runkle
1fe2c4084b chore(langchain): remove untested chains for first alpha (#32710)
Also removing globals.py file
2025-08-27 08:24:43 -04:00
Sydney Runkle
c6c7fce6c9 chore(langchain): drop Python 3.9 to prep for v1 (#32704)
Python 3.9 EOL is October 2025, so we're going to drop it for the v1
alpha release.
2025-08-26 23:16:42 +00:00
Mason Daugherty
3d08b6bd11 chore: adress pytest-asyncio deprecation warnings + other nits (#32696)
amongst some linting imcompatible rules
2025-08-26 15:51:38 -04:00
Matthew Farrellee
f2dcdae467 fix(standard-tests): update function_args to match my_adder_tool param types (#32689)
**Description:**

https://api.llama.com implements strong type checking, which results in
a false negative.

with type mismatch (expected integer, received string) -

```
$ curl -X POST "https://api.llama.com/compat/v1/chat/completions" \
  -H "Authorization: Bearer API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
 "model": "Llama-3.3-70B-Instruct",
 "messages": [
     {"role": "user", "content": "What is 1 + 2"},
     {"role": "assistant", "content": "", "tool_calls": [{"id": "abc123", "type": "function", "function": {"name": "my_adder_tool", "arguments": "{\"a\": \"1\", \"b\": \"2\"}"}}]},
     {"role": "tool", "tool_call_id": "abc123", "content": "{\"result\": 3}"}
 ],
 "tools": [{"type": "function", "function": {"name": "my_adder_tool", "description": "Sum two integers", "parameters": {"properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, "required": ["a", "b"], "type": "object"}}}]
}'

{"title":"Bad request","detail":"Unexpected param value `a`: \"1\"","status":400}
```

with correct type -

```
$ curl -X POST "https://api.llama.com/compat/v1/chat/completions" \
  -H "Authorization: Bearer API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
 "model": "Llama-3.3-70B-Instruct",
 "messages": [
     {"role": "user", "content": "What is 1 + 2"},
     {"role": "assistant", "content": "", "tool_calls": [{"id": "abc123", "type": "function", "function": {"name": "my_adder_tool", "arguments": "{\"a\": 1, \"b\": 2}"}}]},
     {"role": "tool", "tool_call_id": "abc123", "content": "{\"result\": 3}"}
 ],
 "tools": [{"type": "function", "function": {"name": "my_adder_tool", "description": "Sum two integers", "parameters": {"properties": {"a": {"type": "integer"}, "b": {"type": "integer"}}, "required": ["a", "b"], "type": "object"}}}]
}'

{"id":"AhMwBbuaa5payFr_xsOHzxX","model":"Llama-3.3-70B-Instruct","choices":[{"finish_reason":"stop","index":0,"message":{"refusal":"","role":"assistant","content":"The result of 1 + 2 is 3.","id":"AhMwBbuaa5payFr_xsOHzxX"},"logprobs":null}],"created":1756167668,"object":"chat.completions","usage":{"prompt_tokens":248,"completion_tokens":17,"total_tokens":265}}
```
2025-08-26 15:50:47 -04:00
ccurme
dbebe2ca97 release(core): 0.3.75 (#32693) 2025-08-26 11:12:03 -04:00
ccurme
008043977d release(openai): 0.3.32 (#32691) 2025-08-26 14:05:40 +00:00
Jacob Lee
1459d4f4ce fix(openai): Always add raw response object to OpenAI client errors for invoke (#32655) 2025-08-26 09:59:25 -04:00
ccurme
f33480c2cf feat(core): trace response body on error (#32653) 2025-08-25 14:28:19 -04:00
Mason Daugherty
1c55536ec1 chore(core): add note about backward compatibility for tool_calls in additional_kwargs in JsonOutputKeyToolsParser 2025-08-25 10:30:41 -04:00
Maitrey Talware
622337a297 docs(docs): fixed typos in documentations (#32661)
Minor typo fixes. (Not linked to current open issues)
2025-08-25 10:02:53 -04:00
Shahroz Ahmad
1819c73d10 docs(docs): update Docker to ClickHouse 25.7 with vector_similarity support (#32659)
- **Description:** Updated Docker command to use ClickHouse 25.7 (has
`vector_similarity` index support). Added `CLICKHOUSE_SKIP_USER_SETUP=1`
env param to [bypass default user
setup](https://clickhouse.com/docs/install/docker#managing-default-user)
and allow external network access. There was also a bug where if you try
to access results using `similarity_search_with_relevance_scores`, they
need to unpacked first.

- **Issue:** Fixes #32094 if someone following tutorial with default
Clickhouse configurations.
2025-08-25 09:59:28 -04:00
Kim
8171403b4a docs(docs): rebranding of Azure AI Studio to Azure AI Foundry (#32658)
# Description
Updated documentation to reflect Microsoft’s rebranding of Azure AI
Studio to Azure AI Foundry. This ensures consistency with current Azure
terminology across the docs.

# Issue
N/A

# Dependencies
None
2025-08-25 09:58:31 -04:00
Mason Daugherty
2d0713c2fc fix(infra): ollama CI 2025-08-22 16:40:03 -04:00
Mason Daugherty
8060b371bb fix(infra): ollama CI 2025-08-22 16:37:05 -04:00
Mason Daugherty
7851f66503 release(ollama): 0.3.7 (#32651) 2025-08-22 15:18:40 -04:00
Mason Daugherty
af3b88f58d feat(ollama): update reasoning type to support string values for custom intensity levels (e.g. gpt-oss) (#32650) 2025-08-22 15:11:32 -04:00
itaismith
1eb45d17fb feat(chroma): Add support for collection forking (#32627) 2025-08-21 17:57:55 -04:00
ccurme
8545d4731e release(openai): 0.3.31 (#32646) 2025-08-21 16:50:27 -04:00
Alex Naidis
21f7a9a9e5 fix(openai): allow temperature parameter for gpt-5-chat models (#32624) 2025-08-21 16:40:10 -04:00
sa411022
61bc1bf9cc fix(openai): construct responses api input (#32557) 2025-08-21 15:56:29 -04:00
Shahrukh Shaik
4ba222148d fix(openai): Chat Message Annotations defaults to [ ] if not list or None (#32614) 2025-08-21 15:30:12 -04:00
Christophe Bornet
b825f85bf2 fix(standard-tests): fix BaseStoreAsyncTests.test_set_values_is_idempotent (#32638)
The async version of the test should use the `ayield_keys` method
instead of `yield_keys`.
Otherwise tools such as `blockbuster` may trigger on a blocking call.
2025-08-21 10:07:46 -04:00
Mohammed Mohtasim .M.S
b5c44406eb docs(docs): fix typos in table in "How to load PDFs" documentation (#32635)
**Description:**
Fixed corrupted text in the code cell output of the documentation
notebook. The code cell itself was correct, but the saved output
contained garbage text.

**Issue:**
The saved output in the documentation notebook contained garbage/typo
text in the table name.

**Dependencies:**
None
2025-08-21 10:06:45 -04:00
Emmanuel Leroy
2ec63ca7da docs: migration to langchain_oci (#32619)
Doc update. I missed a couple mentions of the old package.
2025-08-21 10:03:44 -04:00
Christophe Bornet
f896bcdb1d chore(langchain): add mypy pydantic plugin (#32610) 2025-08-19 16:59:59 -04:00
Christophe Bornet
73a7de63aa chore(text-splitters): add mypy pydantic plugin (#32611) 2025-08-19 16:58:12 -04:00
Emmanuel Leroy
cd5f3ee364 docs: migrate from community package to langchain-oci (#32608)
Migrate package from langchain_community to langchain_oci
2025-08-19 16:57:37 -04:00
Christophe Bornet
02d6b9106b chore(core): add mypy pydantic plugin (#32604)
This helps to remove a bunch of mypy false positives.
2025-08-19 09:39:53 -04:00
William FH
b470c79f1d refactor(core): Use duck typing for _StreamingCallbackHandler (#32535)
It's used in langgraph and maybe elsewhere, so would be preferable if it
could just be duck-typed
2025-08-19 05:41:07 -07:00
Mason Daugherty
d204f0dd55 feat(infra): add skip-preview tag check in Vercel deployment script (#32600)
Having vercel attempt to deploy on each commit (even if unrelated to
docs) was getting annoying. Options:

- `[skip-preview]`
- `[no-preview]`
- `[skip-deploy]`

Full example: `fix(core): resolve memory leak [no-preview]`
2025-08-18 17:33:27 -04:00
Mohammad Mohtashim
00259b0061 fix(deepseek): Deep Seek Model for LS Tracing (#32575)
- **Description:** Fix for LS Tracing for Provider for DeepSeek.
  - **Issue:** #32484

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-18 18:48:30 +00:00
Mohammad Mohtashim
4fb1132e30 docs: Classification Notebook Update (#32357)
- **Description:** Updating the Classification notebook which was raised
[here](https://github.com/langchain-ai/langchain/issues/32354)
- **Issue:** Fixes #32354

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-18 18:45:03 +00:00
Mason Daugherty
a6690eb9fd release(anthropic): 0.3.19 (#32595) 2025-08-18 14:25:03 -04:00
Mason Daugherty
f69f9598f5 chore: update references to use the latest version of Claude-3.5 Sonnet (#32594) 2025-08-18 14:11:15 -04:00
Mason Daugherty
8d0fb2d04b fix(anthropic): correct input_token count for streaming (#32591)
* Create usage metadata on
[`message_delta`](https://docs.anthropic.com/en/docs/build-with-claude/streaming#event-types)
instead of at the beginning. Consequently, token counts are not included
during streaming but instead at the end. This allows for accurate
reporting of server-side tool usage (important for billing)
* Add some clarifying comments
* Fix some outstanding Pylance warnings
* Remove unnecessary `text` popping in thinking blocks
* Also now correctly reports `input_cache_read`/`input_cache_creation`
as a result
2025-08-18 17:51:47 +00:00
Mason Daugherty
8042b04da6 fix(anthropic): clean up null file_id fields in citations during message formatting (#32592)
When citations are returned from streaming, they include a `file_id:
null` field in their `content_block_location` structure.

When these citations are passed back to the API in subsequent messages,
the API rejects them with "Extra inputs are not permitted" for the
`file_id` field.
2025-08-18 13:01:52 -04:00
Daehwi Kim
fb74265175 fix(docs): update LangGraph guides link and add JS how-to link (#32583)
**Description:**  
Corrected LangGraph documentation link (changed to “guides”), and added
a link to LangGraph JS how-to guides for clarity.

**Issue:**  
N/A  

**Dependencies:**  
None

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-18 14:27:37 +00:00
Oresztesz Margaritisz
21b61aaf9a fix(docs): Using appropriate argument name in ToolNode for error handling (#32586)
The appropriate `ToolNode` attribute for error handling is called
`handle_tool_errors` instead of `handle_tool_error`.

For further info see [ToolNode source code in
LangGraph](https://github.com/langchain-ai/langgraph/blob/main/libs/prebuilt/langgraph/prebuilt/tool_node.py#L255)

**Twitter handle:** gitaroktato

- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
2025-08-18 10:12:10 -04:00
Keyu Chen
03138f41a0 feat(text-splitters): add optional custom header pattern support (#31887)
## Description

This PR adds support for custom header patterns in
`MarkdownHeaderTextSplitter`, allowing users to define non-standard
Markdown header formats (like `**Header**`) and specify their hierarchy
levels.

**Issue:** Fixes #22738

**Dependencies:** None - this change has no new dependencies

**Key Changes:**
- Added optional `custom_header_patterns` parameter to support
non-standard header formats
- Enable splitting on patterns like `**Header**` and `***Header***`
- Maintain full backward compatibility with existing usage
- Added comprehensive tests for custom and mixed header scenarios

## Example Usage

```python
from langchain_text_splitters import MarkdownHeaderTextSplitter

headers_to_split_on = [
    ("**", "Chapter"),
    ("***", "Section"),
]

custom_header_patterns = {
    "**": 1,   # Level 1 headers
    "***": 2,  # Level 2 headers
}

splitter = MarkdownHeaderTextSplitter(
    headers_to_split_on=headers_to_split_on,
    custom_header_patterns=custom_header_patterns,
)

# Now **Chapter 1** is treated as a level 1 header
# And ***Section 1.1*** is treated as a level 2 header
```

## Testing

-  Added unit tests for custom header patterns
-  Added tests for mixed standard and custom headers
-  All existing tests pass (backward compatibility maintained)
-  Linting and formatting checks pass

---

The implementation provides a flexible solution while maintaining the
simplicity of the existing API. Users can continue using the splitter
exactly as before, with the new functionality being entirely opt-in
through the `custom_header_patterns` parameter.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Claude <noreply@anthropic.com>
2025-08-18 10:10:49 -04:00
Mason Daugherty
fd891ee3d4 revert(anthropic): streaming token counting to defer input tokens until completion (#32587)
Reverts langchain-ai/langchain#32518
2025-08-18 09:48:33 -04:00
ccurme
b8cdbc4eca fix(anthropic): sanitize tool use block when taking directly from content (#32574) 2025-08-18 09:06:57 -04:00
Christophe Bornet
791d309c06 chore(langchain): add mypy warn_unreachable setting (#32529)
See
https://mypy.readthedocs.io/en/stable/config_file.html#confval-warn_unreachable

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-15 23:03:53 +00:00
Mason Daugherty
d3d23e2372 fix(anthropic): streaming token counting to defer input tokens until completion (#32518)
Supersedes #32461

Fixed incorrect input token reporting during streaming when tools are
used. Previously, input tokens were counted at `message_start` before
tool execution, leading to inaccurate counts. Now input tokens are
properly deferred until `message_delta` (completion), aligning with
Anthropic's billing model and SDK expectations.

**Before Fix:**
- Streaming with tools: Input tokens = 0 
- Non-streaming with tools: Input tokens = 472 

**After Fix:**
- Streaming with tools: Input tokens = 472 
- Non-streaming with tools: Input tokens = 472 

Aligns with Anthropic's SDK expectations. The SDK handles input token
updates in `message_delta` events:

```python
# https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/streaming/_messages.py
if event.usage.input_tokens is not None:
      current_snapshot.usage.input_tokens = event.usage.input_tokens
```
2025-08-15 17:49:46 -04:00
Mason Daugherty
2f32c444b8 docs: add details on message IDs and their assignment process (#32534) 2025-08-15 18:22:28 +00:00
Mason Daugherty
fe740a9397 fix(docs): chatbot.ipynb trimming regression (#32561)
Supersedes #32544

Changes to the `trimmer` behavior resulted in the call `"What math
problem was asked?"` to no longer see the relevant query due to the
number of the queries' tokens. Adjusted to not trigger trimming the
relevant part of the message history. Also, add print to the trimmer to
increase observability on what is leaving the context window.

Add note to trimming tut & format links as inline
2025-08-15 14:47:22 +00:00
Rostyslav Borovyk
b2b835cb36 docs(docs): add Oxylabs document loader (#32429)
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- [x] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
  - Examples:
    - feat(core): add multi-tenant support
    - fix(cli): resolve flag parsing error
    - docs(openai): update API usage examples
  - Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
  - Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
  - Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.

- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change. Include a [closing
keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)
if applicable to a relevant issue.
  - **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
  - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-15 10:46:26 -04:00
Christophe Bornet
4656f727da chore(text-splitters): add mypy warn_unreachable (#32558) 2025-08-15 09:45:20 -04:00
Mason Daugherty
34800332bf chore: update integrations table (#32556)
Enhance the integrations table by adding the `js:
'@langchain/community'` reference for several packages and updating the
titles of specific integrations to avoid improper capitalization
2025-08-14 22:37:36 -04:00
Mason Daugherty
06ba80ff68 docs: formatting Tavily (#32555) 2025-08-14 23:41:37 +00:00
Mason Daugherty
2bd8096faa docs: add pre-commit setup instructions to the dev setup guide (#32553) 2025-08-14 20:35:57 +00:00
Mason Daugherty
a0331285d7 fix(core): Support no-args tools by defaulting args to empty dict (#32530)
Supersedes #32408

Description:  
This PR ensures that tool calls without explicitly provided `args` will
default to an empty dictionary (`{}`), allowing tools with no parameters
(e.g. `def foo() -> str`) to be registered and invoked without
validation errors. This change improves compatibility with agent
frameworks that may omit the `args` field when generating tool calls.

Issue:  
See
[langgraph#5722](https://github.com/langchain-ai/langgraph/issues/5722)
–
LangGraph currently emits tool calls without `args`, which leads to
validation errors
when tools with no parameters are invoked. This PR ensures compatibility
by defaulting
`args` to `{}` when missing.

Dependencies:  
None

---------

Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
  - Examples:
    - feat(core): add multi-tenant support
    - fix(cli): resolve flag parsing error
    - docs(openai): update API usage examples
  - Allowed `{TYPE}` values:
- feat, fix, docs, style, refactor, perf, test, build, ci, chore,
revert, release
  - Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma,
deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama,
openai, perplexity, prompty, qdrant, xai
  - Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do
not include it in the PR.

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change. Include a [closing
keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)
if applicable to a relevant issue.
  - **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
  - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

- [ ] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.

---------

Signed-off-by: jitokim <pigberger70@gmail.com>
Co-authored-by: jito <pigberger70@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-14 20:28:36 +00:00
Mason Daugherty
8f68a08528 chore: add Chat LangChain to README.md (#32545) 2025-08-14 16:15:27 -04:00
Lauren Hirata Singh
71651c4a11 docs: update banner (#32552) 2025-08-14 10:54:29 -07:00
Lauren Hirata Singh
44ec1f32b2 docs: banner for academy course (#32550)
Publish at 10AM PT
2025-08-14 10:05:00 -07:00
Yoon
0c81499243 docs(ollama): update API usage examples (#32547)
**Description**  
Corrected a typo in the Ollama chatbot example output in  
`docs/docs/integrations/chat/ollama.ipynb` where `"got-oss"` was  
mistakenly used instead of `"gpt-oss"`.

No functional changes to code; documentation-only update.  
All notebook outputs were cleared to keep the diff minimal.

**Issue**  
N/A

**Dependencies**  
None

**Twitter handle**  
N/A
2025-08-14 12:57:38 -04:00
Mason Daugherty
397cd89988 docs: update outdated README.md content (#32540) 2025-08-13 22:19:38 +00:00
mishraravibhushan
db438d8dcc docs(docs): fixed additional grammar and style issues in how-to index (#32533)
- Fix 'few shot' → 'few-shot' (add hyphen for consistency)
- Fix 'over the database' → 'over a database' (add missing article)
- Fix 'run time' → 'runtime' (more consistent terminology)
- Fix 'in-sync' → 'in sync' (remove unnecessary hyphen)
2025-08-13 14:10:58 -04:00
RecallIO
4f71c35eb0 docs(docs): Add RecallIO.AI as a memory provider (#32331)
Add requested files to add RecallIO as a memory provider.

---------

Co-authored-by: Frey <gfreyburger@gmail.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-13 15:09:56 +00:00
Mason Daugherty
156ae2e69b fix(docs): resolve langchain-azure-ai conflict with langchain-core (#32528) 2025-08-13 14:47:23 +00:00
Shenghang Tsai
f4f919768e docs(langchain): create SiliconFlow provider entry (#32342)
SiliconFlow's provider integration will be maintained at
https://github.com/siliconflow/langchain-siliconflow
This PR introduce the basic instruction to make use of the pip package
2025-08-13 10:41:23 -04:00
Mason Daugherty
7932e1edd1 feat(docs): clarify structured output with tools ordering (#32527) 2025-08-13 10:40:48 -04:00
Mason Daugherty
024422e9b0 chore: update to use new LGP docs url (#32522) 2025-08-13 03:38:39 +00:00
Mason Daugherty
d52036accc chore: update README.md to use pepy downloads badge (#32521) 2025-08-13 03:23:11 +00:00
Mason Daugherty
5b701b5189 fix(tests): add anthropic_proxy to configurable test parameters (for v1) 2025-08-12 18:33:21 -04:00
Mason Daugherty
8848b3e018 fix(tests): add anthropic_proxy to configurable test parameters 2025-08-12 18:27:35 -04:00
Mason Daugherty
80068432ed chore(core): bump lock 2025-08-12 17:32:24 -04:00
Jack
b9dcce95be fix(anthropic): Add proxy (#32409)
Thank you for contributing to LangChain! Follow these steps to mark your
pull request as ready for review. **If any of these steps are not
completed, your PR will not be considered for review.**

- [x] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
fix #30146
- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. **We will not consider
a PR unless these three are passing in CI.** See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-12 21:21:26 +00:00
ccurme
be83ce74a7 feat(anthropic): support cache_control as a kwarg (#31523)
```python
from langchain_anthropic import ChatAnthropic

llm = ChatAnthropic(model="claude-3-5-haiku-latest")
caching_llm = llm.bind(cache_control={"type": "ephemeral"})

caching_llm.invoke(
    [
        HumanMessage("..."),
        AIMessage("..."),
        HumanMessage("..."),  # <-- final message / content block gets cache annotation
    ]
)
```
Potentially useful given's Anthropic's [incremental
caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching#continuing-a-multi-turn-conversation)
capabilities:
> During each turn, we mark the final block of the final message with
cache_control so the conversation can be incrementally cached. The
system will automatically lookup and use the longest previously cached
prefix for follow-up messages.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-12 16:18:24 -04:00
Mason Daugherty
1167e7458e fix(anthropic): update test model names and adjust token count assertions in integration tests (#32422) 2025-08-12 19:39:35 +00:00
Mason Daugherty
d5fd0bca35 docs(anthropic): add documentation for extended context windows in Claude Sonnet 4 (#32517) 2025-08-12 19:16:26 +00:00
Narasimha Badrinath
30d646b576 docs(docs): remove redundant integration details from ChatGradient page. (#32514)
This commit removes redundant integration info from details page,
additionally, changing reference from "DigitalOcean GradientAI" to
"DigitalOcean Gradient™ AI" and updating the setup instructions
accordingly.
2025-08-12 16:14:18 +00:00
Mason Daugherty
262c83763f release(openai): 0.3.30 (#32515) 2025-08-12 16:06:17 +00:00
Mason Daugherty
0024dffa68 feat(openai): officially support verbosity (#32470) 2025-08-12 16:00:30 +00:00
Brody
98797f367a docs: fix broken links (#32513)
**Description:**

Two broken links were reported by another LangChain employee. This PR
fixes those links.

Fixed and tested locally.
  
**Dependencies:**

None
2025-08-12 15:55:37 +00:00
Christophe Bornet
1563099f3f chore(langchain): select ALL rules with exclusions (#31930)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-12 11:51:31 -04:00
rishiraj
7f259863e1 feat(docs): add truefoundry ai gateway (#32362)
This PR adds documentation for integrating [TrueFoundry’s AI
Gateway](https://www.truefoundry.com/ai-gateway) with Langfuse using the
Langraph OpenAI SDK.
The integration sends requests through TrueFoundry’s AI Gateway for
unified governance, observability, and routing, while Langraph runs on
the client side to capture execution traces and telemetry.
- Issue: N/A
- Dependencies: None
- Twitter - https://x.com/truefoundry


tests - Not applicable

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-12 02:26:45 +00:00
Mason Daugherty
c8df6c7ec9 chore: update CONTRIBUTING.md to more clearly mention forum (#32509) 2025-08-11 23:02:21 +00:00
Christophe Bornet
cf2b4bbe09 chore(cli): select ALL rules with exclusions (#31936)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 22:43:11 +00:00
Christophe Bornet
09a616fe85 chore(standard-tests): add ruff rules D (#32347)
See https://docs.astral.sh/ruff/rules/#pydocstyle-d

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 22:26:11 +00:00
Christophe Bornet
46bbd52e81 chore(cli): add ruff rules D1 (#32350)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 22:25:30 +00:00
Christophe Bornet
8b663ed6c6 chore(text-splitters): bump mypy version to 1.17 (#32387)
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 22:24:49 +00:00
Anderson
166c027434 docs: add scrapeless integration documentation (#32081)
Thank you for contributing to LangChain! 
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, core, etc. is being
modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI
changes.
  - Example: "core: add foobar LLM"

- **Description:** Integrated the Scrapeless package to enable Langchain
users to seamlessly incorporate Scrapeless into their agents.
- **Dependencies:** None
- **Twitter handle:** [Scrapelessteam](https://x.com/Scrapelessteam)

- [x] **Add tests and docs**: If you're adding a new integration, you
must include:
1. A test for the integration, preferably unit tests that do not rely on
network access,
2. An example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See [contribution
guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even
optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-11 22:16:15 +00:00
GDanksAnchor
4a2a3fcd43 docs: add anchorbrowser (#32494)
# Description

This PR updates the docs for the
[langchain-anchorbrowser](https://pypi.org/project/langchain-anchorbrowser/)
package. It adds a few tools

[Anchor Browser](https://anchorbrowser.io/?utm=langchain) is the
platform for AI Agentic browser automation, which solves the challenge
of automating workflows for web applications that lack APIs or have
limited API coverage. It simplifies the creation, deployment, and
management of browser-based automations, transforming complex web
interactions into simple API endpoints.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-11 21:48:10 +00:00
Anubhav Dhawan
d46dcf4a60 docs: add Google partner guide for MCP Toolbox (#32356)
This PR introduces a new Google partner guide for MCP Toolbox. The
primary goal of this new documentation is to enhance the discoverability
of MCP Toolbox for developers working within the Google ecosystem,
providing them with a clear and direct path to using our tools.

> [!IMPORTANT]
> This PR contains link to a page which is added in #32344. This will
cause deployment failure until that PR is merged.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-11 21:34:12 +00:00
William Espegren
d2ac3b375c fix(docs): add Spider as a webpage loader (#32453)
[Spider](https://spider.cloud/) is a webpage loader and should be listed
under the
["Webpages"](https://python.langchain.com/docs/integrations/document_loaders/#webpages)
table on the Document loaders page.

Twitter: https://x.com/WilliamEspegren

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 21:23:03 +00:00
Anubhav Dhawan
1e38fd2ce3 docs: add integration guide for MCP Toolbox (#32344)
This PR introduces a new integration guide for MCP Toolbox. The primary
goal of this new documentation is to enhance the discoverability of MCP
Toolbox for developers working within the LangChain ecosystem, providing
them with a clear and direct path to using our tools.

This approach was chosen to provide users with a practical, hands-on
example that they can easily follow.

> [!NOTE]
> The page added in this PR is linked to from a section in Google
partners page added in #32356.

---------

Co-authored-by: Lauren Hirata Singh <lauren@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 21:03:38 +00:00
Yasien Dwieb
155e3740bc fix(docs): handle collection not found error on RAG tutorial when qdrant is selected as vectorStore (#32099)
In [Rag Part 1
Tutorial](https://python.langchain.com/docs/tutorials/rag/), when QDrant
vector store is selected, the sample code does not work
It fails with error  `ValueError: Collection test not found`

So, this fix is creating that collection and ensuring its dimension size
is matching the selection the embedding size of the selected LLM Model

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-11 20:31:24 +00:00
Deepesh Dhakal
f9b4e501a8 fix(docs): update llamacpp.ipynb for installation options on Mac (#32341)
The previous code generated data invalid error.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-11 20:25:35 +00:00
prem-sagar123
5a50802c9a docs: update prompt_templates.mdx (#32405)
```messages_to_pass = [
    HumanMessage(content="What's the capital of France?"),
    AIMessage(content="The capital of France is Paris."),
    HumanMessage(content="And what about Germany?")
]
formatted_prompt = prompt_template.invoke({"msgs": messages_to_pass})
print(formatted_prompt)```

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
2025-08-11 20:16:30 +00:00
Mohammad Mohtashim
9a7e66be60 docs: put standard-tests before other packages (#32424)
- **Description:** Moving `standard-tests` to main ordered section
- **Issue:** #32395

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 20:05:24 +00:00
Mason Daugherty
5597b277c5 feat(docs): add subsection on Tool Artifacts vs. Injected State (#32468)
Clarify the differences between tool artifacts and injected state in
LangChain and LangGraph
2025-08-11 19:53:33 +00:00
Soham Sharma
a1da5697c6 docs: clarify how to get LangSmith API key (#32402)
**Description:**
I've added a small clarification to the chatbot tutorial. The tutorial
mentions setting the `LANGSMITH_API_KEY`, but doesn't explain how a new
user can get the key from the website. This change adds a brief note to
guide them to the Settings page.

P.S. This is my first pull request, so I'm excited to learn and
contribute!

**Issue:**
N/A

**Dependencies:**
N/A

**Twitter handle:**
@sohamactive

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 19:52:05 +00:00
Divyanshu Gupta
11a54b1f1a docs: clarify SystemMessage usage in LangGraph agent notebook (#32320) (#32346)
Closes #32320

This PR updates the `langgraph_agentic_rag.ipynb` notebook to clarify
that LangGraph does not automatically prepend a `SystemMessage`. A
markdown note and an inline Python comment have been added to guide
users to explicitly include a `SystemMessage` when needed.

This improves documentation for developers working with LangGraph-based
agents and avoids confusion about system-level behavior not being
applied.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 19:49:42 +00:00
Mason Daugherty
5ccdcd7b7b feat(ollama): docs updates (#32507) 2025-08-11 15:39:44 -04:00
Mason Daugherty
ee4c2510eb feat: port various nit changes from wip-v0.4 (#32506)
Lots of work that wasn't directly related to core
improvements/messages/testing functionality
2025-08-11 15:09:08 -04:00
mishraravibhushan
7db9e60601 docs(docs): fix grammar, capitalization, and style issues across documentation (#32503)
**Changes made:**
- Fix 'Async programming with langchain' → 'Async programming with
LangChain'
- Fix 'Langchain asynchronous APIs' → 'LangChain asynchronous APIs'
- Fix 'How to: init any model' → 'How to: initialize any model'
- Fix 'async programming with Langchain' → 'async programming with
LangChain'
- Fix 'How to propagate callbacks constructor' → 'How to propagate
callbacks to the constructor'
- Fix 'How to add a semantic layer over graph database' → 'How to add a
semantic layer over a graph database'
- Fix 'Build a Question/Answering system' → 'Build a Question-Answering
system'

**Why is this change needed?**
- Improves documentation clarity and readability
- Maintains consistent LangChain branding throughout the docs
- Fixes grammar issues that could confuse users
- Follows proper documentation standards

**Files changed:**
- `docs/docs/concepts/async.mdx`
- `docs/docs/concepts/tools.mdx`
- `docs/docs/how_to/index.mdx`
- `docs/docs/how_to/callbacks_constructor.ipynb`
- `docs/docs/how_to/graph_semantic.ipynb`
- `docs/docs/tutorials/sql_qa.ipynb`

**Issue:** N/A (documentation improvements)

**Dependencies:** None

**Twitter handle:** https://x.com/mishraravibhush

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-11 13:32:28 -04:00
Mason Daugherty
e5d0a4e4d6 feat(standard-tests): formatting (#32504)
Not touching `pyproject.toml` or chat model related items as to not
interfere with work in wip0.4 branch
2025-08-11 13:30:30 -04:00
Mason Daugherty
457ce9c4b0 feat(text-splitters): ruff fixes and rules (#32502) 2025-08-11 13:28:22 -04:00
Mason Daugherty
27b6b53f20 feat(xai): ruff fixes and rules (#32501) 2025-08-11 13:03:07 -04:00
Christophe Bornet
f55186b38f fix(core): fix beta decorator for properties (#32497) 2025-08-11 12:43:53 -04:00
Mason Daugherty
374f414c91 feat(qdrant): ruff fixes and rules (#32500) 2025-08-11 12:43:41 -04:00
dependabot[bot]
9b3f3dc8d9 chore: bump actions/download-artifact from 4 to 5 (#32495)
Bumps
[actions/download-artifact](https://github.com/actions/download-artifact)
from 4 to 5.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/actions/download-artifact/releases">actions/download-artifact's
releases</a>.</em></p>
<blockquote>
<h2>v5.0.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Update README.md by <a
href="https://github.com/nebuk89"><code>@​nebuk89</code></a> in <a
href="https://redirect.github.com/actions/download-artifact/pull/407">actions/download-artifact#407</a></li>
<li>BREAKING fix: inconsistent path behavior for single artifact
downloads by ID by <a
href="https://github.com/GrantBirki"><code>@​GrantBirki</code></a> in <a
href="https://redirect.github.com/actions/download-artifact/pull/416">actions/download-artifact#416</a></li>
</ul>
<h2>v5.0.0</h2>
<h3>🚨 Breaking Change</h3>
<p>This release fixes an inconsistency in path behavior for single
artifact downloads by ID. <strong>If you're downloading single artifacts
by ID, the output path may change.</strong></p>
<h4>What Changed</h4>
<p>Previously, <strong>single artifact downloads</strong> behaved
differently depending on how you specified the artifact:</p>
<ul>
<li><strong>By name</strong>: <code>name: my-artifact</code> → extracted
to <code>path/</code> (direct)</li>
<li><strong>By ID</strong>: <code>artifact-ids: 12345</code> → extracted
to <code>path/my-artifact/</code> (nested)</li>
</ul>
<p>Now both methods are consistent:</p>
<ul>
<li><strong>By name</strong>: <code>name: my-artifact</code> → extracted
to <code>path/</code> (unchanged)</li>
<li><strong>By ID</strong>: <code>artifact-ids: 12345</code> → extracted
to <code>path/</code> (fixed - now direct)</li>
</ul>
<h4>Migration Guide</h4>
<h5> No Action Needed If:</h5>
<ul>
<li>You download artifacts by <strong>name</strong></li>
<li>You download <strong>multiple</strong> artifacts by ID</li>
<li>You already use <code>merge-multiple: true</code> as a
workaround</li>
</ul>
<h5>⚠️ Action Required If:</h5>
<p>You download <strong>single artifacts by ID</strong> and your
workflows expect the nested directory structure.</p>
<p><strong>Before v5 (nested structure):</strong></p>
<pre lang="yaml"><code>- uses: actions/download-artifact@v4
  with:
    artifact-ids: 12345
    path: dist
# Files were in: dist/my-artifact/
</code></pre>
<blockquote>
<p>Where <code>my-artifact</code> is the name of the artifact you
previously uploaded</p>
</blockquote>
<p><strong>To maintain old behavior (if needed):</strong></p>
<pre lang="yaml"><code>&lt;/tr&gt;&lt;/table&gt; 
</code></pre>
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="634f93cb29"><code>634f93c</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/download-artifact/issues/416">#416</a>
from actions/single-artifact-id-download-path</li>
<li><a
href="b19ff43027"><code>b19ff43</code></a>
refactor: resolve download path correctly in artifact download tests
(mainly ...</li>
<li><a
href="e262cbee4a"><code>e262cbe</code></a>
bundle dist</li>
<li><a
href="bff23f9308"><code>bff23f9</code></a>
update docs</li>
<li><a
href="fff8c148a8"><code>fff8c14</code></a>
fix download path logic when downloading a single artifact by id</li>
<li><a
href="448e3f862a"><code>448e3f8</code></a>
Merge pull request <a
href="https://redirect.github.com/actions/download-artifact/issues/407">#407</a>
from actions/nebuk89-patch-1</li>
<li><a
href="47225c44b3"><code>47225c4</code></a>
Update README.md</li>
<li>See full diff in <a
href="https://github.com/actions/download-artifact/compare/v4...v5">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=actions/download-artifact&package-manager=github_actions&previous-version=4&new-version=5)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-11 12:41:58 -04:00
lineuman
afc3b1824c docs(deepseek): Add DeepSeek model option (#32481) 2025-08-11 09:20:39 -04:00
ran8080
130b7e6170 docs(docs): add missing name to AIMessage in example (#32482)
**Description:**

In the `docs/docs/how_to/structured_output.ipynb` notebook, an
`AIMessage` within the tool-calling few-shot example was missing the
`name="example_assistant"` parameter. This was inconsistent with the
other `AIMessage` instances in the same list.

This change adds the missing `name` parameter to ensure all examples in
the section are consistent, improving the clarity and correctness of the
documentation.

**Issue:** N/A

**Dependencies:** N/A
2025-08-11 09:20:09 -04:00
Navanit Dubey
d40fa534c1 docs(docs): use model_json_schema() (#32485)
While trying the line People.schema got a warning. 
```The `schema` method is deprecated; use `model_json_schema` instead```

So made the changes and now working file.

Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. **If any of these steps are not completed, your PR will not be considered for review.**

- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
  - Examples:
    - feat(core): add multi-tenant support
    - fix(cli): resolve flag parsing error
    - docs(openai): update API usage examples
  - Allowed `{TYPE}` values:
    - feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release
  - Allowed `{SCOPE}` values (optional):
    - core, cli, langchain, standard-tests, docs, anthropic, chroma, deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant, xai
  - Note: the `{DESCRIPTION}` must not start with an uppercase letter.
  - Once you've written the title, please delete this checklist item; do not include it in the PR.

- [ ] **PR message**: ***Delete this entire checklist*** and replace with
  - **Description:** a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue.
  - **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
  - **Dependencies:** any dependencies required for this change
  - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out!

- [ ] **Add tests and docs**: If you're adding a new integration, you must include:
  1. A test for the integration, preferably unit tests that do not rely on network access,
  2. An example notebook showing its use. It lives in `docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. **We will not consider a PR unless these three are passing in CI.** See [contribution guidelines](https://python.langchain.com/docs/contributing/) for more.

Additional guidelines:

- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
2025-08-11 09:19:14 -04:00
mishraravibhushan
20bd296421 docs(docs): fix grammar in "How to deal with high-cardinality categoricals" guide title (#32488)
Description:
Corrected the guide title from "How deal with high cardinality
categoricals" to "How to deal with high-cardinality categoricals".
- Added missing "to" for grammatical correctness.
- Hyphenated "high-cardinality" for standard compound adjective usage.

Issue:
N/A

Dependencies:
None

Twitter handle:
https://x.com/mishraravibhush
2025-08-11 09:17:51 -04:00
ccurme
9259eea846 fix(docs): use pepy for integration package download badges (#32491)
pypi stats has been down for some time.
2025-08-10 18:41:36 -04:00
ccurme
afcb097ef5 fix(docs): DigitalOcean Gradient: link to correct provider page and update page title (#32490) 2025-08-10 17:29:44 -04:00
ccurme
088095b663 release(openai): release 0.3.29 (#32463) 2025-08-08 11:04:33 -04:00
Mason Daugherty
c31236264e chore: formatting across codebase (#32466) 2025-08-08 10:20:10 -04:00
ccurme
02001212b0 fix(openai): revert some changes (#32462)
Keep coverage on `output_version="v0"` (increasing coverage is being
managed in v0.4 branch).
2025-08-08 08:51:18 -04:00
Mason Daugherty
00244122bd feat(openai): minimal and verbosity (#32455) 2025-08-08 02:24:21 +00:00
ccurme
6727d6e8c8 release(core): 0.3.74 (#32454) 2025-08-07 16:39:01 -04:00
Michael Matloka
5036bd7adb fix(openai): don't crash get_num_tokens_from_messages on gpt-5 (#32451) 2025-08-07 16:33:19 -04:00
ccurme
ec2b34a02d feat(openai): custom tools (#32449) 2025-08-07 16:30:01 -04:00
Mason Daugherty
145d38f7dd test(openai): add tests for prompt_cache_key parameter and update docs (#32363)
Introduce tests to validate the behavior and inclusion of the
`prompt_cache_key` parameter in request payloads for the `ChatOpenAI`
model.
2025-08-07 15:29:47 -04:00
ccurme
68c70da33e fix(openai): add in output_text (#32450)
This property was deleted in `openai==1.99.2`.
2025-08-07 15:23:56 -04:00
Eugene Yurtsev
754528d23f feat(langchain): add stuff and map reduce chains (#32333)
* Add stuff and map reduce chains
* We'll need to rename and add unit tests to the chains prior to
official release
2025-08-07 15:20:05 -04:00
CLOVA Studio 개발
ac706c77d4 docs(docs): update v0.1.1 chatModel document on langchain-naver. (#32445)
## **Description:** 
This PR was requested after the `langchain-naver` partner-managed
packages were released
[v0.1.1](https://pypi.org/project/langchain-naver/0.1.1/).
So we've updated some our documents with the additional changed
features.

## **Dependencies:** 
https://github.com/langchain-ai/langchain/pull/30956

---------

Co-authored-by: 김필환[AI Studio Dev1] <pilhwan.kim@navercorp.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-07 15:45:50 +00:00
Tianyu Chen
8493887b6f docs: update Docker image name for jaguardb setup (#32438)
**Description**
Updated the quick setup instructions for JaguarDB in the documentation.
Replaced the outdated Docker image `jaguardb/jaguardb_with_http` with
the current recommended image `jaguardb/jaguardb` for pulling and
running the server.
2025-08-07 11:23:29 -04:00
Christophe Bornet
a647073b26 feat(standard-tests): add a property to set the name of the parameter for the number of results to return (#32443)
Not all retrievers use `k` as param name to set the number of results to
return. Even in LangChain itself. Eg:
bc4251b9e0/libs/core/langchain_core/indexing/in_memory.py (L31)

So it's helpful to be able to change it for a given retriever.
The change also adds hints to disable the tests if the retriever doesn't
support setting the param in the constructor or in the invoke method
(for instance, the `InMemoryDocumentIndex` in the link supports in the
constructor but not in the invoke method).

This change is backward compatible.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-07 11:22:24 -04:00
ccurme
e120604774 fix(infra): exclude pre-releases from previous version testing (#32447) 2025-08-07 10:18:59 -04:00
ccurme
06d8754b0b release(core): 0.3.73 (#32446) 2025-08-07 09:03:53 -04:00
ccurme
6e108c1cb4 feat(core): zero-out token costs for cache hits (#32437) 2025-08-07 08:49:34 -04:00
John Bledsoe
bc4251b9e0 fix(core): fix index checking when merging lists (#32431)
**Description:** fix an issue I discovered when attempting to merge
messages in which one message has an `index` key in its content
dictionary and another does not.
2025-08-06 12:47:33 -04:00
Nelson Sproul
2543007436 docs(langchain): complete PDF embedding example for OpenAI, also some minor doc fixes (#32426)
For OpenAI PDF attaching, note the needed metadata.

Also some minor doc updates.
2025-08-06 12:16:16 -04:00
Mason Daugherty
ba83f58141 release(groq): 0.3.7 (#32417) 2025-08-05 15:13:08 -04:00
Mason Daugherty
fb490b0c39 feat(groq): losen restrictions on reasoning_effort, inject effort in meta, update tests (#32415) 2025-08-05 15:03:38 -04:00
Mason Daugherty
419c173225 feat(groq): openai-oss (#32411)
use new openai-oss for integration tests, set module-level testing model
names and improve robustness of tool tests
2025-08-05 14:18:56 -04:00
Pranav Bhartiya
4011257c25 docs: add Windows-specific setup instructions (#32399)
**Description:** This PR improves the contribution setup guide by adding
comprehensive Windows-specific instructions. The changes address a
common pain point for Windows contributors who don't have `make`
installed by default, making the LangChain contribution process more
accessible across different operating systems.
The main improvements include:

- Added a dedicated "Windows Users" section with multiple installation
options for `make` (Chocolatey, Scoop, WSL)
- Provided direct `uv` commands as alternatives to all `make` commands
throughout the setup guide
- Included Windows-specific instructions for testing, formatting,
linting, and spellchecking
- Enhanced the documentation to be more inclusive for Windows developers

This change makes it easier for Windows users to contribute to LangChain
without requiring additional tool installation, while maintaining the
existing workflow for users who already have `make` available.

**Issue:** This addresses the common barrier Windows users face when
trying to contribute to LangChain due to missing `make` commands.

**Dependencies:** None required - this is purely a documentation
improvement.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-05 15:00:03 +00:00
Kanav Bansal
9de0892a77 fix(docs): update package names across multiple integration docs (#32393)
## **Description:** 
Updated incorrect package names across multiple integration docs by
replacing underscores with hyphens to reflect their actual names on
PyPI. This aligns with the actual PyPI package names and prevents
potential confusion or installation issues.
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-04 17:38:29 +00:00
Narasimha Badrinath
dd9f5d7cde feat(docs): add langchain-gradientai as provider (#32202)
langchain-gradientai is Digitalocean's integration with Langchain. It
will help users to build langchain applications using Digitalocean's
GradientAI platform.

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-04 14:57:59 +00:00
Ammar Younas
d348cfe968 docs: fix minor typos in image generation description (#32375)
Description:
Fixed minor typos in the `google_imagen.ipynb` integration notebook
related to image generation prompt formatting. No functional changes
were made — just a documentation correction to improve clarity.
2025-08-04 10:52:05 -04:00
Kanav Bansal
84c5048cb8 fix(docs): correct package names in FeatureTables.js (#32377)
## **Description:** 
Updated incorrect package names in `FeatureTables.js` by replacing
underscores with hyphens to reflect their actual names on PyPI. This
aligns with the actual PyPI package names and prevents potential
confusion or installation issues.

The following package names were corrected:
- `langchain_aws` ➝ `langchain-aws`
- `langchain_community` ➝ `langchain-community`
- `langchain_elasticsearch` ➝ `langchain-elasticsearch`
- `langchain_google_community` ➝ `langchain-google-community`

 
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
2025-08-04 10:51:32 -04:00
garciasces
d318c655b6 fix(docs): inconsistent docs for Google Vertex AI (#32381)
Description: Documentation is inconsistent with API docs.

Current documentation implies that to use the integration you must have
credentials configured AND store the path to a service account JSON
file.

API docs explain that you must only complete EITHER of the steps
regarding credentials.

I have updated the docs to make them consistent with the API wording.
2025-08-04 10:50:50 -04:00
Kanav Bansal
df4eed0cea fix(docs): update package names, class links and package links across kv_store_feat_table.py (#32353)
## **Description:** 
Refactored multiple entries in `kv_store_feat_table.py` to ensure that
all vector store metadata is accurate, consistent, and aligned with
LangChain's latest documentation structure and PyPI naming standards.

**Key improvements across all updated entries:**
- Updated `class` links to point to their respective **docs-based
integration pages** (e.g., `/docs/integrations/stores/...`) instead of
raw API reference URLs.
- Corrected `package` display names to use **hyphenated PyPI-compliant
names** (e.g., `langchain-astradb` instead of `langchain_astradb`).
- Updated `package` links to point to the **specific class-level API
references** (e.g., `/api_reference/.../storage/...ClassName.html`) for
precision.

These improvements enhance:
- Navigation experience for users
- Alignment with PyPI and docs naming conventions
- Clarity across LangChain’s integrations documentation


 
## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
2025-08-04 09:46:54 -04:00
Dhanesh Gujrathi
a25e196fe9 docs(docs): add link for ALPHAVANTAGE_API_KEY generation in integration notebook (#32364)
docs(alpha_vantage): add link for ALPHAVANTAGE_API_KEY generation in
integration notebook

**Description:**

This PR updates the `docs/docs/integrations/tools/alpha_vantage.ipynb`
integration notebook to help users locate the API key registration page
for Alpha Vantage. The following markdown line was added:

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-08-03 19:49:44 +00:00
Ethan Knights
3137d49bd9 docs: minor agent tools markdown improvement (#32367)
Minor sharpening of agent tool doc.
2025-08-03 15:42:19 -04:00
Raphaël
9a2f49df1f fix(docs): add missing space (#32349) 2025-07-31 09:28:51 -04:00
Mason Daugherty
32e5040a42 chore: add CLAUDE.md (#32334) 2025-07-30 23:04:45 +00:00
ccurme
a9e52ca605 chore(openai): bump openai sdk (#32322) 2025-07-30 10:58:18 -04:00
Kanav Bansal
e2bc8f19c0 docs(docs): update RAG tutorials link across multiple vector store docs (AstraDB, DatabricksVectorSearch, FAISS, Redis, etc.) (#32301)
## **Description:** 
This PR updates the internal documentation link for the RAG tutorials to
reflect the updated path. Previously, the link pointed to the root
`/docs/tutorials/`, which was generic. It now correctly routes to the
RAG-specific tutorial page for the following vector store docs.

1. AstraDBVectorStore
2. Clickhouse
3. CouchbaseSearchVectorStore
4. DatabricksVectorSearch
5. ElasticsearchStore
6. FAISS
7. Milvus
8. MongoDBAtlasVectorSearch
9. openGauss
10. PGVector
11. PGVectorStore
12. PineconeVectorStore
13. QdrantVectorStore
14. Redis
15. SQLServer

## **Issue:** N/A
## **Dependencies:** None
## **Twitter handle:** N/A
2025-07-30 09:46:01 -04:00
Mason Daugherty
fbd5a238d8 fix(core): revert "fix: tool call streaming bug with inconsistent indices from Qwen3" (#32307)
Reverts langchain-ai/langchain#32160

Original issue stems from using `ChatOpenAI` to interact with a `qwen`
model. Recommended to use
[langchain-qwq](https://python.langchain.com/docs/integrations/chat/qwq/)
which is built for Qwen
2025-07-29 10:26:38 -04:00
HerrDings
fc2f66ca80 docs: fixed link to docs of unstructured (#32306)
In the section [How to load documents from a
directory](https://python.langchain.com/docs/how_to/document_loader_directory/)
there is a link to the docs of *unstructured*. When you click this link,
it tells you that it has moved. Accordingly this PR fixes this link in
LangChain docs directly

from: `https://unstructured-io.github.io/unstructured/#`
to: `https://docs.unstructured.io/`
2025-07-29 10:12:22 -04:00
Mason Daugherty
0e287763cd fix: lint 2025-07-28 18:49:43 -04:00
Copilot
0b56c1bc4b fix: tool call streaming bug with inconsistent indices from Qwen3 (#32160)
Fixes a streaming bug where models like Qwen3 (using OpenAI interface)
send tool call chunks with inconsistent indices, resulting in
duplicate/erroneous tool calls instead of a single merged tool call.

## Problem

When Qwen3 streams tool calls, it sends chunks with inconsistent `index`
values:
- First chunk: `index=1` with tool name and partial arguments  
- Subsequent chunks: `index=0` with `name=None`, `id=None` and argument
continuation

The existing `merge_lists` function only merges chunks when their
`index` values match exactly, causing these logically related chunks to
remain separate, resulting in multiple incomplete tool calls instead of
one complete tool call.

```python
# Before fix: Results in 1 valid + 1 invalid tool call
chunk1 = AIMessageChunk(tool_call_chunks=[
    {"name": "search", "args": '{"query":', "id": "call_123", "index": 1}
])
chunk2 = AIMessageChunk(tool_call_chunks=[
    {"name": None, "args": ' "test"}', "id": None, "index": 0}  
])
merged = chunk1 + chunk2  # Creates 2 separate tool calls

# After fix: Results in 1 complete tool call
merged = chunk1 + chunk2  # Creates 1 merged tool call: search({"query": "test"})
```

## Solution

Enhanced the `merge_lists` function in `langchain_core/utils/_merge.py`
with intelligent tool call chunk merging:

1. **Preserves existing behavior**: Same-index chunks still merge as
before
2. **Adds special handling**: Tool call chunks with
`name=None`/`id=None` that don't match any existing index are now merged
with the most recent complete tool call chunk
3. **Maintains backward compatibility**: All existing functionality
works unchanged
4. **Targeted fix**: Only affects tool call chunks, doesn't change
behavior for other list items

The fix specifically handles the pattern where:
- A continuation chunk has `name=None` and `id=None` (indicating it's
part of an ongoing tool call)
- No matching index is found in existing chunks
- There exists a recent tool call chunk with a valid name or ID to merge
with

## Testing

Added comprehensive test coverage including:
-  Qwen3-style chunks with different indices now merge correctly
-  Existing same-index behavior preserved  
-  Multiple distinct tool calls remain separate
-  Edge cases handled (empty chunks, orphaned continuations)
-  Backward compatibility maintained

Fixes #31511.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-28 22:31:41 +00:00
Copilot
ad88e5aaec fix(core): resolve cache validation error by safely converting Generation to ChatGeneration objects (#32156)
## Problem

ChatLiteLLM encounters a `ValidationError` when using cache on
subsequent calls, causing the following error:

```
ValidationError(model='ChatResult', errors=[{'loc': ('generations', 0, 'type'), 'msg': "unexpected value; permitted: 'ChatGeneration'", 'type': 'value_error.const', 'ctx': {'given': 'Generation', 'permitted': ('ChatGeneration',)}}])
```

This occurs because:
1. The cache stores `Generation` objects (with `type="Generation"`)
2. But `ChatResult` expects `ChatGeneration` objects (with
`type="ChatGeneration"` and a required `message` field)
3. When cached values are retrieved, validation fails due to the type
mismatch

## Solution

Added graceful handling in both sync (`_generate_with_cache`) and async
(`_agenerate_with_cache`) cache methods to:

1. **Detect** when cached values contain `Generation` objects instead of
expected `ChatGeneration` objects
2. **Convert** them to `ChatGeneration` objects by wrapping the text
content in an `AIMessage`
3. **Preserve** all original metadata (`generation_info`)
4. **Allow** `ChatResult` creation to succeed without validation errors

## Example

```python
# Before: This would fail with ValidationError
from langchain_community.chat_models import ChatLiteLLM
from langchain_community.cache import SQLiteCache
from langchain.globals import set_llm_cache

set_llm_cache(SQLiteCache(database_path="cache.db"))
llm = ChatLiteLLM(model_name="openai/gpt-4o", cache=True, temperature=0)

print(llm.predict("test"))  # Works fine (cache empty)
print(llm.predict("test"))  # Now works instead of ValidationError

# After: Seamlessly handles both Generation and ChatGeneration objects
```

## Changes

- **`libs/core/langchain_core/language_models/chat_models.py`**: 
  - Added `Generation` import from `langchain_core.outputs`
- Enhanced cache retrieval logic in `_generate_with_cache` and
`_agenerate_with_cache` methods
- Added conversion from `Generation` to `ChatGeneration` objects when
needed

-
**`libs/core/tests/unit_tests/language_models/chat_models/test_cache.py`**:
- Added test case to validate the conversion logic handles mixed object
types

## Impact

- **Backward Compatible**: Existing code continues to work unchanged
- **Minimal Change**: Only affects cache retrieval path, no API changes
- **Robust**: Handles both legacy cached `Generation` objects and new
`ChatGeneration` objects
- **Preserves Data**: All original content and metadata is maintained
during conversion

Fixes #22389.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-28 22:28:16 +00:00
Mason Daugherty
30e3ed6a19 fix: add space in run-name for better readability 2025-07-28 17:46:27 -04:00
Mason Daugherty
8641a95c43 fix: update run-name in scheduled_test.yml to include dynamic inputs 2025-07-28 17:45:05 -04:00
Mason Daugherty
df70c5c186 chore: update actions run-names and add default inputs (#32293) 2025-07-28 17:33:27 -04:00
Mason Daugherty
d5ca77e065 fix: remove erreneous rocket emoji in run-name 2025-07-28 17:11:14 -04:00
Mason Daugherty
b7e4797e8b release(anthropic): 0.3.18 (#32292) 2025-07-28 17:07:11 -04:00
Mason Daugherty
3a487bf720 refactor(anthropic): AnthropicLLM to use Messages API (#32290)
re: #32189
2025-07-28 16:22:58 -04:00
Mason Daugherty
e5fd67024c fix: update link text for reporting security vulnerabilities in SECURITY.md 2025-07-28 15:05:31 -04:00
Mason Daugherty
b86841ac40 fix: update alt attribute for GitHub Codespace badge in README 2025-07-28 15:04:57 -04:00
Mason Daugherty
8db16b5633 fix: use new Google model names in examples (#32288) 2025-07-28 19:03:42 +00:00
Mason Daugherty
6f10160a45 fix: scripts/ errors 2025-07-28 15:03:25 -04:00
Mason Daugherty
e79e0bd6b4 fix(openai): add max_retries parameter to ChatOpenAI for handling 503 capacity errors (#32286)
Some integration tests were failing
2025-07-28 13:58:23 -04:00
ccurme
c55294ecb0 chore(core): add test for nested pydantic fields in schemas (#32285) 2025-07-28 17:27:24 +00:00
Mason Daugherty
7a26c3d233 fix: update bar_model to use the correct model version claude-3-7-sonnet-20250219 (#32284) 2025-07-28 12:57:40 -04:00
Mason Daugherty
c6ffac3ce0 refactor: mdx lint (#32282) 2025-07-28 12:56:22 -04:00
Mason Daugherty
a07d2c5016 refactor: remove references to unsupported model claude-3-sonnet-20240229 (#32281)
Addresses some (but not all) test issues brought about in #32280
2025-07-28 11:57:43 -04:00
Aleksandr Filippov
f0b6baa0ef fix(core): track within-batch deduplication in indexing num_skipped count (#32273)
**Description:** Fixes incorrect `num_skipped` count in the LangChain
indexing API. The current implementation only counts documents that
already exist in RecordManager (cross-batch duplicates) but fails to
count documents removed during within-batch deduplication via
`_deduplicate_in_order()`.

This PR adds tracking of the original batch size before deduplication
and includes the difference in `num_skipped`, ensuring that `num_added +
num_skipped` equals the total number of input documents.

**Issue:** Fixes incorrect document count reporting in indexing
statistics

**Dependencies:** None

Fixes #32272

---------

Co-authored-by: Alex Feel <afilippov@spotware.com>
2025-07-28 09:58:51 -04:00
Mason Daugherty
12c0e9b7d8 fix(docs): local API reference documentation build (#32271)
ensure all relevant packages are correctly processed - cli wasn't
included, also fix ValueError
2025-07-28 00:50:20 -04:00
Mason Daugherty
ed682ae62d fix: explicitly tell uv to copy when using devcontainer (#32267) 2025-07-28 00:01:06 -04:00
Mason Daugherty
caf1919217 fix: devcontainer to use volume to store the workspace (#32266)
should resolve the file sharing issue for users on macOS.
2025-07-27 23:43:06 -04:00
Mason Daugherty
904066f1ec feat: add VSCode configuration files for Python development (#32263) 2025-07-27 23:37:59 -04:00
Mason Daugherty
96cbd90cba fix: formatting issues in docstrings (#32265)
Ensures proper reStructuredText formatting by adding the required blank
line before closing docstring quotes, which resolves the "Block quote
ends without a blank line; unexpected unindent" warning.
2025-07-27 23:37:47 -04:00
Mason Daugherty
a8a2cff129 Merge branch 'master' of github.com:langchain-ai/langchain 2025-07-27 23:34:59 -04:00
Mason Daugherty
f4ff4514ef fix: update workspace folder path in devcontainer configuration 2025-07-27 23:34:57 -04:00
Mason Daugherty
d1679cec91 chore: add .editorconfig for consistent coding styles across files (#32261)
Following existing codebase conventions
2025-07-27 23:25:30 -04:00
Mason Daugherty
5295f2add0 fix: update dev container name to match service name 2025-07-27 22:30:16 -04:00
Mason Daugherty
5f5b87e9a3 fix: update service name in devcontainer configuration 2025-07-27 22:28:47 -04:00
Mason Daugherty
e0ef98dac0 feat: add markdownlint configuration file (#32264) 2025-07-27 22:24:58 -04:00
Mason Daugherty
62212c7ee2 fix: update links in SECURITY.md to use markdown format 2025-07-27 21:54:25 -04:00
Mason Daugherty
9d38f170ce refactor: enhance workflow names and descriptions for clarity (#32262) 2025-07-27 21:31:59 -04:00
Mason Daugherty
c6cb1fae61 fix: devcontainer (#32260) 2025-07-27 20:24:16 -04:00
Kanav Bansal
e42b1d23dc docs(docs): update RAG tutorials link to point to correct path (#32256)
- **Description:** This PR updates the internal documentation link for
the RAG tutorials to reflect the updated path. Previously, the link
pointed to the root `/docs/tutorials/`, which was generic. It now
correctly routes to the RAG-specific tutorial page.
  - **Issue:** N/A
  - **Dependencies:** None
  - **Twitter handle:** N/A
2025-07-27 20:00:41 -04:00
Mason Daugherty
53d0bfe9cd refactor: markdownlint (#32259) 2025-07-27 20:00:16 -04:00
Mason Daugherty
eafab52483 refactor: markdownlint SECURITY.md (#32258) 2025-07-27 19:55:25 -04:00
Christophe Bornet
efdfa00d10 chore(langchain): add ruff rules ARG (#32110)
See https://docs.astral.sh/ruff/rules/#flake8-unused-arguments-arg

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-26 18:32:34 -04:00
Christophe Bornet
a2ad5aca41 chore(langchain): add ruff rules TC (#31921)
See https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc
2025-07-26 18:27:26 -04:00
Mason Daugherty
5ecbb5f277 fix(docs): temporary workaround until the underlying dependency issues in the AI21 package ecosystem are resolved. (#32248) 2025-07-25 15:12:44 -04:00
Mason Daugherty
c1028171af fix(docs): update protobuf version constraint to <5.0 in vercel_overrides.txt (#32247) 2025-07-25 15:08:44 -04:00
ccurme
f6236d9f12 fix(infra): add pypdf to vercel overrides (#32242)
>   × No solution found when resolving dependencies:
  ╰─▶ Because only langchain-neo4j==0.5.0 is available and
langchain-neo4j==0.5.0 depends on neo4j-graphrag>=1.9.0, we can conclude
that all versions of langchain-neo4j depend on neo4j-graphrag>=1.9.0.
      And because only neo4j-graphrag<=1.9.0 is available and
neo4j-graphrag==1.9.0 depends on pypdf>=5.1.0,<6.0.0, we can conclude
that all versions of langchain-neo4j depend on pypdf>=5.1.0,<6.0.0.
And because langchain-upstage==0.6.0 depends on pypdf>=4.2.0,<5.0.0
and only langchain-upstage==0.6.0 is available, we can conclude that
all versions of langchain-neo4j and all versions of langchain-upstage
      are incompatible.
And because you require langchain-neo4j and langchain-upstage, we can
      conclude that your requirements are unsatisfiable.

---------

Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-25 15:05:21 -04:00
Mason Daugherty
df20f111a8 fix(docs): add validation for repository format and name in API docs build workflow (#32246)
for build
2025-07-25 15:05:06 -04:00
Eugene Yurtsev
db22311094 ci(infra): no need for . in the regexp (#32245)
No need for allowing `.`
2025-07-25 15:02:02 -04:00
Mason Daugherty
f624ad489a feat(docs): improve devx, fix Makefile targets (#32237)
**TL;DR much of the provided `Makefile` targets were broken, and any
time I wanted to preview changes locally I either had to refer to a
command Chester gave me or try waiting on a Vercel preview deployment.
With this PR, everything should behave like normal.**

Significant updates to the `Makefile` and documentation files, focusing
on improving usability, adding clear messaging, and fixing/enhancing
documentation workflows.

### Updates to `Makefile`:

#### Enhanced build and cleaning processes:
- Added informative messages (e.g., "📚 Building LangChain
documentation...") to makefile targets like `docs_build`, `docs_clean`,
and `api_docs_build` for better user feedback during execution.
- Introduced a `clean-cache` target to the `docs` `Makefile` to clear
cached dependencies and ensure clean builds.

#### Improved dependency handling:
- Modified `install-py-deps` to create a `.venv/deps_installed` marker,
preventing redundant/duplicate dependency installations and improving
efficiency.

#### Streamlined file generation and infrastructure setup:
- Added caching for the LangServe README download and parallelized
feature table generation
- Added user-friendly completion messages for targets like `copy-infra`
and `render`.

#### Documentation server updates:
- Enhanced the `start` target with messages indicating server start and
URL for local documentation viewing.

---

### Documentation Improvements:

#### Content clarity and consistency:
- Standardized section titles for consistency across documentation
files.
[[1]](diffhunk://#diff-9b1a85ea8a9dcf79f58246c88692cd7a36316665d7e05a69141cfdc50794c82aL1-R1)
[[2]](diffhunk://#diff-944008ad3a79d8a312183618401fcfa71da0e69c75803eff09b779fc8e03183dL1-R1)
- Refined phrasing and formatting in sections like "Dependency
management" and "Formatting and linting" for better readability.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L6-R6)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L84-R82)

#### Enhanced workflows:
- Updated instructions for building and viewing documentation locally,
including tips for specifying server ports and handling API reference
previews.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L60-R94)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
- Expanded guidance on cleaning documentation artifacts and using
linting tools effectively.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)

#### API reference documentation:
- Improved instructions for generating and formatting in-code
documentation, highlighting best practices for docstring writing.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L144-R186)

---

### Minor Changes:
- Added support for a new package name (`langchain_v1`) in the API
documentation generation script.
- Fixed minor capitalization and formatting issues in documentation
files.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L40-R40)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L166-R160)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-25 14:49:03 -04:00
Eugene Yurtsev
549ecd3e78 chore(infra): harden api docs build workflow (#32243)
Harden permissions for api docs build workflow
2025-07-25 14:40:20 -04:00
dishaprakash
a0671676ae feat(docs): add PGVectorStore (#30950)
Thank you for contributing to LangChain!

-  **Adding documentation for PGVectorStore**: 
docs: Adding documentation for the new PGVectorStore as a part of
langchain-postgres

- **Add docs**: The notebook for PGVectorStore is now added to the
directory `docs/docs/integrations`.
As a part of this change, we've also updated the VectorStore features
table and VectorStoreTabs

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-07-25 13:22:58 -04:00
Christophe Bornet
12ae42c5e9 chore(langchain): add ruff rules D1 (except D100 and D104) (#32123) 2025-07-25 11:59:48 -04:00
Christophe Bornet
e1238b8085 chore(langchain): add ruff rules SLF (#32112)
See https://docs.astral.sh/ruff/rules/private-member-access/
2025-07-25 11:56:40 -04:00
Chaitanya varma
8f5ec20ccf chore(langchain): strip_ansi fucntion to remove ANSI escape sequences (#32200)
**Description:** 
Fixes a bug in the file callback test where ANSI escape codes were
causing test failures. The improved test now properly handles ANSI
escape sequences by:
- Using exact string comparison instead of substring checking
- Applying the `strip_ansi` function consistently to all file contents
- Adding descriptive assertion messages
- Maintaining test coverage and backward compatibility

The changes ensure tests pass reliably even when terminal control
sequences are present in the output

**Issue:** Fixes #32150

**Dependencies:** None required - uses existing dependencies only.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-07-25 15:53:19 +00:00
niceg
0d6f915442 fix: LLM mimicking Unicode responses due to forced Unicode conversion of non-ASCII characters. (#32222)
fix: Fix LLM mimicking Unicode responses due to forced Unicode
conversion of non-ASCII characters.

- **Description:** This PR fixes an issue where the LLM would mimic
Unicode responses due to forced Unicode conversion of non-ASCII
characters in tool calls. The fix involves disabling the `ensure_ascii`
flag in `json.dumps()` when converting tool calls to OpenAI format.
- **Issue:** Fixes ↓↓↓
input:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "你好啊集团"}'}}]}
```
output:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "\\u4f60\\u597d\\u554a\\u96c6\\u56e2"}'}}]}
```
then:
llm will mimic outputting unicode. Unicode's vast number of symbols can
lengthen LLM responses, leading to slower performance.
<img width="686" height="277" alt="image"
src="https://github.com/user-attachments/assets/28f3b007-3964-4455-bee2-68f86ac1906d"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-24 17:01:31 -04:00
Mason Daugherty
d53ebf367e fix(docs): capitalization, codeblock formatting, and hyperlinks, note blocks (#32235)
widespread cleanup attempt
2025-07-24 16:55:04 -04:00
Copilot
54542b9385 docs(openai): add comprehensive documentation and examples for extra_body + others (#32149)
This PR addresses the common issue where users struggle to pass custom
parameters to OpenAI-compatible APIs like LM Studio, vLLM, and others.
The problem occurs when users try to use `model_kwargs` for custom
parameters, which causes API errors.

## Problem

Users attempting to pass custom parameters (like LM Studio's `ttl`
parameter) were getting errors:

```python
#  This approach fails
llm = ChatOpenAI(
    base_url="http://localhost:1234/v1",
    model="mlx-community/QwQ-32B-4bit",
    model_kwargs={"ttl": 5}  # Causes TypeError: unexpected keyword argument 'ttl'
)
```

## Solution

The `extra_body` parameter is the correct way to pass custom parameters
to OpenAI-compatible APIs:

```python
#  This approach works correctly
llm = ChatOpenAI(
    base_url="http://localhost:1234/v1",
    model="mlx-community/QwQ-32B-4bit",
    extra_body={"ttl": 5}  # Custom parameters go in extra_body
)
```

## Changes Made

1. **Enhanced Documentation**: Updated the `extra_body` parameter
docstring with comprehensive examples for LM Studio, vLLM, and other
providers

2. **Added Documentation Section**: Created a new "OpenAI-compatible
APIs" section in the main class docstring with practical examples

3. **Unit Tests**: Added tests to verify `extra_body` functionality
works correctly:
- `test_extra_body_parameter()`: Verifies custom parameters are included
in request payload
- `test_extra_body_with_model_kwargs()`: Ensures `extra_body` and
`model_kwargs` work together

4. **Clear Guidance**: Documented when to use `extra_body` vs
`model_kwargs`

## Examples Added

**LM Studio with TTL (auto-eviction):**
```python
ChatOpenAI(
    base_url="http://localhost:1234/v1",
    api_key="lm-studio",
    model="mlx-community/QwQ-32B-4bit",
    extra_body={"ttl": 300}  # Auto-evict after 5 minutes
)
```

**vLLM with custom sampling:**
```python
ChatOpenAI(
    base_url="http://localhost:8000/v1",
    api_key="EMPTY",
    model="meta-llama/Llama-2-7b-chat-hf",
    extra_body={
        "use_beam_search": True,
        "best_of": 4
    }
)
```

## Why This Works

- `model_kwargs` parameters are passed directly to the OpenAI client's
`create()` method, causing errors for non-standard parameters
- `extra_body` parameters are included in the HTTP request body, which
is exactly what OpenAI-compatible APIs expect for custom parameters

Fixes #32115.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-24 16:43:16 -04:00
Mason Daugherty
7d2a13f519 fix: various typos (#32231) 2025-07-24 12:35:08 -04:00
Christophe Bornet
0b34be4ce5 refactor(langchain): refactor unit test stub classes (#32209)
See
https://github.com/langchain-ai/langchain/pull/32098#discussion_r2225961563
2025-07-24 11:05:56 -04:00
Mason Daugherty
6f3169eb49 chore: update copilot development guidelines for clarity and structure (#32230) 2025-07-24 15:05:09 +00:00
Eugene Yurtsev
7995c719c5 chore(langchain_v1): clean anything uncertain (#32228)
Further clean up of namespace:

- Removed prompts (we'll re-add in a separate commit)
- Remove LocalFileStore until we can review whether all the
implementation details are necessary
- Remove message processing logic from memory (we'll figure out where to
expose it)
- Remove `Tool` primitive (should be sufficient to use `BaseTool` for
typing purposes)
- Remove utilities to create kv stores. Unclear if they've had much
usage outside MultiparentRetriever
2025-07-24 14:41:05 +00:00
Mason Daugherty
bdf1cd383c fix(langchain): update deps 2025-07-24 10:37:08 -04:00
Mason Daugherty
77c981999e fix(text-splitters): update langchain-core version to 0.3.72 2025-07-24 10:35:07 -04:00
Mason Daugherty
7f015b6f14 fix(text-splitters): update lock for release 2025-07-24 10:32:04 -04:00
Mason Daugherty
71ad451e1f Merge branch 'master' of github.com:langchain-ai/langchain 2025-07-24 10:24:17 -04:00
Mason Daugherty
2c42893703 fix(langchain): update langchain-core version to 0.3.72 2025-07-24 10:24:04 -04:00
Mason Daugherty
0e139fb9a6 release(langchain): 0.3.27 (#32227) 2025-07-24 10:20:20 -04:00
tanwirahmad
622bb05751 fix(langchain): class HTMLSemanticPreservingSplitter ignores the text inside the div tag (#32213)
**Description:** We collect the text from the "html", "body", "div", and
"main" nodes, if they have any.

**Issue:** Fixes #32206.
2025-07-24 10:09:03 -04:00
Eugene Yurtsev
56dde3ade3 feat(langchain): v1 scaffolding (#32166)
This PR adds scaffolding for langchain 1.0 entry package.

Most contents have been removed. 

Currently remaining entrypoints for:

* chat models
* embedding models
* memory -> trimming messages, filtering messages and counting tokens
[we may remove this]
* prompts -> we may remove some prompts
* storage: primarily to support cache backed embeddings, may remove the
kv store
* tools -> report tool primitives

Things to be added:

* Selected agent implementations
* Selected workflows
* Common primitives: messages, Document
* Primitives for type hinting: BaseChatModel, BaseEmbeddings
* Selected retrievers
* Selected text splitters

Things to be removed:

* Globals needs to be removed (needs an update in langchain core)


Todos: 

* TBD indexing api (requires sqlalchemy which we don't want as a
dependency)
* Be explicit about public/private interfaces (e.g., likely rename
chat_models.base.py to something more internal)
* Remove dockerfiles
* Update module doc-strings and README.md
2025-07-24 09:47:48 -04:00
Mason Daugherty
bd3d6496f3 release(core): 0.3.72 (#32214)
fixes #32170
2025-07-23 20:33:48 -04:00
jmaillefaud
fb5da8384e fix(core): Dereference Refs for pydantic schema fails in tool schema generation (#32203)
The `_dereference_refs_helper` in `langchain_core.utils.json_schema`
incorrectly handled objects with a reference and other fields.

**Issue**: #32170

# Description

We change the check so that it accepts other keys in the object.
2025-07-23 20:28:27 -04:00
Maxime Grenu
a7d0e42f3f docs: fix typos in documentation (#32201)
## Summary
- Fixed redundant word "done" in SECURITY.md line 69  
- Fixed grammar errors in Fireworks README.md line 77: "how it fares
compares" → "how it compares" and "in terms just" → "in terms of"

## Test plan
- [x] Verified changes improve readability and correct grammar
- [x] No functional changes, documentation only

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-07-23 10:43:25 -04:00
1313 changed files with 79282 additions and 39389 deletions

View File

@@ -15,12 +15,12 @@ You may use the button above, or follow these steps to open this repo in a Codes
1. Click **Create codespace on master**.
For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).
## VS Code Dev Containers
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
> [!NOTE]
> [!NOTE]
> If you click the link above you will open the main repo (`langchain-ai/langchain`) and *not* your local cloned repo. This is fine if you only want to run and test the library, but if you want to contribute you can use the link below and replace with your username and cloned repo name:
```txt

View File

@@ -4,7 +4,7 @@ services:
build:
dockerfile: libs/langchain/dev.Dockerfile
context: ..
networks:
- langchain-network

View File

@@ -129,4 +129,4 @@ For answers to common questions about this code of conduct, see the FAQ at
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations
[translations]: https://www.contributor-covenant.org/translations

View File

@@ -3,8 +3,4 @@
Hi there! Thank you for even being interested in contributing to LangChain.
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether they involve new features, improved infrastructure, better documentation, or bug fixes.
To learn how to contribute to LangChain, please follow the [contribution guide here](https://python.langchain.com/docs/contributing/).
## New features
For new features, please start a new [discussion](https://forum.langchain.com/), where the maintainers will help with scoping out the necessary changes.
To learn how to contribute to LangChain, please follow the [contribution guide here](https://docs.langchain.com/oss/python/contributing).

View File

@@ -1,11 +1,12 @@
name: "\U0001F41B Bug Report"
description: Report a bug in LangChain. To report a security issue, please instead use the security option below. For questions, please use the LangChain forum.
labels: ["bug"]
type: bug
body:
- type: markdown
attributes:
value: |
Thank you for taking the time to file a bug report.
Thank you for taking the time to file a bug report.
Use this to report BUGS in LangChain. For usage questions, feature requests and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
@@ -13,9 +14,7 @@ body:
if there's another way to solve your problem:
* [LangChain Forum](https://forum.langchain.com/),
* [LangChain Github Issues](https://github.com/langchain-ai/langchain/issues?q=is%3Aissue),
* [LangChain documentation with the integrated search](https://python.langchain.com/docs/get_started/introduction),
* [LangChain how-to guides](https://python.langchain.com/docs/how_to/),
* [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference](https://python.langchain.com/api_reference/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
@@ -25,7 +24,7 @@ body:
label: Checked other resources
description: Please confirm and check all the following options.
options:
- label: This is a bug, not a usage question. For questions, please use the LangChain Forum (https://forum.langchain.com/).
- label: This is a bug, not a usage question.
required: true
- label: I added a clear and descriptive title that summarizes this issue.
required: true
@@ -35,6 +34,8 @@ body:
required: true
- label: The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
required: true
- label: This is not related to the langchain-community package.
required: true
- label: I read what a minimal reproducible example is (https://stackoverflow.com/help/minimal-reproducible-example).
required: true
- label: I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
@@ -50,7 +51,7 @@ body:
If a maintainer can copy it, run it, and see it right away, there's a much higher chance that you'll be able to get help.
**Important!**
**Important!**
* Avoid screenshots when possible, as they are hard to read and (more importantly) don't allow others to copy-and-paste your code.
* Reduce your code to the minimum required to reproduce the issue if possible. This makes it much easier for others to help you.
@@ -58,14 +59,14 @@ body:
* INCLUDE the language label (e.g. `python`) after the first three backticks to enable syntax highlighting. (e.g., ```python rather than ```).
placeholder: |
The following code:
The following code:
```python
from langchain_core.runnables import RunnableLambda
def bad_code(inputs) -> int:
raise NotImplementedError('For demo purpose')
chain = RunnableLambda(bad_code)
chain.invoke('Hello!')
```

View File

@@ -1,6 +1,9 @@
blank_issues_enabled: false
version: 2.1
contact_links:
- name: LangChain Forum
url: https://forum.langchain.com/
about: General community discussions, support, and feature requests
- name: 📚 Documentation
url: https://github.com/langchain-ai/docs/issues/new?template=langchain.yml
about: Report an issue related to the LangChain documentation
- name: 💬 LangChain Forum
url: https://forum.langchain.com/
about: General community discussions and support

View File

@@ -1,59 +0,0 @@
name: Documentation
description: Report an issue related to the LangChain documentation.
title: "docs: <Please write a comprehensive title after the 'docs: ' prefix>"
labels: [documentation]
body:
- type: markdown
attributes:
value: |
Thank you for taking the time to report an issue in the documentation.
Only report issues with documentation here, explain if there are
any missing topics or if you found a mistake in the documentation.
Do **NOT** use this to ask usage questions or reporting issues with your code.
If you have usage questions or need help solving some problem,
please use the [LangChain Forum](https://forum.langchain.com/).
If you're in the wrong place, here are some helpful links to find a better
place to ask your question:
* [LangChain Forum](https://forum.langchain.com/),
* [LangChain Github Issues](https://github.com/langchain-ai/langchain/issues?q=is%3Aissue),
* [LangChain documentation with the integrated search](https://python.langchain.com/docs/get_started/introduction),
* [LangChain how-to guides](https://python.langchain.com/docs/how_to/),
* [API Reference](https://python.langchain.com/api_reference/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
- type: input
id: url
attributes:
label: URL
description: URL to documentation
validations:
required: false
- type: checkboxes
id: checks
attributes:
label: Checklist
description: Please confirm and check all the following options.
options:
- label: I added a very descriptive title to this issue.
required: true
- label: I included a link to the documentation page I am referring to (if applicable).
required: true
- type: textarea
attributes:
label: "Issue with current documentation:"
description: >
Please make sure to leave a reference to the document/code you're
referring to. Feel free to include names of classes, functions, methods
or concepts you'd like to see documented more.
- type: textarea
attributes:
label: "Idea or request for content:"
description: >
Please describe as clearly as possible what topics you think are missing
from the current documentation.

View File

@@ -0,0 +1,118 @@
name: "✨ Feature Request"
description: Request a new feature or enhancement for LangChain. For questions, please use the LangChain forum.
labels: ["feature request"]
type: feature
body:
- type: markdown
attributes:
value: |
Thank you for taking the time to request a new feature.
Use this to request NEW FEATURES or ENHANCEMENTS in LangChain. For bug reports, please use the bug report template. For usage questions and general design questions, please use the [LangChain Forum](https://forum.langchain.com/).
Relevant links to check before filing a feature request to see if your request has already been made or
if there's another way to achieve what you want:
* [LangChain Forum](https://forum.langchain.com/),
* [LangChain documentation with the integrated search](https://docs.langchain.com/oss/python/langchain/overview),
* [API Reference](https://python.langchain.com/api_reference/),
* [LangChain ChatBot](https://chat.langchain.com/)
* [GitHub search](https://github.com/langchain-ai/langchain),
- type: checkboxes
id: checks
attributes:
label: Checked other resources
description: Please confirm and check all the following options.
options:
- label: This is a feature request, not a bug report or usage question.
required: true
- label: I added a clear and descriptive title that summarizes the feature request.
required: true
- label: I used the GitHub search to find a similar feature request and didn't find it.
required: true
- label: I checked the LangChain documentation and API reference to see if this feature already exists.
required: true
- label: This is not related to the langchain-community package.
required: true
- type: textarea
id: feature-description
validations:
required: true
attributes:
label: Feature Description
description: |
Please provide a clear and concise description of the feature you would like to see added to LangChain.
What specific functionality are you requesting? Be as detailed as possible.
placeholder: |
I would like LangChain to support...
This feature would allow users to...
- type: textarea
id: use-case
validations:
required: true
attributes:
label: Use Case
description: |
Describe the specific use case or problem this feature would solve.
Why do you need this feature? What problem does it solve for you or other users?
placeholder: |
I'm trying to build an application that...
Currently, I have to work around this by...
This feature would help me/users to...
- type: textarea
id: proposed-solution
validations:
required: false
attributes:
label: Proposed Solution
description: |
If you have ideas about how this feature could be implemented, please describe them here.
This is optional but can be helpful for maintainers to understand your vision.
placeholder: |
I think this could be implemented by...
The API could look like...
```python
# Example of how the feature might work
```
- type: textarea
id: alternatives
validations:
required: false
attributes:
label: Alternatives Considered
description: |
Have you considered any alternative solutions or workarounds?
What other approaches have you tried or considered?
placeholder: |
I've tried using...
Alternative approaches I considered:
1. ...
2. ...
But these don't work because...
- type: textarea
id: additional-context
validations:
required: false
attributes:
label: Additional Context
description: |
Add any other context, screenshots, examples, or references that would help explain your feature request.
placeholder: |
Related issues: #...
Similar features in other libraries:
- ...
Additional context or examples:
- ...

View File

@@ -4,12 +4,7 @@ body:
- type: markdown
attributes:
value: |
Thanks for your interest in LangChain! 🚀
If you are not a LangChain maintainer or were not asked directly by a maintainer to create an issue, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead.
You are a LangChain maintainer if you maintain any of the packages inside of the LangChain repository
or are a regular contributor to LangChain with previous merged pull requests.
If you are not a LangChain maintainer, employee, or were not asked directly by a maintainer to create an issue, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead.
- type: checkboxes
id: privileged
attributes:

91
.github/ISSUE_TEMPLATE/task.yml vendored Normal file
View File

@@ -0,0 +1,91 @@
name: "📋 Task"
description: Create a task for project management and tracking by LangChain maintainers. If you are not a maintainer, please use other templates or the forum.
labels: ["task"]
type: task
body:
- type: markdown
attributes:
value: |
Thanks for creating a task to help organize LangChain development.
This template is for **maintainer tasks** such as project management, development planning, refactoring, documentation updates, and other organizational work.
If you are not a LangChain maintainer or were not asked directly by a maintainer to create a task, then please start the conversation on the [LangChain Forum](https://forum.langchain.com/) instead or use the appropriate bug report or feature request templates on the previous page.
- type: checkboxes
id: maintainer
attributes:
label: Maintainer task
description: Confirm that you are allowed to create a task here.
options:
- label: I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create a task here.
required: true
- type: textarea
id: task-description
attributes:
label: Task Description
description: |
Provide a clear and detailed description of the task.
What needs to be done? Be specific about the scope and requirements.
placeholder: |
This task involves...
The goal is to...
Specific requirements:
- ...
- ...
validations:
required: true
- type: textarea
id: acceptance-criteria
attributes:
label: Acceptance Criteria
description: |
Define the criteria that must be met for this task to be considered complete.
What are the specific deliverables or outcomes expected?
placeholder: |
This task will be complete when:
- [ ] ...
- [ ] ...
- [ ] ...
validations:
required: true
- type: textarea
id: context
attributes:
label: Context and Background
description: |
Provide any relevant context, background information, or links to related issues/PRs.
Why is this task needed? What problem does it solve?
placeholder: |
Background:
- ...
Related issues/PRs:
- #...
Additional context:
- ...
validations:
required: false
- type: textarea
id: dependencies
attributes:
label: Dependencies
description: |
List any dependencies or blockers for this task.
Are there other tasks, issues, or external factors that need to be completed first?
placeholder: |
This task depends on:
- [ ] Issue #...
- [ ] PR #...
- [ ] External dependency: ...
Blocked by:
- ...
validations:
required: false

View File

@@ -1,3 +1,5 @@
(Replace this entire block of text)
Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. **If any of these steps are not completed, your PR will not be considered for review.**
- [ ] **PR title**: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION}
@@ -9,14 +11,13 @@ Thank you for contributing to LangChain! Follow these steps to mark your pull re
- feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release
- Allowed `{SCOPE}` values (optional):
- core, cli, langchain, standard-tests, docs, anthropic, chroma, deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant, xai
- Note: the `{DESCRIPTION}` must not start with an uppercase letter.
- *Note:* the `{DESCRIPTION}` must not start with an uppercase letter.
- Once you've written the title, please delete this checklist item; do not include it in the PR.
- [ ] **PR message**: ***Delete this entire checklist*** and replace with
- **Description:** a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue.
- **Issue:** the issue # it fixes, if applicable (e.g. Fixes #123)
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, you must include:
1. A test for the integration, preferably unit tests that do not rely on network access,
@@ -26,7 +27,7 @@ Thank you for contributing to LangChain! Follow these steps to mark your pull re
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests.
- Most PRs should not touch more than one package.
- Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are **required** for unit tests.
- Changes should be backwards compatible.
- Make sure optional dependencies are imported within a function.

View File

@@ -4,4 +4,4 @@ RUN pip install httpx PyGithub "pydantic==2.0.2" pydantic-settings "pyyaml>=5.3.
COPY ./app /app
CMD ["python", "/app/main.py"]
CMD ["python", "/app/main.py"]

View File

@@ -1,11 +1,13 @@
# Adapted from https://github.com/tiangolo/fastapi/blob/master/.github/actions/people/action.yml
# TODO: fix this, migrate to new docs repo?
name: "Generate LangChain People"
description: "Generate the data for the LangChain People page"
author: "Jacob Lee <jacob@langchain.dev>"
inputs:
token:
description: 'User token, to read the GitHub API. Can be passed in using {{ secrets.LANGCHAIN_PEOPLE_GITHUB_TOKEN }}'
description: "User token, to read the GitHub API. Can be passed in using {{ secrets.LANGCHAIN_PEOPLE_GITHUB_TOKEN }}"
required: true
runs:
using: 'docker'
image: 'Dockerfile'
using: "docker"
image: "Dockerfile"

View File

@@ -1,12 +1,24 @@
# TODO: https://docs.astral.sh/uv/guides/integration/github/#caching
# Helper to set up Python and uv with caching
name: uv-install
description: Set up Python and uv
description: Set up Python and uv with caching
inputs:
python-version:
description: Python version, supporting MAJOR.MINOR only
required: true
enable-cache:
description: Enable caching for uv dependencies
required: false
default: "true"
cache-suffix:
description: Custom cache key suffix for cache invalidation
required: false
default: ""
working-directory:
description: Working directory for cache glob scoping
required: false
default: "**"
env:
UV_VERSION: "0.5.25"
@@ -15,7 +27,13 @@ runs:
using: composite
steps:
- name: Install uv and set the python version
uses: astral-sh/setup-uv@v5
uses: astral-sh/setup-uv@v6
with:
version: ${{ env.UV_VERSION }}
python-version: ${{ inputs.python-version }}
enable-cache: ${{ inputs.enable-cache }}
cache-dependency-glob: |
${{ inputs.working-directory }}/pyproject.toml
${{ inputs.working-directory }}/uv.lock
${{ inputs.working-directory }}/requirements*.txt
cache-suffix: ${{ inputs.cache-suffix }}

80
.github/pr-file-labeler.yml vendored Normal file
View File

@@ -0,0 +1,80 @@
# Label PRs (config)
# Automatically applies labels based on changed files and branch patterns
# Core packages
core:
- changed-files:
- any-glob-to-any-file:
- "libs/core/**/*"
langchain:
- changed-files:
- any-glob-to-any-file:
- "libs/langchain/**/*"
- "libs/langchain_v1/**/*"
v1:
- changed-files:
- any-glob-to-any-file:
- "libs/langchain_v1/**/*"
cli:
- changed-files:
- any-glob-to-any-file:
- "libs/cli/**/*"
standard-tests:
- changed-files:
- any-glob-to-any-file:
- "libs/standard-tests/**/*"
# Partner integrations
integration:
- changed-files:
- any-glob-to-any-file:
- "libs/partners/**/*"
# Infrastructure and DevOps
infra:
- changed-files:
- any-glob-to-any-file:
- ".github/**/*"
- "Makefile"
- ".pre-commit-config.yaml"
- "scripts/**/*"
- "docker/**/*"
- "Dockerfile*"
github_actions:
- changed-files:
- any-glob-to-any-file:
- ".github/workflows/**/*"
- ".github/actions/**/*"
dependencies:
- changed-files:
- any-glob-to-any-file:
- "**/pyproject.toml"
- "uv.lock"
- "**/requirements*.txt"
- "**/poetry.lock"
# Documentation
documentation:
- changed-files:
- any-glob-to-any-file:
- "docs/**/*"
- "**/*.md"
- "**/*.rst"
- "**/README*"
# Security related changes
security:
- changed-files:
- any-glob-to-any-file:
- "**/*security*"
- "**/*auth*"
- "**/*credential*"
- "**/*secret*"
- "**/*token*"
- ".github/workflows/security*"

41
.github/pr-title-labeler.yml vendored Normal file
View File

@@ -0,0 +1,41 @@
# PR title labeler config
#
# Labels PRs based on conventional commit patterns in titles
#
# Format: type(scope): description or type!: description (breaking)
add-missing-labels: true
clear-prexisting: false
include-commits: false
include-title: true
label-for-breaking-changes: breaking
label-mapping:
documentation: ["docs"]
feature: ["feat"]
fix: ["fix"]
infra: ["build", "ci", "chore"]
integration:
[
"anthropic",
"chroma",
"deepseek",
"exa",
"fireworks",
"groq",
"huggingface",
"mistralai",
"nomic",
"ollama",
"openai",
"perplexity",
"prompty",
"qdrant",
"xai",
]
linting: ["style"]
performance: ["perf"]
refactor: ["refactor"]
release: ["release"]
revert: ["revert"]
tests: ["test"]

View File

@@ -1,16 +1,29 @@
"""Analyze git diffs to determine which directories need to be tested.
Intelligently determines which LangChain packages and directories need to be tested,
linted, or built based on the changes. Handles dependency relationships between
packages, maps file changes to appropriate CI job configurations, and outputs JSON
configurations for GitHub Actions.
- Maps changed files to affected package directories (libs/core, libs/partners/*, etc.)
- Builds dependency graph to include dependent packages when core components change
- Generates test matrix configurations with appropriate Python versions
- Handles special cases for Pydantic version testing and performance benchmarks
Used as part of the check_diffs workflow.
"""
import glob
import json
import os
import sys
from collections import defaultdict
from typing import Dict, List, Set
from pathlib import Path
from typing import Dict, List, Set
import tomllib
from packaging.requirements import Requirement
from get_min_versions import get_min_version_from_toml
from packaging.requirements import Requirement
LANGCHAIN_DIRS = [
"libs/core",
@@ -19,7 +32,7 @@ LANGCHAIN_DIRS = [
"libs/langchain_v1",
]
# when set to True, we are ignoring core dependents
# When set to True, we are ignoring core dependents
# in order to be able to get CI to pass for each individual
# package that depends on core
# e.g. if you touch core, we don't then add textsplitters/etc to CI
@@ -38,7 +51,7 @@ IGNORED_PARTNERS = [
]
PY_312_MAX_PACKAGES = [
"libs/partners/chroma", # https://github.com/chroma-core/chroma/issues/4382
"libs/partners/chroma", # https://github.com/chroma-core/chroma/issues/4382
]
@@ -51,9 +64,9 @@ def all_package_dirs() -> Set[str]:
def dependents_graph() -> dict:
"""
Construct a mapping of package -> dependents, such that we can
run tests on all dependents of a package when a change is made.
"""Construct a mapping of package -> dependents
Done such that we can run tests on all dependents of a package when a change is made.
"""
dependents = defaultdict(set)
@@ -85,9 +98,9 @@ def dependents_graph() -> dict:
for depline in extended_deps:
if depline.startswith("-e "):
# editable dependency
assert depline.startswith(
"-e ../partners/"
), "Extended test deps should only editable install partner packages"
assert depline.startswith("-e ../partners/"), (
"Extended test deps should only editable install partner packages"
)
partner = depline.split("partners/")[1]
dep = f"langchain-{partner}"
else:
@@ -125,15 +138,16 @@ def _get_configs_for_single_dir(job: str, dir_: str) -> List[Dict[str, str]]:
elif dir_ == "libs/core":
py_versions = ["3.9", "3.10", "3.11", "3.12", "3.13"]
# custom logic for specific directories
elif dir_ == "libs/partners/milvus":
# milvus doesn't allow 3.12 because they declare deps in funny way
py_versions = ["3.9", "3.11"]
elif dir_ in PY_312_MAX_PACKAGES:
py_versions = ["3.9", "3.12"]
elif dir_ == "libs/langchain" and job == "extended-tests":
py_versions = ["3.9", "3.13"]
elif dir_ == "libs/langchain_v1":
py_versions = ["3.10", "3.13"]
elif dir_ in {"libs/cli"}:
py_versions = ["3.10", "3.13"]
elif dir_ == ".":
# unable to install with 3.13 because tokenizers doesn't support 3.13 yet
@@ -271,7 +285,7 @@ if __name__ == "__main__":
dirs_to_run["extended-test"].add(dir_)
elif file.startswith("libs/standard-tests"):
# TODO: update to include all packages that rely on standard-tests (all partner packages)
# note: won't run on external repo partners
# Note: won't run on external repo partners
dirs_to_run["lint"].add("libs/standard-tests")
dirs_to_run["test"].add("libs/standard-tests")
dirs_to_run["lint"].add("libs/cli")
@@ -285,7 +299,7 @@ if __name__ == "__main__":
elif file.startswith("libs/cli"):
dirs_to_run["lint"].add("libs/cli")
dirs_to_run["test"].add("libs/cli")
elif file.startswith("libs/partners"):
partner_dir = file.split("/")[2]
if os.path.isdir(f"libs/partners/{partner_dir}") and [
@@ -303,7 +317,10 @@ if __name__ == "__main__":
f"Unknown lib: {file}. check_diff.py likely needs "
"an update for this new library!"
)
elif file.startswith("docs/") or file in ["pyproject.toml", "uv.lock"]: # docs or root uv files
elif file.startswith("docs/") or file in [
"pyproject.toml",
"uv.lock",
]: # docs or root uv files
docs_edited = True
dirs_to_run["lint"].add(".")

View File

@@ -1,19 +1,21 @@
"""Check that no dependencies allow prereleases unless we're releasing a prerelease."""
import sys
import tomllib
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
# read toml file
with open(toml_file, "rb") as file:
toml_data = tomllib.load(file)
# see if we're releasing an rc
# See if we're releasing an rc or dev version
version = toml_data["project"]["version"]
releasing_rc = "rc" in version or "dev" in version
# if not, iterate through dependencies and make sure none allow prereleases
# If not, iterate through dependencies and make sure none allow prereleases
if not releasing_rc:
dependencies = toml_data["project"]["dependencies"]
for dep_version in dependencies:

View File

@@ -1,24 +1,22 @@
from collections import defaultdict
"""Get minimum versions of dependencies from a pyproject.toml file."""
import sys
from collections import defaultdict
from typing import Optional
if sys.version_info >= (3, 11):
import tomllib
else:
# for python 3.10 and below, which doesnt have stdlib tomllib
# For Python 3.10 and below, which doesnt have stdlib tomllib
import tomli as tomllib
from packaging.requirements import Requirement
from packaging.specifiers import SpecifierSet
from packaging.version import Version
import requests
from packaging.version import parse
import re
from typing import List
import re
import requests
from packaging.requirements import Requirement
from packaging.specifiers import SpecifierSet
from packaging.version import Version, parse
MIN_VERSION_LIBS = [
"langchain-core",
@@ -38,14 +36,13 @@ SKIP_IF_PULL_REQUEST = [
def get_pypi_versions(package_name: str) -> List[str]:
"""
Fetch all available versions for a package from PyPI.
"""Fetch all available versions for a package from PyPI.
Args:
package_name (str): Name of the package
package_name: Name of the package
Returns:
List[str]: List of all available versions
List of all available versions
Raises:
requests.exceptions.RequestException: If PyPI API request fails
@@ -58,25 +55,26 @@ def get_pypi_versions(package_name: str) -> List[str]:
def get_minimum_version(package_name: str, spec_string: str) -> Optional[str]:
"""
Find the minimum published version that satisfies the given constraints.
"""Find the minimum published version that satisfies the given constraints.
Args:
package_name (str): Name of the package
spec_string (str): Version specification string (e.g., ">=0.2.43,<0.4.0,!=0.3.0")
package_name: Name of the package
spec_string: Version specification string (e.g., ">=0.2.43,<0.4.0,!=0.3.0")
Returns:
Optional[str]: Minimum compatible version or None if no compatible version found
Minimum compatible version or None if no compatible version found
"""
# rewrite occurrences of ^0.0.z to 0.0.z (can be anywhere in constraint string)
# Rewrite occurrences of ^0.0.z to 0.0.z (can be anywhere in constraint string)
spec_string = re.sub(r"\^0\.0\.(\d+)", r"0.0.\1", spec_string)
# rewrite occurrences of ^0.y.z to >=0.y.z,<0.y+1 (can be anywhere in constraint string)
# Rewrite occurrences of ^0.y.z to >=0.y.z,<0.y+1 (can be anywhere in constraint string)
for y in range(1, 10):
spec_string = re.sub(rf"\^0\.{y}\.(\d+)", rf">=0.{y}.\1,<0.{y+1}", spec_string)
# rewrite occurrences of ^x.y.z to >=x.y.z,<x+1.0.0 (can be anywhere in constraint string)
spec_string = re.sub(
rf"\^0\.{y}\.(\d+)", rf">=0.{y}.\1,<0.{y + 1}", spec_string
)
# Rewrite occurrences of ^x.y.z to >=x.y.z,<x+1.0.0 (can be anywhere in constraint string)
for x in range(1, 10):
spec_string = re.sub(
rf"\^{x}\.(\d+)\.(\d+)", rf">={x}.\1.\2,<{x+1}", spec_string
rf"\^{x}\.(\d+)\.(\d+)", rf">={x}.\1.\2,<{x + 1}", spec_string
)
spec_set = SpecifierSet(spec_string)
@@ -156,25 +154,28 @@ def get_min_version_from_toml(
def check_python_version(version_string, constraint_string):
"""
Check if the given Python version matches the given constraints.
"""Check if the given Python version matches the given constraints.
:param version_string: A string representing the Python version (e.g. "3.8.5").
:param constraint_string: A string representing the package's Python version constraints (e.g. ">=3.6, <4.0").
:return: True if the version matches the constraints, False otherwise.
Args:
version_string: A string representing the Python version (e.g. "3.8.5").
constraint_string: A string representing the package's Python version
constraints (e.g. ">=3.6, <4.0").
Returns:
True if the version matches the constraints
"""
# rewrite occurrences of ^0.0.z to 0.0.z (can be anywhere in constraint string)
# Rewrite occurrences of ^0.0.z to 0.0.z (can be anywhere in constraint string)
constraint_string = re.sub(r"\^0\.0\.(\d+)", r"0.0.\1", constraint_string)
# rewrite occurrences of ^0.y.z to >=0.y.z,<0.y+1.0 (can be anywhere in constraint string)
# Rewrite occurrences of ^0.y.z to >=0.y.z,<0.y+1.0 (can be anywhere in constraint string)
for y in range(1, 10):
constraint_string = re.sub(
rf"\^0\.{y}\.(\d+)", rf">=0.{y}.\1,<0.{y+1}.0", constraint_string
rf"\^0\.{y}\.(\d+)", rf">=0.{y}.\1,<0.{y + 1}.0", constraint_string
)
# rewrite occurrences of ^x.y.z to >=x.y.z,<x+1.0.0 (can be anywhere in constraint string)
# Rewrite occurrences of ^x.y.z to >=x.y.z,<x+1.0.0 (can be anywhere in constraint string)
for x in range(1, 10):
constraint_string = re.sub(
rf"\^{x}\.0\.(\d+)", rf">={x}.0.\1,<{x+1}.0.0", constraint_string
rf"\^{x}\.0\.(\d+)", rf">={x}.0.\1,<{x + 1}.0.0", constraint_string
)
try:

View File

@@ -1,15 +1,19 @@
#!/usr/bin/env python
"""Script to sync libraries from various repositories into the main langchain repository."""
"""Sync libraries from various repositories into this monorepo.
Moves cloned partner packages into libs/partners structure.
"""
import os
import shutil
import yaml
from pathlib import Path
from typing import Dict, Any
from typing import Any, Dict
import yaml
def load_packages_yaml() -> Dict[str, Any]:
"""Load and parse the packages.yml file."""
"""Load and parse packages.yml."""
with open("langchain/libs/packages.yml", "r") as f:
return yaml.safe_load(f)
@@ -28,7 +32,6 @@ def get_target_dir(package_name: str) -> Path:
def clean_target_directories(packages: list) -> None:
"""Remove old directories that will be replaced."""
for package in packages:
target_dir = get_target_dir(package["name"])
if target_dir.exists():
print(f"Removing {target_dir}")
@@ -38,7 +41,6 @@ def clean_target_directories(packages: list) -> None:
def move_libraries(packages: list) -> None:
"""Move libraries from their source locations to the target directories."""
for package in packages:
repo_name = package["repo"].split("/")[1]
source_path = package["path"]
target_dir = get_target_dir(package["name"])
@@ -62,31 +64,46 @@ def move_libraries(packages: list) -> None:
def main():
"""Main function to orchestrate the library sync process."""
"""Orchestrate the library sync process."""
try:
# Load packages configuration
package_yaml = load_packages_yaml()
# Clean target directories
clean_target_directories([
p
for p in package_yaml["packages"]
if (p["repo"].startswith("langchain-ai/") or p.get("include_in_api_ref"))
and p["repo"] != "langchain-ai/langchain"
and p["name"] != "langchain-ai21" # Skip AI21 due to dependency conflicts
])
# Clean/empty target directories in preparation for moving new ones
#
# Only for packages in the langchain-ai org or explicitly included via
# include_in_api_ref, excluding 'langchain' itself and 'langchain-ai21'
clean_target_directories(
[
p
for p in package_yaml["packages"]
if (
p["repo"].startswith("langchain-ai/") or p.get("include_in_api_ref")
)
and p["repo"] != "langchain-ai/langchain"
and p["name"]
!= "langchain-ai21" # Skip AI21 due to dependency conflicts
]
)
# Move libraries to their new locations
move_libraries([
p
for p in package_yaml["packages"]
if not p.get("disabled", False)
and (p["repo"].startswith("langchain-ai/") or p.get("include_in_api_ref"))
and p["repo"] != "langchain-ai/langchain"
and p["name"] != "langchain-ai21" # Skip AI21 due to dependency conflicts
])
# Move cloned libraries to their new locations, only for packages in the
# langchain-ai org or explicitly included via include_in_api_ref,
# excluding 'langchain' itself and 'langchain-ai21'
move_libraries(
[
p
for p in package_yaml["packages"]
if not p.get("disabled", False)
and (
p["repo"].startswith("langchain-ai/") or p.get("include_in_api_ref")
)
and p["repo"] != "langchain-ai/langchain"
and p["name"]
!= "langchain-ai21" # Skip AI21 due to dependency conflicts
]
)
# Delete ones without a pyproject.toml
# Delete partner packages without a pyproject.toml
for partner in Path("langchain/libs/partners").iterdir():
if partner.is_dir() and not (partner / "pyproject.toml").exists():
print(f"Removing {partner} as it does not have a pyproject.toml")

View File

@@ -81,56 +81,93 @@ import time
__version__ = "2022.12+dev"
# Update symlinks only if the platform supports not following them
UPDATE_SYMLINKS = bool(os.utime in getattr(os, 'supports_follow_symlinks', []))
UPDATE_SYMLINKS = bool(os.utime in getattr(os, "supports_follow_symlinks", []))
# Call os.path.normpath() only if not in a POSIX platform (Windows)
NORMALIZE_PATHS = (os.path.sep != '/')
NORMALIZE_PATHS = os.path.sep != "/"
# How many files to process in each batch when re-trying merge commits
STEPMISSING = 100
# (Extra) keywords for the os.utime() call performed by touch()
UTIME_KWS = {} if not UPDATE_SYMLINKS else {'follow_symlinks': False}
UTIME_KWS = {} if not UPDATE_SYMLINKS else {"follow_symlinks": False}
# Command-line interface ######################################################
def parse_args():
parser = argparse.ArgumentParser(
description=__doc__.split('\n---')[0])
parser = argparse.ArgumentParser(description=__doc__.split("\n---")[0])
group = parser.add_mutually_exclusive_group()
group.add_argument('--quiet', '-q', dest='loglevel',
action="store_const", const=logging.WARNING, default=logging.INFO,
help="Suppress informative messages and summary statistics.")
group.add_argument('--verbose', '-v', action="count", help="""
group.add_argument(
"--quiet",
"-q",
dest="loglevel",
action="store_const",
const=logging.WARNING,
default=logging.INFO,
help="Suppress informative messages and summary statistics.",
)
group.add_argument(
"--verbose",
"-v",
action="count",
help="""
Print additional information for each processed file.
Specify twice to further increase verbosity.
""")
""",
)
parser.add_argument('--cwd', '-C', metavar="DIRECTORY", help="""
parser.add_argument(
"--cwd",
"-C",
metavar="DIRECTORY",
help="""
Run as if %(prog)s was started in directory %(metavar)s.
This affects how --work-tree, --git-dir and PATHSPEC arguments are handled.
See 'man 1 git' or 'git --help' for more information.
""")
""",
)
parser.add_argument('--git-dir', dest='gitdir', metavar="GITDIR", help="""
parser.add_argument(
"--git-dir",
dest="gitdir",
metavar="GITDIR",
help="""
Path to the git repository, by default auto-discovered by searching
the current directory and its parents for a .git/ subdirectory.
""")
""",
)
parser.add_argument('--work-tree', dest='workdir', metavar="WORKTREE", help="""
parser.add_argument(
"--work-tree",
dest="workdir",
metavar="WORKTREE",
help="""
Path to the work tree root, by default the parent of GITDIR if it's
automatically discovered, or the current directory if GITDIR is set.
""")
""",
)
parser.add_argument('--force', '-f', default=False, action="store_true", help="""
parser.add_argument(
"--force",
"-f",
default=False,
action="store_true",
help="""
Force updating files with uncommitted modifications.
Untracked files and uncommitted deletions, renames and additions are
always ignored.
""")
""",
)
parser.add_argument('--merge', '-m', default=False, action="store_true", help="""
parser.add_argument(
"--merge",
"-m",
default=False,
action="store_true",
help="""
Include merge commits.
Leads to more recent times and more files per commit, thus with the same
time, which may or may not be what you want.
@@ -138,71 +175,130 @@ def parse_args():
are found sooner, which can improve performance, sometimes substantially.
But as merge commits are usually huge, processing them may also take longer.
By default, merge commits are only used for files missing from regular commits.
""")
""",
)
parser.add_argument('--first-parent', default=False, action="store_true", help="""
parser.add_argument(
"--first-parent",
default=False,
action="store_true",
help="""
Consider only the first parent, the "main branch", when evaluating merge commits.
Only effective when merge commits are processed, either when --merge is
used or when finding missing files after the first regular log search.
See --skip-missing.
""")
""",
)
parser.add_argument('--skip-missing', '-s', dest="missing", default=True,
action="store_false", help="""
parser.add_argument(
"--skip-missing",
"-s",
dest="missing",
default=True,
action="store_false",
help="""
Do not try to find missing files.
If merge commits were not evaluated with --merge and some files were
not found in regular commits, by default %(prog)s searches for these
files again in the merge commits.
This option disables this retry, so files found only in merge commits
will not have their timestamp updated.
""")
""",
)
parser.add_argument('--no-directories', '-D', dest='dirs', default=True,
action="store_false", help="""
parser.add_argument(
"--no-directories",
"-D",
dest="dirs",
default=True,
action="store_false",
help="""
Do not update directory timestamps.
By default, use the time of its most recently created, renamed or deleted file.
Note that just modifying a file will NOT update its directory time.
""")
""",
)
parser.add_argument('--test', '-t', default=False, action="store_true",
help="Test run: do not actually update any file timestamp.")
parser.add_argument(
"--test",
"-t",
default=False,
action="store_true",
help="Test run: do not actually update any file timestamp.",
)
parser.add_argument('--commit-time', '-c', dest='commit_time', default=False,
action='store_true', help="Use commit time instead of author time.")
parser.add_argument(
"--commit-time",
"-c",
dest="commit_time",
default=False,
action="store_true",
help="Use commit time instead of author time.",
)
parser.add_argument('--oldest-time', '-o', dest='reverse_order', default=False,
action='store_true', help="""
parser.add_argument(
"--oldest-time",
"-o",
dest="reverse_order",
default=False,
action="store_true",
help="""
Update times based on the oldest, instead of the most recent commit of a file.
This reverses the order in which the git log is processed to emulate a
file "creation" date. Note this will be inaccurate for files deleted and
re-created at later dates.
""")
""",
)
parser.add_argument('--skip-older-than', metavar='SECONDS', type=int, help="""
parser.add_argument(
"--skip-older-than",
metavar="SECONDS",
type=int,
help="""
Ignore files that are currently older than %(metavar)s.
Useful in workflows that assume such files already have a correct timestamp,
as it may improve performance by processing fewer files.
""")
""",
)
parser.add_argument('--skip-older-than-commit', '-N', default=False,
action='store_true', help="""
parser.add_argument(
"--skip-older-than-commit",
"-N",
default=False,
action="store_true",
help="""
Ignore files older than the timestamp it would be updated to.
Such files may be considered "original", likely in the author's repository.
""")
""",
)
parser.add_argument('--unique-times', default=False, action="store_true", help="""
parser.add_argument(
"--unique-times",
default=False,
action="store_true",
help="""
Set the microseconds to a unique value per commit.
Allows telling apart changes that would otherwise have identical timestamps,
as git's time accuracy is in seconds.
""")
""",
)
parser.add_argument('pathspec', nargs='*', metavar='PATHSPEC', help="""
parser.add_argument(
"pathspec",
nargs="*",
metavar="PATHSPEC",
help="""
Only modify paths matching %(metavar)s, relative to current directory.
By default, update all but untracked files and submodules.
""")
""",
)
parser.add_argument('--version', '-V', action='version',
version='%(prog)s version {version}'.format(version=get_version()))
parser.add_argument(
"--version",
"-V",
action="version",
version="%(prog)s version {version}".format(version=get_version()),
)
args_ = parser.parse_args()
if args_.verbose:
@@ -212,17 +308,18 @@ def parse_args():
def get_version(version=__version__):
if not version.endswith('+dev'):
if not version.endswith("+dev"):
return version
try:
cwd = os.path.dirname(os.path.realpath(__file__))
return Git(cwd=cwd, errors=False).describe().lstrip('v')
return Git(cwd=cwd, errors=False).describe().lstrip("v")
except Git.Error:
return '-'.join((version, "unknown"))
return "-".join((version, "unknown"))
# Helper functions ############################################################
def setup_logging():
"""Add TRACE logging level and corresponding method, return the root logger"""
logging.TRACE = TRACE = logging.DEBUG // 2
@@ -255,11 +352,13 @@ def normalize(path):
if path and path[0] == '"':
# Python 2: path = path[1:-1].decode("string-escape")
# Python 3: https://stackoverflow.com/a/46650050/624066
path = (path[1:-1] # Remove enclosing double quotes
.encode('latin1') # Convert to bytes, required by 'unicode-escape'
.decode('unicode-escape') # Perform the actual octal-escaping decode
.encode('latin1') # 1:1 mapping to bytes, UTF-8 encoded
.decode('utf8', 'surrogateescape')) # Decode from UTF-8
path = (
path[1:-1] # Remove enclosing double quotes
.encode("latin1") # Convert to bytes, required by 'unicode-escape'
.decode("unicode-escape") # Perform the actual octal-escaping decode
.encode("latin1") # 1:1 mapping to bytes, UTF-8 encoded
.decode("utf8", "surrogateescape")
) # Decode from UTF-8
if NORMALIZE_PATHS:
# Make sure the slash matches the OS; for Windows we need a backslash
path = os.path.normpath(path)
@@ -282,12 +381,12 @@ def touch_ns(path, mtime_ns):
def isodate(secs: int):
# time.localtime() accepts floats, but discards fractional part
return time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(secs))
return time.strftime("%Y-%m-%d %H:%M:%S", time.localtime(secs))
def isodate_ns(ns: int):
# for integers fromtimestamp() is equivalent and ~16% slower than isodate()
return datetime.datetime.fromtimestamp(ns / 1000000000).isoformat(sep=' ')
return datetime.datetime.fromtimestamp(ns / 1000000000).isoformat(sep=" ")
def get_mtime_ns(secs: int, idx: int):
@@ -305,35 +404,49 @@ def get_mtime_path(path):
# Git class and parse_log(), the heart of the script ##########################
class Git:
def __init__(self, workdir=None, gitdir=None, cwd=None, errors=True):
self.gitcmd = ['git']
self.gitcmd = ["git"]
self.errors = errors
self._proc = None
if workdir: self.gitcmd.extend(('--work-tree', workdir))
if gitdir: self.gitcmd.extend(('--git-dir', gitdir))
if cwd: self.gitcmd.extend(('-C', cwd))
if workdir:
self.gitcmd.extend(("--work-tree", workdir))
if gitdir:
self.gitcmd.extend(("--git-dir", gitdir))
if cwd:
self.gitcmd.extend(("-C", cwd))
self.workdir, self.gitdir = self._get_repo_dirs()
def ls_files(self, paths: list = None):
return (normalize(_) for _ in self._run('ls-files --full-name', paths))
return (normalize(_) for _ in self._run("ls-files --full-name", paths))
def ls_dirty(self, force=False):
return (normalize(_[3:].split(' -> ', 1)[-1])
for _ in self._run('status --porcelain')
if _[:2] != '??' and (not force or (_[0] in ('R', 'A')
or _[1] == 'D')))
return (
normalize(_[3:].split(" -> ", 1)[-1])
for _ in self._run("status --porcelain")
if _[:2] != "??" and (not force or (_[0] in ("R", "A") or _[1] == "D"))
)
def log(self, merge=False, first_parent=False, commit_time=False,
reverse_order=False, paths: list = None):
cmd = 'whatchanged --pretty={}'.format('%ct' if commit_time else '%at')
if merge: cmd += ' -m'
if first_parent: cmd += ' --first-parent'
if reverse_order: cmd += ' --reverse'
def log(
self,
merge=False,
first_parent=False,
commit_time=False,
reverse_order=False,
paths: list = None,
):
cmd = "whatchanged --pretty={}".format("%ct" if commit_time else "%at")
if merge:
cmd += " -m"
if first_parent:
cmd += " --first-parent"
if reverse_order:
cmd += " --reverse"
return self._run(cmd, paths)
def describe(self):
return self._run('describe --tags', check=True)[0]
return self._run("describe --tags", check=True)[0]
def terminate(self):
if self._proc is None:
@@ -345,18 +458,22 @@ class Git:
pass
def _get_repo_dirs(self):
return (os.path.normpath(_) for _ in
self._run('rev-parse --show-toplevel --absolute-git-dir', check=True))
return (
os.path.normpath(_)
for _ in self._run(
"rev-parse --show-toplevel --absolute-git-dir", check=True
)
)
def _run(self, cmdstr: str, paths: list = None, output=True, check=False):
cmdlist = self.gitcmd + shlex.split(cmdstr)
if paths:
cmdlist.append('--')
cmdlist.append("--")
cmdlist.extend(paths)
popen_args = dict(universal_newlines=True, encoding='utf8')
popen_args = dict(universal_newlines=True, encoding="utf8")
if not self.errors:
popen_args['stderr'] = subprocess.DEVNULL
log.trace("Executing: %s", ' '.join(cmdlist))
popen_args["stderr"] = subprocess.DEVNULL
log.trace("Executing: %s", " ".join(cmdlist))
if not output:
return subprocess.call(cmdlist, **popen_args)
if check:
@@ -379,30 +496,26 @@ def parse_log(filelist, dirlist, stats, git, merge=False, filterlist=None):
mtime = 0
datestr = isodate(0)
for line in git.log(
merge,
args.first_parent,
args.commit_time,
args.reverse_order,
filterlist
merge, args.first_parent, args.commit_time, args.reverse_order, filterlist
):
stats['loglines'] += 1
stats["loglines"] += 1
# Blank line between Date and list of files
if not line:
continue
# Date line
if line[0] != ':': # Faster than `not line.startswith(':')`
stats['commits'] += 1
if line[0] != ":": # Faster than `not line.startswith(':')`
stats["commits"] += 1
mtime = int(line)
if args.unique_times:
mtime = get_mtime_ns(mtime, stats['commits'])
mtime = get_mtime_ns(mtime, stats["commits"])
if args.debug:
datestr = isodate(mtime)
continue
# File line: three tokens if it describes a renaming, otherwise two
tokens = line.split('\t')
tokens = line.split("\t")
# Possible statuses:
# M: Modified (content changed)
@@ -411,7 +524,7 @@ def parse_log(filelist, dirlist, stats, git, merge=False, filterlist=None):
# T: Type changed: to/from regular file, symlinks, submodules
# R099: Renamed (moved), with % of unchanged content. 100 = pure rename
# Not possible in log: C=Copied, U=Unmerged, X=Unknown, B=pairing Broken
status = tokens[0].split(' ')[-1]
status = tokens[0].split(" ")[-1]
file = tokens[-1]
# Handles non-ASCII chars and OS path separator
@@ -419,56 +532,76 @@ def parse_log(filelist, dirlist, stats, git, merge=False, filterlist=None):
def do_file():
if args.skip_older_than_commit and get_mtime_path(file) <= mtime:
stats['skip'] += 1
stats["skip"] += 1
return
if args.debug:
log.debug("%d\t%d\t%d\t%s\t%s",
stats['loglines'], stats['commits'], stats['files'],
datestr, file)
log.debug(
"%d\t%d\t%d\t%s\t%s",
stats["loglines"],
stats["commits"],
stats["files"],
datestr,
file,
)
try:
touch(os.path.join(git.workdir, file), mtime)
stats['touches'] += 1
stats["touches"] += 1
except Exception as e:
log.error("ERROR: %s: %s", e, file)
stats['errors'] += 1
stats["errors"] += 1
def do_dir():
if args.debug:
log.debug("%d\t%d\t-\t%s\t%s",
stats['loglines'], stats['commits'],
datestr, "{}/".format(dirname or '.'))
log.debug(
"%d\t%d\t-\t%s\t%s",
stats["loglines"],
stats["commits"],
datestr,
"{}/".format(dirname or "."),
)
try:
touch(os.path.join(git.workdir, dirname), mtime)
stats['dirtouches'] += 1
stats["dirtouches"] += 1
except Exception as e:
log.error("ERROR: %s: %s", e, dirname)
stats['direrrors'] += 1
stats["direrrors"] += 1
if file in filelist:
stats['files'] -= 1
stats["files"] -= 1
filelist.remove(file)
do_file()
if args.dirs and status in ('A', 'D'):
if args.dirs and status in ("A", "D"):
dirname = os.path.dirname(file)
if dirname in dirlist:
dirlist.remove(dirname)
do_dir()
# All files done?
if not stats['files']:
if not stats["files"]:
git.terminate()
return
# Main Logic ##################################################################
def main():
start = time.time() # yes, Wall time. CPU time is not realistic for users.
stats = {_: 0 for _ in ('loglines', 'commits', 'touches', 'skip', 'errors',
'dirtouches', 'direrrors')}
stats = {
_: 0
for _ in (
"loglines",
"commits",
"touches",
"skip",
"errors",
"dirtouches",
"direrrors",
)
}
logging.basicConfig(level=args.loglevel, format='%(message)s')
logging.basicConfig(level=args.loglevel, format="%(message)s")
log.trace("Arguments: %s", args)
# First things first: Where and Who are we?
@@ -499,13 +632,16 @@ def main():
# Symlink (to file, to dir or broken - git handles the same way)
if not UPDATE_SYMLINKS and os.path.islink(fullpath):
log.warning("WARNING: Skipping symlink, no OS support for updates: %s",
path)
log.warning(
"WARNING: Skipping symlink, no OS support for updates: %s", path
)
continue
# skip files which are older than given threshold
if (args.skip_older_than
and start - get_mtime_path(fullpath) > args.skip_older_than):
if (
args.skip_older_than
and start - get_mtime_path(fullpath) > args.skip_older_than
):
continue
# Always add files relative to worktree root
@@ -519,15 +655,17 @@ def main():
else:
dirty = set(git.ls_dirty())
if dirty:
log.warning("WARNING: Modified files in the working directory were ignored."
"\nTo include such files, commit your changes or use --force.")
log.warning(
"WARNING: Modified files in the working directory were ignored."
"\nTo include such files, commit your changes or use --force."
)
filelist -= dirty
# Build dir list to be processed
dirlist = set(os.path.dirname(_) for _ in filelist) if args.dirs else set()
stats['totalfiles'] = stats['files'] = len(filelist)
log.info("{0:,} files to be processed in work dir".format(stats['totalfiles']))
stats["totalfiles"] = stats["files"] = len(filelist)
log.info("{0:,} files to be processed in work dir".format(stats["totalfiles"]))
if not filelist:
# Nothing to do. Exit silently and without errors, just like git does
@@ -544,10 +682,18 @@ def main():
if args.missing and not args.merge:
filterlist = list(filelist)
missing = len(filterlist)
log.info("{0:,} files not found in log, trying merge commits".format(missing))
log.info(
"{0:,} files not found in log, trying merge commits".format(missing)
)
for i in range(0, missing, STEPMISSING):
parse_log(filelist, dirlist, stats, git,
merge=True, filterlist=filterlist[i:i + STEPMISSING])
parse_log(
filelist,
dirlist,
stats,
git,
merge=True,
filterlist=filterlist[i : i + STEPMISSING],
)
# Still missing some?
for file in filelist:
@@ -556,29 +702,33 @@ def main():
# Final statistics
# Suggestion: use git-log --before=mtime to brag about skipped log entries
def log_info(msg, *a, width=13):
ifmt = '{:%d,}' % (width,) # not using 'n' for consistency with ffmt
ffmt = '{:%d,.2f}' % (width,)
ifmt = "{:%d,}" % (width,) # not using 'n' for consistency with ffmt
ffmt = "{:%d,.2f}" % (width,)
# %-formatting lacks a thousand separator, must pre-render with .format()
log.info(msg.replace('%d', ifmt).replace('%f', ffmt).format(*a))
log.info(msg.replace("%d", ifmt).replace("%f", ffmt).format(*a))
log_info(
"Statistics:\n"
"%f seconds\n"
"%d log lines processed\n"
"%d commits evaluated",
time.time() - start, stats['loglines'], stats['commits'])
"Statistics:\n%f seconds\n%d log lines processed\n%d commits evaluated",
time.time() - start,
stats["loglines"],
stats["commits"],
)
if args.dirs:
if stats['direrrors']: log_info("%d directory update errors", stats['direrrors'])
log_info("%d directories updated", stats['dirtouches'])
if stats["direrrors"]:
log_info("%d directory update errors", stats["direrrors"])
log_info("%d directories updated", stats["dirtouches"])
if stats['touches'] != stats['totalfiles']:
log_info("%d files", stats['totalfiles'])
if stats['skip']: log_info("%d files skipped", stats['skip'])
if stats['files']: log_info("%d files missing", stats['files'])
if stats['errors']: log_info("%d file update errors", stats['errors'])
if stats["touches"] != stats["totalfiles"]:
log_info("%d files", stats["totalfiles"])
if stats["skip"]:
log_info("%d files skipped", stats["skip"])
if stats["files"]:
log_info("%d files missing", stats["files"])
if stats["errors"]:
log_info("%d file update errors", stats["errors"])
log_info("%d files updated", stats['touches'])
log_info("%d files updated", stats["touches"])
if args.test:
log.info("TEST RUN - No files modified!")

View File

@@ -1,6 +0,0 @@
"NotIn": "not in",
- `/checkin`: Check-in
docs/docs/integrations/providers/trulens.mdx
self.assertIn(
from trulens_eval import Tru
tru = Tru()

View File

@@ -1,3 +1,11 @@
# Validates that a package's integration tests compile without syntax or import errors.
#
# (If an integration test fails to compile, it won't run.)
#
# Called as part of check_diffs.yml workflow
#
# Runs pytest with compile marker to check syntax/imports.
name: '🔗 Compile Integration Tests'
on:
@@ -27,12 +35,14 @@ jobs:
timeout-minutes: 20
name: 'Python ${{ inputs.python-version }}'
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ inputs.python-version }}
cache-suffix: compile-integration-tests-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: '📦 Install Integration Dependencies'
shell: bash

View File

@@ -1,4 +1,12 @@
# Runs `make integration_tests` on the specified package.
#
# Manually triggered via workflow_dispatch for testing with real APIs.
#
# Installs integration test dependencies and executes full test suite.
name: '🚀 Integration Tests'
run-name: 'Test ${{ inputs.working-directory }} on Python ${{ inputs.python-version }}'
on:
workflow_dispatch:
@@ -11,6 +19,7 @@ on:
required: true
type: string
description: "Python version to use"
default: "3.11"
permissions:
contents: read
@@ -24,14 +33,16 @@ jobs:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
name: '🚀 Integration Tests (Python ${{ inputs.python-version }})'
name: 'Python ${{ inputs.python-version }}'
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ inputs.python-version }}
cache-suffix: integration-tests-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: '📦 Install Integration Dependencies'
shell: bash
@@ -79,7 +90,7 @@ jobs:
run: |
make integration_tests
- name: Ensure the tests did not create any additional files
- name: 'Ensure testing did not create/modify files'
shell: bash
run: |
set -eu

View File

@@ -1,6 +1,11 @@
name: '🧹 Code Linting'
# Runs code quality checks using ruff, mypy, and other linting tools
# Checks both package code and test code for consistency
# Runs linting.
#
# Uses the package's Makefile to run the checks, specifically the
# `lint_package` and `lint_tests` targets.
#
# Called as part of check_diffs.yml workflow.
name: '🧹 Linting'
on:
workflow_call:
@@ -33,22 +38,16 @@ jobs:
timeout-minutes: 20
steps:
- name: '📋 Checkout Code'
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ inputs.python-version }}
cache-suffix: lint-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: '📦 Install Lint & Typing Dependencies'
# Also installs dev/lint/test/typing dependencies, to ensure we have
# type hints for as many of our libraries as possible.
# This helps catch errors that require dependencies to be spotted, for example:
# https://github.com/langchain-ai/langchain/pull/10249/files#diff-935185cd488d015f026dcd9e19616ff62863e8cde8c0bee70318d3ccbca98341
#
# If you change this configuration, make sure to change the `cache-key`
# in the `poetry_setup` action above to stop using the old cache.
# It doesn't matter how you change it, any change will cause a cache-bust.
working-directory: ${{ inputs.working-directory }}
run: |
uv sync --group lint --group typing
@@ -58,20 +57,13 @@ jobs:
run: |
make lint_package
- name: '📦 Install Unit Test Dependencies'
# Also installs dev/lint/test/typing dependencies, to ensure we have
# type hints for as many of our libraries as possible.
# This helps catch errors that require dependencies to be spotted, for example:
# https://github.com/langchain-ai/langchain/pull/10249/files#diff-935185cd488d015f026dcd9e19616ff62863e8cde8c0bee70318d3ccbca98341
#
# If you change this configuration, make sure to change the `cache-key`
# in the `poetry_setup` action above to stop using the old cache.
# It doesn't matter how you change it, any change will cause a cache-bust.
- name: '📦 Install Test Dependencies (non-partners)'
# (For directories NOT starting with libs/partners/)
if: ${{ ! startsWith(inputs.working-directory, 'libs/partners/') }}
working-directory: ${{ inputs.working-directory }}
run: |
uv sync --inexact --group test
- name: '📦 Install Unit + Integration Test Dependencies'
- name: '📦 Install Test Dependencies'
if: ${{ startsWith(inputs.working-directory, 'libs/partners/') }}
working-directory: ${{ inputs.working-directory }}
run: |

View File

@@ -1,5 +1,11 @@
# Builds and publishes LangChain packages to PyPI.
#
# Manually triggered, though can be used as a reusable workflow (workflow_call).
#
# Handles version bumping, building, and publishing to PyPI with authentication.
name: '🚀 Package Release'
run-name: '🚀 Release ${{ inputs.working-directory }} by @${{ github.actor }}'
run-name: 'Release ${{ inputs.working-directory }} ${{ inputs.release-version }}'
on:
workflow_call:
inputs:
@@ -14,6 +20,11 @@ on:
type: string
description: "From which folder this pipeline executes"
default: 'libs/langchain'
release-version:
required: true
type: string
default: '0.1.0'
description: "New version of package being released"
dangerous-nonmaster-release:
required: false
type: boolean
@@ -38,7 +49,7 @@ jobs:
version: ${{ steps.check-version.outputs.version }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
@@ -47,8 +58,8 @@ jobs:
# We want to keep this build stage *separate* from the release stage,
# so that there's no sharing of permissions between them.
# The release stage has trusted publishing and GitHub repo contents write access,
# and we want to keep the scope of that access limited just to the release job.
# (Release stage has trusted publishing and GitHub repo contents write access,
#
# Otherwise, a malicious `build` step (e.g. via a compromised dependency)
# could get access to our GitHub or PyPI credentials.
#
@@ -87,7 +98,7 @@ jobs:
outputs:
release-body: ${{ steps.generate-release-body.outputs.release-body }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain
path: langchain
@@ -111,7 +122,7 @@ jobs:
# Look for the latest release of the same base version
REGEX="^$PKG_NAME==$BASE_VERSION\$"
PREV_TAG=$(git tag --sort=-creatordate | (grep -P "$REGEX" || true) | head -1)
# If no exact base version match, look for the latest release of any kind
if [ -z "$PREV_TAG" ]; then
REGEX="^$PKG_NAME==\\d+\\.\\d+\\.\\d+\$"
@@ -122,7 +133,7 @@ jobs:
PREV_TAG="$PKG_NAME==${VERSION%.*}.$(( ${VERSION##*.} - 1 ))"; [[ "${VERSION##*.}" -eq 0 ]] && PREV_TAG=""
# backup case if releasing e.g. 0.3.0, looks up last release
# note if last release (chronologically) was e.g. 0.1.47 it will get
# note if last release (chronologically) was e.g. 0.1.47 it will get
# that instead of the last 0.2 release
if [ -z "$PREV_TAG" ]; then
REGEX="^$PKG_NAME==\\d+\\.\\d+\\.\\d+\$"
@@ -178,13 +189,36 @@ jobs:
needs:
- build
- release-notes
uses:
./.github/workflows/_test_release.yml
permissions: write-all
with:
working-directory: ${{ inputs.working-directory }}
dangerous-nonmaster-release: ${{ inputs.dangerous-nonmaster-release }}
secrets: inherit
runs-on: ubuntu-latest
permissions:
# This permission is used for trusted publishing:
# https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
#
# Trusted publishing has to also be configured on PyPI for each package:
# https://docs.pypi.org/trusted-publishers/adding-a-publisher/
id-token: write
steps:
- uses: actions/checkout@v5
- uses: actions/download-artifact@v5
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Publish to test PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true
repository-url: https://test.pypi.org/legacy/
# We overwrite any existing distributions with the same name and version.
# This is *only for CI use* and is *extremely dangerous* otherwise!
# https://github.com/pypa/gh-action-pypi-publish#tolerating-release-package-file-duplicates
skip-existing: true
# Temp workaround since attestations are on by default as of gh-action-pypi-publish v1.11.0
attestations: false
pre-release-checks:
needs:
@@ -194,7 +228,7 @@ jobs:
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
# We explicitly *don't* set up caching here. This ensures our tests are
# maximally sensitive to catching breakage.
@@ -215,7 +249,7 @@ jobs:
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v4
- uses: actions/download-artifact@v5
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -260,16 +294,19 @@ jobs:
run: |
VIRTUAL_ENV=.venv uv pip install dist/*.whl
- name: Run unit tests
run: make tests
working-directory: ${{ inputs.working-directory }}
- name: Check for prerelease versions
# Block release if any dependencies allow prerelease versions
# (unless this is itself a prerelease version)
working-directory: ${{ inputs.working-directory }}
run: |
uv run python $GITHUB_WORKSPACE/.github/scripts/check_prerelease_dependencies.py pyproject.toml
- name: Run unit tests
run: make tests
working-directory: ${{ inputs.working-directory }}
- name: Get minimum versions
# Find the minimum published versions that satisfies the given constraints
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
@@ -284,7 +321,8 @@ jobs:
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
VIRTUAL_ENV=.venv uv pip install --force-reinstall $MIN_VERSIONS --editable .
VIRTUAL_ENV=.venv uv pip install --force-reinstall --editable .
VIRTUAL_ENV=.venv uv pip install --force-reinstall $MIN_VERSIONS
make tests
working-directory: ${{ inputs.working-directory }}
@@ -293,6 +331,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
- name: Run integration tests
# Uses the Makefile's `integration_tests` target for the specified package
if: ${{ startsWith(inputs.working-directory, 'libs/partners/') }}
env:
AI21_API_KEY: ${{ secrets.AI21_API_KEY }}
@@ -333,7 +372,10 @@ jobs:
working-directory: ${{ inputs.working-directory }}
# Test select published packages against new core
# Done when code changes are made to langchain-core
test-prior-published-packages-against-new-core:
# Installs the new core with old partners: Installs the new unreleased core
# alongside the previously published partner packages and runs integration tests
needs:
- build
- release-notes
@@ -357,10 +399,11 @@ jobs:
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
# We implement this conditional as Github Actions does not have good support
# for conditionally needing steps. https://github.com/actions/runner/issues/491
# TODO: this seems to be resolved upstream, so we can probably remove this workaround
- name: Check if libs/core
run: |
if [ "${{ startsWith(inputs.working-directory, 'libs/core') }}" != "true" ]; then
@@ -374,7 +417,7 @@ jobs:
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v4
- uses: actions/download-artifact@v5
if: startsWith(inputs.working-directory, 'libs/core')
with:
name: dist
@@ -383,11 +426,12 @@ jobs:
- name: Test against ${{ matrix.partner }}
if: startsWith(inputs.working-directory, 'libs/core')
run: |
# Identify latest tag
# Identify latest tag, excluding pre-releases
LATEST_PACKAGE_TAG="$(
git ls-remote --tags origin "langchain-${{ matrix.partner }}*" \
| awk '{print $2}' \
| sed 's|refs/tags/||' \
| grep -E '==0\.3\.[0-9]+$' \
| sort -Vr \
| head -n 1
)"
@@ -414,6 +458,7 @@ jobs:
make integration_tests
publish:
# Publishes the package to PyPI
needs:
- build
- release-notes
@@ -434,14 +479,14 @@ jobs:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v4
- uses: actions/download-artifact@v5
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -456,6 +501,7 @@ jobs:
attestations: false
mark-release:
# Marks the GitHub release with the new version tag
needs:
- build
- release-notes
@@ -465,7 +511,7 @@ jobs:
runs-on: ubuntu-latest
permissions:
# This permission is needed by `ncipollo/release-action` to
# create the GitHub release.
# create the GitHub release/tag
contents: write
defaults:
@@ -473,18 +519,18 @@ jobs:
working-directory: ${{ inputs.working-directory }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: Set up Python + uv
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
- uses: actions/download-artifact@v4
- uses: actions/download-artifact@v5
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Create Tag
uses: ncipollo/release-action@v1
with:

View File

@@ -1,6 +1,7 @@
name: '🧪 Unit Testing'
# Runs unit tests with both current and minimum supported dependency versions
# to ensure compatibility across the supported range
# to ensure compatibility across the supported range.
name: '🧪 Unit Testing'
on:
workflow_call:
@@ -32,13 +33,16 @@ jobs:
name: 'Python ${{ inputs.python-version }}'
steps:
- name: '📋 Checkout Code'
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
uses: "./.github/actions/uv_setup"
id: setup-python
with:
python-version: ${{ inputs.python-version }}
cache-suffix: test-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: '📦 Install Test Dependencies'
shell: bash
run: uv sync --group test --dev
@@ -79,4 +83,4 @@ jobs:
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -1,3 +1,10 @@
# Validates that all import statements in `.ipynb` notebooks are correct and functional.
#
# Called as part of check_diffs.yml.
#
# Installs test dependencies and LangChain packages in editable mode and
# runs check_imports.py.
name: '📑 Documentation Import Testing'
on:
@@ -21,12 +28,14 @@ jobs:
name: '🔍 Check Doc Imports (Python ${{ inputs.python-version }})'
steps:
- name: '📋 Checkout Code'
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ inputs.python-version }}
cache-suffix: test-doc-imports-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: '📦 Install Test Dependencies'
shell: bash

View File

@@ -1,3 +1,5 @@
# Facilitate unit testing against different Pydantic versions for a provided package.
name: '🐍 Pydantic Version Testing'
on:
@@ -34,12 +36,14 @@ jobs:
name: 'Pydantic ~=${{ inputs.pydantic-version }}'
steps:
- name: '📋 Checkout Code'
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: '🐍 Set up Python ${{ inputs.python-version }} + UV'
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ inputs.python-version }}
cache-suffix: test-pydantic-${{ inputs.working-directory }}
working-directory: ${{ inputs.working-directory }}
- name: '📦 Install Test Dependencies'
shell: bash
@@ -64,4 +68,4 @@ jobs:
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -1,106 +0,0 @@
name: '🧪 Test Release Package'
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
dangerous-nonmaster-release:
required: false
type: boolean
default: false
description: "Release from a non-master branch (danger!)"
env:
PYTHON_VERSION: "3.11"
UV_FROZEN: "true"
jobs:
build:
if: github.ref == 'refs/heads/master' || inputs.dangerous-nonmaster-release
runs-on: ubuntu-latest
outputs:
pkg-name: ${{ steps.check-version.outputs.pkg-name }}
version: ${{ steps.check-version.outputs.version }}
steps:
- uses: actions/checkout@v4
- name: '🐍 Set up Python + UV'
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
# We want to keep this build stage *separate* from the release stage,
# so that there's no sharing of permissions between them.
# The release stage has trusted publishing and GitHub repo contents write access,
# and we want to keep the scope of that access limited just to the release job.
# Otherwise, a malicious `build` step (e.g. via a compromised dependency)
# could get access to our GitHub or PyPI credentials.
#
# Per the trusted publishing GitHub Action:
# > It is strongly advised to separate jobs for building [...]
# > from the publish job.
# https://github.com/pypa/gh-action-pypi-publish#non-goals
- name: '📦 Build Project for Distribution'
run: uv build
working-directory: ${{ inputs.working-directory }}
- name: '⬆️ Upload Build Artifacts'
uses: actions/upload-artifact@v4
with:
name: test-dist
path: ${{ inputs.working-directory }}/dist/
- name: '🔍 Extract Version Information'
id: check-version
shell: python
working-directory: ${{ inputs.working-directory }}
run: |
import os
import tomllib
with open("pyproject.toml", "rb") as f:
data = tomllib.load(f)
pkg_name = data["project"]["name"]
version = data["project"]["version"]
with open(os.environ["GITHUB_OUTPUT"], "a") as f:
f.write(f"pkg-name={pkg_name}\n")
f.write(f"version={version}\n")
publish:
needs:
- build
runs-on: ubuntu-latest
permissions:
# This permission is used for trusted publishing:
# https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
#
# Trusted publishing has to also be configured on PyPI for each package:
# https://docs.pypi.org/trusted-publishers/adding-a-publisher/
id-token: write
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v4
with:
name: test-dist
path: ${{ inputs.working-directory }}/dist/
- name: Publish to test PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true
repository-url: https://test.pypi.org/legacy/
# We overwrite any existing distributions with the same name and version.
# This is *only for CI use* and is *extremely dangerous* otherwise!
# https://github.com/pypa/gh-action-pypi-publish#tolerating-release-package-file-duplicates
skip-existing: true
# Temp workaround since attestations are on by default as of gh-action-pypi-publish v1.11.0
attestations: false

View File

@@ -1,10 +1,19 @@
name: '📚 API Documentation Build'
# Runs daily or can be triggered manually for immediate updates
# Build the API reference documentation.
#
# Runs daily. Can also be triggered manually for immediate updates.
#
# Built HTML pushed to langchain-ai/langchain-api-docs-html.
#
# Looks for langchain-ai org repos in packages.yml and checks them out.
# Calls prep_api_docs_build.py.
name: '📚 API Docs'
run-name: 'Build & Deploy API Reference'
on:
workflow_dispatch:
schedule:
- cron: '0 13 * * *' # Daily at 1PM UTC
- cron: '0 13 * * *' # Runs daily at 1PM UTC (9AM EDT/6AM PDT)
env:
PYTHON_VERSION: "3.11"
@@ -16,10 +25,10 @@ jobs:
permissions:
contents: read
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
with:
path: langchain
- uses: actions/checkout@v4
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-api-docs-html
path: langchain-api-docs-html
@@ -30,6 +39,8 @@ jobs:
uses: mikefarah/yq@master
with:
cmd: |
# Extract repos from packages.yml that are in the langchain-ai org
# (excluding 'langchain' itself)
yq '
.packages[]
| select(
@@ -51,7 +62,6 @@ jobs:
run: |
# Get unique repositories
REPOS=$(echo "$REPOS_UNSORTED" | sort -u)
# Checkout each unique repository
for repo in $REPOS; do
# Validate repository format (allow any org with proper format)
@@ -59,43 +69,49 @@ jobs:
echo "Error: Invalid repository format: $repo"
exit 1
fi
REPO_NAME=$(echo $repo | cut -d'/' -f2)
# Additional validation for repo name
if [[ ! "$REPO_NAME" =~ ^[a-zA-Z0-9_.-]+$ ]]; then
echo "Error: Invalid repository name: $REPO_NAME"
exit 1
fi
echo "Checking out $repo to $REPO_NAME"
git clone --depth 1 https://github.com/$repo.git $REPO_NAME
done
- name: '🐍 Setup Python ${{ env.PYTHON_VERSION }}'
uses: actions/setup-python@v5
uses: actions/setup-python@v6
id: setup-python
with:
python-version: ${{ env.PYTHON_VERSION }}
- name: '📦 Install Initial Python Dependencies'
- name: '📦 Install Initial Python Dependencies using uv'
working-directory: langchain
run: |
python -m pip install -U uv
python -m uv pip install --upgrade --no-cache-dir pip setuptools pyyaml
- name: '📦 Organize Library Directories'
# Places cloned partner packages into libs/partners structure
run: python langchain/.github/scripts/prep_api_docs_build.py
- name: '🧹 Remove Old HTML Files'
- name: '🧹 Clear Prior Build'
run:
# Remove artifacts from prior docs build
rm -rf langchain-api-docs-html/api_reference_build/html
- name: '📦 Install Documentation Dependencies'
- name: '📦 Install Documentation Dependencies using uv'
working-directory: langchain
run: |
python -m uv pip install $(ls ./libs/partners | xargs -I {} echo "./libs/partners/{}") --overrides ./docs/vercel_overrides.txt
# Install all partner packages in editable mode with overrides
python -m uv pip install $(ls ./libs/partners | xargs -I {} echo "./libs/partners/{}") --overrides ./docs/vercel_overrides.txt --prerelease=allow
# Install core langchain and other main packages
python -m uv pip install libs/core libs/langchain libs/text-splitters libs/community libs/experimental libs/standard-tests
# Install Sphinx and related packages for building docs
python -m uv pip install -r docs/api_reference/requirements.txt
- name: '🔧 Configure Git Settings'
@@ -107,14 +123,29 @@ jobs:
- name: '📚 Build API Documentation'
working-directory: langchain
run: |
# Generate the API reference RST files
python docs/api_reference/create_api_rst.py
# Build the HTML documentation using Sphinx
# -T: show full traceback on exception
# -E: don't use cached environment (force rebuild, ignore cached doctrees)
# -b html: build HTML docs (vs PDS, etc.)
# -d: path for the cached environment (parsed document trees / doctrees)
# - Separate from output dir for faster incremental builds
# -c: path to conf.py
# -j auto: parallel build using all available CPU cores
python -m sphinx -T -E -b html -d ../langchain-api-docs-html/_build/doctrees -c docs/api_reference docs/api_reference ../langchain-api-docs-html/api_reference_build/html -j auto
# Post-process the generated HTML
python docs/api_reference/scripts/custom_formatter.py ../langchain-api-docs-html/api_reference_build/html
# Default index page is blank so we copy in the actual home page.
cp ../langchain-api-docs-html/api_reference_build/html/{reference,index}.html
# Removes Sphinx's intermediate build artifacts after the build is complete.
rm -rf ../langchain-api-docs-html/_build/
# https://github.com/marketplace/actions/add-commit
# Commit and push changes to langchain-api-docs-html repo
- uses: EndBug/add-and-commit@v9
with:
cwd: langchain-api-docs-html

View File

@@ -1,9 +1,11 @@
# Runs broken link checker in /docs on a daily schedule.
name: '🔗 Check Broken Links'
on:
workflow_dispatch:
schedule:
- cron: '0 13 * * *'
- cron: '0 13 * * *' # Runs daily at 1PM UTC (9AM EDT/6AM PDT)
permissions:
contents: read
@@ -13,9 +15,9 @@ jobs:
if: github.repository_owner == 'langchain-ai' || github.event_name != 'schedule'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: '🟢 Setup Node.js 18.x'
uses: actions/setup-node@v4
uses: actions/setup-node@v5
with:
node-version: 18.x
cache: "yarn"

View File

@@ -1,6 +1,8 @@
name: '🔍 Check `core` Version Equality'
# Ensures version numbers in pyproject.toml and version.py stay in sync
# Prevents releases with mismatched version numbers
# Ensures version numbers in pyproject.toml and version.py stay in sync.
#
# (Prevents releases with mismatched version numbers)
name: '🔍 Check Version Equality'
on:
pull_request:
@@ -16,19 +18,34 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: '✅ Verify pyproject.toml & version.py Match'
run: |
PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/core/pyproject.toml)
VERSION_PY_VERSION=$(grep -Po '(?<=^VERSION = ")[^"]*' libs/core/langchain_core/version.py)
# Check core versions
CORE_PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/core/pyproject.toml)
CORE_VERSION_PY_VERSION=$(grep -Po '(?<=^VERSION = ")[^"]*' libs/core/langchain_core/version.py)
# Compare the two versions
if [ "$PYPROJECT_VERSION" != "$VERSION_PY_VERSION" ]; then
# Compare core versions
if [ "$CORE_PYPROJECT_VERSION" != "$CORE_VERSION_PY_VERSION" ]; then
echo "langchain-core versions in pyproject.toml and version.py do not match!"
echo "pyproject.toml version: $PYPROJECT_VERSION"
echo "version.py version: $VERSION_PY_VERSION"
echo "pyproject.toml version: $CORE_PYPROJECT_VERSION"
echo "version.py version: $CORE_VERSION_PY_VERSION"
exit 1
else
echo "Versions match: $PYPROJECT_VERSION"
echo "Core versions match: $CORE_PYPROJECT_VERSION"
fi
# Check langchain_v1 versions
LANGCHAIN_PYPROJECT_VERSION=$(grep -Po '(?<=^version = ")[^"]*' libs/langchain_v1/pyproject.toml)
LANGCHAIN_INIT_PY_VERSION=$(grep -Po '(?<=^__version__ = ")[^"]*' libs/langchain_v1/langchain/__init__.py)
# Compare langchain_v1 versions
if [ "$LANGCHAIN_PYPROJECT_VERSION" != "$LANGCHAIN_INIT_PY_VERSION" ]; then
echo "langchain_v1 versions in pyproject.toml and __init__.py do not match!"
echo "pyproject.toml version: $LANGCHAIN_PYPROJECT_VERSION"
echo "version.py version: $LANGCHAIN_INIT_PY_VERSION"
exit 1
else
echo "Langchain v1 versions match: $LANGCHAIN_PYPROJECT_VERSION"
fi

View File

@@ -1,3 +1,18 @@
# Primary CI workflow.
#
# Only runs against packages that have changed files.
#
# Runs:
# - Linting (_lint.yml)
# - Unit Tests (_test.yml)
# - Pydantic compatibility tests (_test_pydantic.yml)
# - Documentation import tests (_test_doc_imports.yml)
# - Integration test compilation checks (_compile_integration_test.yml)
# - Extended test suites that require additional dependencies
# - Codspeed benchmarks (if not labeled 'codspeed-ignore')
#
# Reports status to GitHub checks and PR status.
name: '🔧 CI'
on:
@@ -11,8 +26,8 @@ on:
# cancel the earlier run in favor of the next run.
#
# There's no point in testing an outdated version of the code. GitHub only allows
# a limited number of job runners to be active at the same time, so it's better to cancel
# pointless jobs early so that more useful jobs can run sooner.
# a limited number of job runners to be active at the same time, so it's better to
# cancel pointless jobs early so that more useful jobs can run sooner.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
@@ -30,11 +45,12 @@ jobs:
build:
name: 'Detect Changes & Set Matrix'
runs-on: ubuntu-latest
if: ${{ !contains(github.event.pull_request.labels.*.name, 'ci-ignore') }}
steps:
- name: '📋 Checkout Code'
uses: actions/checkout@v4
uses: actions/checkout@v5
- name: '🐍 Setup Python 3.11'
uses: actions/setup-python@v5
uses: actions/setup-python@v6
with:
python-version: '3.11'
- name: '📂 Get Changed Files'
@@ -53,6 +69,7 @@ jobs:
dependencies: ${{ steps.set-matrix.outputs.dependencies }}
test-doc-imports: ${{ steps.set-matrix.outputs.test-doc-imports }}
test-pydantic: ${{ steps.set-matrix.outputs.test-pydantic }}
codspeed: ${{ steps.set-matrix.outputs.codspeed }}
# Run linting only on packages that have changed files
lint:
needs: [ build ]
@@ -109,6 +126,7 @@ jobs:
# Verify integration tests compile without actually running them (faster feedback)
compile-integration-tests:
name: 'Compile Integration Tests'
needs: [ build ]
if: ${{ needs.build.outputs.compile-integration-tests != '[]' }}
strategy:
@@ -137,12 +155,14 @@ jobs:
run:
working-directory: ${{ matrix.job-configs.working-directory }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: '🐍 Set up Python ${{ matrix.job-configs.python-version }} + UV'
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ matrix.job-configs.python-version }}
cache-suffix: extended-tests-${{ matrix.job-configs.working-directory }}
working-directory: ${{ matrix.job-configs.working-directory }}
- name: '📦 Install Dependencies & Run Extended Tests'
shell: bash
@@ -165,10 +185,72 @@ jobs:
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'
# Run codspeed benchmarks only on packages that have changed files
codspeed:
name: '⚡ CodSpeed Benchmarks'
needs: [ build ]
if: ${{ needs.build.outputs.codspeed != '[]' && !contains(github.event.pull_request.labels.*.name, 'codspeed-ignore') }}
runs-on: ubuntu-latest
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.codspeed) }}
fail-fast: false
steps:
- uses: actions/checkout@v5
# We have to use 3.12 as 3.13 is not yet supported
- name: '📦 Install UV Package Manager'
uses: astral-sh/setup-uv@v6
with:
python-version: "3.12"
- uses: actions/setup-python@v6
with:
python-version: "3.12"
- name: '📦 Install Test Dependencies'
run: uv sync --group test
working-directory: ${{ matrix.job-configs.working-directory }}
- name: '⚡ Run Benchmarks: ${{ matrix.job-configs.working-directory }}'
uses: CodSpeedHQ/action@v4
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
ANTHROPIC_FILES_API_IMAGE_ID: ${{ secrets.ANTHROPIC_FILES_API_IMAGE_ID }}
ANTHROPIC_FILES_API_PDF_ID: ${{ secrets.ANTHROPIC_FILES_API_PDF_ID }}
AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_API_KEY }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
NOMIC_API_KEY: ${{ secrets.NOMIC_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
PPLX_API_KEY: ${{ secrets.PPLX_API_KEY }}
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
with:
token: ${{ secrets.CODSPEED_TOKEN }}
run: |
cd ${{ matrix.job-configs.working-directory }}
if [ "${{ matrix.job-configs.working-directory }}" = "libs/core" ]; then
uv run --no-sync pytest ./tests/benchmarks --codspeed
else
uv run --no-sync pytest ./tests/ --codspeed
fi
mode: ${{ matrix.job-configs.working-directory == 'libs/core' && 'walltime' || 'instrumentation' }}
# Final status check - ensures all required jobs passed before allowing merge
ci_success:
name: '✅ CI Success'
needs: [build, lint, test, compile-integration-tests, extended-tests, test-doc-imports, test-pydantic]
needs: [build, lint, test, compile-integration-tests, extended-tests, test-doc-imports, test-pydantic, codspeed]
if: |
always()
runs-on: ubuntu-latest

View File

@@ -1,3 +1,6 @@
# For integrations, we run check_templates.py to ensure that new docs use the correct
# templates based on their type. See the script for more details.
name: '📑 Integration Docs Lint'
on:
@@ -22,8 +25,8 @@ jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- uses: actions/checkout@v5
- uses: actions/setup-python@v6
with:
python-version: '3.10'
- id: files

View File

@@ -1,65 +0,0 @@
name: '⚡ CodSpeed'
on:
push:
branches:
- master
pull_request:
workflow_dispatch:
permissions:
contents: read
env:
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: foo
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: foo
DEEPSEEK_API_KEY: foo
FIREWORKS_API_KEY: foo
jobs:
codspeed:
name: 'Benchmark'
runs-on: ubuntu-latest
strategy:
matrix:
include:
- working-directory: libs/core
mode: walltime
- working-directory: libs/partners/openai
- working-directory: libs/partners/anthropic
- working-directory: libs/partners/deepseek
- working-directory: libs/partners/fireworks
- working-directory: libs/partners/xai
- working-directory: libs/partners/mistralai
- working-directory: libs/partners/groq
fail-fast: false
steps:
- uses: actions/checkout@v4
# We have to use 3.12 as 3.13 is not yet supported
- name: '📦 Install UV Package Manager'
uses: astral-sh/setup-uv@v6
with:
python-version: "3.12"
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: '📦 Install Test Dependencies'
run: uv sync --group test
working-directory: ${{ matrix.working-directory }}
- name: '⚡ Run Benchmarks: ${{ matrix.working-directory }}'
uses: CodSpeedHQ/action@v3
with:
token: ${{ secrets.CODSPEED_TOKEN }}
run: |
cd ${{ matrix.working-directory }}
if [ "${{ matrix.working-directory }}" = "libs/core" ]; then
uv run --no-sync pytest ./tests/benchmarks --codspeed
else
uv run --no-sync pytest ./tests/ --codspeed
fi
mode: ${{ matrix.mode || 'instrumentation' }}

View File

@@ -1,10 +0,0 @@
import toml
pyproject_toml = toml.load("pyproject.toml")
# Extract the ignore words list (adjust the key as per your TOML structure)
ignore_words_list = (
pyproject_toml.get("tool", {}).get("codespell", {}).get("ignore-words-list")
)
print(f"::set-output name=ignore_words_list::{ignore_words_list}")

View File

@@ -1,8 +1,11 @@
name: '👥 LangChain People'
# Updates the LangChain People data by fetching the latest info from the LangChain Git.
# TODO: broken/not used
name: '👥 LangChain People'
run-name: 'Update People Data'
on:
schedule:
- cron: "0 14 1 * *"
- cron: "0 14 1 * *" # Runs at 14:00 UTC on the 1st of every month (10AM EDT/7AM PDT)
push:
branches: [jacob/people]
workflow_dispatch:
@@ -18,7 +21,7 @@ jobs:
env:
GITHUB_CONTEXT: ${{ toJson(github) }}
run: echo "$GITHUB_CONTEXT"
- uses: actions/checkout@v4
- uses: actions/checkout@v5
# Ref: https://github.com/actions/runner/issues/2033
- name: '🔧 Fix Git Safe Directory in Container'
run: mkdir -p /home/runner/work/_temp/_github_home && printf "[safe]\n\tdirectory = /github/workspace" > /home/runner/work/_temp/_github_home/.gitconfig

28
.github/workflows/pr_labeler_file.yml vendored Normal file
View File

@@ -0,0 +1,28 @@
# Label PRs based on changed files.
#
# See `.github/pr-file-labeler.yml` to see rules for each label/directory.
name: "🏷️ Pull Request Labeler"
on:
# Safe since we're not checking out or running the PR's code
# Never check out the PR's head in a pull_request_target job
pull_request_target:
types: [opened, synchronize, reopened, edited]
jobs:
labeler:
name: 'label'
permissions:
contents: read
pull-requests: write
issues: write
runs-on: ubuntu-latest
steps:
- name: Label Pull Request
uses: actions/labeler@v6
with:
repo-token: "${{ secrets.GITHUB_TOKEN }}"
configuration-path: .github/pr-file-labeler.yml
sync-labels: false

28
.github/workflows/pr_labeler_title.yml vendored Normal file
View File

@@ -0,0 +1,28 @@
# Label PRs based on their titles.
#
# See `.github/pr-title-labeler.yml` to see rules for each label/title pattern.
name: "🏷️ PR Title Labeler"
on:
# Safe since we're not checking out or running the PR's code
# Never check out the PR's head in a pull_request_target job
pull_request_target:
types: [opened, synchronize, reopened, edited]
jobs:
pr-title-labeler:
name: 'label'
permissions:
contents: read
pull-requests: write
issues: write
runs-on: ubuntu-latest
steps:
- name: Label PR based on title
# Archived repo; latest commit (v0.1.0)
uses: grafana/pr-labeler-action@f19222d3ef883d2ca5f04420fdfe8148003763f0
with:
token: ${{ secrets.GITHUB_TOKEN }}
configuration-path: .github/pr-title-labeler.yml

View File

@@ -1,50 +1,43 @@
# -----------------------------------------------------------------------------
# PR Title Lint Workflow
# PR title linting.
#
# Purpose:
# Enforces Conventional Commits format for pull request titles to maintain a
# clear, consistent, and machine-readable change history across our repository.
# This helps with automated changelog generation and semantic versioning.
# FORMAT (Conventional Commits 1.0.0):
#
# Enforced Commit Message Format (Conventional Commits 1.0.0):
# <type>[optional scope]: <description>
# [optional body]
# [optional footer(s)]
#
# Examples:
# feat(core): add multitenant support
# fix(cli): resolve flag parsing error
# docs: update API usage examples
# docs(openai): update API usage examples
#
# Allowed Types:
# feat — a new feature (MINOR bump)
# fix — a bug fix (PATCH bump)
# docs — documentation only changes
# style — formatting, missing semi-colons, etc.; no code change
# refactor — code change that neither fixes a bug nor adds a feature
# perf — code change that improves performance
# test — adding missing tests or correcting existing tests
# build — changes that affect the build system or external dependencies
# ci — continuous integration/configuration changes
# chore — other changes that don't modify src or test files
# revert — reverts a previous commit
# release — prepare a new release
# * feat — a new feature (MINOR)
# * fix — a bug fix (PATCH)
# * docs — documentation only changes (either in /docs or code comments)
# * style — formatting, linting, etc.; no code change or typing refactors
# * refactor — code change that neither fixes a bug nor adds a feature
# * perf — code change that improves performance
# * test — adding tests or correcting existing
# * build — changes that affect the build system/external dependencies
# * ci — continuous integration/configuration changes
# * chore — other changes that don't modify source or test files
# * revert — reverts a previous commit
# * release — prepare a new release
#
# Allowed Scopes (optional):
# core, cli, langchain, standard-tests, docs, anthropic, chroma, deepseek,
# exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai,
# perplexity, prompty, qdrant, xai
# core, cli, langchain, langchain_v1, langchain_legacy, standard-tests,
# text-splitters, docs, anthropic, chroma, deepseek, exa, fireworks, groq,
# huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant,
# xai, infra
#
# Rules & Tips for New Committers:
# 1. Subject (type) must start with a lowercase letter and, if possible, be
# followed by a scope wrapped in parenthesis `(scope)`
# 2. Breaking changes:
# Append "!" after type/scope (e.g., feat!: drop Node 12 support)
# Or include a footer "BREAKING CHANGE: <details>"
# 3. Example PR titles:
# feat(core): add multitenant support
# fix(cli): resolve flag parsing error
# docs: update API usage examples
# docs(openai): update API usage examples
# Rules:
# 1. The 'Type' must start with a lowercase letter.
# 2. Breaking changes: append "!" after type/scope (e.g., feat!: drop x support)
#
# Resources:
# • Conventional Commits spec: https://www.conventionalcommits.org/en/v1.0.0/
# -----------------------------------------------------------------------------
# Enforces Conventional Commits format for pull request titles to maintain a clear and
# machine-readable change history.
name: '🏷️ PR Title Lint'
@@ -56,13 +49,13 @@ on:
types: [opened, edited, synchronize]
jobs:
# Validates that PR title follows Conventional Commits specification
# Validates that PR title follows Conventional Commits 1.0.0 specification
lint-pr-title:
name: 'Validate PR Title Format'
name: 'validate format'
runs-on: ubuntu-latest
steps:
- name: '✅ Validate Conventional Commits Format'
uses: amannn/action-semantic-pull-request@v5
uses: amannn/action-semantic-pull-request@v6
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
@@ -84,6 +77,7 @@ jobs:
cli
langchain
langchain_v1
langchain_legacy
standard-tests
text-splitters
docs

View File

@@ -1,5 +1,7 @@
name: '📝 Run Documentation Notebooks'
# Integration tests for documentation notebooks.
name: '📓 Validate Documentation Notebooks'
run-name: 'Test notebooks in ${{ inputs.working-directory }}'
on:
workflow_dispatch:
inputs:
@@ -26,21 +28,23 @@ jobs:
if: github.repository == 'langchain-ai/langchain' || github.event_name != 'schedule'
name: '📑 Test Documentation Notebooks'
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
- name: '🐍 Set up Python + UV'
uses: "./.github/actions/uv_setup"
with:
python-version: ${{ github.event.inputs.python_version || '3.11' }}
cache-suffix: run-notebooks-${{ github.event.inputs.working-directory || 'all' }}
working-directory: ${{ github.event.inputs.working-directory || '**' }}
- name: '🔐 Authenticate to Google Cloud'
id: 'auth'
uses: google-github-actions/auth@v2
uses: google-github-actions/auth@v3
with:
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
- name: '🔐 Configure AWS Credentials'
uses: aws-actions/configure-aws-credentials@v4
uses: aws-actions/configure-aws-credentials@v5
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

View File

@@ -1,7 +1,14 @@
# Routine integration tests against partner libraries with live API credentials.
#
# Uses `make integration_tests` for each library in the matrix.
#
# Runs daily. Can also be triggered manually for immediate updates.
name: '⏰ Scheduled Integration Tests'
run-name: "Run Integration Tests - ${{ inputs.working-directory-force || 'all libs' }} (Python ${{ inputs.python-version-force || '3.9, 3.11' }})"
on:
workflow_dispatch: # Allows maintainers to trigger the workflow manually in GitHub UI
workflow_dispatch:
inputs:
working-directory-force:
type: string
@@ -19,7 +26,7 @@ env:
POETRY_VERSION: "1.8.4"
UV_FROZEN: "true"
DEFAULT_LIBS: '["libs/partners/openai", "libs/partners/anthropic", "libs/partners/fireworks", "libs/partners/groq", "libs/partners/mistralai", "libs/partners/xai", "libs/partners/google-vertexai", "libs/partners/google-genai", "libs/partners/aws"]'
POETRY_LIBS: ("libs/partners/google-vertexai" "libs/partners/google-genai" "libs/partners/aws")
POETRY_LIBS: ("libs/partners/aws")
jobs:
# Generate dynamic test matrix based on input parameters or defaults
@@ -53,13 +60,13 @@ jobs:
echo $matrix
echo "matrix=$matrix" >> $GITHUB_OUTPUT
# Run integration tests against partner libraries with live API credentials
# Tests are run with both Poetry and UV depending on the library's setup
# Tests are run with Poetry or UV depending on the library's setup
build:
if: github.repository_owner == 'langchain-ai' || github.event_name != 'schedule'
name: '🐍 Python ${{ matrix.python-version }}: ${{ matrix.working-directory }}'
runs-on: ubuntu-latest
needs: [compute-matrix]
timeout-minutes: 20
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
@@ -67,14 +74,14 @@ jobs:
working-directory: ${{ fromJSON(needs.compute-matrix.outputs.matrix).working-directory }}
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v5
with:
path: langchain
- uses: actions/checkout@v4
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-google
path: langchain-google
- uses: actions/checkout@v4
- uses: actions/checkout@v5
with:
repository: langchain-ai/langchain-aws
path: langchain-aws
@@ -105,12 +112,12 @@ jobs:
- name: '🔐 Authenticate to Google Cloud'
id: 'auth'
uses: google-github-actions/auth@v2
uses: google-github-actions/auth@v3
with:
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
- name: '🔐 Configure AWS Credentials'
uses: aws-actions/configure-aws-credentials@v4
uses: aws-actions/configure-aws-credentials@v5
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
@@ -160,8 +167,8 @@ jobs:
make integration_tests
- name: '🧹 Clean up External Libraries'
# Clean up external libraries to avoid affecting git status check
run: |
# Clean up external libraries to avoid affecting the following git status check
run: |
rm -rf \
langchain/libs/partners/google-genai \
langchain/libs/partners/google-vertexai \

9
.github/workflows/v1_changes.md vendored Normal file
View File

@@ -0,0 +1,9 @@
With the deprecation of v0 docs, the following files will need to be migrated/supported
in the new docs repo:
- run_notebooks.yml: New repo should run Integration tests on code snippets?
- people.yml: Need to fix and somehow display on the new docs site
- Subsequently, `.github/actions/people/`
- _test_doc_imports.yml
- check_new_docs.yml
- check-broken-links.yml

View File

@@ -11,4 +11,4 @@
"MD046": {
"style": "fenced"
}
}
}

View File

@@ -2,110 +2,104 @@ repos:
- repo: local
hooks:
- id: core
name: format core
name: format and lint core
language: system
entry: make -C libs/core format
entry: make -C libs/core format lint
files: ^libs/core/
pass_filenames: false
- id: langchain
name: format langchain
name: format and lint langchain
language: system
entry: make -C libs/langchain format
entry: make -C libs/langchain format lint
files: ^libs/langchain/
pass_filenames: false
- id: standard-tests
name: format standard-tests
name: format and lint standard-tests
language: system
entry: make -C libs/standard-tests format
entry: make -C libs/standard-tests format lint
files: ^libs/standard-tests/
pass_filenames: false
- id: text-splitters
name: format text-splitters
name: format and lint text-splitters
language: system
entry: make -C libs/text-splitters format
entry: make -C libs/text-splitters format lint
files: ^libs/text-splitters/
pass_filenames: false
- id: anthropic
name: format partners/anthropic
name: format and lint partners/anthropic
language: system
entry: make -C libs/partners/anthropic format
entry: make -C libs/partners/anthropic format lint
files: ^libs/partners/anthropic/
pass_filenames: false
- id: chroma
name: format partners/chroma
name: format and lint partners/chroma
language: system
entry: make -C libs/partners/chroma format
entry: make -C libs/partners/chroma format lint
files: ^libs/partners/chroma/
pass_filenames: false
- id: couchbase
name: format partners/couchbase
language: system
entry: make -C libs/partners/couchbase format
files: ^libs/partners/couchbase/
pass_filenames: false
- id: exa
name: format partners/exa
name: format and lint partners/exa
language: system
entry: make -C libs/partners/exa format
entry: make -C libs/partners/exa format lint
files: ^libs/partners/exa/
pass_filenames: false
- id: fireworks
name: format partners/fireworks
name: format and lint partners/fireworks
language: system
entry: make -C libs/partners/fireworks format
entry: make -C libs/partners/fireworks format lint
files: ^libs/partners/fireworks/
pass_filenames: false
- id: groq
name: format partners/groq
name: format and lint partners/groq
language: system
entry: make -C libs/partners/groq format
entry: make -C libs/partners/groq format lint
files: ^libs/partners/groq/
pass_filenames: false
- id: huggingface
name: format partners/huggingface
name: format and lint partners/huggingface
language: system
entry: make -C libs/partners/huggingface format
entry: make -C libs/partners/huggingface format lint
files: ^libs/partners/huggingface/
pass_filenames: false
- id: mistralai
name: format partners/mistralai
name: format and lint partners/mistralai
language: system
entry: make -C libs/partners/mistralai format
entry: make -C libs/partners/mistralai format lint
files: ^libs/partners/mistralai/
pass_filenames: false
- id: nomic
name: format partners/nomic
name: format and lint partners/nomic
language: system
entry: make -C libs/partners/nomic format
entry: make -C libs/partners/nomic format lint
files: ^libs/partners/nomic/
pass_filenames: false
- id: ollama
name: format partners/ollama
name: format and lint partners/ollama
language: system
entry: make -C libs/partners/ollama format
entry: make -C libs/partners/ollama format lint
files: ^libs/partners/ollama/
pass_filenames: false
- id: openai
name: format partners/openai
name: format and lint partners/openai
language: system
entry: make -C libs/partners/openai format
entry: make -C libs/partners/openai format lint
files: ^libs/partners/openai/
pass_filenames: false
- id: prompty
name: format partners/prompty
name: format and lint partners/prompty
language: system
entry: make -C libs/partners/prompty format
entry: make -C libs/partners/prompty format lint
files: ^libs/partners/prompty/
pass_filenames: false
- id: qdrant
name: format partners/qdrant
name: format and lint partners/qdrant
language: system
entry: make -C libs/partners/qdrant format
entry: make -C libs/partners/qdrant format lint
files: ^libs/partners/qdrant/
pass_filenames: false
- id: root
name: format docs, cookbook
name: format and lint docs, cookbook
language: system
entry: make format
entry: make format lint
files: ^(docs|cookbook)/
pass_filenames: false

View File

@@ -1,25 +0,0 @@
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
version: 2
# Set the version of Python and other tools you might need
build:
os: ubuntu-22.04
tools:
python: "3.11"
commands:
- mkdir -p $READTHEDOCS_OUTPUT
- cp -r api_reference_build/* $READTHEDOCS_OUTPUT
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/api_reference/conf.py
# If using Sphinx, optionally build your docs in additional formats such as PDF
formats:
- pdf
# Optionally declare the Python requirements required to build your docs
python:
install:
- requirements: docs/api_reference/requirements.txt

View File

@@ -21,7 +21,7 @@
"[python]": {
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": "explicit",
"source.organizeImports.ruff": "explicit",
"source.fixAll": "explicit"
},
"editor.defaultFormatter": "charliermarsh.ruff"
@@ -77,4 +77,11 @@
"editor.tabSize": 2,
"editor.insertSpaces": true
},
"python.terminal.activateEnvironment": false,
"python.defaultInterpreterPath": "./.venv/bin/python",
"github.copilot.chat.commitMessageGeneration.instructions": [
{
"file": ".github/workflows/pr_lint.yml"
}
]
}

325
AGENTS.md Normal file
View File

@@ -0,0 +1,325 @@
# Global Development Guidelines for LangChain Projects
## Core Development Principles
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
**Bad - Breaking Change:**
```python
def get_user(id, verbose=False): # Changed from `user_id`
pass
```
**Good - Stable Interface:**
```python
def get_user(user_id: str, verbose: bool = False) -> User:
"""Retrieve user by ID with optional verbose output."""
pass
```
**Before making ANY changes to public APIs:**
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring warnings (using reStructuredText, like `.. warning::`)
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
### 2. Code Quality Standards
**All Python code MUST include type hints and return types.**
**Bad:**
```python
def p(u, d):
return [x for x in u if x not in d]
```
**Good:**
```python
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Filter out users that are not in the known users set.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
return [user for user in users if user not in known_users]
```
**Style Requirements:**
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
- Avoid unnecessary abstraction or premature optimization
- Follow existing patterns in the codebase you're modifying
### 3. Testing Requirements
**Every new feature or bugfix MUST be covered by unit tests.**
**Test Organization:**
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- Use `pytest` as the testing framework
**Test Quality Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
Checklist questions:
- [ ] Does the test suite fail if your new logic is broken?
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
- [ ] Do tests use fixtures or mocks where needed?
```python
def test_filter_unknown_users():
"""Test filtering unknown users from a list."""
users = ["alice", "bob", "charlie"]
known_users = {"alice", "bob"}
result = filter_unknown_users(users, known_users)
assert result == ["charlie"]
assert len(result) == 1
```
### 4. Security and Risk Assessment
**Security Checklist:**
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
**Bad:**
```python
def load_config(path):
with open(path) as f:
return eval(f.read()) # ⚠️ Never eval config
```
**Good:**
```python
import json
def load_config(path: str) -> dict:
with open(path) as f:
return json.load(f)
```
### 5. Documentation Standards
**Use Google-style docstrings with Args section for all public functions.**
**Insufficient Documentation:**
```python
def send_email(to, msg):
"""Send an email to a recipient."""
```
**Complete Documentation:**
```python
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""
Send an email to a recipient with specified priority.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level (``'low'``, ``'normal'``, ``'high'``).
Returns:
True if email was sent successfully, False otherwise.
Raises:
InvalidEmailError: If the email address format is invalid.
SMTPConnectionError: If unable to connect to email server.
"""
```
**Documentation Guidelines:**
- Types go in function signatures, NOT in docstrings
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
- Use reStructuredText for docstrings to enable rich formatting
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
### 6. Architectural Improvements
**When you encounter code that could be improved, suggest better designs:**
**Poor Design:**
```python
def process_data(data, db_conn, email_client, logger):
# Function doing too many things
validated = validate_data(data)
result = db_conn.save(validated)
email_client.send_notification(result)
logger.log(f"Processed {len(data)} items")
return result
```
**Better Design:**
```python
@dataclass
class ProcessingResult:
"""Result of data processing operation."""
items_processed: int
success: bool
errors: List[str] = field(default_factory=list)
class DataProcessor:
"""Handles data validation, storage, and notification."""
def __init__(self, db_conn: Database, email_client: EmailClient):
self.db = db_conn
self.email = email_client
def process(self, data: List[dict]) -> ProcessingResult:
"""Process and store data with notifications."""
validated = self._validate_data(data)
result = self.db.save(validated)
self._notify_completion(result)
return result
```
**Design Improvement Areas:**
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
- Reduce code duplication through shared utilities
- Make unit testing easier
- Improve separation of concerns (single responsibility)
- Make unit testing easier through dependency injection
- Add clarity without adding complexity
- Prefer dataclasses for structured data
## Development Tools & Commands
### Package Management
```bash
# Add package
uv add package-name
# Sync project dependencies
uv sync
uv lock
```
### Testing
```bash
# Run unit tests (no network)
make test
# Don't run integration tests, as API keys must be set
# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py
```
### Code Quality
```bash
# Lint code
make lint
# Format code
make format
# Type checking
uv run --group lint mypy .
```
### Dependency Management Patterns
**Local Development Dependencies:**
```toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }
```
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
```python
from langchain_core.tools import tool
@tool
def search_database(query: str) -> str:
"""Search the database for relevant information.
Args:
query: The search query string.
"""
# Implementation here
return results
```
## Commit Standards
**Use Conventional Commits format for PR titles:**
- `feat(core): add multi-tenant support`
- `fix(cli): resolve flag parsing error`
- `docs: update API usage examples`
- `docs(openai): update API usage examples`
## Framework-Specific Guidelines
- Follow the existing patterns in `langchain-core` for base abstractions
- Use `langchain_core.callbacks` for execution tracking
- Implement proper streaming support where applicable
- Avoid deprecated components like legacy `LLMChain`
### Partner Integrations
- Follow the established patterns in existing partner libraries
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
- Include comprehensive integration tests
- Document API key requirements and authentication
---
## Quick Reference Checklist
Before submitting code changes:
- [ ] **Breaking Changes**: Verified no public API changes
- [ ] **Type Hints**: All functions have complete type annotations
- [ ] **Tests**: New functionality is fully tested
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
- [ ] **Documentation**: Google-style docstrings for public functions
- [ ] **Code Quality**: `make lint` and `make format` pass
- [ ] **Architecture**: Suggested improvements where applicable
- [ ] **Commit Message**: Follows Conventional Commits format

325
CLAUDE.md Normal file
View File

@@ -0,0 +1,325 @@
# Global Development Guidelines for LangChain Projects
## Core Development Principles
### 1. Maintain Stable Public Interfaces ⚠️ CRITICAL
**Always attempt to preserve function signatures, argument positions, and names for exported/public methods.**
**Bad - Breaking Change:**
```python
def get_user(id, verbose=False): # Changed from `user_id`
pass
```
**Good - Stable Interface:**
```python
def get_user(user_id: str, verbose: bool = False) -> User:
"""Retrieve user by ID with optional verbose output."""
pass
```
**Before making ANY changes to public APIs:**
- Check if the function/class is exported in `__init__.py`
- Look for existing usage patterns in tests and examples
- Use keyword-only arguments for new parameters: `*, new_param: str = "default"`
- Mark experimental features clearly with docstring warnings (using reStructuredText, like `.. warning::`)
🧠 *Ask yourself:* "Would this change break someone's code if they used it last week?"
### 2. Code Quality Standards
**All Python code MUST include type hints and return types.**
**Bad:**
```python
def p(u, d):
return [x for x in u if x not in d]
```
**Good:**
```python
def filter_unknown_users(users: list[str], known_users: set[str]) -> list[str]:
"""Filter out users that are not in the known users set.
Args:
users: List of user identifiers to filter.
known_users: Set of known/valid user identifiers.
Returns:
List of users that are not in the known_users set.
"""
return [user for user in users if user not in known_users]
```
**Style Requirements:**
- Use descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
- Attempt to break up complex functions (>20 lines) into smaller, focused functions where it makes sense
- Avoid unnecessary abstraction or premature optimization
- Follow existing patterns in the codebase you're modifying
### 3. Testing Requirements
**Every new feature or bugfix MUST be covered by unit tests.**
**Test Organization:**
- Unit tests: `tests/unit_tests/` (no network calls allowed)
- Integration tests: `tests/integration_tests/` (network calls permitted)
- Use `pytest` as the testing framework
**Test Quality Checklist:**
- [ ] Tests fail when your new logic is broken
- [ ] Happy path is covered
- [ ] Edge cases and error conditions are tested
- [ ] Use fixtures/mocks for external dependencies
- [ ] Tests are deterministic (no flaky tests)
Checklist questions:
- [ ] Does the test suite fail if your new logic is broken?
- [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
- [ ] Do tests use fixtures or mocks where needed?
```python
def test_filter_unknown_users():
"""Test filtering unknown users from a list."""
users = ["alice", "bob", "charlie"]
known_users = {"alice", "bob"}
result = filter_unknown_users(users, known_users)
assert result == ["charlie"]
assert len(result) == 1
```
### 4. Security and Risk Assessment
**Security Checklist:**
- No `eval()`, `exec()`, or `pickle` on user-controlled input
- Proper exception handling (no bare `except:`) and use a `msg` variable for error messages
- Remove unreachable/commented code before committing
- Race conditions or resource leaks (file handles, sockets, threads).
- Ensure proper resource cleanup (file handles, connections)
**Bad:**
```python
def load_config(path):
with open(path) as f:
return eval(f.read()) # ⚠️ Never eval config
```
**Good:**
```python
import json
def load_config(path: str) -> dict:
with open(path) as f:
return json.load(f)
```
### 5. Documentation Standards
**Use Google-style docstrings with Args section for all public functions.**
**Insufficient Documentation:**
```python
def send_email(to, msg):
"""Send an email to a recipient."""
```
**Complete Documentation:**
```python
def send_email(to: str, msg: str, *, priority: str = "normal") -> bool:
"""
Send an email to a recipient with specified priority.
Args:
to: The email address of the recipient.
msg: The message body to send.
priority: Email priority level (``'low'``, ``'normal'``, ``'high'``).
Returns:
True if email was sent successfully, False otherwise.
Raises:
InvalidEmailError: If the email address format is invalid.
SMTPConnectionError: If unable to connect to email server.
"""
```
**Documentation Guidelines:**
- Types go in function signatures, NOT in docstrings
- Focus on "why" rather than "what" in descriptions
- Document all parameters, return values, and exceptions
- Keep descriptions concise but clear
- Use reStructuredText for docstrings to enable rich formatting
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
### 6. Architectural Improvements
**When you encounter code that could be improved, suggest better designs:**
**Poor Design:**
```python
def process_data(data, db_conn, email_client, logger):
# Function doing too many things
validated = validate_data(data)
result = db_conn.save(validated)
email_client.send_notification(result)
logger.log(f"Processed {len(data)} items")
return result
```
**Better Design:**
```python
@dataclass
class ProcessingResult:
"""Result of data processing operation."""
items_processed: int
success: bool
errors: List[str] = field(default_factory=list)
class DataProcessor:
"""Handles data validation, storage, and notification."""
def __init__(self, db_conn: Database, email_client: EmailClient):
self.db = db_conn
self.email = email_client
def process(self, data: List[dict]) -> ProcessingResult:
"""Process and store data with notifications."""
validated = self._validate_data(data)
result = self.db.save(validated)
self._notify_completion(result)
return result
```
**Design Improvement Areas:**
If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it and suggest improvements that would:
- Reduce code duplication through shared utilities
- Make unit testing easier
- Improve separation of concerns (single responsibility)
- Make unit testing easier through dependency injection
- Add clarity without adding complexity
- Prefer dataclasses for structured data
## Development Tools & Commands
### Package Management
```bash
# Add package
uv add package-name
# Sync project dependencies
uv sync
uv lock
```
### Testing
```bash
# Run unit tests (no network)
make test
# Don't run integration tests, as API keys must be set
# Run specific test file
uv run --group test pytest tests/unit_tests/test_specific.py
```
### Code Quality
```bash
# Lint code
make lint
# Format code
make format
# Type checking
uv run --group lint mypy .
```
### Dependency Management Patterns
**Local Development Dependencies:**
```toml
[tool.uv.sources]
langchain-core = { path = "../core", editable = true }
langchain-tests = { path = "../standard-tests", editable = true }
```
**For tools, use the `@tool` decorator from `langchain_core.tools`:**
```python
from langchain_core.tools import tool
@tool
def search_database(query: str) -> str:
"""Search the database for relevant information.
Args:
query: The search query string.
"""
# Implementation here
return results
```
## Commit Standards
**Use Conventional Commits format for PR titles:**
- `feat(core): add multi-tenant support`
- `fix(cli): resolve flag parsing error`
- `docs: update API usage examples`
- `docs(openai): update API usage examples`
## Framework-Specific Guidelines
- Follow the existing patterns in `langchain-core` for base abstractions
- Use `langchain_core.callbacks` for execution tracking
- Implement proper streaming support where applicable
- Avoid deprecated components like legacy `LLMChain`
### Partner Integrations
- Follow the established patterns in existing partner libraries
- Implement standard interfaces (`BaseChatModel`, `BaseEmbeddings`, etc.)
- Include comprehensive integration tests
- Document API key requirements and authentication
---
## Quick Reference Checklist
Before submitting code changes:
- [ ] **Breaking Changes**: Verified no public API changes
- [ ] **Type Hints**: All functions have complete type annotations
- [ ] **Tests**: New functionality is fully tested
- [ ] **Security**: No dangerous patterns (eval, silent failures, etc.)
- [ ] **Documentation**: Google-style docstrings for public functions
- [ ] **Code Quality**: `make lint` and `make format` pass
- [ ] **Architecture**: Suggested improvements where applicable
- [ ] **Commit Message**: Follows Conventional Commits format

View File

@@ -1,4 +1,4 @@
.PHONY: all clean help docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck spell_check spell_fix lint lint_package lint_tests format format_diff
.PHONY: all clean help docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck lint lint_package lint_tests format format_diff
.EXPORT_ALL_VARIABLES:
UV_FROZEN = true
@@ -78,18 +78,6 @@ api_docs_linkcheck:
fi
@echo "✅ API link check complete"
## spell_check: Run codespell on the project.
spell_check:
@echo "✏️ Checking spelling across project..."
uv run --group codespell codespell --toml pyproject.toml
@echo "✅ Spell check complete"
## spell_fix: Run codespell on the project and fix the errors.
spell_fix:
@echo "✏️ Fixing spelling errors across project..."
uv run --group codespell codespell --toml pyproject.toml -w
@echo "✅ Spelling errors fixed"
######################
# LINTING AND FORMATTING
######################
@@ -100,7 +88,7 @@ lint lint_package lint_tests:
uv run --group lint ruff check docs cookbook
uv run --group lint ruff format docs cookbook cookbook --diff
git --no-pager grep 'from langchain import' docs cookbook | grep -vE 'from langchain import (hub)' && echo "Error: no importing langchain from root in docs, except for hub" && exit 1 || exit 0
git --no-pager grep 'api.python.langchain.com' -- docs/docs ':!docs/docs/additional_resources/arxiv_references.mdx' ':!docs/docs/integrations/document_loaders/sitemap.ipynb' || exit 0 && \
echo "Error: you should link python.langchain.com/api_reference, not api.python.langchain.com in the docs" && \
exit 1

114
README.md
View File

@@ -1,87 +1,75 @@
<picture>
<source media="(prefers-color-scheme: light)" srcset="docs/static/img/logo-dark.svg">
<source media="(prefers-color-scheme: dark)" srcset="docs/static/img/logo-light.svg">
<img alt="LangChain Logo" src="docs/static/img/logo-dark.svg" width="80%">
</picture>
<p align="center">
<picture>
<source media="(prefers-color-scheme: light)" srcset="docs/static/img/logo-dark.svg">
<source media="(prefers-color-scheme: dark)" srcset="docs/static/img/logo-light.svg">
<img alt="LangChain Logo" src="docs/static/img/logo-dark.svg" width="80%">
</picture>
</p>
<div>
<br>
</div>
<p align="center">
The platform for reliable agents.
</p>
[![Release Notes](https://img.shields.io/github/release/langchain-ai/langchain?style=flat-square)](https://github.com/langchain-ai/langchain/releases)
[![CI](https://github.com/langchain-ai/langchain/actions/workflows/check_diffs.yml/badge.svg)](https://github.com/langchain-ai/langchain/actions/workflows/check_diffs.yml)
[![PyPI - License](https://img.shields.io/pypi/l/langchain-core?style=flat-square)](https://opensource.org/licenses/MIT)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-core?style=flat-square)](https://pypistats.org/packages/langchain-core)
[![GitHub star chart](https://img.shields.io/github/stars/langchain-ai/langchain?style=flat-square)](https://star-history.com/#langchain-ai/langchain)
[![Open Issues](https://img.shields.io/github/issues-raw/langchain-ai/langchain?style=flat-square)](https://github.com/langchain-ai/langchain/issues)
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode&style=flat-square)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
[<img src="https://github.com/codespaces/badge.svg" title="Open in Github Codespace" width="150" height="20">](https://codespaces.new/langchain-ai/langchain)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
[![CodSpeed Badge](https://img.shields.io/endpoint?url=https://codspeed.io/badge.json)](https://codspeed.io/langchain-ai/langchain)
<p align="center">
<a href="https://opensource.org/licenses/MIT" target="_blank">
<img src="https://img.shields.io/pypi/l/langchain-core?style=flat-square" alt="PyPI - License">
</a>
<a href="https://pypistats.org/packages/langchain-core" target="_blank">
<img src="https://img.shields.io/pepy/dt/langchain" alt="PyPI - Downloads">
</a>
<a href="https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain" target="_blank">
<img src="https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode&style=flat-square" alt="Open in Dev Containers">
</a>
<a href="https://codespaces.new/langchain-ai/langchain" target="_blank">
<img src="https://github.com/codespaces/badge.svg" alt="Open in Github Codespace" title="Open in Github Codespace" width="150" height="20">
</a>
<a href="https://codspeed.io/langchain-ai/langchain" target="_blank">
<img src="https://img.shields.io/endpoint?url=https://codspeed.io/badge.json" alt="CodSpeed Badge">
</a>
<a href="https://twitter.com/langchainai" target="_blank">
<img src="https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI" alt="Twitter / X">
</a>
</p>
> [!NOTE]
> Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
LangChain is a framework for building LLM-powered applications. It helps you chain
together interoperable components and third-party integrations to simplify AI
application development — all while future-proofing decisions as the underlying
technology evolves.
LangChain is a framework for building LLM-powered applications. It helps you chain together interoperable components and third-party integrations to simplify AI application development — all while future-proofing decisions as the underlying technology evolves.
```bash
pip install -U langchain
```
To learn more about LangChain, check out
[the docs](https://python.langchain.com/docs/introduction/). If youre looking for more
advanced customization or agent orchestration, check out
[LangGraph](https://langchain-ai.github.io/langgraph/), our framework for building
controllable agent workflows.
---
**Documentation**: To learn more about LangChain, check out [the docs](https://python.langchain.com/docs/introduction/).
If you're looking for more advanced customization or agent orchestration, check out [LangGraph](https://langchain-ai.github.io/langgraph/), our framework for building controllable agent workflows.
> [!NOTE]
> Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
## Why use LangChain?
LangChain helps developers build applications powered by LLMs through a standard
interface for models, embeddings, vector stores, and more.
LangChain helps developers build applications powered by LLMs through a standard interface for models, embeddings, vector stores, and more.
Use LangChain for:
- **Real-time data augmentation**. Easily connect LLMs to diverse data sources and
external / internal systems, drawing from LangChains vast library of integrations with
model providers, tools, vector stores, retrievers, and more.
- **Model interoperability**. Swap models in and out as your engineering team
experiments to find the best choice for your applications needs. As the industry
frontier evolves, adapt quickly — LangChains abstractions keep you moving without
losing momentum.
- **Real-time data augmentation**. Easily connect LLMs to diverse data sources and external/internal systems, drawing from LangChains vast library of integrations with model providers, tools, vector stores, retrievers, and more.
- **Model interoperability**. Swap models in and out as your engineering team experiments to find the best choice for your applications needs. As the industry frontier evolves, adapt quickly — LangChains abstractions keep you moving without losing momentum.
## LangChains ecosystem
While the LangChain framework can be used standalone, it also integrates seamlessly
with any LangChain product, giving developers a full suite of tools when building LLM
applications.
While the LangChain framework can be used standalone, it also integrates seamlessly with any LangChain product, giving developers a full suite of tools when building LLM applications.
To improve your LLM application development, pair LangChain with:
- [LangSmith](http://www.langchain.com/langsmith) - Helpful for agent evals and
observability. Debug poor-performing LLM app runs, evaluate agent trajectories, gain
visibility in production, and improve performance over time.
- [LangGraph](https://langchain-ai.github.io/langgraph/) - Build agents that can
reliably handle complex tasks with LangGraph, our low-level agent orchestration
framework. LangGraph offers customizable architecture, long-term memory, and
human-in-the-loop workflows — and is trusted in production by companies like LinkedIn,
Uber, Klarna, and GitLab.
- [LangGraph Platform](https://langchain-ai.github.io/langgraph/concepts/langgraph_platform/) - Deploy
and scale agents effortlessly with a purpose-built deployment platform for long
running, stateful workflows. Discover, reuse, configure, and share agents across
teams — and iterate quickly with visual prototyping in
[LangGraph Studio](https://langchain-ai.github.io/langgraph/concepts/langgraph_studio/).
- [LangSmith](https://www.langchain.com/langsmith) - Helpful for agent evals and observability. Debug poor-performing LLM app runs, evaluate agent trajectories, gain visibility in production, and improve performance over time.
- [LangGraph](https://langchain-ai.github.io/langgraph/) - Build agents that can reliably handle complex tasks with LangGraph, our low-level agent orchestration framework. LangGraph offers customizable architecture, long-term memory, and human-in-the-loop workflows — and is trusted in production by companies like LinkedIn, Uber, Klarna, and GitLab.
- [LangGraph Platform](https://docs.langchain.com/langgraph-platform) - Deploy and scale agents effortlessly with a purpose-built deployment platform for long-running, stateful workflows. Discover, reuse, configure, and share agents across teams — and iterate quickly with visual prototyping in [LangGraph Studio](https://langchain-ai.github.io/langgraph/concepts/langgraph_studio/).
## Additional resources
- [Tutorials](https://python.langchain.com/docs/tutorials/): Simple walkthroughs with
guided examples on getting started with LangChain.
- [How-to Guides](https://python.langchain.com/docs/how_to/): Quick, actionable code
snippets for topics such as tool calling, RAG use cases, and more.
- [Conceptual Guides](https://python.langchain.com/docs/concepts/): Explanations of key
concepts behind the LangChain framework.
- [Tutorials](https://python.langchain.com/docs/tutorials/): Simple walkthroughs with guided examples on getting started with LangChain.
- [How-to Guides](https://python.langchain.com/docs/how_to/): Quick, actionable code snippets for topics such as tool calling, RAG use cases, and more.
- [Conceptual Guides](https://python.langchain.com/docs/concepts/): Explanations of key concepts behind the LangChain framework.
- [LangChain Forum](https://forum.langchain.com/): Connect with the community and share all of your technical questions, ideas, and feedback.
- [API Reference](https://python.langchain.com/api_reference/): Detailed reference on
navigating base packages and integrations for LangChain.
- [API Reference](https://python.langchain.com/api_reference/): Detailed reference on navigating base packages and integrations for LangChain.
- [Chat LangChain](https://chat.langchain.com/): Ask questions & chat with our documentation.

View File

@@ -4,9 +4,9 @@ LangChain has a large ecosystem of integrations with various external resources
## Best practices
When building such applications developers should remember to follow good security practices:
When building such applications, developers should remember to follow good security practices:
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc. as appropriate for your application.
* [**Limit Permissions**](https://en.wikipedia.org/wiki/Principle_of_least_privilege): Scope permissions specifically to the application's need. Granting broad or excessive permissions can introduce significant security vulnerabilities. To avoid such vulnerabilities, consider using read-only credentials, disallowing access to sensitive resources, using sandboxing techniques (such as running inside a container), specifying proxy configurations to control external requests, etc., as appropriate for your application.
* **Anticipate Potential Misuse**: Just as humans can err, so can Large Language Models (LLMs). Always assume that any system access or credentials may be used in any way allowed by the permissions they are assigned. For example, if a pair of database credentials allows deleting data, it's safest to assume that any LLM able to use those credentials may in fact delete data.
* [**Defense in Depth**](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)): No security technique is perfect. Fine-tuning and good chain design can reduce, but not eliminate, the odds that a Large Language Model (LLM) may make a mistake. It's best to combine multiple layered security approaches rather than relying on any single layer of defense to ensure security. For example: use both read-only permissions and sandboxing to ensure that LLMs are only able to access data that is explicitly meant for them to use.
@@ -22,9 +22,7 @@ Example scenarios with mitigation strategies:
* A user may ask an agent with write access to an external API to write malicious data to the API, or delete data from that API. To mitigate, give the agent read-only API keys, or limit it to only use endpoints that are already resistant to such misuse.
* A user may ask an agent with access to a database to drop a table or mutate the schema. To mitigate, scope the credentials to only the tables that the agent needs to access and consider issuing READ-ONLY credentials.
If you're building applications that access external resources like file systems, APIs
or databases, consider speaking with your company's security team to determine how to best
design and secure your applications.
If you're building applications that access external resources like file systems, APIs or databases, consider speaking with your company's security team to determine how to best design and secure your applications.
## Reporting OSS Vulnerabilities
@@ -32,15 +30,13 @@ LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide
a bounty program for our open source projects.
Please report security vulnerabilities associated with the LangChain
open source projects [here](https://huntr.com/bounties/disclose/?target=https%3A%2F%2Fgithub.com%2Flangchain-ai%2Flangchain&validSearch=true).
open source projects at [huntr](https://huntr.com/bounties/disclose/?target=https%3A%2F%2Fgithub.com%2Flangchain-ai%2Flangchain&validSearch=true).
Before reporting a vulnerability, please review:
1) In-Scope Targets and Out-of-Scope Targets below.
2) The [langchain-ai/langchain](https://python.langchain.com/docs/contributing/repo_structure) monorepo structure.
3) The [Best Practices](#best-practices) above to
understand what we consider to be a security vulnerability vs. developer
responsibility.
2) The [langchain-ai/langchain](https://docs.langchain.com/oss/python/contributing/code#supporting-packages) monorepo structure.
3) The [Best Practices](#best-practices) above to understand what we consider to be a security vulnerability vs. developer responsibility.
### In-Scope Targets
@@ -67,8 +63,7 @@ All out of scope targets defined by huntr as well as:
for more details, but generally tools interact with the real world. Developers are
expected to understand the security implications of their code and are responsible
for the security of their tools.
* Code documented with security notices. This will be decided on a case by
case basis, but likely will not be eligible for a bounty as the code is already
* Code documented with security notices. This will be decided on a case-by-case basis, but likely will not be eligible for a bounty as the code is already
documented with guidelines for developers that should be followed for making their
application secure.
* Any LangSmith related repositories or APIs (see [Reporting LangSmith Vulnerabilities](#reporting-langsmith-vulnerabilities)).

View File

@@ -144,7 +144,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"id": "kWDWfSDBMPl8",
"metadata": {},
"outputs": [
@@ -185,7 +185,7 @@
" )\n",
" # Text summary chain\n",
" model = VertexAI(\n",
" temperature=0, model_name=\"gemini-2.0-flash-lite-001\", max_tokens=1024\n",
" temperature=0, model_name=\"gemini-2.5-flash\", max_tokens=1024\n",
" ).with_fallbacks([empty_response])\n",
" summarize_chain = {\"element\": lambda x: x} | prompt | model | StrOutputParser()\n",
"\n",
@@ -235,7 +235,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"id": "PeK9bzXv3olF",
"metadata": {},
"outputs": [],
@@ -254,7 +254,7 @@
"\n",
"def image_summarize(img_base64, prompt):\n",
" \"\"\"Make image summary\"\"\"\n",
" model = ChatVertexAI(model=\"gemini-2.0-flash\", max_tokens=1024)\n",
" model = ChatVertexAI(model=\"gemini-2.5-flash\", max_tokens=1024)\n",
"\n",
" msg = model.invoke(\n",
" [\n",
@@ -431,7 +431,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"id": "GlwCErBaCKQW",
"metadata": {},
"outputs": [],
@@ -553,7 +553,7 @@
" \"\"\"\n",
"\n",
" # Multi-modal LLM\n",
" model = ChatVertexAI(temperature=0, model_name=\"gemini-2.0-flash\", max_tokens=1024)\n",
" model = ChatVertexAI(temperature=0, model_name=\"gemini-2.5-flash\", max_tokens=1024)\n",
"\n",
" # RAG pipeline\n",
" chain = (\n",

View File

@@ -63,4 +63,5 @@ Notebook | Description
[rag-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag-locally-on-intel-cpu.ipynb) | Perform Retrieval-Augmented-Generation (RAG) on locally downloaded open-source models using langchain and open source tools and execute it on Intel Xeon CPU. We showed an example of how to apply RAG on Llama 2 model and enable it to answer the queries related to Intel Q1 2024 earnings release.
[visual_RAG_vdms.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/visual_RAG_vdms.ipynb) | Performs Visual Retrieval-Augmented-Generation (RAG) using videos and scene descriptions generated by open source models.
[contextual_rag.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/contextual_rag.ipynb) | Performs contextual retrieval-augmented generation (RAG) prepending chunk-specific explanatory context to each chunk before embedding.
[rag-agents-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/local_rag_agents_intel_cpu.ipynb) | Build a RAG agent locally with open source models that routes questions through one of two paths to find answers. The agent generates answers based on documents retrieved from either the vector database or retrieved from web search. If the vector database lacks relevant information, the agent opts for web search. Open-source models for LLM and embeddings are used locally on an Intel Xeon CPU to execute this pipeline.
[rag-agents-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/local_rag_agents_intel_cpu.ipynb) | Build a RAG agent locally with open source models that routes questions through one of two paths to find answers. The agent generates answers based on documents retrieved from either the vector database or retrieved from web search. If the vector database lacks relevant information, the agent opts for web search. Open-source models for LLM and embeddings are used locally on an Intel Xeon CPU to execute this pipeline.
[rag_mlflow_tracking_evaluation.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag_mlflow_tracking_evaluation.ipynb) | Guide on how to create a RAG pipeline and track + evaluate it with MLflow.

View File

@@ -79,6 +79,17 @@
"tool_executor = ToolExecutor(tools)"
]
},
{
"cell_type": "markdown",
"id": "168152fc",
"metadata": {},
"source": [
"📘 **Note on `SystemMessage` usage with LangGraph-based agents**\n",
"\n",
"When constructing the `messages` list for an agent, you *must* manually include any `SystemMessage`s.\n",
"Unlike some agent executors in LangChain that set a default, LangGraph requires explicit inclusion."
]
},
{
"cell_type": "markdown",
"id": "fe6e8f78-1ef7-42ad-b2bf-835ed5850553",

View File

@@ -47,10 +47,12 @@
"source": [
"### Prerequisites\n",
"\n",
"Please install the Oracle Database [python-oracledb driver](https://pypi.org/project/oracledb/) to use LangChain with Oracle AI Vector Search:\n",
"You'll need to install `langchain-oracledb` with `python -m pip install -U langchain-oracledb` to use this integration.\n",
"\n",
"The `python-oracledb` driver is installed automatically as a dependency of langchain-oracledb.\n",
"\n",
"```\n",
"$ python -m pip install --upgrade oracledb\n",
"$ python -m pip install -U langchain-oracledb\n",
"```"
]
},
@@ -217,7 +219,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.embeddings.oracleai import OracleEmbeddings\n",
"from langchain_oracledb.embeddings.oracleai import OracleEmbeddings\n",
"\n",
"# please update with your related information\n",
"# make sure that you have onnx file in the system\n",
@@ -296,7 +298,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.document_loaders.oracleai import OracleDocLoader\n",
"from langchain_oracledb.document_loaders.oracleai import OracleDocLoader\n",
"from langchain_core.documents import Document\n",
"\n",
"# loading from Oracle Database table\n",
@@ -354,7 +356,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.utilities.oracleai import OracleSummary\n",
"from langchain_oracledb.utilities.oracleai import OracleSummary\n",
"from langchain_core.documents import Document\n",
"\n",
"# using 'database' provider\n",
@@ -395,7 +397,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.document_loaders.oracleai import OracleTextSplitter\n",
"from langchain_oracledb.document_loaders.oracleai import OracleTextSplitter\n",
"from langchain_core.documents import Document\n",
"\n",
"# split by default parameters\n",
@@ -452,7 +454,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.embeddings.oracleai import OracleEmbeddings\n",
"from langchain_oracledb.embeddings.oracleai import OracleEmbeddings\n",
"from langchain_core.documents import Document\n",
"\n",
"# using ONNX model loaded to Oracle Database\n",
@@ -498,14 +500,14 @@
"import sys\n",
"\n",
"import oracledb\n",
"from langchain_community.document_loaders.oracleai import (\n",
"from langchain_oracledb.document_loaders.oracleai import (\n",
" OracleDocLoader,\n",
" OracleTextSplitter,\n",
")\n",
"from langchain_community.embeddings.oracleai import OracleEmbeddings\n",
"from langchain_community.utilities.oracleai import OracleSummary\n",
"from langchain_community.vectorstores import oraclevs\n",
"from langchain_community.vectorstores.oraclevs import OracleVS\n",
"from langchain_oracledb.embeddings.oracleai import OracleEmbeddings\n",
"from langchain_oracledb.utilities.oracleai import OracleSummary\n",
"from langchain_oracledb.vectorstores import oraclevs\n",
"from langchain_oracledb.vectorstores.oraclevs import OracleVS\n",
"from langchain_community.vectorstores.utils import DistanceStrategy\n",
"from langchain_core.documents import Document"
]
@@ -677,19 +679,19 @@
"outputs": [],
"source": [
"query = \"What is Oracle AI Vector Store?\"\n",
"filter = {\"document_id\": [\"1\"]}\n",
"db_filter = {\"document_id\": \"1\"}\n",
"\n",
"# Similarity search without a filter\n",
"print(vectorstore.similarity_search(query, 1))\n",
"\n",
"# Similarity search with a filter\n",
"print(vectorstore.similarity_search(query, 1, filter=filter))\n",
"print(vectorstore.similarity_search(query, 1, filter=db_filter))\n",
"\n",
"# Similarity search with relevance score\n",
"print(vectorstore.similarity_search_with_score(query, 1))\n",
"\n",
"# Similarity search with relevance score with filter\n",
"print(vectorstore.similarity_search_with_score(query, 1, filter=filter))\n",
"print(vectorstore.similarity_search_with_score(query, 1, filter=db_filter))\n",
"\n",
"# Max marginal relevance search\n",
"print(vectorstore.max_marginal_relevance_search(query, 1, fetch_k=20, lambda_mult=0.5))\n",
@@ -697,7 +699,7 @@
"# Max marginal relevance search with filter\n",
"print(\n",
" vectorstore.max_marginal_relevance_search(\n",
" query, 1, fetch_k=20, lambda_mult=0.5, filter=filter\n",
" query, 1, fetch_k=20, lambda_mult=0.5, filter=db_filter\n",
" )\n",
")"
]

View File

@@ -0,0 +1,455 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "3716230e",
"metadata": {},
"source": [
"# RAG Pipeline with MLflow Tracking, Tracing & Evaluation\n",
"\n",
"This notebook demonstrates how to build a complete Retrieval-Augmented Generation (RAG) pipeline using LangChain and integrate it with MLflow for experiment tracking, tracing, and evaluation.\n",
"\n",
"\n",
"- **RAG Pipeline Construction**: Build a complete RAG system using LangChain components\n",
"- **MLflow Integration**: Track experiments, parameters, and artifacts\n",
"- **Tracing**: Monitor inputs, outputs, retrieved documents, scores, prompts, and timings\n",
"- **Evaluation**: Use MLflow's built-in scorers to assess RAG performance\n",
"- **Best Practices**: Implement proper configuration management and reproducible experiments\n",
"\n",
"We'll build a RAG system that can answer questions about academic papers by:\n",
"1. Loading and chunking documents from ArXiv\n",
"2. Creating embeddings and a vector store\n",
"3. Setting up a retrieval-augmented generation chain\n",
"4. Tracking all experiments with MLflow\n",
"5. Evaluating the system's performance\n",
"\n",
"![System Diagram](https://miro.medium.com/v2/resize:fit:720/format:webp/1*eiw86PP4hrBBxhjTjP0JUQ.png)"
]
},
{
"cell_type": "markdown",
"id": "2f7561c4",
"metadata": {},
"source": [
"#### Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0814ebe9",
"metadata": {},
"outputs": [],
"source": [
"%pip install -U langchain mlflow langchain-community arxiv pymupdf langchain-text-splitters langchain-openai"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "747399b6",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import mlflow\n",
"from mlflow.genai.scorers import RelevanceToQuery, Correctness, ExpectationsGuidelines\n",
"from langchain_community.document_loaders import ArxivLoader\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_core.vectorstores import InMemoryVectorStore\n",
"from langchain_openai import OpenAIEmbeddings, ChatOpenAI\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnableLambda, RunnablePassthrough\n",
"from langchain_core.output_parsers import StrOutputParser"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4141ee05",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"OPENAI_API_KEY\"] = \"<YOUR OPENAI API KEY>\"\n",
"\n",
"mlflow.set_experiment(\"LangChain-RAG-MLflow\")\n",
"mlflow.langchain.autolog()"
]
},
{
"cell_type": "markdown",
"id": "dd5eb41b",
"metadata": {},
"source": [
"Define all hyperparameters and configuration in a centralized dictionary. This makes it easy to:\n",
"- Track different experiment configurations\n",
"- Reproduce results\n",
"- Perform hyperparameter tuning\n",
"\n",
"**Key Parameters**:\n",
"- `chunk_size`: Size of text chunks for document splitting\n",
"- `chunk_overlap`: Overlap between consecutive chunks\n",
"- `retriever_k`: Number of documents to retrieve\n",
"- `embeddings_model`: OpenAI embedding model\n",
"- `llm`: Language model for generation\n",
"- `temperature`: Sampling temperature for the LLM"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "6dcdc5d8",
"metadata": {},
"outputs": [],
"source": [
"CONFIG = {\n",
" \"chunk_size\": 400,\n",
" \"chunk_overlap\": 80,\n",
" \"retriever_k\": 3,\n",
" \"embeddings_model\": \"text-embedding-3-small\",\n",
" \"system_prompt\": \"You are a helpful assistant. Use the following context to answer the question. Use three sentences maximum and keep the answer concise.\",\n",
" \"llm\": \"gpt-5-nano\",\n",
" \"temperature\": 0,\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "8a2985f1",
"metadata": {},
"source": [
"#### ArXiv Dcoument Loading and Processing"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1f32aa36",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Published': '2023-08-02', 'Title': 'Attention Is All You Need', 'Authors': 'Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin', 'Summary': 'The dominant sequence transduction models are based on complex recurrent or\\nconvolutional neural networks in an encoder-decoder configuration. The best\\nperforming models also connect the encoder and decoder through an attention\\nmechanism. We propose a new simple network architecture, the Transformer, based\\nsolely on attention mechanisms, dispensing with recurrence and convolutions\\nentirely. Experiments on two machine translation tasks show these models to be\\nsuperior in quality while being more parallelizable and requiring significantly\\nless time to train. Our model achieves 28.4 BLEU on the WMT 2014\\nEnglish-to-German translation task, improving over the existing best results,\\nincluding ensembles by over 2 BLEU. On the WMT 2014 English-to-French\\ntranslation task, our model establishes a new single-model state-of-the-art\\nBLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction\\nof the training costs of the best models from the literature. We show that the\\nTransformer generalizes well to other tasks by applying it successfully to\\nEnglish constituency parsing both with large and limited training data.'}\n"
]
}
],
"source": [
"# Load documents from ArXiv\n",
"loader = ArxivLoader(\n",
" query=\"1706.03762\",\n",
" load_max_docs=1,\n",
")\n",
"docs = loader.load()\n",
"print(docs[0].metadata)\n",
"\n",
"# Split documents into chunks\n",
"splitter = RecursiveCharacterTextSplitter(\n",
" chunk_size=CONFIG[\"chunk_size\"],\n",
" chunk_overlap=CONFIG[\"chunk_overlap\"],\n",
")\n",
"chunks = splitter.split_documents(docs)\n",
"\n",
"\n",
"# Join chunks into a single string\n",
"def join_chunks(chunks):\n",
" return \"\\n\\n\".join([chunk.page_content for chunk in chunks])"
]
},
{
"cell_type": "markdown",
"id": "6e194ab4",
"metadata": {},
"source": [
"#### Vector Store and Retriever Setup"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "26dfbeaa",
"metadata": {},
"outputs": [],
"source": [
"# Create embeddings\n",
"embeddings = OpenAIEmbeddings(model=CONFIG[\"embeddings_model\"])\n",
"\n",
"# Create vector store from documents\n",
"vectorstore = InMemoryVectorStore.from_documents(\n",
" chunks,\n",
" embedding=embeddings,\n",
")\n",
"\n",
"# Create retriever\n",
"retriever = vectorstore.as_retriever(search_kwargs={\"k\": CONFIG[\"retriever_k\"]})"
]
},
{
"cell_type": "markdown",
"id": "bc1f181b",
"metadata": {},
"source": [
"#### RAG Chain Construction using [LCEL](https://python.langchain.com/docs/concepts/lcel/)\n",
"\n",
"Flow:\n",
"1. Query → Retriever (finds relevant chunks)\n",
"2. Chunks → join_chunks (creates context)\n",
"3. Context + Query → Prompt Template\n",
"4. Prompt → Language Model → Response\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "6a810dc3",
"metadata": {},
"outputs": [],
"source": [
"# Initialize the language model\n",
"llm = ChatOpenAI(model=CONFIG[\"llm\"], temperature=CONFIG[\"temperature\"])\n",
"\n",
"# Create the prompt template\n",
"prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", CONFIG[\"system_prompt\"] + \"\\n\\nContext:\\n{context}\\n\\n\"),\n",
" (\"human\", \"\\n{question}\\n\"),\n",
" ]\n",
")\n",
"\n",
"# Construct the RAG chain\n",
"rag_chain = (\n",
" {\n",
" \"context\": retriever | RunnableLambda(join_chunks),\n",
" \"question\": RunnablePassthrough(),\n",
" }\n",
" | prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "c04bd019",
"metadata": {},
"source": [
"#### Prediction Function with MLflow Tracing\n",
"\n",
"Create a prediction function decorated with `@mlflow.trace` to automatically log:\n",
"- Input queries\n",
"- Retrieved documents\n",
"- Generated responses\n",
"- Execution time\n",
"- Chain intermediate steps"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "7b45fc04",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Question: What is the main idea of the paper?\n",
"Response: The main idea is to replace recurrent/convolutional sequence models with a pure attention-based architecture called the Transformer. It uses self-attention to model dependencies between all positions in the input and output, enabling full parallelization and better handling of long-range relations. This approach achieves strong results on translation and can extend to other modalities.\n"
]
}
],
"source": [
"@mlflow.trace\n",
"def predict_fn(question: str) -> str:\n",
" return rag_chain.invoke(question)\n",
"\n",
"\n",
"# Test the prediction function\n",
"sample_question = \"What is the main idea of the paper?\"\n",
"response = predict_fn(sample_question)\n",
"print(f\"Question: {sample_question}\")\n",
"print(f\"Response: {response}\")"
]
},
{
"cell_type": "markdown",
"id": "421469de",
"metadata": {},
"source": [
"#### Evaluation Dataset and Scoring\n",
"\n",
"Define an evaluation dataset and run systematic evaluation using [MLflow's built-in scorers](https://mlflow.org/docs/latest/genai/eval-monitor/scorers/llm-judge/predefined/#available-scorers):\n",
"\n",
"<u>Evaluation Components:</u>\n",
"- **Dataset**: Questions with expected concepts and facts\n",
"- **Scorers**: \n",
" - `RelevanceToQuery`: Measures how relevant the response is to the question\n",
" - `Correctness`: Evaluates factual accuracy of the response\n",
" - `ExpectationsGuidelines`: Checks that output matches expectation guidelines\n",
"\n",
"<u>Best Practices:</u>\n",
"- Create diverse test cases covering different query types\n",
"- Include expected concepts to guide evaluation\n",
"- Use multiple scoring metrics for comprehensive assessment"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "5c1dc4f2",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2025/08/23 20:14:39 INFO mlflow.models.evaluation.utils.trace: Auto tracing is temporarily enabled during the model evaluation for computing some metrics and debugging. To disable tracing, call `mlflow.autolog(disable=True)`.\n",
"2025/08/23 20:14:39 INFO mlflow.genai.utils.data_validation: Testing model prediction with the first sample in the dataset.\n"
]
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "2b6c6687efa24796b39c7951d589d481",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Evaluating: 0%| | 0/3 [Elapsed: 00:00, Remaining: ?] "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"✨ Evaluation completed.\n",
"\n",
"Metrics and evaluation results are logged to the MLflow run:\n",
" Run name: \u001b[94mbaseline_eval\u001b[0m\n",
" Run ID: \u001b[94ma2218d9f24c9415f8040d3b77af103a9\u001b[0m\n",
"\n",
"To view the detailed evaluation results with sample-wise scores,\n",
"open the \u001b[93m\u001b[1mTraces\u001b[0m tab in the Run page in the MLflow UI.\n",
"\n"
]
}
],
"source": [
"# Define evaluation dataset\n",
"eval_dataset = [\n",
" {\n",
" \"inputs\": {\"question\": \"What is the main idea of the paper?\"},\n",
" \"expectations\": {\n",
" \"key_concepts\": [\"attention mechanism\", \"transformer\", \"neural network\"],\n",
" \"expected_facts\": [\n",
" \"attention mechanism is a key component of the transformer model\"\n",
" ],\n",
" \"guidelines\": [\"The response must be factual and concise\"],\n",
" },\n",
" },\n",
" {\n",
" \"inputs\": {\n",
" \"question\": \"What's the difference between a transformer and a recurrent neural network?\"\n",
" },\n",
" \"expectations\": {\n",
" \"key_concepts\": [\"sequential\", \"attention mechanism\", \"hidden state\"],\n",
" \"expected_facts\": [\n",
" \"transformer processes data in parallel while RNN processes data sequentially\"\n",
" ],\n",
" \"guidelines\": [\n",
" \"The response must be factual and focus on the difference between the two models\"\n",
" ],\n",
" },\n",
" },\n",
" {\n",
" \"inputs\": {\"question\": \"What does the attention mechanism do?\"},\n",
" \"expectations\": {\n",
" \"key_concepts\": [\"query\", \"key\", \"value\", \"relationship\", \"similarity\"],\n",
" \"expected_facts\": [\n",
" \"attention allows the model to weigh the importance of different parts of the input sequence when processing it\"\n",
" ],\n",
" \"guidelines\": [\n",
" \"The response must be factual and explain the concept of attention\"\n",
" ],\n",
" },\n",
" },\n",
"]\n",
"\n",
"# Run evaluation with MLflow\n",
"with mlflow.start_run(run_name=\"baseline_eval\") as run:\n",
" # Log configuration parameters\n",
" mlflow.log_params(CONFIG)\n",
"\n",
" # Run evaluation\n",
" results = mlflow.genai.evaluate(\n",
" data=eval_dataset,\n",
" predict_fn=predict_fn,\n",
" scorers=[RelevanceToQuery(), Correctness(), ExpectationsGuidelines()],\n",
" )"
]
},
{
"cell_type": "markdown",
"id": "52b137c7",
"metadata": {},
"source": [
"#### Launch MLflow UI to check out the results\n",
"\n",
"<u>What you'll see in the UI:</u>\n",
"- **Experiments**: Compare different RAG configurations\n",
"- **Runs**: Individual experiment runs with metrics and parameters\n",
"- **Traces**: Detailed execution traces showing retrieval and generation steps\n",
"- **Evaluation Results**: Scoring metrics and detailed comparisons\n",
"- **Artifacts**: Saved models, datasets, and other files\n",
"\n",
"Navigate to `http://localhost:5000` after running the command below."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "817c3799",
"metadata": {},
"outputs": [],
"source": [
"!mlflow ui"
]
},
{
"cell_type": "markdown",
"id": "c75861e3",
"metadata": {},
"source": [
"You should see something like this\n",
"\n",
"![MLflow UI image](https://miro.medium.com/v2/resize:fit:720/format:webp/1*Cx7MMy53pAP7150x_hvztA.png)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -34,7 +34,7 @@
"tools = [multiply, exponentiate, add]\n",
"\n",
"gpt35 = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0).bind_tools(tools)\n",
"claude3 = ChatAnthropic(model=\"claude-3-sonnet-20240229\").bind_tools(tools)\n",
"claude3 = ChatAnthropic(model=\"claude-3-7-sonnet-20250219\").bind_tools(tools)\n",
"llm_with_tools = gpt35.configurable_alternatives(\n",
" ConfigurableField(id=\"llm\"), default_key=\"gpt35\", claude3=claude3\n",
")"
@@ -113,14 +113,15 @@
{
"data": {
"text/plain": [
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\"),\n",
" AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_6yMU2WsS4Bqgi1WxFHxtfJRc', 'function': {'arguments': '{\"x\": 8, \"y\": 2.743}', 'name': 'exponentiate'}, 'type': 'function'}, {'id': 'call_GAL3dQiKFF9XEV0RrRLPTvVp', 'function': {'arguments': '{\"x\": 17.24, \"y\": -918.1241}', 'name': 'add'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 58, 'prompt_tokens': 168, 'total_tokens': 226}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_b28b39ffa8', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-528302fc-7acf-4c11-82c4-119ccf40c573-0', tool_calls=[{'name': 'exponentiate', 'args': {'x': 8, 'y': 2.743}, 'id': 'call_6yMU2WsS4Bqgi1WxFHxtfJRc'}, {'name': 'add', 'args': {'x': 17.24, 'y': -918.1241}, 'id': 'call_GAL3dQiKFF9XEV0RrRLPTvVp'}]),\n",
" ToolMessage(content='300.03770462067547', tool_call_id='call_6yMU2WsS4Bqgi1WxFHxtfJRc'),\n",
" ToolMessage(content='-900.8841', tool_call_id='call_GAL3dQiKFF9XEV0RrRLPTvVp'),\n",
" AIMessage(content='The result of \\\\(3 + 5^{2.743}\\\\) is approximately 300.04, and the result of \\\\(17.24 - 918.1241\\\\) is approximately -900.88.', response_metadata={'token_usage': {'completion_tokens': 44, 'prompt_tokens': 251, 'total_tokens': 295}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_b28b39ffa8', 'finish_reason': 'stop', 'logprobs': None}, id='run-d1161669-ed09-4b18-94bd-6d8530df5aa8-0')]}"
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\", additional_kwargs={}, response_metadata={}),\n",
" AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_xuNXwm2P6U2Pp2pAbC1sdIBz', 'function': {'arguments': '{\"x\": 3, \"y\": 5}', 'name': 'add'}, 'type': 'function'}, {'id': 'call_0pImUJUDlYa5zfBcxxuvWyYS', 'function': {'arguments': '{\"x\": 8, \"y\": 2.743}', 'name': 'exponentiate'}, 'type': 'function'}, {'id': 'call_yaownQ9TZK0dkqD1KSFyax4H', 'function': {'arguments': '{\"x\": 17.24, \"y\": -918.1241}', 'name': 'add'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 75, 'prompt_tokens': 131, 'total_tokens': 206, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-ByJm2qxSWU3oTTSZQv64J4XQKZhA6', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--35fad027-47f7-44d3-aa8b-99f4fc24098c-0', tool_calls=[{'name': 'add', 'args': {'x': 3, 'y': 5}, 'id': 'call_xuNXwm2P6U2Pp2pAbC1sdIBz', 'type': 'tool_call'}, {'name': 'exponentiate', 'args': {'x': 8, 'y': 2.743}, 'id': 'call_0pImUJUDlYa5zfBcxxuvWyYS', 'type': 'tool_call'}, {'name': 'add', 'args': {'x': 17.24, 'y': -918.1241}, 'id': 'call_yaownQ9TZK0dkqD1KSFyax4H', 'type': 'tool_call'}], usage_metadata={'input_tokens': 131, 'output_tokens': 75, 'total_tokens': 206, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}),\n",
" ToolMessage(content='8.0', tool_call_id='call_xuNXwm2P6U2Pp2pAbC1sdIBz'),\n",
" ToolMessage(content='300.03770462067547', tool_call_id='call_0pImUJUDlYa5zfBcxxuvWyYS'),\n",
" ToolMessage(content='-900.8841', tool_call_id='call_yaownQ9TZK0dkqD1KSFyax4H'),\n",
" AIMessage(content='The results are:\\n1. 3 plus 5 is 8.\\n2. 5 raised to the power of 2.743 is approximately 300.04.\\n3. 17.24 minus 918.1241 is approximately -900.88.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 55, 'prompt_tokens': 236, 'total_tokens': 291, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-ByJm345MYnpowGS90iAZAlSs7haed', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--5fa66d47-d80e-45d0-9c32-31348c735d72-0', usage_metadata={'input_tokens': 236, 'output_tokens': 55, 'total_tokens': 291, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]}"
]
},
"execution_count": 4,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
@@ -146,17 +147,17 @@
{
"data": {
"text/plain": [
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\"),\n",
" AIMessage(content=[{'text': \"Okay, let's break this down into two parts:\", 'type': 'text'}, {'id': 'toolu_01DEhqcXkXTtzJAiZ7uMBeDC', 'input': {'x': 3, 'y': 5}, 'name': 'add', 'type': 'tool_use'}], response_metadata={'id': 'msg_01AkLGH8sxMHaH15yewmjwkF', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'input_tokens': 450, 'output_tokens': 81}}, id='run-f35bfae8-8ded-4f8a-831b-0940d6ad16b6-0', tool_calls=[{'name': 'add', 'args': {'x': 3, 'y': 5}, 'id': 'toolu_01DEhqcXkXTtzJAiZ7uMBeDC'}]),\n",
" ToolMessage(content='8.0', tool_call_id='toolu_01DEhqcXkXTtzJAiZ7uMBeDC'),\n",
" AIMessage(content=[{'id': 'toolu_013DyMLrvnrto33peAKMGMr1', 'input': {'x': 8.0, 'y': 2.743}, 'name': 'exponentiate', 'type': 'tool_use'}], response_metadata={'id': 'msg_015Fmp8aztwYcce2JDAFfce3', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'input_tokens': 545, 'output_tokens': 75}}, id='run-48aaeeeb-a1e5-48fd-a57a-6c3da2907b47-0', tool_calls=[{'name': 'exponentiate', 'args': {'x': 8.0, 'y': 2.743}, 'id': 'toolu_013DyMLrvnrto33peAKMGMr1'}]),\n",
" ToolMessage(content='300.03770462067547', tool_call_id='toolu_013DyMLrvnrto33peAKMGMr1'),\n",
" AIMessage(content=[{'text': 'So 3 plus 5 raised to the 2.743 power is 300.04.\\n\\nFor the second part:', 'type': 'text'}, {'id': 'toolu_01UTmMrGTmLpPrPCF1rShN46', 'input': {'x': 17.24, 'y': -918.1241}, 'name': 'add', 'type': 'tool_use'}], response_metadata={'id': 'msg_015TkhfRBENPib2RWAxkieH6', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'input_tokens': 638, 'output_tokens': 105}}, id='run-45fb62e3-d102-4159-881d-241c5dbadeed-0', tool_calls=[{'name': 'add', 'args': {'x': 17.24, 'y': -918.1241}, 'id': 'toolu_01UTmMrGTmLpPrPCF1rShN46'}]),\n",
" ToolMessage(content='-900.8841', tool_call_id='toolu_01UTmMrGTmLpPrPCF1rShN46'),\n",
" AIMessage(content='Therefore, 17.24 - 918.1241 = -900.8841', response_metadata={'id': 'msg_01LgKnRuUcSyADCpxv9tPoYD', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 759, 'output_tokens': 24}}, id='run-1008254e-ccd1-497c-8312-9550dd77bd08-0')]}"
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\", additional_kwargs={}, response_metadata={}),\n",
" AIMessage(content=[{'text': \"I'll solve these calculations for you.\\n\\nFor the first part, I need to calculate 3 plus 5 raised to the power of 2.743.\\n\\nLet me break this down:\\n1) First, I'll calculate 5 raised to the power of 2.743\\n2) Then add 3 to the result\", 'type': 'text'}, {'id': 'toolu_01L1mXysBQtpPLQ2AZTaCGmE', 'input': {'x': 5, 'y': 2.743}, 'name': 'exponentiate', 'type': 'tool_use'}], additional_kwargs={}, response_metadata={'id': 'msg_01HCbDmuzdg9ATMyKbnecbEE', 'model': 'claude-3-7-sonnet-20250219', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 563, 'output_tokens': 146, 'server_tool_use': None, 'service_tier': 'standard'}, 'model_name': 'claude-3-7-sonnet-20250219'}, id='run--9f6469fb-bcbb-4c1c-9eec-79f6979c38e6-0', tool_calls=[{'name': 'exponentiate', 'args': {'x': 5, 'y': 2.743}, 'id': 'toolu_01L1mXysBQtpPLQ2AZTaCGmE', 'type': 'tool_call'}], usage_metadata={'input_tokens': 563, 'output_tokens': 146, 'total_tokens': 709, 'input_token_details': {'cache_read': 0, 'cache_creation': 0}}),\n",
" ToolMessage(content='82.65606421491815', tool_call_id='toolu_01L1mXysBQtpPLQ2AZTaCGmE'),\n",
" AIMessage(content=[{'text': \"Now I'll add 3 to this result:\", 'type': 'text'}, {'id': 'toolu_01NARC83e9obV35mZ6jYzBiN', 'input': {'x': 3, 'y': 82.65606421491815}, 'name': 'add', 'type': 'tool_use'}], additional_kwargs={}, response_metadata={'id': 'msg_01ELwyCtVLeGC685PUFqmdz2', 'model': 'claude-3-7-sonnet-20250219', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 727, 'output_tokens': 87, 'server_tool_use': None, 'service_tier': 'standard'}, 'model_name': 'claude-3-7-sonnet-20250219'}, id='run--d5af3d7c-e8b7-4cc2-997a-ad2dafd08751-0', tool_calls=[{'name': 'add', 'args': {'x': 3, 'y': 82.65606421491815}, 'id': 'toolu_01NARC83e9obV35mZ6jYzBiN', 'type': 'tool_call'}], usage_metadata={'input_tokens': 727, 'output_tokens': 87, 'total_tokens': 814, 'input_token_details': {'cache_read': 0, 'cache_creation': 0}}),\n",
" ToolMessage(content='85.65606421491815', tool_call_id='toolu_01NARC83e9obV35mZ6jYzBiN'),\n",
" AIMessage(content=[{'text': \"For the second part, you asked for 17.24 - 918.1241. I don't have a subtraction function available, but I can rewrite this as adding a negative number: 17.24 + (-918.1241)\", 'type': 'text'}, {'id': 'toolu_01Q6fLcZkBWZpMPCZ55WXR3N', 'input': {'x': 17.24, 'y': -918.1241}, 'name': 'add', 'type': 'tool_use'}], additional_kwargs={}, response_metadata={'id': 'msg_01WkmDwUxWjjaKGnTtdLGJnN', 'model': 'claude-3-7-sonnet-20250219', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 832, 'output_tokens': 130, 'server_tool_use': None, 'service_tier': 'standard'}, 'model_name': 'claude-3-7-sonnet-20250219'}, id='run--39a6fbda-4c81-47a6-b361-524bd4ee5823-0', tool_calls=[{'name': 'add', 'args': {'x': 17.24, 'y': -918.1241}, 'id': 'toolu_01Q6fLcZkBWZpMPCZ55WXR3N', 'type': 'tool_call'}], usage_metadata={'input_tokens': 832, 'output_tokens': 130, 'total_tokens': 962, 'input_token_details': {'cache_read': 0, 'cache_creation': 0}}),\n",
" ToolMessage(content='-900.8841', tool_call_id='toolu_01Q6fLcZkBWZpMPCZ55WXR3N'),\n",
" AIMessage(content='So, the answers are:\\n1) 3 plus 5 raised to the 2.743 = 85.65606421491815\\n2) 17.24 - 918.1241 = -900.8841', additional_kwargs={}, response_metadata={'id': 'msg_015Yoc62CvdJbANGFouiQ6AQ', 'model': 'claude-3-7-sonnet-20250219', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'cache_creation_input_tokens': 0, 'cache_read_input_tokens': 0, 'input_tokens': 978, 'output_tokens': 58, 'server_tool_use': None, 'service_tier': 'standard'}, 'model_name': 'claude-3-7-sonnet-20250219'}, id='run--174c0882-6180-47ea-8f63-d7b747302327-0', usage_metadata={'input_tokens': 978, 'output_tokens': 58, 'total_tokens': 1036, 'input_token_details': {'cache_read': 0, 'cache_creation': 0}})]}"
]
},
"execution_count": 5,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
@@ -177,7 +178,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"display_name": "langchain",
"language": "python",
"name": "python3"
},
@@ -191,7 +192,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.10.16"
}
},
"nbformat": 4,

View File

@@ -1,3 +1,154 @@
# LangChain Documentation
For more information on contributing to our documentation, see the [Documentation Contributing Guide](https://python.langchain.com/docs/contributing/how_to/documentation)
For more information on contributing to our documentation, see the [Documentation Contributing Guide](https://python.langchain.com/docs/contributing/how_to/documentation).
## Structure
The primary documentation is located in the `docs/` directory. This directory contains
both the source files for the main documentation as well as the API reference doc
build process.
### API Reference
API reference documentation is located in `docs/api_reference/` and is generated from
the codebase using Sphinx.
The API reference have additional build steps that differ from the main documentation.
#### Deployment Process
Currently, the build process roughly follows these steps:
1. Using the `api_doc_build.yml` GitHub workflow, the API reference docs are
[built](#build-technical-details) and copied to the `langchain-api-docs-html`
repository. This workflow is triggered either (1) on a cron routine interval or (2)
triggered manually.
In short, the workflow extracts all `langchain-ai`-org-owned repos defined in
`langchain/libs/packages.yml`, clones them locally (in the workflow runner's file
system), and then builds the API reference RST files (using `create_api_rst.py`).
Following post-processing, the HTML files are pushed to the
`langchain-api-docs-html` repository.
2. After the HTML files are in the `langchain-api-docs-html` repository, they are **not**
automatically published to the [live docs site](https://python.langchain.com/api_reference/).
The docs site is served by Vercel. The Vercel deployment process copies the HTML
files from the `langchain-api-docs-html` repository and deploys them to the live
site. Deployments are triggered on each new commit pushed to `v0.3`.
#### Build Technical Details
The build process creates a virtual monorepo by syncing multiple repositories, then generates comprehensive API documentation:
1. **Repository Sync Phase:**
- `.github/scripts/prep_api_docs_build.py` - Clones external partner repos and organizes them into the `libs/partners/` structure to create a virtual monorepo for documentation building
2. **RST Generation Phase:**
- `docs/api_reference/create_api_rst.py` - Main script that **generates RST files** from Python source code
- Scans `libs/` directories and extracts classes/functions from each module (using `inspect`)
- Creates `.rst` files using specialized templates for different object types
- Templates in `docs/api_reference/templates/` (`pydantic.rst`, `runnable_pydantic.rst`, etc.)
3. **HTML Build Phase:**
- Sphinx-based, uses `sphinx.ext.autodoc` (auto-extracts docstrings from the codebase)
- `docs/api_reference/conf.py` (sphinx config) configures `autodoc` and other extensions
- `sphinx-build` processes the generated `.rst` files into HTML using autodoc
- `docs/api_reference/scripts/custom_formatter.py` - Post-processes the generated HTML
- Copies `reference.html` to `index.html` to create the default landing page (artifact? might not need to do this - just put everyhing in index directly?)
4. **Deployment:**
- `.github/workflows/api_doc_build.yml` - Workflow responsible for orchestrating the entire build and deployment process
- Built HTML files are committed and pushed to the `langchain-api-docs-html` repository
#### Local Build
For local development and testing of API documentation, use the Makefile targets in the repository root:
```bash
# Full build
make api_docs_build
```
Like the CI process, this target:
- Installs the CLI package in editable mode
- Generates RST files for all packages using `create_api_rst.py`
- Builds HTML documentation with Sphinx
- Post-processes the HTML with `custom_formatter.py`
- Opens the built documentation (`reference.html`) in your browser
**Quick Preview:**
```bash
make api_docs_quick_preview API_PKG=openai
```
- Generates RST files for only the specified package (default: `text-splitters`)
- Builds and post-processes HTML documentation
- Opens the preview in your browser
Both targets automatically clean previous builds and handle the complete build pipeline locally, mirroring the CI process but for faster iteration during development.
#### Documentation Standards
**Docstring Format:**
The API reference uses **Google-style docstrings** with reStructuredText markup. Sphinx processes these through the `sphinx.ext.napoleon` extension to generate documentation.
**Required format:**
```python
def example_function(param1: str, param2: int = 5) -> bool:
"""Brief description of the function.
Longer description can go here. Use reStructuredText syntax for
rich formatting like **bold** and *italic*.
TODO: code: figure out what works?
Args:
param1: Description of the first parameter.
param2: Description of the second parameter with default value.
Returns:
Description of the return value.
Raises:
ValueError: When param1 is empty.
TypeError: When param2 is not an integer.
.. warning::
This function is experimental and may change.
"""
```
**Special Markers:**
- `:private:` in docstrings excludes members from documentation
- `.. warning::` adds warning admonitions
#### Site Styling and Assets
**Theme and Styling:**
- Uses [**PyData Sphinx Theme**](https://pydata-sphinx-theme.readthedocs.io/en/stable/index.html) (`pydata_sphinx_theme`)
- Custom CSS in `docs/api_reference/_static/css/custom.css` with LangChain-specific:
- Color palette
- Inter font family
- Custom navbar height and sidebar formatting
- Deprecated/beta feature styling
**Static Assets:**
- Logos: `_static/wordmark-api.svg` (light) and `_static/wordmark-api-dark.svg` (dark mode)
- Favicon: `_static/img/brand/favicon.png`
- Custom CSS: `_static/css/custom.css`
**Post-Processing:**
- `scripts/custom_formatter.py` cleans up generated HTML:
- Shortens TOC entries from `ClassName.method()` to `method()`
**Analytics and Integration:**
- GitHub integration (source links, edit buttons)
- Example backlinking through custom `ExampleLinksDirective`

View File

@@ -50,7 +50,7 @@ class GalleryGridDirective(SphinxDirective):
individual cards + ["image", "header", "content", "title"].
Danger:
This directive can only be used in the context of a Myst documentation page as
This directive can only be used in the context of a MyST documentation page as
the templates use Markdown flavored formatting.
"""

View File

@@ -1,7 +1,5 @@
"""Configuration file for the Sphinx documentation builder."""
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
@@ -20,16 +18,18 @@ from docutils.parsers.rst.directives.admonitions import BaseAdmonition
from docutils.statemachine import StringList
from sphinx.util.docutils import SphinxDirective
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
# Add paths to Python import system so Sphinx can import LangChain modules
# This allows autodoc to introspect and document the actual code
_DIR = Path(__file__).parent.absolute()
sys.path.insert(0, os.path.abspath("."))
sys.path.insert(0, os.path.abspath("../../libs/langchain"))
sys.path.insert(0, os.path.abspath(".")) # Current directory
sys.path.insert(0, os.path.abspath("../../libs/langchain")) # LangChain main package
# Load package metadata from pyproject.toml (for version info, etc.)
with (_DIR.parents[1] / "libs" / "langchain" / "pyproject.toml").open("r") as f:
data = toml.load(f)
# Load mapping of classes to example notebooks for backlinking
# This file is generated by scripts that scan our tutorial/example notebooks
with (_DIR / "guide_imports.json").open("r") as f:
imported_classes = json.load(f)
@@ -86,6 +86,7 @@ class Beta(BaseAdmonition):
def setup(app):
"""Register custom directives and hooks with Sphinx."""
app.add_directive("example_links", ExampleLinksDirective)
app.add_directive("beta", Beta)
app.connect("autodoc-skip-member", skip_private_members)
@@ -97,7 +98,7 @@ def skip_private_members(app, what, name, obj, skip, options):
if hasattr(obj, "__doc__") and obj.__doc__ and ":private:" in obj.__doc__:
return True
if name == "__init__" and obj.__objclass__ is object:
# dont document default init
# don't document default init
return True
return None
@@ -125,7 +126,7 @@ extensions = [
"sphinx.ext.viewcode",
"sphinxcontrib.autodoc_pydantic",
"IPython.sphinxext.ipython_console_highlighting",
"myst_parser",
"myst_parser", # For generated index.md and reference.md
"_extensions.gallery_directive",
"sphinx_design",
"sphinx_copybutton",
@@ -258,6 +259,7 @@ html_static_path = ["_static"]
html_css_files = ["css/custom.css"]
html_use_index = False
# Only used on the generated index.md and reference.md files
myst_enable_extensions = ["colon_fence"]
# generate autosummary even if no references
@@ -268,11 +270,11 @@ autosummary_ignore_module_all = False
html_copy_source = False
html_show_sourcelink = False
googleanalytics_id = "G-9B66JQQH2F"
# Set canonical URL from the Read the Docs Domain
html_baseurl = os.environ.get("READTHEDOCS_CANONICAL_URL", "")
googleanalytics_id = "G-9B66JQQH2F"
# Tell Jinja2 templates the build is running on Read the Docs
if os.environ.get("READTHEDOCS", "") == "True":
html_context["READTHEDOCS"] = True

View File

@@ -1,4 +1,41 @@
"""Script for auto-generating api_reference.rst."""
"""Auto-generate API reference documentation (RST files) for LangChain packages.
* Automatically discovers all packages in `libs/` and `libs/partners/`
* For each package, recursively walks the filesystem to:
* Load Python modules using importlib
* Extract classes and functions using Python's inspect module
* Classify objects by type (Pydantic models, Runnables, TypedDicts, etc.)
* Filter out private members (names starting with '_') and deprecated items
* Creates structured RST files with:
* Module-level documentation pages with autosummary tables
* Different Sphinx templates based on object type (see templates/ directory)
* Proper cross-references and navigation structure
* Separation of current vs deprecated APIs
* Generates a directory tree like:
```
docs/api_reference/
├── index.md # Main landing page with package gallery
├── reference.md # Package overview and navigation
├── core/ # langchain-core documentation
│ ├── index.rst
│ ├── callbacks.rst
│ └── ...
├── langchain/ # langchain documentation
│ ├── index.rst
│ └── ...
└── partners/ # Integration packages
├── openai/
├── anthropic/
└── ...
```
## Key Features
* Respects privacy markers:
* Modules with `:private:` in docstring are excluded entirely
* Objects with `:private:` in docstring are filtered out
* Names starting with '_' are treated as private
"""
import importlib
import inspect
@@ -97,7 +134,7 @@ def _load_module_members(module_path: str, namespace: str) -> ModuleMembers:
if type(type_) is typing_extensions._TypedDictMeta: # type: ignore
kind: ClassKind = "TypedDict"
elif type(type_) is typing._TypedDictMeta: # type: ignore
kind: ClassKind = "TypedDict"
kind = "TypedDict"
elif (
issubclass(type_, Runnable)
and issubclass(type_, BaseModel)
@@ -177,19 +214,20 @@ def _load_package_modules(
Traversal based on the file system makes it easy to determine which
of the modules/packages are part of the package vs. 3rd party or built-in.
Parameters:
package_directory (Union[str, Path]): Path to the package directory.
submodule (Optional[str]): Optional name of submodule to load.
Args:
package_directory: Path to the package directory.
submodule: Optional name of submodule to load.
Returns:
Dict[str, ModuleMembers]: A dictionary where keys are module names and values are ModuleMembers objects.
A dictionary where keys are module names and values are `ModuleMembers`
objects.
"""
package_path = (
Path(package_directory)
if isinstance(package_directory, str)
else package_directory
)
modules_by_namespace = {}
modules_by_namespace: Dict[str, ModuleMembers] = {}
# Get the high level package name
package_name = package_path.name
@@ -199,12 +237,13 @@ def _load_package_modules(
package_path = package_path / submodule
for file_path in package_path.rglob("*.py"):
# Skip private modules
if file_path.name.startswith("_"):
continue
# Skip integration_template and project_template directories (for libs/cli)
if "integration_template" in file_path.parts:
continue
if "project_template" in file_path.parts:
continue
@@ -215,8 +254,13 @@ def _load_package_modules(
continue
# Get the full namespace of the module
# Example: langchain_core/schema/output_parsers.py ->
# langchain_core.schema.output_parsers
namespace = str(relative_module_name).replace(".py", "").replace("/", ".")
# Keep only the top level namespace
# Example: langchain_core.schema.output_parsers ->
# langchain_core
top_namespace = namespace.split(".")[0]
try:
@@ -253,16 +297,16 @@ def _construct_doc(
members_by_namespace: Dict[str, ModuleMembers],
package_version: str,
) -> List[typing.Tuple[str, str]]:
"""Construct the contents of the reference.rst file for the given package.
"""Construct the contents of the `reference.rst` for the given package.
Args:
package_namespace: The package top level namespace
members_by_namespace: The members of the package, dict organized by top level
module contains a list of classes and functions
inside of the top level namespace.
members_by_namespace: The members of the package dict organized by top level.
Module contains a list of classes and functions inside of the top level
namespace.
Returns:
The contents of the reference.rst file.
The string contents of the reference.rst file.
"""
docs = []
index_doc = f"""\
@@ -283,7 +327,7 @@ def _construct_doc(
.. toctree::
:hidden:
:maxdepth: 2
"""
index_autosummary = """
"""
@@ -365,9 +409,9 @@ def _construct_doc(
module_doc += f"""\
:template: {template}
{class_["qualified_name"]}
"""
index_autosummary += f"""
{class_["qualified_name"]}
@@ -465,10 +509,13 @@ def _construct_doc(
def _build_rst_file(package_name: str = "langchain") -> None:
"""Create a rst file for building of documentation.
"""Create a rst file for a given package.
Args:
package_name: Can be either "langchain" or "core" or "experimental".
package_name: Name of the package to create the rst file for.
Returns:
The rst file is created in the same directory as this script.
"""
package_dir = _package_dir(package_name)
package_members = _load_package_modules(package_dir)
@@ -487,7 +534,7 @@ def _package_namespace(package_name: str) -> str:
"""Returns the package name used.
Args:
package_name: Can be either "langchain" or "core" or "experimental".
package_name: Can be either "langchain" or "core"
Returns:
modified package_name: Can be either "langchain" or "langchain_{package_name}"
@@ -500,7 +547,10 @@ def _package_namespace(package_name: str) -> str:
def _package_dir(package_name: str = "langchain") -> Path:
"""Return the path to the directory containing the documentation."""
"""Return the path to the directory containing the documentation.
Attempts to find the package in `libs/` first, then `libs/partners/`.
"""
if (ROOT_DIR / "libs" / package_name).exists():
return ROOT_DIR / "libs" / package_name / _package_namespace(package_name)
else:
@@ -514,7 +564,7 @@ def _package_dir(package_name: str = "langchain") -> Path:
def _get_package_version(package_dir: Path) -> str:
"""Return the version of the package."""
"""Return the version of the package by reading the `pyproject.toml`."""
try:
with open(package_dir.parent / "pyproject.toml", "r") as f:
pyproject = toml.load(f)
@@ -540,22 +590,39 @@ def _out_file_path(package_name: str) -> Path:
def _build_index(dirs: List[str]) -> None:
"""Build the index.md file for the API reference.
Args:
dirs: List of package directories to include in the index.
Returns:
The index.md file is created in the same directory as this script.
"""
custom_names = {
"aws": "AWS",
"ai21": "AI21",
"ibm": "IBM",
}
ordered = ["core", "langchain", "text-splitters", "community", "experimental"]
ordered = [
"core",
"langchain",
"text-splitters",
"community",
"standard-tests",
]
main_ = [dir_ for dir_ in ordered if dir_ in dirs]
integrations = sorted(dir_ for dir_ in dirs if dir_ not in main_)
doc = """# LangChain Python API Reference
Welcome to the LangChain Python API reference. This is a reference for all
`langchain-x` packages.
Welcome to the LangChain v0.3 Python API reference. This is a reference for all
`langchain-x` packages.
For user guides see [https://python.langchain.com](https://python.langchain.com).
These pages refer to the the v0.3 versions of LangChain packages and integrations. To
visit the documentation for the latest versions of LangChain, visit [https://docs.langchain.com](https://docs.langchain.com)
and [https://reference.langchain.com/python/](https://reference.langchain.com/python/) (for references.)
For the legacy API reference hosted on ReadTheDocs see [https://api.python.langchain.com/](https://api.python.langchain.com/).
For the legacy API reference (<v0.3) hosted on ReadTheDocs see [https://api.python.langchain.com/](https://api.python.langchain.com/).
"""
if main_:
@@ -641,9 +708,14 @@ See the full list of integrations in the Section Navigation.
{integration_tree}
```
"""
# Write the reference.md file
with open(HERE / "reference.md", "w") as f:
f.write(doc)
# Write a dummy index.md file that points to reference.md
# Sphinx requires an index file to exist in each doc directory
# TODO: investigate why we don't just put everything in index.md directly?
# if it works it works I guess
dummy_index = """\
# API reference
@@ -659,8 +731,11 @@ Reference<reference>
def main(dirs: Optional[list] = None) -> None:
"""Generate the api_reference.rst file for each package."""
print("Starting to build API reference files.")
"""Generate the `api_reference.rst` file for each package.
If dirs is None, generate for all packages in `libs/` and `libs/partners/`.
Otherwise generate only for the specified package(s).
"""
if not dirs:
dirs = [
p.parent.name
@@ -669,18 +744,17 @@ def main(dirs: Optional[list] = None) -> None:
if p.parent.parent.name in ("libs", "partners")
]
for dir_ in sorted(dirs):
# Skip any hidden directories
# Skip any hidden directories prefixed with a dot
# Some of these could be present by mistake in the code base
# e.g., .pytest_cache from running tests from the wrong location.
# (e.g., .pytest_cache from running tests from the wrong location)
if dir_.startswith("."):
print("Skipping dir:", dir_)
continue
else:
print("Building package:", dir_)
print("Building:", dir_)
_build_rst_file(package_name=dir_)
_build_index(sorted(dirs))
print("API reference files built.")
if __name__ == "__main__":

File diff suppressed because one or more lines are too long

View File

@@ -1,12 +1,12 @@
autodoc_pydantic>=2,<3
sphinx>=8,<9
myst-parser>=3
sphinx-autobuild>=2024
pydata-sphinx-theme>=0.15
toml>=0.10.2
myst-nb>=1.1.1
pyyaml
sphinx-design
sphinx-copybutton
beautifulsoup4
sphinxcontrib-googleanalytics
pydata-sphinx-theme>=0.15
myst-parser>=3
myst-nb>=1.1.1
toml>=0.10.2
pyyaml
beautifulsoup4

View File

@@ -1,3 +1,10 @@
"""Post-process generated HTML files to clean up table-of-contents headers.
Runs after Sphinx generates the API reference HTML. It finds TOC entries like
"ClassName.method_name()" and shortens them to just "method_name()" for better
readability in the sidebar navigation.
"""
import sys
from glob import glob
from pathlib import Path

View File

@@ -1,10 +1,10 @@
# arXiv
LangChain implements the latest research in the field of Natural Language Processing.
This page contains `arXiv` papers referenced in the LangChain Documentation, API Reference,
Templates, and Cookbooks.
From the opposite direction, scientists use `LangChain` in research and reference it in the research papers.
From the opposite direction, scientists use `LangChain` in research and reference it in the research papers.
`arXiv` papers with references to:
[LangChain](https://arxiv.org/search/?query=langchain&searchtype=all&source=header) | [LangGraph](https://arxiv.org/search/?query=langgraph&searchtype=all&source=header) | [LangSmith](https://arxiv.org/search/?query=langsmith&searchtype=all&source=header)
@@ -83,7 +83,7 @@ a set of open-domain QA datasets, covering multiple query complexities, and
show that ours enhances the overall efficiency and accuracy of QA systems,
compared to relevant baselines including the adaptive retrieval approaches.
Code is available at: https://github.com/starsuzi/Adaptive-RAG.
## Self-Discover: Large Language Models Self-Compose Reasoning Structures
- **Authors:** Pei Zhou, Jay Pujara, Xiang Ren, et al.
@@ -106,7 +106,7 @@ than 20%, while requiring 10-40x fewer inference compute. Finally, we show that
the self-discovered reasoning structures are universally applicable across
model families: from PaLM 2-L to GPT-4, and from GPT-4 to Llama2, and share
commonalities with human reasoning patterns.
## RAG-Fusion: a New Take on Retrieval-Augmented Generation
- **Authors:** Zackary Rackauckas
@@ -129,7 +129,7 @@ the generated queries' relevance to the original query is insufficient. This
research marks significant progress in artificial intelligence (AI) and natural
language processing (NLP) applications and demonstrates transformations in a
global and multi-industry context.
## RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
- **Authors:** Parth Sarthi, Salman Abdullah, Aditi Tuli, et al.
@@ -152,7 +152,7 @@ tasks. On question-answering tasks that involve complex, multi-step reasoning,
we show state-of-the-art results; for example, by coupling RAPTOR retrieval
with the use of GPT-4, we can improve the best performance on the QuALITY
benchmark by 20% in absolute accuracy.
## Corrective Retrieval Augmented Generation
- **Authors:** Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, et al.
@@ -180,7 +180,7 @@ them. CRAG is plug-and-play and can be seamlessly coupled with various
RAG-based approaches. Experiments on four datasets covering short- and
long-form generation tasks show that CRAG can significantly improve the
performance of RAG-based approaches.
## Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering
- **Authors:** Tal Ridnik, Dedy Kredo, Itamar Friedman
@@ -206,7 +206,7 @@ to 44% with the AlphaCodium flow. Many of the principles and best practices
acquired in this work, we believe, are broadly applicable to general code
generation tasks. Full implementation is available at:
https://github.com/Codium-ai/AlphaCodium
## Mixtral of Experts
- **Authors:** Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, et al.
@@ -229,7 +229,7 @@ multilingual benchmarks. We also provide a model fine-tuned to follow
instructions, Mixtral 8x7B - Instruct, that surpasses GPT-3.5 Turbo,
Claude-2.1, Gemini Pro, and Llama 2 70B - chat model on human benchmarks. Both
the base and instruct models are released under the Apache 2.0 license.
## Dense X Retrieval: What Retrieval Granularity Should We Use?
- **Authors:** Tong Chen, Hongwei Wang, Sihao Chen, et al.
@@ -255,7 +255,7 @@ also enhances the performance of downstream QA tasks, since the retrieved texts
are more condensed with question-relevant information, reducing the need for
lengthy input tokens and minimizing the inclusion of extraneous, irrelevant
information.
## Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models
- **Authors:** Wenhao Yu, Hongming Zhang, Xiaoman Pan, et al.
@@ -286,7 +286,7 @@ with CoN significantly outperform standard RALMs. Notably, CoN achieves an
average improvement of +7.9 in EM score given entirely noisy retrieved
documents and +10.5 in rejection rates for real-time questions that fall
outside the pre-training knowledge scope.
## Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
- **Authors:** Akari Asai, Zeqiu Wu, Yizhong Wang, et al.
@@ -317,7 +317,7 @@ outperforms ChatGPT and retrieval-augmented Llama2-chat on Open-domain QA,
reasoning and fact verification tasks, and it shows significant gains in
improving factuality and citation accuracy for long-form generations relative
to these models.
## Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
- **Authors:** Huaixiu Steven Zheng, Swaroop Mishra, Xinyun Chen, et al.
@@ -338,7 +338,7 @@ substantial performance gains on various challenging reasoning-intensive tasks
including STEM, Knowledge QA, and Multi-Hop Reasoning. For instance, Step-Back
Prompting improves PaLM-2L performance on MMLU (Physics and Chemistry) by 7%
and 11% respectively, TimeQA by 27%, and MuSiQue by 7%.
## Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation
- **Authors:** Xuefei Ning, Zinan Lin, Zixuan Zhou, et al.
@@ -359,7 +359,7 @@ potentially improve the answer quality on several question categories. SoT is
an initial attempt at data-centric optimization for inference efficiency, and
showcases the potential of eliciting high-quality answers by explicitly
planning the answer structure in language.
## Llama 2: Open Foundation and Fine-Tuned Chat Models
- **Authors:** Hugo Touvron, Louis Martin, Kevin Stone, et al.
@@ -377,7 +377,7 @@ safety, may be a suitable substitute for closed-source models. We provide a
detailed description of our approach to fine-tuning and safety improvements of
Llama 2-Chat in order to enable the community to build on our work and
contribute to the responsible development of LLMs.
## Lost in the Middle: How Language Models Use Long Contexts
- **Authors:** Nelson F. Liu, Kevin Lin, John Hewitt, et al.
@@ -399,7 +399,7 @@ significantly degrades when models must access relevant information in the
middle of long contexts, even for explicitly long-context models. Our analysis
provides a better understanding of how language models use their input context
and provides new evaluation protocols for future long-context language models.
## Query Rewriting for Retrieval-Augmented Large Language Models
- **Authors:** Xinbei Ma, Yeyun Gong, Pengcheng He, et al.
@@ -426,7 +426,7 @@ Evaluation is conducted on downstream tasks, open-domain QA and multiple-choice
QA. Experiments results show consistent performance improvement, indicating
that our framework is proven effective and scalable, and brings a new framework
for retrieval-augmented LLM.
## Large Language Model Guided Tree-of-Thought
- **Authors:** Jieyi Long
@@ -452,7 +452,7 @@ the effectiveness of the proposed technique, we implemented a ToT-based solver
for the Sudoku Puzzle. Experimental results show that the ToT framework can
significantly increase the success rate of Sudoku puzzle solving. Our
implementation of the ToT-based Sudoku solver is available on [GitHub](https://github.com/jieyilong/tree-of-thought-puzzle-solver).
## Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
- **Authors:** Lei Wang, Wanyu Xu, Yihuai Lan, et al.
@@ -482,7 +482,7 @@ by a large margin, is comparable to or exceeds Zero-shot-Program-of-Thought
Prompting, and has comparable performance with 8-shot CoT prompting on the math
reasoning problem. The code can be found at
https://github.com/AGI-Edgerunners/Plan-and-Solve-Prompting.
## Zero-Shot Listwise Document Reranking with a Large Language Model
- **Authors:** Xueguang Ma, Xinyu Zhang, Ronak Pradeep, et al.
@@ -506,7 +506,7 @@ results, but can also act as a final-stage reranker to improve the top-ranked
results of a pointwise method for improved efficiency. Additionally, we apply
our approach to subsets of MIRACL, a recent multilingual retrieval dataset,
with results showing its potential to generalize across different languages.
## Visual Instruction Tuning
- **Authors:** Haotian Liu, Chunyuan Li, Qingyang Wu, et al.
@@ -530,7 +530,7 @@ instruction-following dataset. When fine-tuned on Science QA, the synergy of
LLaVA and GPT-4 achieves a new state-of-the-art accuracy of 92.53%. We make
GPT-4 generated visual instruction tuning data, our model and code base
publicly available.
## Generative Agents: Interactive Simulacra of Human Behavior
- **Authors:** Joon Sung Park, Joseph C. O'Brien, Carrie J. Cai, et al.
@@ -563,7 +563,7 @@ architecture--observation, planning, and reflection--each contribute critically
to the believability of agent behavior. By fusing large language models with
computational, interactive agents, this work introduces architectural and
interaction patterns for enabling believable simulations of human behavior.
## CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society
- **Authors:** Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, et al.
@@ -590,7 +590,7 @@ include introducing a novel communicative agent framework, offering a scalable
approach for studying the cooperative behaviors and capabilities of multi-agent
systems, and open-sourcing our library to support research on communicative
agents and beyond: https://github.com/camel-ai/camel.
## HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face
- **Authors:** Yongliang Shen, Kaitao Song, Xu Tan, et al.
@@ -619,7 +619,7 @@ HuggingGPT can tackle a wide range of sophisticated AI tasks spanning different
modalities and domains and achieve impressive results in language, vision,
speech, and other challenging tasks, which paves a new way towards the
realization of artificial general intelligence.
## A Watermark for Large Language Models
- **Authors:** John Kirchenbauer, Jonas Geiping, Yuxin Wen, et al.
@@ -641,7 +641,7 @@ interpretable p-values, and derive an information-theoretic framework for
analyzing the sensitivity of the watermark. We test the watermark using a
multi-billion parameter model from the Open Pretrained Transformer (OPT)
family, and discuss robustness and security.
## Precise Zero-Shot Dense Retrieval without Relevance Labels
- **Authors:** Luyu Gao, Xueguang Ma, Jimmy Lin, et al.
@@ -670,7 +670,7 @@ details. Our experiments show that HyDE significantly outperforms the
state-of-the-art unsupervised dense retriever Contriever and shows strong
performance comparable to fine-tuned retrievers, across various tasks (e.g. web
search, QA, fact verification) and languages~(e.g. sw, ko, ja).
## Constitutional AI: Harmlessness from AI Feedback
- **Authors:** Yuntao Bai, Saurav Kadavath, Sandipan Kundu, et al.
@@ -697,7 +697,7 @@ and RL methods can leverage chain-of-thought style reasoning to improve the
human-judged performance and transparency of AI decision making. These methods
make it possible to control AI behavior more precisely and with far fewer human
labels.
## Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments
- **Authors:** Zhivar Sourati, Vishnu Priya Prasanna Venkatesh, Darshan Deshpande, et al.
@@ -727,7 +727,7 @@ components and fallacy classes, indicating that fallacy identification is a
challenging task that may require specialized forms of reasoning to capture
various classes. We share our open-source code and data on GitHub to support
further work on logical fallacy identification.
## Complementary Explanations for Effective In-Context Learning
- **Authors:** Xi Ye, Srinivasan Iyer, Asli Celikyilmaz, et al.
@@ -752,7 +752,7 @@ performance. Therefore, we propose a maximal marginal relevance-based exemplar
selection approach for constructing exemplar sets that are both relevant as
well as complementary, which successfully improves the in-context learning
performance across three real-world tasks on multiple LLMs.
## PAL: Program-aided Language Models
- **Authors:** Luyu Gao, Aman Madaan, Shuyan Zhou, et al.
@@ -784,7 +784,7 @@ larger models. For example, PAL using Codex achieves state-of-the-art few-shot
accuracy on the GSM8K benchmark of math word problems, surpassing PaLM-540B
which uses chain-of-thought by absolute 15% top-1. Our code and data are
publicly available at http://reasonwithpal.com/ .
## An Analysis of Fusion Functions for Hybrid Retrieval
- **Authors:** Sebastian Bruch, Siyu Gai, Amir Ingber
@@ -803,7 +803,7 @@ learning of a CC fusion is generally agnostic to the choice of score
normalization; that CC outperforms RRF in in-domain and out-of-domain settings;
and finally, that CC is sample efficient, requiring only a small set of
training examples to tune its only parameter to a target domain.
## ReAct: Synergizing Reasoning and Acting in Language Models
- **Authors:** Shunyu Yao, Jeffrey Zhao, Dian Yu, et al.
@@ -835,7 +835,7 @@ benchmarks (ALFWorld and WebShop), ReAct outperforms imitation and
reinforcement learning methods by an absolute success rate of 34% and 10%
respectively, while being prompted with only one or two in-context examples.
Project site with code: https://react-lm.github.io
## Deep Lake: a Lakehouse for Deep Learning
- **Authors:** Sasun Hambardzumyan, Abhinav Tuli, Levon Ghukasyan, et al.
@@ -860,7 +860,7 @@ streams the data over the network to (a) Tensor Query Language, (b) in-browser
visualization engine, or (c) deep learning frameworks without sacrificing GPU
utilization. Datasets stored in Deep Lake can be accessed from PyTorch,
TensorFlow, JAX, and integrate with numerous MLOps tools.
## Matryoshka Representation Learning
- **Authors:** Aditya Kusupati, Gantavya Bhatt, Aniket Rege, et al.
@@ -891,7 +891,7 @@ representations. Finally, we show that MRL extends seamlessly to web-scale
datasets (ImageNet, JFT) across various modalities -- vision (ViT, ResNet),
vision + language (ALIGN) and language (BERT). MRL code and pretrained models
are open-sourced at https://github.com/RAIVNLab/MRL.
## Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages
- **Authors:** Kevin Heffernan, Onur Çelebi, Holger Schwenk
@@ -917,7 +917,7 @@ which is valuable in the low-resource setting.
very low-resource languages and handle 50 African languages, many of which are
not covered by any other model. For these languages, we train sentence
encoders, mine bitexts, and validate the bitexts by training NMT systems.
## Evaluating the Text-to-SQL Capabilities of Large Language Models
- **Authors:** Nitarshan Rajkumar, Raymond Li, Dzmitry Bahdanau
@@ -934,7 +934,7 @@ this setting. Furthermore, we demonstrate on the GeoQuery and Scholar
benchmarks that a small number of in-domain examples provided in the prompt
enables Codex to perform better than state-of-the-art models finetuned on such
few-shot examples.
## Locally Typical Sampling
- **Authors:** Clara Meister, Tiago Pimentel, Gian Wiher, et al.
@@ -963,7 +963,7 @@ human evaluations show that, in comparison to nucleus and top-k sampling,
locally typical sampling offers competitive performance (in both abstractive
summarization and story generation) in terms of quality while consistently
reducing degenerate repetitions.
## ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction
- **Authors:** Keshav Santhanam, Omar Khattab, Jon Saad-Falcon, et al.
@@ -985,7 +985,7 @@ improve the quality and space footprint of late interaction. We evaluate
ColBERTv2 across a wide range of benchmarks, establishing state-of-the-art
quality within and outside the training domain while reducing the space
footprint of late interaction models by 6--10$\times$.
## Learning Transferable Visual Models From Natural Language Supervision
- **Authors:** Alec Radford, Jong Wook Kim, Chris Hallacy, et al.
@@ -1014,7 +1014,7 @@ For instance, we match the accuracy of the original ResNet-50 on ImageNet
zero-shot without needing to use any of the 1.28 million training examples it
was trained on. We release our code and pre-trained model weights at
https://github.com/OpenAI/CLIP.
## Language Models are Few-Shot Learners
- **Authors:** Tom B. Brown, Benjamin Mann, Nick Ryder, et al.
@@ -1047,7 +1047,7 @@ training on large web corpora. Finally, we find that GPT-3 can generate samples
of news articles which human evaluators have difficulty distinguishing from
articles written by humans. We discuss broader societal impacts of this finding
and of GPT-3 in general.
## Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
- **Authors:** Patrick Lewis, Ethan Perez, Aleksandra Piktus, et al.
@@ -1078,7 +1078,7 @@ parametric seq2seq models and task-specific retrieve-and-extract architectures.
For language generation tasks, we find that RAG models generate more specific,
diverse and factual language than a state-of-the-art parametric-only seq2seq
baseline.
## CTRL: A Conditional Transformer Language Model for Controllable Generation
- **Authors:** Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, et al.
@@ -1098,4 +1098,3 @@ codes also allow CTRL to predict which parts of the training data are most
likely given a sequence. This provides a potential method for analyzing large
amounts of data via model-based source attribution. We have released multiple
full-sized, pretrained versions of CTRL at https://github.com/salesforce/ctrl.

View File

@@ -7,4 +7,4 @@
- `BaseChatModel` methods `__call__`, `call_as_llm`, `predict`, `predict_messages`. Will be removed in 0.2.0. Use `BaseChatModel.invoke` instead.
- `BaseChatModel` methods `apredict`, `apredict_messages`. Will be removed in 0.2.0. Use `BaseChatModel.ainvoke` instead.
- `BaseLLM` methods `__call__`, `predict`, `predict_messages`. Will be removed in 0.2.0. Use `BaseLLM.invoke` instead.
- `BaseLLM` methods `apredict`, `apredict_messages`. Will be removed in 0.2.0. Use `BaseLLM.ainvoke` instead.
- `BaseLLM` methods `apredict`, `apredict_messages`. Will be removed in 0.2.0. Use `BaseLLM.ainvoke` instead.

View File

@@ -90,4 +90,4 @@ Deprecated classes and methods will be removed in 0.2.0
| OpenAIMultiFunctionsAgent | create_openai_tools_agent | Use LCEL builder over a class |
| SelfAskWithSearchAgent | create_self_ask_with_search | Use LCEL builder over a class |
| StructuredChatAgent | create_structured_chat_agent | Use LCEL builder over a class |
| XMLAgent | create_xml_agent | Use LCEL builder over a class |
| XMLAgent | create_xml_agent | Use LCEL builder over a class |

View File

@@ -11,8 +11,8 @@ Please see the following resources for more information:
## Legacy agent concept: AgentExecutor
LangChain previously introduced the `AgentExecutor` as a runtime for agents.
While it served as an excellent starting point, its limitations became apparent when dealing with more sophisticated and customized agents.
LangChain previously introduced the `AgentExecutor` as a runtime for agents.
While it served as an excellent starting point, its limitations became apparent when dealing with more sophisticated and customized agents.
As a result, we're gradually phasing out `AgentExecutor` in favor of more flexible solutions in LangGraph.
### Transitioning from AgentExecutor to LangGraph

View File

@@ -1,4 +1,4 @@
# Async programming with langchain
# Async programming with LangChain
:::info Prerequisites
* [Runnable interface](/docs/concepts/runnables)
@@ -12,7 +12,7 @@ You are expected to be familiar with asynchronous programming in Python before r
This guide specifically focuses on what you need to know to work with LangChain in an asynchronous context, assuming that you are already familiar with asynchronous programming.
:::
## Langchain asynchronous APIs
## LangChain asynchronous APIs
Many LangChain APIs are designed to be asynchronous, allowing you to build efficient and responsive applications.

View File

@@ -70,4 +70,4 @@ This is a common reason why you may fail to see events being emitted from custom
runnables or tools.
:::
For specifics on how to use callbacks, see the [relevant how-to guides here](/docs/how_to/#callbacks).
For specifics on how to use callbacks, see the [relevant how-to guides here](/docs/how_to/#callbacks).

View File

@@ -26,7 +26,7 @@ A full conversation often involves a combination of two patterns of alternating
Since chat models have a maximum limit on input size, it's important to manage chat history and trim it as needed to avoid exceeding the [context window](/docs/concepts/chat_models/#context-window).
While processing chat history, it's essential to preserve a correct conversation structure.
While processing chat history, it's essential to preserve a correct conversation structure.
Key guidelines for managing chat history:

View File

@@ -127,7 +127,7 @@ If the input exceeds the context window, the model may not be able to process th
The size of the input is measured in [tokens](/docs/concepts/tokens) which are the unit of processing that the model uses.
## Advanced topics
### Rate-limiting
Many chat model providers impose a limit on the number of requests that can be made in a given time period.

View File

@@ -15,9 +15,9 @@ Embedding models can also be [multimodal](/docs/concepts/multimodality) though s
Imagine being able to capture the essence of any text - a tweet, document, or book - in a single, compact representation.
This is the power of embedding models, which lie at the heart of many retrieval systems.
Embedding models transform human language into a format that machines can understand and compare with speed and accuracy.
Embedding models transform human language into a format that machines can understand and compare with speed and accuracy.
These models take text as input and produce a fixed-length array of numbers, a numerical fingerprint of the text's semantic meaning.
Embeddings allow search system to find relevant documents not just based on keyword matches, but on semantic understanding.
Embeddings allow search system to find relevant documents not just based on keyword matches, but on semantic understanding.
## Key concepts
@@ -27,16 +27,16 @@ Embeddings allow search system to find relevant documents not just based on keyw
(2) **Measure similarity**: Embedding vectors can be compared using simple mathematical operations.
## Embedding
## Embedding
### Historical context
### Historical context
The landscape of embedding models has evolved significantly over the years.
A pivotal moment came in 2018 when Google introduced [BERT (Bidirectional Encoder Representations from Transformers)](https://www.nvidia.com/en-us/glossary/bert/).
The landscape of embedding models has evolved significantly over the years.
A pivotal moment came in 2018 when Google introduced [BERT (Bidirectional Encoder Representations from Transformers)](https://www.nvidia.com/en-us/glossary/bert/).
BERT applied transformer models to embed text as a simple vector representation, which lead to unprecedented performance across various NLP tasks.
However, BERT wasn't optimized for generating sentence embeddings efficiently.
However, BERT wasn't optimized for generating sentence embeddings efficiently.
This limitation spurred the creation of [SBERT (Sentence-BERT)](https://www.sbert.net/examples/training/sts/README.html), which adapted the BERT architecture to generate semantically rich sentence embeddings, easily comparable via similarity metrics like cosine similarity, dramatically reduced the computational overhead for tasks like finding similar sentences.
Today, the embedding model ecosystem is diverse, with numerous providers offering their own implementations.
Today, the embedding model ecosystem is diverse, with numerous providers offering their own implementations.
To navigate this variety, researchers and practitioners often turn to benchmarks like the Massive Text Embedding Benchmark (MTEB) [here](https://huggingface.co/blog/mteb) for objective comparisons.
:::info[Further reading]
@@ -93,9 +93,9 @@ LangChain offers many embedding model integrations which you can find [on the em
## Measure similarity
Each embedding is essentially a set of coordinates, often in a high-dimensional space.
Each embedding is essentially a set of coordinates, often in a high-dimensional space.
In this space, the position of each point (embedding) reflects the meaning of its corresponding text.
Just as similar words might be close to each other in a thesaurus, similar concepts end up close to each other in this embedding space.
Just as similar words might be close to each other in a thesaurus, similar concepts end up close to each other in this embedding space.
This allows for intuitive comparisons between different pieces of text.
By reducing text to these numerical representations, we can use simple mathematical operations to quickly measure how alike two pieces of text are, regardless of their original length or structure.
Some common similarity metrics include:
@@ -118,7 +118,7 @@ def cosine_similarity(vec1, vec2):
similarity = cosine_similarity(query_result, document_result)
print("Cosine Similarity:", similarity)
```
```
:::info[Further reading]
@@ -127,4 +127,4 @@ print("Cosine Similarity:", similarity)
* See Pinecone's [blog post](https://www.pinecone.io/learn/vector-similarity/) on similarity metrics.
* See OpenAI's [FAQ](https://platform.openai.com/docs/guides/embeddings/faq) on what similarity metric to use with OpenAI embeddings.
:::
:::

View File

@@ -14,4 +14,3 @@ This process is vital for building reliable applications.
- It allows you to track results over time and automatically run your evaluators on a schedule or as part of CI/Code
To learn more, check out [this LangSmith guide](https://docs.smith.langchain.com/concepts/evaluation).

View File

@@ -17,4 +17,4 @@ Sometimes these examples are hardcoded into the prompt, but for more advanced si
## Related resources
* [Example selector how-to guides](/docs/how_to/#example-selectors)
* [Example selector how-to guides](/docs/how_to/#example-selectors)

View File

@@ -31,7 +31,7 @@ The conceptual guide does not cover step-by-step instructions or specific implem
- **[Vector stores](/docs/concepts/vectorstores)**: Storage of and efficient search over vectors and associated metadata.
- **[Retriever](/docs/concepts/retrievers)**: A component that returns relevant documents from a knowledge base in response to a query.
- **[Retrieval Augmented Generation (RAG)](/docs/concepts/rag)**: A technique that enhances language models by combining them with external knowledge bases.
- **[Agents](/docs/concepts/agents)**: Use a [language model](/docs/concepts/chat_models) to choose a sequence of actions to take. Agents can interact with external resources via [tool](/docs/concepts/tools).
- **[Agents](/docs/concepts/agents)**: Use a [language model](/docs/concepts/chat_models) to choose a sequence of actions to take. Agents can interact with external resources via [tools](/docs/concepts/tools).
- **[Prompt templates](/docs/concepts/prompt_templates)**: Component for factoring out the static parts of a model "prompt" (usually a sequence of messages). Useful for serializing, versioning, and reusing these static parts.
- **[Output parsers](/docs/concepts/output_parsers)**: Responsible for taking the output of a model and transforming it into a more suitable format for downstream tasks. Output parsers were primarily useful prior to the general availability of [tool calling](/docs/concepts/tool_calling) and [structured outputs](/docs/concepts/structured_outputs).
- **[Few-shot prompting](/docs/concepts/few_shot_prompting)**: A technique for improving model performance by providing a few examples of the task to perform in the prompt.
@@ -48,7 +48,7 @@ The conceptual guide does not cover step-by-step instructions or specific implem
- **[AIMessage](/docs/concepts/messages#aimessage)**: Represents a complete response from an AI model.
- **[astream_events](/docs/concepts/chat_models#key-methods)**: Stream granular information from [LCEL](/docs/concepts/lcel) chains.
- **[BaseTool](/docs/concepts/tools/#tool-interface)**: The base class for all tools in LangChain.
- **[batch](/docs/concepts/runnables)**: Use to execute a runnable with batch inputs.
- **[batch](/docs/concepts/runnables)**: Used to execute a runnable with batch inputs.
- **[bind_tools](/docs/concepts/tool_calling/#tool-binding)**: Allows models to interact with tools.
- **[Caching](/docs/concepts/chat_models#caching)**: Storing results to avoid redundant calls to a chat model.
- **[Chat models](/docs/concepts/multimodality/#multimodality-in-chat-models)**: Chat models that handle multiple data modalities.

View File

@@ -147,7 +147,7 @@ An `AIMessage` has the following attributes. The attributes which are **standard
| `tool_calls` | Standardized | Tool calls associated with the message. See [tool calling](/docs/concepts/tool_calling) for details. |
| `invalid_tool_calls` | Standardized | Tool calls with parsing errors associated with the message. See [tool calling](/docs/concepts/tool_calling) for details. |
| `usage_metadata` | Standardized | Usage metadata for a message, such as [token counts](/docs/concepts/tokens). See [Usage Metadata API Reference](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.UsageMetadata.html). |
| `id` | Standardized | An optional unique identifier for the message, ideally provided by the provider/model that created the message. |
| `id` | Standardized | An optional unique identifier for the message, ideally provided by the provider/model that created the message. See [Message IDs](#message-ids) for details. |
| `response_metadata` | Raw | Response metadata, e.g., response headers, logprobs, token counts. |
#### content
@@ -243,3 +243,37 @@ At the moment, the output of the model will be in terms of LangChain messages, s
need OpenAI format for the output as well.
The [convert_to_openai_messages](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.utils.convert_to_openai_messages.html) utility function can be used to convert from LangChain messages to OpenAI format.
## Message IDs
LangChain messages include an optional `id` field that serves as a unique identifier. Understanding when and how these IDs are assigned can be helpful for debugging, tracing, and working with message history.
### When Messages Get IDs
Messages receive IDs in the following scenarios:
**Automatically assigned by LangChain:**
- When generated through chat model invocation (`.invoke()`, `.stream()`, `.astream()`) with an active run manager/tracing context
- IDs follow the format:
- `run-$RUN_ID` (e.g., `run-ba48f958-6402-41a5-b461-5e250a4ebd36-0`)
- `run-$RUN_ID-$IDX` (e.g., `run-ba48f958-6402-41a5-b461-5e250a4ebd36-1`) when there are multiple generations from a single chat model invocation.
**Provider-assigned IDs (highest priority):**
- When the model provider assigns its own ID to the message
- These take precedence over LangChain-generated run IDs
- Format varies by provider
### When Messages Don't Get IDs
Messages will **not** receive IDs in these situations:
- **Manual message creation**: Messages created directly (e.g., `AIMessage(content="hello")`) without going through chat models
- **No run manager context**: When there's no active callback/tracing infrastructure
### ID Priority System
LangChain follows a clear precedence system for message IDs:
1. **Provider-assigned IDs** (highest priority): IDs from the model provider
2. **LangChain run IDs** (medium priority): IDs starting with `run-`
3. **Manual IDs** (lowest priority): IDs explicitly set by users

View File

@@ -14,7 +14,7 @@
* [Chat models](/docs/concepts/chat_models)
* [Messages](/docs/concepts/messages)
:::
LangChain supports multimodal data as input to chat models:
1. Following provider-specific formats

View File

@@ -53,17 +53,29 @@ This is how you use MessagesPlaceholder.
```python
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage
from langchain_core.messages import HumanMessage, AIMessage
prompt_template = ChatPromptTemplate([
("system", "You are a helpful assistant"),
MessagesPlaceholder("msgs")
])
# Simple example with one message
prompt_template.invoke({"msgs": [HumanMessage(content="hi!")]})
# More complex example with conversation history
messages_to_pass = [
HumanMessage(content="What's the capital of France?"),
AIMessage(content="The capital of France is Paris."),
HumanMessage(content="And what about Germany?")
]
formatted_prompt = prompt_template.invoke({"msgs": messages_to_pass})
print(formatted_prompt)
```
This will produce a list of two messages, the first one being a system message, and the second one being the HumanMessage we passed in.
This will produce a list of four messages total: the system message plus the three messages we passed in (two HumanMessages and one AIMessage).
If we had passed in 5 messages, then it would have produced 6 messages in total (the system message plus the 5 passed in).
This is useful for letting a list of messages be slotted into a particular spot.

View File

@@ -8,7 +8,7 @@
## Overview
Retrieval Augmented Generation (RAG) is a powerful technique that enhances [language models](/docs/concepts/chat_models/) by combining them with external knowledge bases.
Retrieval Augmented Generation (RAG) is a powerful technique that enhances [language models](/docs/concepts/chat_models/) by combining them with external knowledge bases.
RAG addresses [a key limitation of models](https://www.glean.com/blog/how-to-build-an-ai-assistant-for-the-enterprise): models rely on fixed training datasets, which can lead to outdated or incomplete information.
When given a query, RAG systems first search a knowledge base for relevant information.
The system then incorporates this retrieved information into the model's prompt.
@@ -44,7 +44,7 @@ See our conceptual guide on [retrieval](/docs/concepts/retrieval/).
## Adding external knowledge
With a retrieval system in place, we need to pass knowledge from this system to the model.
With a retrieval system in place, we need to pass knowledge from this system to the model.
A RAG pipeline typically achieves this following these steps:
- Receive an input query.
@@ -59,12 +59,12 @@ from langchain_openai import ChatOpenAI
from langchain_core.messages import SystemMessage, HumanMessage
# Define a system prompt that tells the model how to use the retrieved context
system_prompt = """You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
system_prompt = """You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use three sentences maximum and keep the answer concise.
Context: {context}:"""
# Define a question
question = """What are the main components of an LLM-powered autonomous agent system?"""
@@ -78,7 +78,7 @@ docs_text = "".join(d.page_content for d in docs)
system_prompt_fmt = system_prompt.format(context=docs_text)
# Create a model
model = ChatOpenAI(model="gpt-4o", temperature=0)
model = ChatOpenAI(model="gpt-4o", temperature=0)
# Generate a response
questions = model.invoke([SystemMessage(content=system_prompt_fmt),

View File

@@ -10,28 +10,28 @@
:::
:::danger[Security]
Some of the concepts reviewed here utilize models to generate queries (e.g., for SQL or graph databases).
There are inherent risks in doing this.
Make sure that your database connection permissions are scoped as narrowly as possible for your application's needs.
This will mitigate, though not eliminate, the risks of building a model-driven system capable of querying databases.
There are inherent risks in doing this.
Make sure that your database connection permissions are scoped as narrowly as possible for your application's needs.
This will mitigate, though not eliminate, the risks of building a model-driven system capable of querying databases.
For more on general security best practices, see our [security guide](/docs/security/).
:::
## Overview
## Overview
Retrieval systems are fundamental to many AI applications, efficiently identifying relevant information from large datasets.
Retrieval systems are fundamental to many AI applications, efficiently identifying relevant information from large datasets.
These systems accommodate various data formats:
- Unstructured text (e.g., documents) is often stored in vector stores or lexical search indexes.
- Structured data is typically housed in relational or graph databases with defined schemas.
Despite the growing diversity in data formats, modern AI applications increasingly aim to make all types of data accessible through natural language interfaces.
Models play a crucial role in this process by translating natural language queries into formats compatible with the underlying search index or database.
Despite the growing diversity in data formats, modern AI applications increasingly aim to make all types of data accessible through natural language interfaces.
Models play a crucial role in this process by translating natural language queries into formats compatible with the underlying search index or database.
This translation enables more intuitive and flexible interactions with complex data structures.
## Key concepts
## Key concepts
![Retrieval](/img/retrieval_concept.png)
@@ -39,20 +39,20 @@ This translation enables more intuitive and flexible interactions with complex d
(2) **Information retrieval**: Search queries are used to fetch information from various retrieval systems.
## Query analysis
## Query analysis
While users typically prefer to interact with retrieval systems using natural language, these systems may require specific query syntax or benefit from certain keywords.
While users typically prefer to interact with retrieval systems using natural language, these systems may require specific query syntax or benefit from certain keywords.
Query analysis serves as a bridge between raw user input and optimized search queries. Some common applications of query analysis include:
1. **Query Re-writing**: Queries can be re-written or expanded to improve semantic or lexical searches.
2. **Query Construction**: Search indexes may require structured queries (e.g., SQL for databases).
Query analysis employs models to transform or construct optimized search queries from raw user input.
Query analysis employs models to transform or construct optimized search queries from raw user input.
### Query re-writing
Retrieval systems should ideally handle a wide spectrum of user inputs, from simple and poorly worded queries to complex, multi-faceted questions.
To achieve this versatility, a popular approach is to use models to transform raw user queries into more effective search queries.
Retrieval systems should ideally handle a wide spectrum of user inputs, from simple and poorly worded queries to complex, multi-faceted questions.
To achieve this versatility, a popular approach is to use models to transform raw user queries into more effective search queries.
This transformation can range from simple keyword extraction to sophisticated query expansion and reformulation.
Here are some key benefits of using models for query analysis in unstructured data retrieval:
@@ -87,7 +87,7 @@ class Questions(BaseModel):
)
# Create an instance of the model and enforce the output structure
model = ChatOpenAI(model="gpt-4o", temperature=0)
model = ChatOpenAI(model="gpt-4o", temperature=0)
structured_model = model.with_structured_output(Questions)
# Define the system prompt
@@ -111,7 +111,7 @@ See our RAG from Scratch videos for a few different specific approaches:
### Query construction
Query analysis also can focus on translating natural language queries into specialized query languages or filters.
Query analysis also can focus on translating natural language queries into specialized query languages or filters.
This translation is crucial for effectively interacting with various types of databases that house structured or semi-structured data.
1. **Structured Data examples**: For relational and graph databases, Domain-Specific Languages (DSLs) are used to query data.
@@ -129,10 +129,10 @@ These approaches leverage models to bridge the gap between user intent and the s
| [Text to SQL](/docs/tutorials/sql_qa/) | If users are asking questions that require information housed in a relational database, accessible via SQL. | This uses an LLM to transform user input into a SQL query. |
| [Text-to-Cypher](/docs/tutorials/graph/) | If users are asking questions that require information housed in a graph database, accessible via Cypher. | This uses an LLM to transform user input into a Cypher query. |
As an example, here is how to use the `SelfQueryRetriever` to convert natural language queries into metadata filters.
As an example, here is how to use the `SelfQueryRetriever` to convert natural language queries into metadata filters.
```python
metadata_field_info = schema_for_metadata
metadata_field_info = schema_for_metadata
document_content_description = "Brief summary of a movie"
llm = ChatOpenAI(temperature=0)
retriever = SelfQueryRetriever.from_llm(
@@ -149,20 +149,20 @@ retriever = SelfQueryRetriever.from_llm(
* See our [blog post overview](https://blog.langchain.dev/query-construction/).
* See our RAG from Scratch video on [query construction](https://youtu.be/kl6NwWYxvbM?feature=shared).
:::
:::
## Information retrieval
## Information retrieval
### Common retrieval systems
#### Lexical search indexes
Many search engines are based upon matching words in a query to the words in each document.
Many search engines are based upon matching words in a query to the words in each document.
This approach is called lexical retrieval, using search [algorithms that are typically based upon word frequencies](https://cameronrwolfe.substack.com/p/the-basics-of-ai-powered-vector-search?utm_source=profile&utm_medium=reader2).
The intution is simple: a word appears frequently both in the users query and a particular document, then this document might be a good match.
The particular data structure used to implement this is often an [*inverted index*](https://www.geeksforgeeks.org/inverted-index/).
This types of index contains a list of words and a mapping of each word to a list of locations at which it occurs in various documents.
This types of index contains a list of words and a mapping of each word to a list of locations at which it occurs in various documents.
Using this data structure, it is possible to efficiently match the words in search queries to the documents in which they appear.
[BM25](https://en.wikipedia.org/wiki/Okapi_BM25#:~:text=BM25%20is%20a%20bag%2Dof,slightly%20different%20components%20and%20parameters.) and [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) are [two popular lexical search algorithms](https://cameronrwolfe.substack.com/p/the-basics-of-ai-powered-vector-search?utm_source=profile&utm_medium=reader2).
@@ -171,13 +171,13 @@ Using this data structure, it is possible to efficiently match the words in sear
* See the [BM25](/docs/integrations/retrievers/bm25/) retriever integration.
* See the [Elasticsearch](/docs/integrations/retrievers/elasticsearch_retriever/) retriever integration.
:::
:::
#### Vector indexes
Vector indexes are an alternative way to index and store unstructured data.
See our conceptual guide on [vectorstores](/docs/concepts/vectorstores/) for a detailed overview.
In short, rather than using word frequencies, vectorstores use an [embedding model](/docs/concepts/embedding_models/) to compress documents into high-dimensional vector representation.
See our conceptual guide on [vectorstores](/docs/concepts/vectorstores/) for a detailed overview.
In short, rather than using word frequencies, vectorstores use an [embedding model](/docs/concepts/embedding_models/) to compress documents into high-dimensional vector representation.
This allows for efficient similarity search over embedding vectors using simple mathematical operations like cosine similarity.
:::info[Further reading]
@@ -190,9 +190,9 @@ This allows for efficient similarity search over embedding vectors using simple
#### Relational databases
Relational databases are a fundamental type of structured data storage used in many applications.
They organize data into tables with predefined schemas, where each table represents an entity or relationship.
Data is stored in rows (records) and columns (attributes), allowing for efficient querying and manipulation through SQL (Structured Query Language).
Relational databases are a fundamental type of structured data storage used in many applications.
They organize data into tables with predefined schemas, where each table represents an entity or relationship.
Data is stored in rows (records) and columns (attributes), allowing for efficient querying and manipulation through SQL (Structured Query Language).
Relational databases excel at maintaining data integrity, supporting complex queries, and handling relationships between different data entities.
:::info[Further reading]
@@ -204,8 +204,8 @@ Relational databases excel at maintaining data integrity, supporting complex que
#### Graph databases
Graph databases are a specialized type of database designed to store and manage highly interconnected data.
Unlike traditional relational databases, graph databases use a flexible structure consisting of nodes (entities), edges (relationships), and properties.
Graph databases are a specialized type of database designed to store and manage highly interconnected data.
Unlike traditional relational databases, graph databases use a flexible structure consisting of nodes (entities), edges (relationships), and properties.
This structure allows for efficient representation and querying of complex, interconnected data.
Graph databases store data in a graph structure, with nodes, edges, and properties.
They are particularly useful for storing and querying complex relationships between data points, such as social networks, supply-chain management, fraud detection, and recommendation services
@@ -213,12 +213,12 @@ They are particularly useful for storing and querying complex relationships betw
:::info[Further reading]
* See our [tutorial](/docs/tutorials/graph/) for working with graph databases.
* See our [list of graph database integrations](/docs/integrations/graphs/).
* See our [list of graph database integrations](/docs/integrations/graphs/).
* See Neo4j's [starter kit for LangChain](https://neo4j.com/developer-blog/langchain-neo4j-starter-kit/).
:::
### Retriever
### Retriever
LangChain provides a unified interface for interacting with various retrieval systems through the [retriever](/docs/concepts/retrievers/) concept. The interface is straightforward:

View File

@@ -23,16 +23,16 @@ The LangChain [retriever](/docs/concepts/retrievers/) interface is straightforwa
## Key concept
![Retriever](/img/retriever_concept.png)
All retrievers implement a simple interface for retrieving documents using natural language queries.
## Interface
## Interface
The only requirement for a retriever is the ability to accepts a query and return documents.
The only requirement for a retriever is the ability to accepts a query and return documents.
In particular, [LangChain's retriever class](https://python.langchain.com/api_reference/core/retrievers/langchain_core.retrievers.BaseRetriever.html#) only requires that the `_get_relevant_documents` method is implemented, which takes a `query: str` and returns a list of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects that are most relevant to the query.
The underlying logic used to get relevant documents is specified by the retriever and can be whatever is most useful for the application.
A LangChain retriever is a [runnable](/docs/how_to/lcel_cheatsheet/), which is a standard interface for LangChain components.
A LangChain retriever is a [runnable](/docs/how_to/lcel_cheatsheet/), which is a standard interface for LangChain components.
This means that it has a few common methods, including `invoke`, that are used to interact with it. A retriever can be invoked with a query:
```python
@@ -42,23 +42,23 @@ docs = retriever.invoke(query)
Retrievers return a list of [Document](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html) objects, which have two attributes:
* `page_content`: The content of this document. Currently is a string.
* `metadata`: Arbitrary metadata associated with this document (e.g., document id, file name, source, etc).
* `metadata`: Arbitrary metadata associated with this document (e.g., document id, file name, source, etc).
:::info[Further reading]
* See our [how-to guide](/docs/how_to/custom_retriever/) on building your own custom retriever.
:::
## Common types
Despite the flexibility of the retriever interface, a few common types of retrieval systems are frequently used.
### Search apis
It's important to note that retrievers don't need to actually *store* documents.
For example, we can build retrievers on top of search APIs that simply return search results!
See our retriever integrations with [Amazon Kendra](/docs/integrations/retrievers/amazon_kendra_retriever/) or [Wikipedia Search](/docs/integrations/retrievers/wikipedia/).
It's important to note that retrievers don't need to actually *store* documents.
For example, we can build retrievers on top of search APIs that simply return search results!
See our retriever integrations with [Amazon Kendra](/docs/integrations/retrievers/amazon_kendra_retriever/) or [Wikipedia Search](/docs/integrations/retrievers/wikipedia/).
### Relational or graph database
@@ -75,7 +75,7 @@ For example, you can build a retriever for a SQL database using text-to-SQL conv
### Lexical search
As discussed in our conceptual review of [retrieval](/docs/concepts/retrieval/), many search engines are based upon matching words in a query to the words in each document.
As discussed in our conceptual review of [retrieval](/docs/concepts/retrieval/), many search engines are based upon matching words in a query to the words in each document.
[BM25](https://en.wikipedia.org/wiki/Okapi_BM25#:~:text=BM25%20is%20a%20bag%2Dof,slightly%20different%20components%20and%20parameters.) and [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) are [two popular lexical search algorithms](https://cameronrwolfe.substack.com/p/the-basics-of-ai-powered-vector-search?utm_source=profile&utm_medium=reader2).
LangChain has retrievers for many popular lexical search algorithms / engines.
@@ -85,11 +85,11 @@ LangChain has retrievers for many popular lexical search algorithms / engines.
* See the [TF-IDF](/docs/integrations/retrievers/tf_idf/) retriever integration.
* See the [Elasticsearch](/docs/integrations/retrievers/elasticsearch_retriever/) retriever integration.
:::
:::
### Vector store
### Vector store
[Vector stores](/docs/concepts/vectorstores/) are a powerful and efficient way to index and retrieve unstructured data.
[Vector stores](/docs/concepts/vectorstores/) are a powerful and efficient way to index and retrieve unstructured data.
A vectorstore can be used as a retriever by calling the `as_retriever()` method.
```python
@@ -99,7 +99,7 @@ retriever = vectorstore.as_retriever()
## Advanced retrieval patterns
### Ensemble
### Ensemble
Because the retriever interface is so simple, returning a list of `Document` objects given a search query, it is possible to combine multiple retrievers using ensembling.
This is particularly useful when you have multiple retrievers that are good at finding different types of relevant documents.
@@ -112,24 +112,24 @@ ensemble_retriever = EnsembleRetriever(
)
```
When ensembling, how do we combine search results from many retrievers?
When ensembling, how do we combine search results from many retrievers?
This motivates the concept of re-ranking, which takes the output of multiple retrievers and combines them using a more sophisticated algorithm such as [Reciprocal Rank Fusion (RRF)](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf).
### Source document retention
### Source document retention
Many retrievers utilize some kind of index to make documents easily searchable.
The process of indexing can include a transformation step (e.g., vectorstores often use document splitting).
The process of indexing can include a transformation step (e.g., vectorstores often use document splitting).
Whatever transformation is used, can be very useful to retain a link between the *transformed document* and the original, giving the retriever the ability to return the *original* document.
![Retrieval with full docs](/img/retriever_full_docs.png)
This is particularly useful in AI applications, because it ensures no loss in document context for the model.
For example, you may use small chunk size for indexing documents in a vectorstore.
If you return *only* the chunks as the retrieval result, then the model will have lost the original document context for the chunks.
For example, you may use small chunk size for indexing documents in a vectorstore.
If you return *only* the chunks as the retrieval result, then the model will have lost the original document context for the chunks.
LangChain has two different retrievers that can be used to address this challenge.
The [Multi-Vector](/docs/how_to/multi_vector/) retriever allows the user to use any document transformation (e.g., use an LLM to write a summary of the document) for indexing while retaining linkage to the source document.
The [ParentDocument](/docs/how_to/parent_document_retriever/) retriever links document chunks from a text-splitter transformation for indexing while retaining linkage to the source document.
LangChain has two different retrievers that can be used to address this challenge.
The [Multi-Vector](/docs/how_to/multi_vector/) retriever allows the user to use any document transformation (e.g., use an LLM to write a summary of the document) for indexing while retaining linkage to the source document.
The [ParentDocument](/docs/how_to/parent_document_retriever/) retriever links document chunks from a text-splitter transformation for indexing while retaining linkage to the source document.
| Name | Index Type | Uses an LLM | When to Use | Description |
|-----------------------------------------------------------|-------------------------------|---------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

View File

@@ -107,7 +107,7 @@ The Runnable interface provides methods to get the [JSON Schema](https://json-sc
These APIs are mostly used internally for unit-testing and by [LangServe](/docs/concepts/architecture#langserve) which uses the APIs for input validation and generation of [OpenAPI documentation](https://www.openapis.org/).
In addition, to the input and output types, some Runnables have been set up with additional run time configuration options.
In addition, to the input and output types, some Runnables have been set up with additional run time configuration options.
There are corresponding APIs to get the Pydantic Schema and JSON Schema of the configuration options for the Runnable.
Please see the [Configurable Runnables](#configurable-runnables) section for more information.
@@ -151,12 +151,12 @@ Passing `config` to the `invoke` method is done like so:
```python
some_runnable.invoke(
some_input,
some_input,
config={
'run_name': 'my_run',
'tags': ['tag1', 'tag2'],
'run_name': 'my_run',
'tags': ['tag1', 'tag2'],
'metadata': {'key': 'value'}
}
)
```
@@ -185,13 +185,13 @@ There are two main patterns by which new `Runnables` are created:
foo_runnable = RunnableLambda(foo)
```
LangChain will try to propagate `RunnableConfig` automatically for both of the patterns.
LangChain will try to propagate `RunnableConfig` automatically for both of the patterns.
For handling the second pattern, LangChain relies on Python's [contextvars](https://docs.python.org/3/library/contextvars.html).
In Python 3.11 and above, this works out of the box, and you do not need to do anything special to propagate the `RunnableConfig` to the sub-calls.
In Python 3.9 and 3.10, if you are using **async code**, you need to manually pass the `RunnableConfig` through to the `Runnable` when invoking it.
In Python 3.9 and 3.10, if you are using **async code**, you need to manually pass the `RunnableConfig` through to the `Runnable` when invoking it.
This is due to a limitation in [asyncio's tasks](https://docs.python.org/3/library/asyncio-task.html#asyncio.create_task) in Python 3.9 and 3.10 which did
not accept a `context` argument.
@@ -201,7 +201,7 @@ Propagating the `RunnableConfig` manually is done like so:
```python
async def foo(input, config): # <-- Note the config argument
return await bar_runnable.ainvoke(input, config=config)
foo_runnable = RunnableLambda(foo)
```
@@ -235,7 +235,7 @@ The attributes will also be propagated to [callbacks](/docs/concepts/callbacks),
This is an advanced feature that is unnecessary for most users.
:::
You may need to set a custom `run_id` for a given run, in case you want
You may need to set a custom `run_id` for a given run, in case you want
to reference it later or correlate it with other systems.
The `run_id` MUST be a valid UUID string and **unique** for each run. It is used to identify
@@ -249,7 +249,7 @@ import uuid
run_id = uuid.uuid4()
some_runnable.invoke(
some_input,
some_input,
config={
'run_id': run_id
}
@@ -292,7 +292,7 @@ In addition, you can use it to specify any custom configuration options to pass
### Setting callbacks
Use this option to configure [callbacks](/docs/concepts/callbacks) for the runnable at
Use this option to configure [callbacks](/docs/concepts/callbacks) for the runnable at
runtime. The callbacks will be passed to all sub-calls made by the runnable.
```python

View File

@@ -52,7 +52,7 @@ In addition, there is a **legacy** async [astream_log](https://python.langchain.
The `stream()` method returns an iterator that yields chunks of output synchronously as they are produced. You can use a `for` loop to process each chunk in real-time. For example, when using an LLM, this allows the output to be streamed incrementally as it is generated, reducing the wait time for users.
The type of chunk yielded by the `stream()` and `astream()` methods depends on the component being streamed. For example, when streaming from an [LLM](/docs/concepts/chat_models) each component will be an [AIMessageChunk](/docs/concepts/messages#aimessagechunk); however, for other components, the chunk may be different.
The type of chunk yielded by the `stream()` and `astream()` methods depends on the component being streamed. For example, when streaming from an [LLM](/docs/concepts/chat_models) each component will be an [AIMessageChunk](/docs/concepts/messages#aimessagechunk); however, for other components, the chunk may be different.
The `stream()` method returns an iterator that yields these chunks as they are produced. For example,
@@ -99,7 +99,7 @@ If you compose multiple Runnables using [LangChains Expression Language (LCEL
<span data-heading-keywords="astream_events,stream_events,stream events"></span>
:::tip
Use the `astream_events` API to access custom data and intermediate outputs from LLM applications built entirely with [LCEL](/docs/concepts/lcel).
Use the `astream_events` API to access custom data and intermediate outputs from LLM applications built entirely with [LCEL](/docs/concepts/lcel).
While this API is available for use with [LangGraph](/docs/concepts/architecture#langgraph) as well, it is usually not necessary when working with LangGraph, as the `stream` and `astream` methods provide comprehensive streaming capabilities for LangGraph graphs.
:::
@@ -119,7 +119,7 @@ from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_anthropic import ChatAnthropic
model = ChatAnthropic(model="claude-3-sonnet-20240229")
model = ChatAnthropic(model="claude-3-7-sonnet-20250219")
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
parser = StrOutputParser()
@@ -148,7 +148,7 @@ LangChain simplifies streaming from [chat models](/docs/concepts/chat_models) by
### How It Works
When you call the `invoke` (or `ainvoke`) method on a chat model, LangChain will automatically switch to streaming mode if it detects that you are trying to stream the overall application.
When you call the `invoke` (or `ainvoke`) method on a chat model, LangChain will automatically switch to streaming mode if it detects that you are trying to stream the overall application.
Under the hood, it'll have `invoke` (or `ainvoke`) use the `stream` (or `astream`) method to generate its output. The result of the invocation will be the same as far as the code that was using `invoke` is concerned; however, while the chat model is being streamed, LangChain will take care of invoking `on_llm_new_token` events in LangChain's [callback system](/docs/concepts/callbacks). These callback events
allow LangGraph `stream`/`astream` and `astream_events` to surface the chat model's output in real-time.
@@ -158,14 +158,14 @@ Example:
```python
def node(state):
...
# The code below uses the invoke method, but LangChain will
# The code below uses the invoke method, but LangChain will
# automatically switch to streaming mode
# when it detects that the overall
# when it detects that the overall
# application is being streamed.
ai_message = model.invoke(state["messages"])
...
for chunk in compiled_graph.stream(..., mode="messages"):
for chunk in compiled_graph.stream(..., mode="messages"):
...
```
## Async Programming

View File

@@ -1,15 +1,15 @@
# Structured outputs
## Overview
## Overview
For many applications, such as chatbots, models need to respond to users directly in natural language.
However, there are scenarios where we need models to output in a *structured format*.
For many applications, such as chatbots, models need to respond to users directly in natural language.
However, there are scenarios where we need models to output in a *structured format*.
For example, we might want to store the model output in a database and ensure that the output conforms to the database schema.
This need motivates the concept of structured output, where models can be instructed to respond with a particular output structure.
![Structured output](/img/structured_output.png)
## Key concepts
## Key concepts
1. **Schema definition:** The output structure is represented as a schema, which can be defined in several ways.<br/>
2. **Returning structured output:** The model is given this schema, and is instructed to return output that conforms to it.
@@ -18,7 +18,7 @@ This need motivates the concept of structured output, where models can be instru
This pseudocode illustrates the recommended workflow when using structured output.
LangChain provides a method, [`with_structured_output()`](/docs/how_to/structured_output/#the-with_structured_output-method), that automates the process of binding the schema to the [model](/docs/concepts/chat_models/) and parsing the output.
This helper function is available for all model providers that support structured output.
This helper function is available for all model providers that support structured output.
```python
# Define schema
@@ -29,9 +29,25 @@ model_with_structure = model.with_structured_output(schema)
structured_output = model_with_structure.invoke(user_input)
```
:::warning[Tool Order Matters]
When combining structured output with additional tools, bind tools **first**, then apply structured output:
```python
# Correct
model_with_tools = model.bind_tools([tool1, tool2])
structured_model = model_with_tools.with_structured_output(schema)
# Incorrect - will cause tool resolution errors
structured_model = model.with_structured_output(schema)
broken_model = structured_model.bind_tools([tool1, tool2])
```
:::
## Schema definition
The central concept is that the output structure of model responses needs to be represented in some way.
The central concept is that the output structure of model responses needs to be represented in some way.
While types of objects you can use depend on the model you're working with, there are common types of objects that are typically allowed or recommended for structured output in Python.
The simplest and most common format for structured output is a JSON-like structure, which in Python can be represented as a dictionary (dict) or list (list).
@@ -45,7 +61,7 @@ JSON objects (or dicts in Python) are often used directly when the tool requires
```
As a second example, [Pydantic](https://docs.pydantic.dev/latest/) is particularly useful for defining structured output schemas because it offers type hints and validation.
Here's an example of a Pydantic schema:
Here's an example of a Pydantic schema:
```python
from pydantic import BaseModel, Field
@@ -59,7 +75,7 @@ class ResponseFormatter(BaseModel):
## Returning structured output
With a schema defined, we need a way to instruct the model to use it.
While one approach is to include this schema in the prompt and *ask nicely* for the model to use it, this is not recommended.
While one approach is to include this schema in the prompt and *ask nicely* for the model to use it, this is not recommended.
Several more powerful methods that utilizes native features in the model provider's API are available.
### Using tool calling
@@ -78,7 +94,7 @@ model_with_tools = model.bind_tools([ResponseFormatter])
ai_msg = model_with_tools.invoke("What is the powerhouse of the cell?")
```
The arguments of the tool call are already extracted as a dictionary.
The arguments of the tool call are already extracted as a dictionary.
This dictionary can be optionally parsed into a Pydantic object, matching our original `ResponseFormatter` schema.
```python
@@ -92,7 +108,7 @@ pydantic_object = ResponseFormatter.model_validate(ai_msg.tool_calls[0]["args"])
### JSON mode
In addition to tool calling, some model providers support a feature called `JSON mode`.
In addition to tool calling, some model providers support a feature called `JSON mode`.
This supports JSON schema definition as input and enforces the model to produce a conforming JSON output.
You can find a table of model providers that support JSON mode [here](/docs/integrations/chat/).
Here is an example of how to use JSON mode with OpenAI:
@@ -105,21 +121,21 @@ ai_msg
{'random_ints': [45, 67, 12, 34, 89, 23, 78, 56, 90, 11]}
```
## Structured output method
## Structured output method
There are a few challenges when producing structured output with the above methods:
There are a few challenges when producing structured output with the above methods:
1. When tool calling is used, tool call arguments needs to be parsed from a dictionary back to the original schema.<br/>
1. When tool calling is used, tool call arguments needs to be parsed from a dictionary back to the original schema.<br/>
2. In addition, the model needs to be instructed to *always* use the tool when we want to enforce structured output, which is a provider specific setting.<br/>
2. In addition, the model needs to be instructed to *always* use the tool when we want to enforce structured output, which is a provider specific setting.<br/>
3. When JSON mode is used, the output needs to be parsed into a JSON object.
3. When JSON mode is used, the output needs to be parsed into a JSON object.
With these challenges in mind, LangChain provides a helper function (`with_structured_output()`) to streamline the process.
![Diagram of with structured output](/img/with_structured_output.png)
This both binds the schema to the model as a tool and parses the output to the specified output schema.
This both binds the schema to the model as a tool and parses the output to the specified output schema.
```python
# Bind the schema to the model

View File

@@ -23,9 +23,9 @@ def test_convert_to_openai_messages():
ToolCall(name='parrot_multiply_tool', id='1', args={'a': 2, 'b': 3}),
]
)
result = convert_to_openai_messages(ai_message)
expected = {
"role": "assistant",
"tool_calls": [

View File

@@ -7,4 +7,4 @@ You are probably looking for the [Chat Model Concept Guide](/docs/concepts/chat_
LangChain has implementations for older language models that take a string as input and return a string as output. These models are typically named without the "Chat" prefix (e.g., `Ollama`, `Anthropic`, `OpenAI`, etc.), and may include the "LLM" suffix (e.g., `OllamaLLM`, `AnthropicLLM`, `OpenAILLM`, etc.). These models implement the [BaseLLM](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.llms.BaseLLM.html#langchain_core.language_models.llms.BaseLLM) interface.
Users should be using almost exclusively the newer [Chat Models](/docs/concepts/chat_models) as most
model providers have adopted a chat like interface for interacting with language models.
model providers have adopted a chat like interface for interacting with language models.

View File

@@ -69,7 +69,7 @@ texts = text_splitter.split_text(document)
### Text-structured based
Text is naturally organized into hierarchical units such as paragraphs, sentences, and words.
Text is naturally organized into hierarchical units such as paragraphs, sentences, and words.
We can leverage this inherent structure to inform our splitting strategy, creating split that maintain natural language flow, maintain semantic coherence within split, and adapts to varying levels of text granularity.
LangChain's [`RecursiveCharacterTextSplitter`](/docs/how_to/recursive_text_splitter/) implements this concept:
- The `RecursiveCharacterTextSplitter` attempts to keep larger units (e.g., paragraphs) intact.
@@ -92,7 +92,7 @@ texts = text_splitter.split_text(document)
### Document-structured based
Some documents have an inherent structure, such as HTML, Markdown, or JSON files.
Some documents have an inherent structure, such as HTML, Markdown, or JSON files.
In these cases, it's beneficial to split the document based on its structure, as it often naturally groups semantically related text.
Key benefits of structure-based splitting:
- Preserves the logical organization of the document
@@ -116,7 +116,7 @@ Examples of structure-based splitting:
### Semantic meaning based
Unlike the previous methods, semantic-based splitting actually considers the *content* of the text.
Unlike the previous methods, semantic-based splitting actually considers the *content* of the text.
While other approaches use document or text structure as proxies for semantic meaning, this method directly analyzes the text's semantics.
There are several ways to implement this, but conceptually the approach is split text when there are significant changes in text *meaning*.
As an example, we can use a sliding window approach to generate embeddings, and compare the embeddings to find significant differences:

View File

@@ -55,4 +55,4 @@ According to the OpenAI post, the approximate token counts for English text are
* 1 token ~= 4 chars in English
* 1 token ~= ¾ words
* 100 tokens ~= 75 words
* 100 tokens ~= 75 words

View File

@@ -6,7 +6,7 @@
:::
## Overview
## Overview
Many AI applications interact directly with humans. In these cases, it is appropriate for models to respond in natural language.
But what about cases where we want a model to also interact *directly* with systems, such as databases or an API?
@@ -14,12 +14,12 @@ These systems often have a particular input schema; for example, APIs frequently
This need motivates the concept of *tool calling*. You can use [tool calling](https://platform.openai.com/docs/guides/function-calling/example-use-cases) to request model responses that match a particular schema.
:::info
You will sometimes hear the term `function calling`. We use this term interchangeably with `tool calling`.
You will sometimes hear the term `function calling`. We use this term interchangeably with `tool calling`.
:::
![Conceptual overview of tool calling](/img/tool_calling_concept.png)
## Key concepts
## Key concepts
1. **Tool Creation:** Use the [@tool](https://python.langchain.com/api_reference/core/tools/langchain_core.tools.convert.tool.html) decorator to create a [tool](/docs/concepts/tools). A tool is an association between a function and its schema.<br/>
2. **Tool Binding:** The tool needs to be connected to a model that supports tool calling. This gives the model awareness of the tool and the associated input schema required by the tool.<br/>
@@ -40,7 +40,7 @@ The tool call arguments can be passed directly to the tool.
tools = [my_tool]
# Tool binding
model_with_tools = model.bind_tools(tools)
# Tool calling
# Tool calling
response = model_with_tools.invoke(user_input)
```
@@ -65,16 +65,16 @@ def multiply(a: int, b: int) -> int:
:::
## Tool binding
## Tool binding
[Many](https://platform.openai.com/docs/guides/function-calling) [model providers](https://platform.openai.com/docs/guides/function-calling) support tool calling.
[Many](https://platform.openai.com/docs/guides/function-calling) [model providers](https://platform.openai.com/docs/guides/function-calling) support tool calling.
:::tip
See our [model integration page](/docs/integrations/chat/) for a list of providers that support tool calling.
:::
The central concept to understand is that LangChain provides a standardized interface for connecting tools to models.
The `.bind_tools()` method can be used to specify which tools are available for a model to call.
The central concept to understand is that LangChain provides a standardized interface for connecting tools to models.
The `.bind_tools()` method can be used to specify which tools are available for a model to call.
```python
model_with_tools = model.bind_tools(tools_list)
@@ -113,7 +113,7 @@ However, if we pass an input *relevant to the tool*, the model should choose to
result = llm_with_tools.invoke("What is 2 multiplied by 3?")
```
As before, the output `result` will be an `AIMessage`.
As before, the output `result` will be an `AIMessage`.
But, if the tool was called, `result` will have a `tool_calls` [attribute](https://python.langchain.com/api_reference/core/messages/langchain_core.messages.ai.AIMessage.html#langchain_core.messages.ai.AIMessage.tool_calls).
This attribute includes everything needed to execute the tool, including the tool name and input arguments:

View File

@@ -6,7 +6,7 @@
## Overview
The **tool** abstraction in LangChain associates a Python **function** with a **schema** that defines the function's **name**, **description** and **expected arguments**.
The **tool** abstraction in LangChain associates a Python **function** with a **schema** that defines the function's **name**, **description** and **expected arguments**.
**Tools** can be passed to [chat models](/docs/concepts/chat_models) that support [tool calling](/docs/concepts/tool_calling) allowing the model to request the execution of a specific function with specific inputs.
@@ -31,7 +31,7 @@ The key attributes that correspond to the tool's **schema**:
The key methods to execute the function associated with the **tool**:
- **invoke**: Invokes the tool with the given arguments.
- **ainvoke**: Invokes the tool with the given arguments, asynchronously. Used for [async programming with Langchain](/docs/concepts/async).
- **ainvoke**: Invokes the tool with the given arguments, asynchronously. Used for [async programming with LangChain](/docs/concepts/async).
## Create tools using the `@tool` decorator
@@ -68,10 +68,10 @@ You can also inspect the tool's schema and other properties:
```python
print(multiply.name) # multiply
print(multiply.description) # Multiply two numbers.
print(multiply.args)
print(multiply.args)
# {
# 'type': 'object',
# 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}},
# 'type': 'object',
# 'properties': {'a': {'type': 'integer'}, 'b': {'type': 'integer'}},
# 'required': ['a', 'b']
# }
```
@@ -89,14 +89,14 @@ Please see the [API reference for @tool](https://python.langchain.com/api_refere
## Tool artifacts
**Tools** are utilities that can be called by a model, and whose outputs are designed to be fed back to a model. Sometimes, however, there are artifacts of a tool's execution that we want to make accessible to downstream components in our chain or agent, but that we don't want to expose to the model itself. For example if a tool returns a custom object, a dataframe or an image, we may want to pass some metadata about this output to the model without passing the actual output to the model. At the same time, we may want to be able to access this full output elsewhere, for example in downstream tools.
**Tools** are utilities that can be called by a model, and whose outputs are designed to be fed back to a model. Sometimes, however, there are artifacts of a tool's execution that we want to make accessible to downstream components in our chain or agent, but that we don't want to expose to the model itself. For example if a tool returns a custom object, a dataframe or an image, we may want to pass some metadata about this output to the model without passing the actual output. At the same time, we may want to be able to access this full output elsewhere, for example in downstream tools.
```python
@tool(response_format="content_and_artifact")
def some_tool(...) -> Tuple[str, Any]:
"""Tool that does something."""
...
return 'Message for chat model', some_artifact
return 'Message for chat model', some_artifact
```
See [how to return artifacts from tools](/docs/how_to/tool_artifacts/) for more details.
@@ -134,7 +134,7 @@ def user_specific_tool(input_data: str, user_id: InjectedToolArg) -> str:
Annotating the `user_id` argument with `InjectedToolArg` tells LangChain that this argument should not be exposed as part of the
tool's schema.
See [how to pass run time values to tools](/docs/how_to/tool_runtime/) for more details on how to use `InjectedToolArg`.
See [how to pass run time values to tools](/docs/how_to/tool_runtime/) for more details on how to use `InjectedToolArg`.
### RunnableConfig
@@ -171,6 +171,26 @@ Please see the [InjectedState](https://langchain-ai.github.io/langgraph/referenc
Please see the [InjectedStore](https://langchain-ai.github.io/langgraph/reference/prebuilt/#langgraph.prebuilt.tool_node.InjectedStore) documentation for more details.
## Tool Artifacts vs. Injected State
Although similar conceptually, tool artifacts in LangChain and [injected state in LangGraph](https://langchain-ai.github.io/langgraph/reference/agents/#langgraph.prebuilt.tool_node.InjectedState) serve different purposes and operate at different levels of abstraction.
**Tool Artifacts**
- **Purpose:** Store and pass data between tool executions within a single chain/workflow
- **Scope:** Limited to tool-to-tool communication
- **Lifecycle:** Tied to individual tool calls and their immediate context
- **Usage:** Temporary storage for intermediate results that tools need to share
**Injected State (LangGraph)**
- **Purpose:** Maintain persistent state across the entire graph execution
- **Scope:** Global to the entire graph workflow
- **Lifecycle:** Persists throughout the entire graph execution and can be saved/restored
- **Usage:** Long-term state management, conversation memory, user context, workflow checkpointing
Tool artifacts are ephemeral data passed between tools, while injected state is persistent workflow-level state that survives across multiple steps, tool calls, and even execution sessions in LangGraph.
## Best practices
When designing tools to be used by models, keep the following in mind:

View File

@@ -7,4 +7,4 @@ Traces contain individual steps called `runs`. These can be individual calls fro
tool, or sub-chains.
Tracing gives you observability inside your chains and agents, and is vital in diagnosing issues.
For a deeper dive, check out [this LangSmith conceptual guide](https://docs.smith.langchain.com/concepts/tracing).
For a deeper dive, check out [this LangSmith conceptual guide](https://docs.langchain.com/langsmith/observability-quickstart).

View File

@@ -9,7 +9,7 @@
:::
:::info[Note]
This conceptual overview focuses on text-based indexing and retrieval for simplicity.
This conceptual overview focuses on text-based indexing and retrieval for simplicity.
However, embedding models can be [multi-modal](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings)
and vector stores can be used to store and retrieve a variety of data types beyond text.
:::
@@ -125,7 +125,7 @@ to the documentation of the specific vectorstore you are using to see what simil
Given a similarity metric to measure the distance between the embedded query and any embedded document, we need an algorithm to efficiently search over *all* the embedded documents to find the most similar ones.
There are various ways to do this. As an example, many vectorstores implement [HNSW (Hierarchical Navigable Small World)](https://www.pinecone.io/learn/series/faiss/hnsw/), a graph-based index structure that allows for efficient similarity search.
Regardless of the search algorithm used under the hood, the LangChain vectorstore interface has a `similarity_search` method for all integrations.
Regardless of the search algorithm used under the hood, the LangChain vectorstore interface has a `similarity_search` method for all integrations.
This will take the search query, create an embedding, find similar documents, and return them as a list of [Documents](https://python.langchain.com/api_reference/core/documents/langchain_core.documents.base.Document.html).
```python
@@ -166,7 +166,7 @@ vectorstore.similarity_search(
k=2,
filter={"source": "tweet"},
)
```
```
:::info[Further reading]
@@ -179,7 +179,7 @@ vectorstore.similarity_search(
While algorithms like HNSW provide the foundation for efficient similarity search in many cases, additional techniques can be employed to improve search quality and diversity.
For example, [maximal marginal relevance](https://python.langchain.com/v0.1/docs/modules/model_io/prompts/example_selectors/mmr/) is a re-ranking algorithm used to diversify search results, which is applied after the initial similarity search to ensure a more diverse set of results.
As a second example, some [vector stores](/docs/integrations/retrievers/pinecone_hybrid_search/) offer built-in [hybrid-search](https://docs.pinecone.io/guides/data/understanding-hybrid-search) to combine keyword and semantic similarity search, which marries the benefits of both approaches.
As a second example, some [vector stores](/docs/integrations/retrievers/pinecone_hybrid_search/) offer built-in [hybrid-search](https://docs.pinecone.io/guides/data/understanding-hybrid-search) to combine keyword and semantic similarity search, which marries the benefits of both approaches.
At the moment, there is no unified way to perform hybrid search using LangChain vectorstores, but it is generally exposed as a keyword argument that is passed in with `similarity_search`.
See this [how-to guide on hybrid search](/docs/how_to/hybrid/) for more details.
@@ -188,4 +188,4 @@ See this [how-to guide on hybrid search](/docs/how_to/hybrid/) for more details.
| [Hybrid search](/docs/integrations/retrievers/pinecone_hybrid_search/) | When combining keyword-based and semantic similarity. | Hybrid search combines keyword and semantic similarity, marrying the benefits of both approaches. [Paper](https://arxiv.org/abs/2210.11934). |
| [Maximal Marginal Relevance (MMR)](https://python.langchain.com/api_reference/pinecone/vectorstores/langchain_pinecone.vectorstores.PineconeVectorStore.html#langchain_pinecone.vectorstores.PineconeVectorStore.max_marginal_relevance_search) | When needing to diversify search results. | MMR attempts to diversify the results of a search to avoid returning similar and redundant documents. |

View File

@@ -18,7 +18,7 @@ LangChain exposes a standard interface for key components, making it easy to swi
3. **Observability and evaluation:** As applications become more complex, it becomes increasingly difficult to understand what is happening within them.
Furthermore, the pace of development can become rate-limited by the [paradox of choice](https://en.wikipedia.org/wiki/Paradox_of_choice).
For example, developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost.
For example, developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost.
[Observability](https://en.wikipedia.org/wiki/Observability) and evaluations can help developers monitor their applications and rapidly answer these types of questions with confidence.
@@ -29,10 +29,10 @@ As an example, all [chat models](/docs/concepts/chat_models/) implement the [Bas
This provides a standard way to interact with chat models, supporting important but often provider-specific features like [tool calling](/docs/concepts/tool_calling/) and [structured outputs](/docs/concepts/structured_outputs/).
### Example: chat models
### Example: chat models
Many [model providers](/docs/concepts/chat_models/) support [tool calling](/docs/concepts/tool_calling/), a critical feature for many applications (e.g., [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/)), that allows a developer to request model responses that match a particular schema.
The APIs for each provider differ.
The APIs for each provider differ.
LangChain's [chat model](/docs/concepts/chat_models/) interface provides a common way to bind [tools](/docs/concepts/tools) to a model in order to support [tool calling](/docs/concepts/tool_calling/):
```python
@@ -42,7 +42,7 @@ tools = [my_tool]
model_with_tools = model.bind_tools(tools)
```
Similarly, getting models to produce [structured outputs](/docs/concepts/structured_outputs/) is an extremely common use case.
Similarly, getting models to produce [structured outputs](/docs/concepts/structured_outputs/) is an extremely common use case.
Providers support different approaches for this, including [JSON mode or tool calling](https://platform.openai.com/docs/guides/structured-outputs), with different APIs.
LangChain's [chat model](/docs/concepts/chat_models/) interface provides a common way to produce structured outputs using the `with_structured_output()` method:
@@ -62,9 +62,9 @@ The underlying implementation of the retriever depends on the type of data store
documents = my_retriever.invoke("What is the meaning of life?")
```
## Orchestration
## Orchestration
While standardization for individual components is useful, we've increasingly seen that developers want to *combine* components into more complex applications.
While standardization for individual components is useful, we've increasingly seen that developers want to *combine* components into more complex applications.
This motivates the need for [orchestration](https://en.wikipedia.org/wiki/Orchestration_(computing)).
There are several common characteristics of LLM applications that this orchestration layer should support:
@@ -75,7 +75,7 @@ There are several common characteristics of LLM applications that this orchestra
The recommended way to orchestrate components for complex applications is [LangGraph](https://langchain-ai.github.io/langgraph/concepts/high_level/).
LangGraph is a library that gives developers a high degree of control by expressing the flow of the application as a set of nodes and edges.
LangGraph comes with built-in support for [persistence](https://langchain-ai.github.io/langgraph/concepts/persistence/), [human-in-the-loop](https://langchain-ai.github.io/langgraph/concepts/human_in_the_loop/), [memory](https://langchain-ai.github.io/langgraph/concepts/memory/), and other features.
It's particularly well suited for building [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/) or [multi-agent](https://langchain-ai.github.io/langgraph/concepts/multi_agent/) applications.
It's particularly well suited for building [agents](https://langchain-ai.github.io/langgraph/concepts/agentic_concepts/) or [multi-agent](https://langchain-ai.github.io/langgraph/concepts/multi_agent/) applications.
Importantly, individual LangChain components can be used as LangGraph nodes, but you can also use LangGraph **without** using LangChain components.
:::info[Further reading]
@@ -86,8 +86,8 @@ Have a look at our free course, [Introduction to LangGraph](https://academy.lang
## Observability and evaluation
The pace of AI application development is often rate-limited by high-quality evaluations because there is a paradox of choice.
Developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost.
The pace of AI application development is often rate-limited by high-quality evaluations because there is a paradox of choice.
Developers often wonder how to engineer their prompt or which LLM best balances accuracy, latency, and cost.
High quality tracing and evaluations can help you rapidly answer these types of questions with confidence.
[LangSmith](https://docs.smith.langchain.com/) is our platform that supports observability and evaluation for AI applications.
See our conceptual guides on [evaluations](https://docs.smith.langchain.com/concepts/evaluation) and [tracing](https://docs.smith.langchain.com/concepts/tracing) for more details.

View File

@@ -3,9 +3,9 @@
Here are some things to keep in mind for all types of contributions:
- Follow the ["fork and pull request"](https://docs.github.com/en/get-started/exploring-projects-on-github/contributing-to-a-project) workflow.
- Fill out the checked-in pull request template when opening pull requests. Note related issues and tag relevant maintainers.
- Fill out the checked-in pull request template when opening pull requests. Note related issues.
- Ensure your PR passes formatting, linting, and testing checks before requesting a review.
- If you would like comments or feedback on your current progress, please open an issue or discussion and tag a maintainer.
- If you would like comments or feedback on your current progress, please open an issue or discussion.
- See the sections on [Testing](setup.mdx#testing) and [Formatting and Linting](setup.mdx#formatting-and-linting) for how to run these checks locally.
- Backwards compatibility is key. Your changes must not be breaking, except in case of critical bug and security fixes.
- Look for duplicate PRs or issues that have already been opened before opening a new one.

View File

@@ -9,6 +9,14 @@ This project utilizes [uv](https://docs.astral.sh/uv/) v0.5+ as a dependency man
Install `uv`: **[documentation on how to install it](https://docs.astral.sh/uv/getting-started/installation/)**.
### Windows Users
If you're on Windows and don't have `make` installed, you can install it via:
- **Option 1**: Install via [Chocolatey](https://chocolatey.org/): `choco install make`
- **Option 2**: Install via [Scoop](https://scoop.sh/): `scoop install make`
- **Option 3**: Use [Windows Subsystem for Linux (WSL)](https://docs.microsoft.com/en-us/windows/wsl/)
- **Option 4**: Use the direct `uv` commands shown in the sections below
## Different packages
This repository contains multiple packages:
@@ -48,7 +56,11 @@ uv sync
Then verify dependency installation:
```bash
# If you have `make` installed:
make test
# If you don't have `make` (Windows alternative):
uv run --group test pytest -n auto --disable-socket --allow-unix-socket tests/unit_tests
```
## Testing
@@ -61,7 +73,11 @@ If you add new logic, please add a unit test.
To run unit tests:
```bash
# If you have `make` installed:
make test
# If you don't have make (Windows alternative):
uv run --group test pytest -n auto --disable-socket --allow-unix-socket tests/unit_tests
```
There are also [integration tests and code-coverage](../testing.mdx) available.
@@ -72,7 +88,12 @@ If you are only developing `langchain_core`, you can simply install the dependen
```bash
cd libs/core
# If you have `make` installed:
make test
# If you don't have `make` (Windows alternative):
uv run --group test pytest -n auto --disable-socket --allow-unix-socket tests/unit_tests
```
## Formatting and linting
@@ -86,20 +107,37 @@ Formatting for this project is done via [ruff](https://docs.astral.sh/ruff/rules
To run formatting for docs, cookbook and templates:
```bash
# If you have `make` installed:
make format
# If you don't have make (Windows alternative):
uv run --all-groups ruff format .
uv run --all-groups ruff check --fix .
```
To run formatting for a library, run the same command from the relevant library directory:
```bash
cd libs/{LIBRARY}
# If you have `make` installed:
make format
# If you don't have make (Windows alternative):
uv run --all-groups ruff format .
uv run --all-groups ruff check --fix .
```
Additionally, you can run the formatter only on the files that have been modified in your current branch as compared to the master branch using the format_diff command:
```bash
# If you have `make` installed:
make format_diff
# If you don't have `make` (Windows alternative):
# First, get the list of modified files:
git diff --relative=libs/langchain --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$' | xargs uv run --all-groups ruff format
git diff --relative=libs/langchain --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$' | xargs uv run --all-groups ruff check --fix
```
This is especially useful when you have made changes to a subset of the project and want to ensure your changes are properly formatted without affecting the rest of the codebase.
@@ -111,52 +149,89 @@ Linting for this project is done via a combination of [ruff](https://docs.astral
To run linting for docs, cookbook and templates:
```bash
# If you have `make` installed:
make lint
# If you don't have `make` (Windows alternative):
uv run --all-groups ruff check .
uv run --all-groups ruff format . --diff
uv run --all-groups mypy . --cache-dir .mypy_cache
```
To run linting for a library, run the same command from the relevant library directory:
```bash
cd libs/{LIBRARY}
# If you have `make` installed:
make lint
# If you don't have `make` (Windows alternative):
uv run --all-groups ruff check .
uv run --all-groups ruff format . --diff
uv run --all-groups mypy . --cache-dir .mypy_cache
```
In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff command:
```bash
# If you have `make` installed:
make lint_diff
# If you don't have `make` (Windows alternative):
# First, get the list of modified files:
git diff --relative=libs/langchain --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$' | xargs uv run --all-groups ruff check
git diff --relative=libs/langchain --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$' | xargs uv run --all-groups ruff format --diff
git diff --relative=libs/langchain --name-only --diff-filter=d master | grep -E '\.py$$|\.ipynb$$' | xargs uv run --all-groups mypy --cache-dir .mypy_cache
```
This can be very helpful when you've made changes to only certain parts of the project and want to ensure your changes meet the linting standards without having to check the entire codebase.
We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
### Spellcheck
### Pre-commit
Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).
Note that `codespell` finds common typos, so it could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.
We use [pre-commit](https://pre-commit.com/) to ensure commits are formatted/linted.
To check spelling for this project:
#### Installing Pre-commit
First, install pre-commit:
```bash
make spell_check
# Option 1: Using uv (recommended)
uv tool install pre-commit
# Option 2: Using Homebrew (globally for macOS/Linux)
brew install pre-commit
# Option 3: Using pip
pip install pre-commit
```
To fix spelling in place:
Then install the git hook scripts:
```bash
make spell_fix
pre-commit install
```
If codespell is incorrectly flagging a word, you can skip spellcheck for that word by adding it to the codespell config in the `pyproject.toml` file.
#### How Pre-commit Works
```python
[tool.codespell]
...
# Add here:
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
Once installed, pre-commit will automatically run on every `git commit`. Hooks are specified in `.pre-commit-config.yaml` and will:
- Format code using `ruff` for the specific library/package you're modifying
- Only run on files that have changed
- Prevent commits if formatting fails
#### Skipping Pre-commit
In exceptional cases, you can skip pre-commit hooks with:
```bash
git commit --no-verify
```
However, this is discouraged as the CI system will still enforce the same formatting rules.
## Working with optional dependencies
`langchain`, `langchain-community`, and `langchain-experimental` rely on optional dependencies to keep these packages lightweight.

View File

@@ -1,6 +1,6 @@
# Contribute documentation
Documentation is a vital part of LangChain. We welcome both new documentation for new features and
Documentation is a vital part of LangChain. We welcome both new documentation for new features and
community improvements to our current documentation. Please read the resources below before getting started:
- [Documentation style guide](style_guide.mdx)

View File

@@ -35,7 +35,7 @@ Some examples include:
- [Build a Retrieval Augmented Generation (RAG) App](/docs/tutorials/rag/)
A good structural rule of thumb is to follow the structure of this [example from Numpy](https://numpy.org/numpy-tutorials/content/tutorial-svd.html).
Here are some high-level tips on writing a good tutorial:
- Focus on guiding the user to get something done, but keep in mind the end-goal is more to impart principles than to create a perfect production system.
@@ -49,7 +49,7 @@ Here are some high-level tips on writing a good tutorial:
- The first time you mention a LangChain concept, use its full name (e.g. "LangChain Expression Language (LCEL)"), and link to its conceptual/other documentation page.
- It's also helpful to add a prerequisite callout that links to any pages with necessary background information.
- End with a recap/next steps section summarizing what the tutorial covered and future reading, such as related how-to guides.
### How-to guides
A how-to guide, as the name implies, demonstrates how to do something discrete and specific.
@@ -79,7 +79,7 @@ Here are some high-level tips on writing a good how-to guide:
### Conceptual guide
LangChain's conceptual guide falls under the **Explanation** quadrant of Diataxis. These guides should cover LangChain terms and concepts
LangChain's conceptual guides fall under the **Explanation** quadrant of Diataxis. These guides should cover LangChain terms and concepts
in a more abstract way than how-to guides or tutorials, targeting curious users interested in
gaining a deeper understanding and insights of the framework. Try to avoid excessively large code examples as the primary goal is to
provide perspective to the user rather than to finish a practical project. These guides should cover **why** things work the way they do.
@@ -105,7 +105,7 @@ Here are some high-level tips on writing a good conceptual guide:
### References
References contain detailed, low-level information that describes exactly what functionality exists and how to use it.
In LangChain, this is mainly our API reference pages, which are populated from docstrings within code.
In LangChain, these are mainly our API reference pages, which are populated from docstrings within code.
References pages are generally not read end-to-end, but are consulted as necessary when a user needs to know
how to use something specific.
@@ -119,7 +119,7 @@ but here are some high-level tips on writing a good docstring:
- Be concise
- Discuss special cases and deviations from a user's expectations
- Go into detail on required inputs and outputs
- Light details on when one might use the feature are fine, but in-depth details belong in other sections.
- Light details on when one might use the feature are fine, but in-depth details belong in other sections
Each category serves a distinct purpose and requires a specific approach to writing and structuring the content.
@@ -127,17 +127,17 @@ Each category serves a distinct purpose and requires a specific approach to writ
Here are some other guidelines you should think about when writing and organizing documentation.
We generally do not merge new tutorials from outside contributors without an actue need.
We generally do not merge new tutorials from outside contributors without an acute need.
We welcome updates as well as new integration docs, how-tos, and references.
### Avoid duplication
Multiple pages that cover the same material in depth are difficult to maintain and cause confusion. There should
be only one (very rarely two), canonical pages for a given concept or feature. Instead, you should link to other guides.
be only one (very rarely two) canonical pages for a given concept or feature. Instead, you should link to other guides.
### Link to other sections
Because sections of the docs do not exist in a vacuum, it is important to link to other sections frequently,
Because sections of the docs do not exist in a vacuum, it is important to link to other sections frequently
to allow a developer to learn more about an unfamiliar topic within the flow of reading.
This includes linking to the API references and conceptual sections!

Some files were not shown because too many files have changed in this diff Show More