langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-05 19:15:44 +00:00

Author	SHA1	Message	Date
Sydney Runkle	8bd7ea5a5e	linting	2025-05-14 08:16:07 -07:00
Sydney Runkle	b9c2031795	Merge branch 'sr/remove-unused-validators' of https://github.com/langchain-ai/langchain into sr/remove-unused-validators	2025-05-14 08:14:39 -07:00
Sydney Runkle	94a637f4a8	Merge branch 'master' into sr/remove-unused-validators	2025-05-14 08:14:35 -07:00
Sydney Runkle	263c215112	perf[core]: remove generations summation from hot loop (#31231 ) 1. Removes summation of `ChatGenerationChunk` from hot loops in `stream` and `astream` 2. Removes run id gen from loop as well (minor impact) Again, benchmarking on processing ~200k chunks (a poem about broccoli). Before: ~4.2s Blue circle is all the time spent adding up gen chunks <img width="1345" alt="Screenshot 2025-05-14 at 7 48 33 AM" src="https://github.com/user-attachments/assets/08a59d78-134d-4cd3-9d54-214de689df51" /> After: ~2.3s Blue circle is remaining time spent on adding chunks, which can be minimized in a future PR by optimizing the `merge_content`, `merge_dicts`, and `merge_lists` utilities. <img width="1353" alt="Screenshot 2025-05-14 at 7 50 08 AM" src="https://github.com/user-attachments/assets/df6b3506-929e-4b6d-b198-7c4e992c6d34" />	2025-05-14 08:13:05 -07:00
Sydney Runkle	d8bb6b24c4	remove another id -> str custom validator	2025-05-14 08:08:37 -07:00
Sydney Runkle	f1e9bf9d85	removing id -> str costly field validator	2025-05-14 07:55:43 -07:00
Sydney Runkle	17b799860f	perf[core]: remove costly async helpers for non-end event handlers (#31230 ) 1. Remove `shielded` decorator from non-end event handlers 2. Exit early with a `self.handlers` check instead of doing unnecessary asyncio work Using a benchmark that processes ~200k chunks (a poem about broccoli). Before: ~15s Circled in blue is unnecessary event handling time. This is addressed by point 2 above <img width="1347" alt="Screenshot 2025-05-14 at 7 37 53 AM" src="https://github.com/user-attachments/assets/675e0fed-8f37-46c0-90b3-bef3cb9a1e86" /> After: ~4.2s The total time is largely reduced by the removal of the `shielded` decorator, which holds little significance for non-end handlers. <img width="1348" alt="Screenshot 2025-05-14 at 7 37 22 AM" src="https://github.com/user-attachments/assets/54be8a3e-5827-4136-a87b-54b0d40fe331" />	2025-05-14 07:42:56 -07:00
Christophe Bornet	83d006190d	core: Fix some private member accesses (#30912 ) See https://github.com/langchain-ai/langchain/pull/30666 --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2025-05-12 17:42:26 +00:00
CtrlMj	1e56c66f86	core: Fix issue 31035 alias fields in base tool langchain core (#31112 ) Description: The 'inspect' package in python skips over the aliases set in the schema of a pydantic model. This is a workound to include the aliases from the original input. issue: #31035 Cc: @ccurme @eyurtsev --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-05-12 11:04:13 -04:00
ccurme	f70b263ff3	core: release 0.3.59 (#31150 )	2025-05-07 17:36:59 +00:00
Jacob Lee	66d1ed6099	fix(core): Permit OpenAI style blocks to be passed into convert_to_openai_messages (#31140 ) Should effectively be a noop, just shouldn't throw CC @madams0013 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2025-05-07 10:57:37 -04:00
ccurme	ff41f47e91	core: release 0.3.58 (#31099 )	2025-05-02 12:46:32 -04:00
ccurme	26ad239669	core, openai[patch]: prefer provider-assigned IDs when aggregating message chunks (#31080 ) When aggregating AIMessageChunks in a stream, core prefers the leftmost non-null ID. This is problematic because: - Core assigns IDs when they are null to `f"run-{run_manager.run_id}"` - The desired meaningful ID might not be available until midway through the stream, as is the case for the OpenAI Responses API. For the OpenAI Responses API, we assign message IDs to the top-level `AIMessage.id`. This works in `.(a)invoke`, but during `.(a)stream` the IDs get overwritten by the defaults assigned in langchain-core. These IDs [must](https://community.openai.com/t/how-to-solve-badrequesterror-400-item-rs-of-type-reasoning-was-provided-without-its-required-following-item-error-in-responses-api/1151686/9) be available on the AIMessage object to support passing reasoning items back to the API (e.g., if not using OpenAI's `previous_response_id` feature). We could add them elsewhere, but seeing as we've already made the decision to store them in `.id` during `.(a)invoke`, addressing the issue in core lets us fix the problem with no interface changes.	2025-05-02 11:18:18 -04:00
William FH	b5bf2d6218	0.3.57 (#31095 )	2025-05-01 23:42:26 -07:00
William FH	167afa5102	Enable run mutation (#31090 ) This lets you more easily modify a run in-flight	2025-05-01 17:00:51 -07:00
Sydney Runkle	7e926520d5	packaging: remove Python upper bound for langchain and co libs (#31025 ) Follow up to https://github.com/langchain-ai/langsmith-sdk/pull/1696, I've bumped the `langsmith` version where applicable in `uv.lock`. Type checking problems here because deps have been updated in `pyproject.toml` and `uv lock` hasn't been run - we should enforce that in the future - goes with the other dependabot todos :).	2025-04-28 14:44:28 -04:00
ccurme	403fae8eec	core: release 0.3.56 (#31000 )	2025-04-24 13:22:31 -04:00
ccurme	8fc7a723b9	core: release 0.3.56rc1 (#30998 )	2025-04-24 15:09:44 +00:00
ccurme	f4863f82e2	core[patch]: fix edge cases for _is_openai_data_block (#30997 )	2025-04-24 10:48:52 -04:00
Jacob Lee	6b0b317cb5	feat(core): Autogenerate filenames for when converting file content blocks to OpenAI format (#30984 ) CC @ccurme --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-04-24 13:36:31 +00:00
ccurme	faef3e5d50	core, standard-tests: support PDF and audio input in Chat Completions format (#30979 ) Chat models currently implement support for: - images in OpenAI Chat Completions format - other multimodal types (e.g., PDF and audio) in a cross-provider [standard format](https://python.langchain.com/docs/how_to/multimodal_inputs/) Here we update core to extend support to PDF and audio input in Chat Completions format. If an OAI-format PDF or audio content block is passed into any chat model, it will be transformed to the LangChain standard format. We assume that any chat model supporting OAI-format PDF or audio has implemented support for the standard format.	2025-04-23 18:32:51 +00:00
Bagatur	d4fc734250	core[patch]: update dict prompt template (#30967 ) Align with JS changes made in https://github.com/langchain-ai/langchainjs/pull/8043	2025-04-23 10:04:50 -07:00
ccurme	4bc70766b5	core, openai: support standard multi-modal blocks in convert_to_openai_messages (#30968 )	2025-04-23 11:20:44 -04:00
ccurme	8574442c57	core[patch]: release 0.3.55 (#30952 )	2025-04-21 17:56:24 +00:00
Nuno Campos	27296bdb0c	core: Make Graph.Node.data optional (#30943 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, eyurtsev, ccurme, vbarda, hwchase17.	2025-04-21 07:18:36 -07:00
Ahmed Tammaa	de56c31672	core: Improve OutputParser error messaging when model output is truncated (max_tokens) (#30936 ) Addresses #30158 When using the output parser—either in a chain or standalone—hitting max_tokens triggers a misleading “missing variable” error instead of indicating the output was truncated. This subtle bug often surfaces with Anthropic models. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-04-21 10:06:18 -04:00
ccurme	096f0e5966	core[patch]: de-beta usage callback (#30928 )	2025-04-18 15:45:09 +00:00
Sydney Runkle	98c357b3d7	core: release 0.3.54 (#30911 )	2025-04-17 14:27:06 -04:00
Vadym Barda	d2cbfa379f	core[patch]: add retries and better messages to draw_mermaid_png (#30881 )	2025-04-17 18:25:37 +00:00
Sydney Runkle	75e50a3efd	core[patch]: Raise `AttributeError` (instead of `ModuleNotFoundError`) in custom `__getattr__` (#30905 ) Follow up to https://github.com/langchain-ai/langchain/pull/30769, fixing the regression reported [here](https://github.com/langchain-ai/langchain/pull/30769#issuecomment-2807483610), thanks @krassowski for the report! Fix inspired by https://github.com/PrefectHQ/prefect/pull/16172/files Other changes: * Using tuples for `__all__`, except in `output_parsers` bc of a list namespace conflict * Using a helper function for imports due to repeated logic across `__init__.py` files becoming hard to maintain. Co-authored-by: Michał Krassowski < krassowski 5832902+krassowski@users.noreply.github.com>"	2025-04-17 14:15:28 -04:00
ccurme	2c2db1ab69	core: release 0.3.53 (#30901 )	2025-04-17 13:10:32 +00:00
ccurme	86d51f6be6	multiple: permit optional fields on multimodal content blocks (#30887 ) Instead of stuffing provider-specific fields in `metadata`, they can go directly on the content block.	2025-04-17 12:48:46 +00:00
Sydney Runkle	88fce67724	core: Removing unnecessary `pydantic` core schema rebuilds (#30848 ) We only need to rebuild model schemas if type annotation information isn't available during declaration - that shouldn't be the case for these types corrected here. Need to do more thorough testing to make sure these structures have complete schemas, but hopefully this boosts startup / import time.	2025-04-16 12:00:08 -04:00
Sydney Runkle	ef5aff3b6c	core[fix]: Fix `__dir__` in `__init__.py` for `output_parsers` module (#30856 ) We have a `list.py` file which causes a namespace conflict with `list` from stdlib, unfortunately. `__all__` is already a list, so no need to coerce.	2025-04-15 13:09:13 -04:00
Christophe Bornet	a4ca1fe0ed	core: Remove some noqa (#30855 )	2025-04-15 13:08:40 -04:00
Sydney Runkle	1f5e207379	core[fix]: remove `load` from dynamic imports dict (#30849 )	2025-04-15 12:02:46 -04:00
ccurme	7240458619	core: release 0.3.52 (#30850 )	2025-04-15 15:28:31 +00:00
Sydney Runkle	6aa5494a75	Fix `from langchain_core.load.load import load` import (#30843 ) TL;DR: you can't optimize imports with a lazy `__getattr__` if there is a namespace conflict with a module name and an attribute name. We should avoid introducing conflicts like this in the future. This PR fixes a bug introduced by my lazy imports PR: https://github.com/langchain-ai/langchain/pull/30769. In `langchain_core`, we have utilities for loading and dumping data. Unfortunately, one of those utilities is a `load` function, located in `langchain_core/load/load.py`. To make this function more visible, we make it accessible at the top level `langchain_core.load` module via importing the function in `langchain_core/load/__init__.py`. So, either of these imports should work: ```py from langchain_core.load import load from langchain_core.load.load import load ``` As you can tell, this is already a bit confusing. You'd think that the first import would produce the module `load`, but because of the `__init__.py` shortcut, both produce the function `load`. <details> More on why the lazy imports PR broke this support... All was well, except when the absolute import was run first, see the last snippet: ``` >>> from langchain_core.load import load >>> load <function load at 0x101c320c0> ``` ``` >>> from langchain_core.load.load import load >>> load <function load at 0x1069360c0> ``` ``` >>> from langchain_core.load import load >>> load <function load at 0x10692e0c0> >>> from langchain_core.load.load import load >>> load <function load at 0x10692e0c0> ``` ``` >>> from langchain_core.load.load import load >>> load <function load at 0x101e2e0c0> >>> from langchain_core.load import load >>> load <module 'langchain_core.load.load' from '/Users/sydney_runkle/oss/langchain/libs/core/langchain_core/load/load.py'> ``` In this case, the function `load` wasn't stored in the globals cache for the `langchain_core.load` module (by the lazy import logic), so Python defers to a module import. </details> New `langchain` tongue twister 😜: we've created a problem for ourselves because you have to load the load function from the load file in the load module 😨.	2025-04-15 11:06:13 -04:00
Bagatur	7262de4217	core[patch]: dict chat prompt template support (#25674 ) - Support passing dicts as templates to chat prompt template - Support making any attribute on a message a runtime variable - Significantly simpler than trying to update our existing prompt template classes ```python template = ChatPromptTemplate( [ { "role": "assistant", "content": [ { "type": "text", "text": "{text1}", "cache_control": {"type": "ephemeral"}, }, {"type": "image_url", "image_url": {"path": "{local_image_path}"}}, ], "name": "{name1}", "tool_calls": [ { "name": "{tool_name1}", "args": {"arg1": "{tool_arg1}"}, "id": "1", "type": "tool_call", } ], }, { "role": "tool", "content": "{tool_content2}", "tool_call_id": "1", "name": "{tool_name1}", }, ] ) ``` will likely close #25514 if we like this idea and update to use this logic --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-04-15 11:00:49 -04:00
ccurme	9cfe6bcacd	multiple: multi-modal content blocks (#30746 ) Introduces standard content block format for images, audio, and files. ## Examples Image from url: ``` { "type": "image", "source_type": "url", "url": "https://path.to.image.png", } ``` Image, in-line data: ``` { "type": "image", "source_type": "base64", "data": "<base64 string>", "mime_type": "image/png", } ``` PDF, in-line data: ``` { "type": "file", "source_type": "base64", "data": "<base64 string>", "mime_type": "application/pdf", } ``` File from ID: ``` { "type": "file", "source_type": "id", "id": "file-abc123", } ``` Plain-text file: ``` { "type": "file", "source_type": "text", "text": "foo bar", } ```	2025-04-15 09:48:06 -04:00
Sydney Runkle	59f2c9e737	Tinkering with CodSpeed (#30824 ) Fix CI to trigger benchmarks on `run-codspeed-benchmarks` label addition Reduce scope of async benchmark to save time on CI Waiting to merge this PR until we figure out how to use walltime on local runners.	2025-04-15 08:49:09 -04:00
William FH	ed5c4805f6	Consistent docstring indentation (#30834 ) Should be 4 spaces instead of 3.	2025-04-14 19:04:35 -07:00
Sydney Runkle	edb6a23aea	core[lint]: fix issue with unused ignore in `__init__.py` files (#30825 ) Fixing a race condition between https://github.com/langchain-ai/langchain/pull/30769 and https://github.com/langchain-ai/langchain/pull/30737	2025-04-14 17:57:00 +00:00
Sydney Runkle	4f69094b51	core[performance]: use custom `__getattr__` in `__init__.py` files for lazy imports (#30769 ) Most easily reviewed with the "hide whitespace" option toggled. Seeing 10-50% speed ups in import time for common structures 🚀 The general purpose of this PR is to lazily import structures within `langchain_core.XXX_module.__init__.py` so that we're not eagerly importing expensive dependencies (`pydantic`, `requests`, etc). Analysis of flamegraphs generated with `importtime` motivated these changes. For example, the one below demonstrates that importing `HumanMessage` accidentally triggered imports for `importlib.metadata`, `requests`, etc. There's still much more to do on this front, and we can start digging into our own internal code for optimizations now that we're less concerned about external imports. <img width="1210" alt="Screenshot 2025-04-11 at 1 10 54 PM" src="https://github.com/user-attachments/assets/112a3fe7-24a9-4294-92c1-d5ae64df839e" /> I've tracked the improvements with some local benchmarks: ## `pytest-benchmark` results \| Name \| Before (s) \| After (s) \| Delta (s) \| % Change \| \|-----------------------------\|------------\|-----------\|-----------\|----------\| \| Document \| 2.8683 \| 1.2775 \| -1.5908 \| -55.46% \| \| HumanMessage \| 2.2358 \| 1.1673 \| -1.0685 \| -47.79% \| \| ChatPromptTemplate \| 5.5235 \| 2.9709 \| -2.5526 \| -46.22% \| \| Runnable \| 2.9423 \| 1.7793 \| -1.163 \| -39.53% \| \| InMemoryVectorStore \| 3.1180 \| 1.8417 \| -1.2763 \| -40.93% \| \| RunnableLambda \| 2.7385 \| 1.8745 \| -0.864 \| -31.55% \| \| tool \| 5.1231 \| 4.0771 \| -1.046 \| -20.42% \| \| CallbackManager \| 4.2263 \| 3.4099 \| -0.8164 \| -19.32% \| \| LangChainTracer \| 3.8394 \| 3.3101 \| -0.5293 \| -13.79% \| \| BaseChatModel \| 4.3317 \| 3.8806 \| -0.4511 \| -10.41% \| \| PydanticOutputParser \| 3.2036 \| 3.2995 \| 0.0959 \| 2.99% \| \| InMemoryRateLimiter \| 0.5311 \| 0.5995 \| 0.0684 \| 12.88% \| Note the lack of change for `InMemoryRateLimiter` and `PydanticOutputParser` is just random noise, I'm getting comparable numbers locally. ## Local CodSpeed results We're still working on configuring CodSpeed on CI. The local usage produced similar results.	2025-04-14 08:57:54 -04:00
Christophe Bornet	42944f3499	core: Improve mypy config (#30737 ) * Cleanup mypy config * Add mypy `strict` rules except `disallow_any_generics`, `warn_return_any` and `strict_equality` (TODO) * Add mypy `strict_byte` rule * Add mypy support for PEP702 `@deprecated` decorator * Bump mypy version to 1.15 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-04-11 16:35:13 -04:00
Christophe Bornet	913c896598	core: Add ruff rules FBT001 and FBT002 (#30695 ) Add ruff rules [FBT001](https://docs.astral.sh/ruff/rules/boolean-type-hint-positional-argument/) and [FBT002](https://docs.astral.sh/ruff/rules/boolean-default-value-positional-argument/). Mostly `noqa`s to not introduce breaking changes and possible non-breaking fixes have already been done in a [previous PR](https://github.com/langchain-ai/langchain/pull/29424). These rules will prevent new violations to happen.	2025-04-11 16:26:33 -04:00
William FH	2803a48661	core[patch]: Share executor for async callbacks run in sync context (#30779 ) To avoid having to create ephemeral threads, grab the thread lock, etc.	2025-04-11 10:34:43 -07:00
Sydney Runkle	fdc2b4bcac	core[lint]: Use 3.9 formatting for docs and tests (#30780 ) Looks like `pyupgrade` was already used here but missed some docs and tests. This helps to keep our docs looking professional and up to date. Eventually, we should lint / format our inline docs.	2025-04-11 10:39:25 -04:00
Christophe Bornet	89f28a24d3	core[lint]: Fix typing in `test_async_callbacks` (#30788 )	2025-04-11 07:26:38 -04:00
Christophe Bornet	dc19d42d37	core: Specify code when ignoring type issue (ruff PGH003) (#30675 ) See https://docs.astral.sh/ruff/rules/blanket-type-ignore/	2025-04-10 22:23:52 -04:00

1 2 3 4 5 ...

1152 Commits