langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-02-21 14:43:07 +00:00

Author	SHA1	Message	Date
Sydney Runkle	9bd028d04a	fix: disable int tests on release temporarily (#34685 ) langchain-core==1.2.7	2026-01-09 12:42:25 -05:00
Mason Daugherty	2e8744559d	fix(langchain,langchain-classic): more descriptive error msg when dep is not installed (#34679 )	2026-01-09 12:41:55 -05:00
ccurme	19edaa8acb	chore(openai): delete outdated test (#34682 )	2026-01-09 12:37:44 -05:00
Sydney Runkle	b500244250	fix: rm anth test (#34684 )	2026-01-09 12:37:33 -05:00
Sydney Runkle	d972d00b3a	chore: dropping openai from release matrix (#34681 )	2026-01-09 11:22:49 -05:00
Guofang.Tang	384158daec	fix(langchain): infer provider from mixed-case prefixes (#34672 ) Fix provider inference for mixed-case model prefixes and add matching unit coverage.	2026-01-09 11:07:14 -05:00
Sydney Runkle	c080296bed	release: langchain-core 1.2.7 (#34678 )	2026-01-09 16:02:38 +00:00
Sydney Runkle	323c76504a	fix: add test confirming we don't inject args based on `args_schema` alone (#34677 ) pending exclusion from function signature	2026-01-09 11:00:13 -05:00
Sydney Runkle	ed2aa9f747	fix: don't trace injected args only found in signature (#34670 ) for the case when they're not included in the `args_schema` this was predicted by @eyurtsev's comment here: https://github.com/langchain-ai/langchain/pull/33729/files#r2475538173 pairing w/ this PR in mcp adapters: https://github.com/langchain-ai/langchain-mcp-adapters/pull/407	2026-01-09 09:58:34 -05:00
Mason Daugherty	76da99e022	release(langchain): 1.2.3 (#34668 ) langchain==1.2.3	2026-01-08 15:24:32 -05:00
Aman Gupta	2847814c70	feat(core): add more file extensions to ignore in HTML link extraction (#34552 ) # feat(core): add more file extensions to ignore in HTML link extraction ## Description This PR enhances the HTML link extraction utility in `libs/core/langchain_core/utils/html.py` by expanding the `SUFFIXES_TO_IGNORE` list to include additional common binary file extensions: - `.webp` - `.pdf` - `.docx` - `.xlsx` - `.pptx` - `.pptm` These file types are non-HTML, non-crawlable resources. Ignoring them prevents `find_all_links` and `extract_sub_links` from mistakenly treating such binary assets as navigable links. This improves link filtering, reduces unnecessary crawling, and aligns behavior with typical web scraping expectations. ## Summary of Changes - Updated `libs/core/langchain_core/utils/html.py`: Added `.webp`, `.pdf`, `.docx`, `.xlsx`, `.pptx`, `.pptm` to `SUFFIXES_TO_IGNORE`. ## Related Issues N/A ## Verification - `ruff check libs/core/langchain_core/utils/html.py`: Passed - `mypy libs/core/langchain_core/utils/html.py`: Passed - `pytest libs/core/tests/unit_tests/utils/test_html.py`: Passed (11 tests) --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-08 14:40:22 -05:00
ccurme	d383f00489	refactor(langchain): engage summarization based on reported `usage_metadata` (#34632 )	2026-01-08 11:12:00 -05:00
Aman Gupta	50c5bb5607	refactor(core): improve docstrings for HTML link extraction utilities (#34550 ) # refactor(core): improve docstrings for HTML link extraction utilities ## Description This PR updates and clarifies the docstrings for `find_all_links` and `extract_sub_links` in `libs/core/langchain_core/utils/html.py`. The previous return-value descriptions were vague (e.g., "all links", "sub links"). They have now been revised to clearly describe the behavior and output of each function: - find_all_links → “A list of all links found in the HTML.” - extract_sub_links → “A list of absolute paths to sub links.” These improvements make the utilities more understandable and developer-friendly without altering functionality. ## Verification - `ruff check libs/core/langchain_core/utils/html.py`: Passed - `pytest libs/core/tests/unit_tests/utils/test_html.py`: Passed ## Checklists - PR title follows the required format: `TYPE(SCOPE): DESCRIPTION` - Changes are limited to the `langchain-core` package - `make format`, `make lint`, and `make test` pass	2026-01-08 10:21:17 -05:00
Mason Daugherty	2b6911d9af	fix(langchain): keep tool call / `AIMessage` pairings when summarizing (#34609 ) Fixes #34282 Before: When using agents with tools (like file reading, web search, etc.), the conversation looks like this: ``` [User] "Read these 10 files and summarize them" [AI] "I'll read all 10 files" + [tool_call: read_file x 10] [Tool] "Contents of file1.txt..." [Tool] "Contents of file2.txt..." [Tool] "Contents of file3.txt..." ... (7 more tool responses) ``` When the conversation gets too long, `SummarizationMiddleware` kicks in to compress older messages. The problem was: If you asked to keep the last 6 messages, you'd get: ``` [Summary] "Here's what happened before..." [Tool] "Contents of file5.txt..." [Tool] "Contents of file6.txt..." [Tool] "Contents of file7.txt..." [Tool] "Contents of file8.txt..." [Tool] "Contents of file9.txt..." [Tool] "Contents of file10.txt..." ``` The AI's original request to read the files (`[AI]` message with `tool_calls`) was summarized away, but the tool responses remained. This caused the error: ``` Error code: 400 - "No tool call found for function call output with call_id..." ``` Many APIs require that every tool response has a matching tool request. Without the AI message, the tool responses are "orphaned." ## The fix Now when the cutoff lands on tool messages, we move backward to include the AI message that requested those tools: Same scenario, keeping last 6 messages: ``` [Summary] "Here's what happened before..." [AI] "I'll read all 10 files" + [tool_call: read_file x 10] [Tool] "Contents of file1.txt..." [Tool] "Contents of file2.txt..." ... (all 10 tool responses) ``` The AI message is preserved along with its tool responses, keeping them paired together. ## Practical examples ### Example 1: Parallel tool calls Scenario: Agent reads 10 files in parallel, summarization triggers (see above) ### Example 2: Mixed conversation Scenario: User asks question, AI uses tools, user says thanks ``` [User] "What's the weather?" [AI] "Let me check" + [tool_call: get_weather] [Tool] "72F and sunny" [AI] "It's 72F and sunny!" [User] "Thanks!" ``` Keeping last 2 messages: \| Before (Bug) \| After (Fix) \| \|--------------\|-------------\| \| Only `[User] "Thanks!"` kept \| `[AI] + [Tool] + [AI] + [User]` all kept \| \| Lost the weather info \| Tool pair preserved with response \| ### Example 3: Multiple tool sequences ``` [User] "Search for X" [AI] [tool_call: search] [Tool] "Results for X" [User] "Now search for Y" [AI] [tool_call: search] [Tool] "Results for Y" [User] "Great!" ``` Keeping last 3 messages: If cutoff lands on `[Tool] "Results for Y"`, we now include `[AI] [tool_call: search]` to keep the pair together.	2026-01-08 10:07:56 -05:00
Guofang.Tang	f805ea9601	test(langchain): cover chat model provider inference (#34657 ) Add unit coverage for chat model provider inference across common model name prefixes. This improves regression protection without touching runtime --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-08 09:59:12 -05:00
Stephan Günther	0276cc0290	fix(langchain): fix copy-paste error on azure_openai embedding provider map (#34655 ) Fixes a bug introduced with commit `85f1ba2` (released in `langchain == 1.2.1`). Whenever the index embedding of the langgraph-server is configured with `azure_openai` provider, the wrong class is going to be initialized (and fails to do so if the now unexpected credentials in environment variable `OPENAI_API_KEY` is not provided). Example configuration file `langgraph.json` that will reproduce the issue: (see https://docs.langchain.com/langsmith/cli#adding-semantic-search-to-the-store) ```json { "dependencies": ["."], "graphs": { "chat": "src/agents/chat/graph.py:graph", }, "store": { "index": { "embed": "azure_openai:text-embedding-3-small", "dims": 1536 } }, "python_version": "3.13", "image_distro": "wolfi" } ```	2026-01-08 09:54:53 -05:00
Eugene Yurtsev	ceca38d3fe	fix(langchain): add test to verify version (#34644 ) verify version in langchain to avoid accidental drift langchain==1.2.2	2026-01-07 22:36:10 +00:00
Eugene Yurtsev	5554a36ad5	release(langchain): release 1.2.2 (#34643 ) Release langchain 1.2.2	2026-01-07 17:27:58 -05:00
Harrison Chase	bda22aa1d9	fix(langchain): handle parallel usage of the todo tool in planning middleware (#34637 ) The agent should only make a single call to update the todo list at a time. A parallel call doesn't make sense, but also cannot work as there's no obvious reducer to use. On parallel calls of the todo tool, we return ToolMessage containing to guide the LLM to not call the tool in parallel. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2026-01-07 17:23:56 -05:00
Manas karthik	48cd13114f	test(core): add edge case for empty examples in LengthBasedExampleSelector (#34641 )	2026-01-07 15:26:53 -05:00
Mohammad Mohtashim	e6a9694f5d	fix(core): fix strict schema generation for functions with optional args (#34599 )	2026-01-07 15:13:18 -05:00
ccurme	25bb36de81	release(openai): 1.1.7 (#34640 ) langchain-openai==1.1.7	2026-01-07 14:34:23 -05:00
OysterMax	92afcaae60	fix(openai): raise proper exception `OpenAIRefusalError` on structured output refusal (#34619 )	2026-01-07 14:34:02 -05:00
Sujal M H	7ad1c19d9c	fix: handle empty assistant content in Responses API (#34272 ) (#34296 )	2026-01-07 14:21:55 -05:00
Christophe Bornet	f10225184d	chore(langchain): fix types in test_wrap_model_call (#34573 )	2026-01-07 11:49:46 -05:00
Chris Papademetrious	0c7b7e045d	feat(core): support custom message separator in get_buffer_string() (#34569 )	2026-01-07 11:46:17 -05:00
Aarav Dugar	4c86e8ba39	chore(groq): document vision support (#34620 )	2026-01-07 11:37:05 -05:00
Manas karthik	048de6dfb6	test(text-splitters): add edge case tests for CharacterTextSplitter (#34628 )	2026-01-07 11:06:44 -05:00
Mason Daugherty	557eddfd51	refactor(core): add warning for fallback GPT-2 tokenizer usage (#34621 )	2026-01-06 19:11:10 -05:00
Mason Daugherty	aa9c63b96a	release(langchain): 1.2.1 (#34622 ) langchain==1.2.1	2026-01-06 19:10:49 -05:00
Mason Daugherty	8aeff95341	fix(core,langchain): use `get_buffer_string` for message summarization (#34607 ) Fixes #34517 Supersedes #34557, #34570 Fixes token inflation in `SummarizationMiddleware` that caused context window overflow during summarization. Root cause: When formatting messages for the summary prompt, `str(messages)` was implicitly called, which includes all Pydantic metadata fields (`usage_metadata`, `response_metadata`, `additional_kwargs`, etc.). This caused the stringified representation to use ~2.5x more tokens than `count_tokens_approximately` estimates. Problem: - Summarization triggers at 85% of context window based on `count_tokens_approximately` - But `str(messages)` in the prompt uses 2.5x more tokens - Results in `ContextLengthExceeded` Fix: Use `get_buffer_string()` to format messages, which produces compact output: ``` Human: What's the weather? AI: Let me check...[tool_calls] Tool: 72°F and sunny ``` Instead of verbose Pydantic repr: ```python [HumanMessage(content='What's the weather?', additional_kwargs={}, response_metadata={}), ...] ```	2026-01-06 19:05:03 -05:00
Christophe Bornet	0438f8c277	chore(langchain): fix types in test_model_fallback (#34615 )	2026-01-06 13:07:18 -05:00
Christophe Bornet	7f4f130479	chore(langchain): fix types in test_pii (#34617 )	2026-01-06 13:06:25 -05:00
ccurme	6537939f53	chore(langchain): add admonition around redaction_rules (#34618 )	2026-01-06 13:01:09 -05:00
Ademola Balogun	a2529cd805	fix(langchain): correct typo 'langchain experiment' to 'langchain_experimental' in error messages (#34608 ) Fixed typo in ImportError messages where "langchain experiment" should be "langchain_experimental" for consistency with the actual package name. This helps improve clarity for users who encounter these error messages when trying to use deprecated tools that have moved to the langchain_experimental package. Related issues: #13858, #13859 Co-authored-by: Ademola <ademicho@gmail>	2026-01-05 18:10:06 -05:00
ccurme	c1f1641018	fix(anthropic): fix version (#34606 ) langchain-anthropic==1.3.1	2026-01-05 16:03:20 -05:00
ccurme	225e0fa8c9	release(anthropic): 1.3.1 (#34605 )	2026-01-05 15:55:15 -05:00
Loganaden Velvindron	f021e899dc	fix(anthropic): CVE-2025-68664 (#34563 )	2026-01-05 15:51:25 -05:00
lwtaiyty	578cef9622	fix(anthropic): skip cache_control for code_execution blocks (#34579 )	2026-01-05 15:40:59 -05:00
Christophe Bornet	7979fd3d9f	chore(langchain): fix types in test_composition (#34580 )	2026-01-05 14:49:34 -05:00
Christophe Bornet	3b65985551	chore(langchain): fix types in test_decorators (#34583 )	2026-01-05 14:47:10 -05:00
Christophe Bornet	c4babed5c6	chore(langchain): fix types in test_wrap_tool_call (#34600 )	2026-01-05 14:38:31 -05:00
Christophe Bornet	5ae53fdfb3	chore(langchain): fix types in test_model_call_limit_types (#34601 )	2026-01-05 14:37:03 -05:00
Christophe Bornet	901690ceec	chore(langchain): fix types in test_file_search and test_human_in_the_loop (#34602 )	2026-01-05 14:34:35 -05:00
ゆり	be2c7f1aa8	test(core): add tests for formatting utils and merge functions (#34511 ) ## Summary Add comprehensive test coverage for previously untested utilities in `langchain-core`. ## Changes ### New file: `test_formatting.py` (18 tests) Tests for `StrictFormatter` class: - `test_vformat_with_keyword_args` - basic functionality - `test_vformat_with_multiple_keyword_args` - multiple placeholders - `test_vformat_with_empty_string` - edge case - `test_vformat_with_no_placeholders` - literal strings - `test_vformat_raises_on_positional_args` - error handling - `test_vformat_raises_on_multiple_positional_args` - error handling - `test_vformat_with_special_characters` - newlines, tabs - `test_vformat_with_unicode` - emoji, CJK characters - `test_vformat_with_format_spec` - format specifications - `test_vformat_with_nested_braces` - escaped braces Tests for `validate_input_variables`: - `test_validate_input_variables_success` - valid input - `test_validate_input_variables_with_extra_variables` - extra vars allowed - `test_validate_input_variables_with_missing_variable` - KeyError - `test_validate_input_variables_empty_format` - edge case - `test_validate_input_variables_no_placeholders` - edge case Tests for `formatter` singleton: - `test_formatter_is_strict_formatter` - type check - `test_formatter_format_works` - functionality - `test_formatter_rejects_positional_args` - error handling ### Extended `test_utils.py` (14 new tests) Tests for `merge_lists`: - Parametrized tests covering None handling, simple merge, empty lists, index-based merging - `test_merge_lists_multiple_others` - merging 3+ lists - `test_merge_lists_all_none` - all None inputs Tests for `merge_obj`: - Parametrized tests for None, strings, dicts, lists, equal values - `test_merge_obj_type_mismatch` - TypeError on type mismatch - `test_merge_obj_unmergeable_values` - ValueError on different values - `test_merge_obj_tuple_raises` - ValueError for tuples ## Test plan - [x] Tests follow existing patterns in the codebase - [x] All tests are unit tests (no network calls) - [x] Tests cover happy paths and error conditions - [x] Tests verify no mutation of input data ## AI Disclosure This contribution was developed with AI assistance (Claude Code). 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: yurekami <yurekami@users.noreply.github.com> Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-05 14:20:11 -05:00
ccurme	b5c5ba0a5f	release(xai): 1.2.1 (#34604 ) langchain-xai==1.2.1	2026-01-05 13:55:38 -05:00
ccurme	944b43dd25	fix(xai): count reasoning tokens in output total (#34603 )	2026-01-05 13:25:30 -05:00
aroun-coumar	730a3676f8	fix(core): strip message IDs from cache keys using `model_copy` (#33915 ) Description: Closes #[33883](https://github.com/langchain-ai/langchain/issues/33883) Chat model cache keys are generated by serializing messages via `dumps(messages)`. The optional `BaseMessage.id` field (a UUID used solely for tracing/threading) is included in this serialization, causing functionally identical messages to produce different cache keys. This results in repeated API calls, cache bloat, and degraded performance in production workloads (e.g., agents, RAG chains, long conversations). This change normalizes messages only for cache key generation by stripping the nonsemantic `id` field using Pydantic V2’s `model_copy(update={"id": None})`. The normalization is applied in both synchronous and asynchronous cache paths (`_generate_with_cache` / `_agenerate_with_cache`) immediately before `dumps()`. ```python normalized_messages = [ msg.model_copy(update={"id": None}) if getattr(msg, "id", None) is not None else msg for msg in messages ] prompt = dumps(normalized_messages) --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Mason Daugherty <github@mdrxy.com>	2026-01-05 10:37:10 -05:00
Julia (Juli) Huang	cd5b36456a	fix(text-splitters): HTMLSemanticPreservingSplitter nested preserved … (#34587 ) Summary Fixes an issue where HTMLSemanticPreservingSplitter failed to preserve elements nested inside non-container tags. With these changes, preserved elements are now correctly detected and handled at any nesting depth. Root Cause `_process_element()` only recursed into a small set of hard-coded container tags (`html`, `body`, `div`, `main`). For other tags, the subtree was flattened into text, preventing nested preserved elements (inside `<p>`, `<section>`, `<article>`, etc.) from being detected. Fix - Updated traversal logic in _process_element (html.py) to recursively process child elements for any tag that contains nested elements - Avoided duplicate text extraction - Preserved correct placeholder ordering - Treated leaf nodes as text only Tests Adds regression tests covering preserved elements nested inside non-container tags, including: - table inside section - nested divs - code inside paragraph All existing tests pass (make lint, format, test, etc). Breaking changes None. Fixes Fixes #31569 Disclaimer GitHub Copilot was used to assist with test case design in test_text_splitters.py and documentation comments; all code logic was manually implemented and reviewed. --------- Co-authored-by: julih <julih@julihs-MacBook-Pro.local> Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2026-01-05 10:28:27 -05:00
Mohan Kumar S	13cfdf1676	fix(core): exclude injected args from tool schema (#34582 )	2026-01-05 09:59:59 -05:00

1 2 3 4 5 ...

15076 Commits