langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-07-15 07:00:38 +00:00

Author	SHA1	Message	Date
Mason Daugherty	e98fc34203	Merge branch 'cc/1.0/standard_content' into mdrxy/invocation-version	2025-08-19 10:11:37 -04:00
Mason Daugherty	43b9d3d904	feat(core): implement dynamic translator registration for model providers (#32602 ) Extensible registry system for translating AI message content blocks from various model providers. Refactors the way provider-specific content is handled, moving from hardcoded logic to a plugin-like architecture.	2025-08-19 10:08:56 -04:00
Mason Daugherty	27d81cf3d9	test(openai): address some type issues in tests (#32601 ) nits	2025-08-19 00:28:35 -04:00
Mason Daugherty	313ed7b401	Merge branch 'wip-v1.0' into cc/1.0/standard_content	2025-08-19 00:11:18 -04:00
Mason Daugherty	f0f1e28473	Merge branch 'master' of github.com:langchain-ai/langchain into wip-v1.0	2025-08-18 23:30:10 -04:00
Mason Daugherty	d204f0dd55	feat(infra): add skip-preview tag check in Vercel deployment script (#32600 ) Having vercel attempt to deploy on each commit (even if unrelated to docs) was getting annoying. Options: - `[skip-preview]` - `[no-preview]` - `[skip-deploy]` Full example: `fix(core): resolve memory leak [no-preview]`	2025-08-18 17:33:27 -04:00
Mason Daugherty	0e6c172893	refactor(core): prefixes, again (#32599 ) Put in `core.utils` this time to prevent other circular import issues present in the `normalize()` rfc: `base` imports `content` `content` imports `ensure_id()` from `base`	2025-08-18 17:24:57 -04:00
Mason Daugherty	8ee0cbba3c	refactor(core): prefixes (#32597 ) re: #32589 cc: @ccurme - Rename namespace: `messages.content_blocks` -> `messages.content` - Prefixes and ID logic are now in `messages.common` instead of `AIMessage` since the logic is shared between messages and message content. Did this instead of `utils` due to circular import problems that were hairy	2025-08-18 16:33:12 -04:00
Mohammad Mohtashim	00259b0061	fix(deepseek): Deep Seek Model for LS Tracing (#32575 ) - Description: Fix for LS Tracing for Provider for DeepSeek. - Issue: #32484 --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-08-18 18:48:30 +00:00
Mohammad Mohtashim	4fb1132e30	docs: Classification Notebook Update (#32357 ) - Description: Updating the Classification notebook which was raised [here](https://github.com/langchain-ai/langchain/issues/32354) - Issue: Fixes #32354 --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-08-18 18:45:03 +00:00
Mason Daugherty	a6690eb9fd	release(anthropic): 0.3.19 (#32595 ) langchain-anthropic==0.3.19	2025-08-18 14:25:03 -04:00
Mason Daugherty	f69f9598f5	chore: update references to use the latest version of Claude-3.5 Sonnet (#32594 )	2025-08-18 14:11:15 -04:00
Mason Daugherty	8d0fb2d04b	fix(anthropic): correct `input_token` count for streaming (#32591 ) * Create usage metadata on [`message_delta`](https://docs.anthropic.com/en/docs/build-with-claude/streaming#event-types) instead of at the beginning. Consequently, token counts are not included during streaming but instead at the end. This allows for accurate reporting of server-side tool usage (important for billing) * Add some clarifying comments * Fix some outstanding Pylance warnings * Remove unnecessary `text` popping in thinking blocks * Also now correctly reports `input_cache_read`/`input_cache_creation` as a result	2025-08-18 17:51:47 +00:00
Mason Daugherty	8042b04da6	fix(anthropic): clean up null `file_id` fields in citations during message formatting (#32592 ) When citations are returned from streaming, they include a `file_id: null` field in their `content_block_location` structure. When these citations are passed back to the API in subsequent messages, the API rejects them with "Extra inputs are not permitted" for the `file_id` field.	2025-08-18 13:01:52 -04:00
ccurme	4790c7265a	feat(core): lazy-load standard content (#32570 )	2025-08-18 10:30:49 -04:00
Daehwi Kim	fb74265175	fix(docs): update LangGraph guides link and add JS how-to link (#32583 ) Description: Corrected LangGraph documentation link (changed to “guides”), and added a link to LangGraph JS how-to guides for clarity. Issue: N/A Dependencies: None --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-08-18 14:27:37 +00:00
Oresztesz Margaritisz	21b61aaf9a	fix(docs): Using appropriate argument name in ToolNode for error handling (#32586 ) The appropriate `ToolNode` attribute for error handling is called `handle_tool_errors` instead of `handle_tool_error`. For further info see [ToolNode source code in LangGraph](https://github.com/langchain-ai/langgraph/blob/main/libs/prebuilt/langgraph/prebuilt/tool_node.py#L255) Twitter handle: gitaroktato - [x] Add tests and docs: If you're adding a new integration, you must include: 1. A test for the integration, preferably unit tests that do not rely on network access, 2. An example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. We will not consider a PR unless these three are passing in CI. See [contribution guidelines](https://python.langchain.com/docs/contributing/) for more. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible.	2025-08-18 10:12:10 -04:00
Keyu Chen	03138f41a0	feat(text-splitters): add optional custom header pattern support (#31887 ) ## Description This PR adds support for custom header patterns in `MarkdownHeaderTextSplitter`, allowing users to define non-standard Markdown header formats (like `Header`) and specify their hierarchy levels. Issue: Fixes #22738 Dependencies: None - this change has no new dependencies Key Changes: - Added optional `custom_header_patterns` parameter to support non-standard header formats - Enable splitting on patterns like `Header` and `*Header` - Maintain full backward compatibility with existing usage - Added comprehensive tests for custom and mixed header scenarios ## Example Usage ```python from langchain_text_splitters import MarkdownHeaderTextSplitter headers_to_split_on = [ ("", "Chapter"), ("", "Section"), ] custom_header_patterns = { "": 1, # Level 1 headers "*": 2, # Level 2 headers } splitter = MarkdownHeaderTextSplitter( headers_to_split_on=headers_to_split_on, custom_header_patterns=custom_header_patterns, ) # Now Chapter 1 is treated as a level 1 header # And Section 1.1** is treated as a level 2 header ``` ## Testing - ✅ Added unit tests for custom header patterns - ✅ Added tests for mixed standard and custom headers - ✅ All existing tests pass (backward compatibility maintained) - ✅ Linting and formatting checks pass --- The implementation provides a flexible solution while maintaining the simplicity of the existing API. Users can continue using the splitter exactly as before, with the new functionality being entirely opt-in through the `custom_header_patterns` parameter. --------- Co-authored-by: Mason Daugherty <mason@langchain.dev> Co-authored-by: Claude <noreply@anthropic.com>	2025-08-18 10:10:49 -04:00
ccurme	aeea0e3ff8	fix(langchain): fix tests on standard content branch (#32590 )	2025-08-18 09:49:01 -04:00
Mason Daugherty	fd891ee3d4	revert(anthropic): streaming token counting to defer input tokens until completion (#32587 ) Reverts langchain-ai/langchain#32518	2025-08-18 09:48:33 -04:00
ccurme	aca7c1fe6a	fix(core): temporarily fix tests (#32589 )	2025-08-18 09:45:06 -04:00
ccurme	b8cdbc4eca	fix(anthropic): sanitize tool use block when taking directly from content (#32574 )	2025-08-18 09:06:57 -04:00
Christophe Bornet	791d309c06	chore(langchain): add mypy `warn_unreachable` setting (#32529 ) See https://mypy.readthedocs.io/en/stable/config_file.html#confval-warn_unreachable --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-08-15 23:03:53 +00:00
Mason Daugherty	d3d23e2372	fix(anthropic): streaming token counting to defer input tokens until completion (#32518 ) Supersedes #32461 Fixed incorrect input token reporting during streaming when tools are used. Previously, input tokens were counted at `message_start` before tool execution, leading to inaccurate counts. Now input tokens are properly deferred until `message_delta` (completion), aligning with Anthropic's billing model and SDK expectations. Before Fix: - Streaming with tools: Input tokens = 0 ❌ - Non-streaming with tools: Input tokens = 472 ✅ After Fix: - Streaming with tools: Input tokens = 472 ✅ - Non-streaming with tools: Input tokens = 472 ✅ Aligns with Anthropic's SDK expectations. The SDK handles input token updates in `message_delta` events: ```python # https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/streaming/_messages.py if event.usage.input_tokens is not None: current_snapshot.usage.input_tokens = event.usage.input_tokens ```	2025-08-15 17:49:46 -04:00
Mason Daugherty	2375c3a4d0	add note	2025-08-15 16:39:36 -04:00
Mason Daugherty	0199b56bda	rfc `test_utils` to make clearer what was existing before and after, and add comments	2025-08-15 16:37:39 -04:00
Mason Daugherty	00345c4de9	tests: add more data content block tests	2025-08-15 16:28:46 -04:00
Mason Daugherty	7f9727ee08	refactor: `is_data_content_block`	2025-08-15 16:28:33 -04:00
Mason Daugherty	08cd5bb9b4	clarify intent of `extras` under data blocks	2025-08-15 16:27:47 -04:00
Mason Daugherty	987031f86c	fix: `_LC_ID_PREFIX` back	2025-08-15 16:27:08 -04:00
Mason Daugherty	7a8c6398a4	clarify: meaning of provider	2025-08-15 16:01:29 -04:00
Mason Daugherty	f691dc348f	refactor: make `ensure_id` public	2025-08-15 15:42:17 -04:00
Mason Daugherty	86252d2ae6	refactor: move ID prefixes	2025-08-15 15:39:36 -04:00
Mason Daugherty	8bd2403518	fix: increase `max_tokens` limit to 64000 re: Anthropic dynamic tokens	2025-08-15 15:34:54 -04:00
Mason Daugherty	4dd9110424	Merge branch 'master' into wip-v1.0	2025-08-15 15:32:21 -04:00
Mason Daugherty	8fc1973bbf	test: add note about for tuple conversion in ToolMessage	2025-08-15 15:30:51 -04:00
Mason Daugherty	a3b20b0ef5	clean up id test	2025-08-15 15:28:11 -04:00
Mason Daugherty	301a425151	snapshot	2025-08-15 15:16:07 -04:00
Mason Daugherty	3db8c60112	chore: more content block formatting	2025-08-15 15:01:07 -04:00
Mason Daugherty	8d110599cb	chore: more content block docstring formatting	2025-08-15 14:39:13 -04:00
Mason Daugherty	c9e847fcb8	chore: format `output_version` docstring	2025-08-15 14:33:59 -04:00
Mohammad Mohtashim	174e685139	feat(anthropic): dynamic mapping of Max Tokens for Anthropic (#31946 ) - Description: Dynamic mapping of `max_tokens` as per the choosen anthropic model. - Issue: Fixes #31605 @ccurme --------- Co-authored-by: Caspar Broekhuizen <caspar@langchain.dev> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-08-15 11:33:51 -07:00
Mason Daugherty	601fa7d672	Merge branch 'wip-v1.0' into cc/1.0/standard_content	2025-08-15 14:31:50 -04:00
Mason Daugherty	7e39cd18c5	feat: allow kwargs on content block factories (#32568 )	2025-08-15 14:30:32 -04:00
Mason Daugherty	2f32c444b8	docs: add details on message IDs and their assignment process (#32534 )	2025-08-15 18:22:28 +00:00
Mason Daugherty	9721684501	Merge branch 'master' into wip-v1.0	2025-08-15 14:06:34 -04:00
Mason Daugherty	a4e135b508	fix: use `.get()` on image URL in `ImagePromptValue.to_string()`	2025-08-15 13:57:50 -04:00
Mason Daugherty	d111965448	Merge branch 'wip-v1.0' into cc/1.0/standard_content	2025-08-15 13:35:57 -04:00
Mason Daugherty	fe740a9397	fix(docs): `chatbot.ipynb` trimming regression (#32561 ) Supersedes #32544 Changes to the `trimmer` behavior resulted in the call `"What math problem was asked?"` to no longer see the relevant query due to the number of the queries' tokens. Adjusted to not trigger trimming the relevant part of the message history. Also, add print to the trimmer to increase observability on what is leaving the context window. Add note to trimming tut & format links as inline	2025-08-15 14:47:22 +00:00
Rostyslav Borovyk	b2b835cb36	docs(docs): add Oxylabs document loader (#32429 ) Thank you for contributing to LangChain! Follow these steps to mark your pull request as ready for review. If any of these steps are not completed, your PR will not be considered for review. - [x] PR title: Follows the format: {TYPE}({SCOPE}): {DESCRIPTION} - Examples: - feat(core): add multi-tenant support - fix(cli): resolve flag parsing error - docs(openai): update API usage examples - Allowed `{TYPE}` values: - feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release - Allowed `{SCOPE}` values (optional): - core, cli, langchain, standard-tests, docs, anthropic, chroma, deepseek, exa, fireworks, groq, huggingface, mistralai, nomic, ollama, openai, perplexity, prompty, qdrant, xai - Note: the `{DESCRIPTION}` must not start with an uppercase letter. - Once you've written the title, please delete this checklist item; do not include it in the PR. - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change. Include a [closing keyword](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) if applicable to a relevant issue. - Issue: the issue # it fixes, if applicable (e.g. Fixes #123) - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, you must include: 1. A test for the integration, preferably unit tests that do not rely on network access, 2. An example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. We will not consider a PR unless these three are passing in CI. See [contribution guidelines](https://python.langchain.com/docs/contributing/) for more. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to `pyproject.toml` files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. --------- Co-authored-by: Mason Daugherty <github@mdrxy.com>	2025-08-15 10:46:26 -04:00

1 2 3 4 5 ...

14087 Commits