Commit Graph

1289 Commits

Author SHA1 Message Date
Mason Daugherty
c7c47f81c4 add type hints to messages 2025-08-05 00:04:30 -04:00
Mason Daugherty
5c9ce7fd2b remove outdated test 2025-08-04 23:47:17 -04:00
Mason Daugherty
f3c863447f fix: core imports tests 2025-08-04 23:24:47 -04:00
Mason Daugherty
5b7df921df add reasoning guard; make mime type not required alongside base64 2025-08-04 15:52:06 -04:00
Mason Daugherty
7bc4512ab7 add type guards to export 2025-08-04 15:51:42 -04:00
Mason Daugherty
ade2155aee fixes 2025-08-04 14:33:02 -04:00
Mason Daugherty
332215ed41 nit 2025-08-04 13:44:31 -04:00
Mason Daugherty
d4ac8ff5f7 lint 2025-08-04 13:14:04 -04:00
Mason Daugherty
719c9dfaaa fix: handle TypeGuard import for compatibility with older Python versions 2025-08-04 13:06:42 -04:00
Mason Daugherty
cc56b8dbd3 Merge branch 'standard_outputs_copy' into mdrxy/ollama_v1 + updates 2025-08-04 12:57:38 -04:00
ccurme
ff3153c04d feat(core): move tool call chunks to content (v1) (#32358) 2025-08-04 11:32:11 -04:00
Mason Daugherty
8d8a61ab3b fix: create_plaintext_block 2025-08-04 11:31:29 -04:00
Mason Daugherty
a6686d7c4f type guards to remove casting 2025-08-04 11:29:36 -04:00
Mason Daugherty
eae4d1db43 integration tests 2025-08-01 12:47:08 -04:00
Mason Daugherty
bc5c6751dc fix test 2025-07-31 17:37:42 -04:00
Mason Daugherty
05d075a37f messages.v1 mypy fixes 2025-07-31 17:32:59 -04:00
Mason Daugherty
7a0c3e0482 fix: update snapshots 2025-07-31 17:27:37 -04:00
Mason Daugherty
02d02ccf1b feat: enhance serialization for v1 message classes in dump and load modules 2025-07-31 17:08:39 -04:00
Mason Daugherty
e1b8ae2e6c content_blocks: add missing blocks to aliases and formatting 2025-07-31 17:08:10 -04:00
Mason Daugherty
2cb48b685f messages v1: some nits, and use create_text_block() 2025-07-31 17:07:42 -04:00
Mason Daugherty
525fa453be fix: revert pydantic bump (#32355) 2025-07-31 12:22:23 -04:00
Mason Daugherty
c88adfad70 fix: updatd snapshots 2025-07-31 11:21:40 -04:00
Mason Daugherty
7f1e98c7fb fix: add index to InvalidToolCall 2025-07-31 11:19:01 -04:00
Mason Daugherty
44bd6fe837 feat(core): content block factories + ids + docs + tests (#32316)
## Benefits

1. **Type Safety**: Compile-time validation of required fields and
proper type setting
2. **Less Boilerplate**: No need to manually set the `type` field or
generate IDs
3. **Input Validation**: Runtime validation prevents common errors
(e.g., base64 without MIME type)
4. **Consistent Patterns**: Standardized creation patterns across all
block types
5. **Better Developer Experience**: Cleaner, more intuitive API than
manual TypedDict construction. Also follows similar other patterns (e.g.
`create_react_agent`, `init_chat_model`
2025-07-31 11:12:00 -04:00
ccurme
740d9d3e7e fix(core): fix tracing for new message types in case of multiple messages (#32352) 2025-07-31 10:47:23 -04:00
ccurme
642262f6fe feat(core): widen input type for output parsers (#32332) 2025-07-30 16:52:34 -04:00
Chester Curme
a0abb79f6d Merge branch 'wip-v0.4' into standard_outputs_copy 2025-07-30 13:17:08 -04:00
ccurme
309d1a232a fix(openai): fix tracing and typing on standard outputs branch (#32326) 2025-07-30 13:02:15 -04:00
ccurme
8cf97e838c fix(core): lint standard outputs branch (#32311) 2025-07-29 15:38:45 -04:00
Mason Daugherty
fbd5a238d8 fix(core): revert "fix: tool call streaming bug with inconsistent indices from Qwen3" (#32307)
Reverts langchain-ai/langchain#32160

Original issue stems from using `ChatOpenAI` to interact with a `qwen`
model. Recommended to use
[langchain-qwq](https://python.langchain.com/docs/integrations/chat/qwq/)
which is built for Qwen
2025-07-29 10:26:38 -04:00
Mason Daugherty
0e287763cd fix: lint 2025-07-28 18:49:43 -04:00
ccurme
c15e55b33c feat(openai): v1 message format support (#32296) 2025-07-28 18:42:26 -04:00
Copilot
0b56c1bc4b fix: tool call streaming bug with inconsistent indices from Qwen3 (#32160)
Fixes a streaming bug where models like Qwen3 (using OpenAI interface)
send tool call chunks with inconsistent indices, resulting in
duplicate/erroneous tool calls instead of a single merged tool call.

## Problem

When Qwen3 streams tool calls, it sends chunks with inconsistent `index`
values:
- First chunk: `index=1` with tool name and partial arguments  
- Subsequent chunks: `index=0` with `name=None`, `id=None` and argument
continuation

The existing `merge_lists` function only merges chunks when their
`index` values match exactly, causing these logically related chunks to
remain separate, resulting in multiple incomplete tool calls instead of
one complete tool call.

```python
# Before fix: Results in 1 valid + 1 invalid tool call
chunk1 = AIMessageChunk(tool_call_chunks=[
    {"name": "search", "args": '{"query":', "id": "call_123", "index": 1}
])
chunk2 = AIMessageChunk(tool_call_chunks=[
    {"name": None, "args": ' "test"}', "id": None, "index": 0}  
])
merged = chunk1 + chunk2  # Creates 2 separate tool calls

# After fix: Results in 1 complete tool call
merged = chunk1 + chunk2  # Creates 1 merged tool call: search({"query": "test"})
```

## Solution

Enhanced the `merge_lists` function in `langchain_core/utils/_merge.py`
with intelligent tool call chunk merging:

1. **Preserves existing behavior**: Same-index chunks still merge as
before
2. **Adds special handling**: Tool call chunks with
`name=None`/`id=None` that don't match any existing index are now merged
with the most recent complete tool call chunk
3. **Maintains backward compatibility**: All existing functionality
works unchanged
4. **Targeted fix**: Only affects tool call chunks, doesn't change
behavior for other list items

The fix specifically handles the pattern where:
- A continuation chunk has `name=None` and `id=None` (indicating it's
part of an ongoing tool call)
- No matching index is found in existing chunks
- There exists a recent tool call chunk with a valid name or ID to merge
with

## Testing

Added comprehensive test coverage including:
-  Qwen3-style chunks with different indices now merge correctly
-  Existing same-index behavior preserved  
-  Multiple distinct tool calls remain separate
-  Edge cases handled (empty chunks, orphaned continuations)
-  Backward compatibility maintained

Fixes #31511.

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 Share your feedback on Copilot coding agent for the chance to win a
$200 gift card! Click
[here](https://survey.alchemer.com/s3/8343779/Copilot-Coding-agent) to
start the survey.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-28 22:31:41 +00:00
Copilot
ad88e5aaec fix(core): resolve cache validation error by safely converting Generation to ChatGeneration objects (#32156)
## Problem

ChatLiteLLM encounters a `ValidationError` when using cache on
subsequent calls, causing the following error:

```
ValidationError(model='ChatResult', errors=[{'loc': ('generations', 0, 'type'), 'msg': "unexpected value; permitted: 'ChatGeneration'", 'type': 'value_error.const', 'ctx': {'given': 'Generation', 'permitted': ('ChatGeneration',)}}])
```

This occurs because:
1. The cache stores `Generation` objects (with `type="Generation"`)
2. But `ChatResult` expects `ChatGeneration` objects (with
`type="ChatGeneration"` and a required `message` field)
3. When cached values are retrieved, validation fails due to the type
mismatch

## Solution

Added graceful handling in both sync (`_generate_with_cache`) and async
(`_agenerate_with_cache`) cache methods to:

1. **Detect** when cached values contain `Generation` objects instead of
expected `ChatGeneration` objects
2. **Convert** them to `ChatGeneration` objects by wrapping the text
content in an `AIMessage`
3. **Preserve** all original metadata (`generation_info`)
4. **Allow** `ChatResult` creation to succeed without validation errors

## Example

```python
# Before: This would fail with ValidationError
from langchain_community.chat_models import ChatLiteLLM
from langchain_community.cache import SQLiteCache
from langchain.globals import set_llm_cache

set_llm_cache(SQLiteCache(database_path="cache.db"))
llm = ChatLiteLLM(model_name="openai/gpt-4o", cache=True, temperature=0)

print(llm.predict("test"))  # Works fine (cache empty)
print(llm.predict("test"))  # Now works instead of ValidationError

# After: Seamlessly handles both Generation and ChatGeneration objects
```

## Changes

- **`libs/core/langchain_core/language_models/chat_models.py`**: 
  - Added `Generation` import from `langchain_core.outputs`
- Enhanced cache retrieval logic in `_generate_with_cache` and
`_agenerate_with_cache` methods
- Added conversion from `Generation` to `ChatGeneration` objects when
needed

-
**`libs/core/tests/unit_tests/language_models/chat_models/test_cache.py`**:
- Added test case to validate the conversion logic handles mixed object
types

## Impact

- **Backward Compatible**: Existing code continues to work unchanged
- **Minimal Change**: Only affects cache retrieval path, no API changes
- **Robust**: Handles both legacy cached `Generation` objects and new
`ChatGeneration` objects
- **Preserves Data**: All original content and metadata is maintained
during conversion

Fixes #22389.

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mdrxy <61371264+mdrxy@users.noreply.github.com>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-28 22:28:16 +00:00
Chester Curme
7166adce1f Merge branch 'wip-v0.4' into standard_outputs_copy
# Conflicts:
#	libs/core/langchain_core/messages/tool.py
#	libs/partners/openai/langchain_openai/chat_models/_compat.py
#	libs/partners/openai/langchain_openai/chat_models/base.py
2025-07-28 13:41:50 -04:00
ccurme
c55294ecb0 chore(core): add test for nested pydantic fields in schemas (#32285) 2025-07-28 17:27:24 +00:00
Mason Daugherty
a07d2c5016 refactor: remove references to unsupported model claude-3-sonnet-20240229 (#32281)
Addresses some (but not all) test issues brought about in #32280
2025-07-28 11:57:43 -04:00
Mason Daugherty
5e9eb19a83 chore: update branch with changes from master (#32277)
Co-authored-by: Maxime Grenu <69890511+cluster2600@users.noreply.github.com>
Co-authored-by: Claude <claude@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: jmaillefaud <jonathan.maillefaud@evooq.ch>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: tanwirahmad <tanwirahmad@users.noreply.github.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: niceg <79145285+growmuye@users.noreply.github.com>
Co-authored-by: Chaitanya varma <varmac301@gmail.com>
Co-authored-by: dishaprakash <57954147+dishaprakash@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Kanav Bansal <13186335+bansalkanav@users.noreply.github.com>
Co-authored-by: Aleksandr Filippov <71711753+alex-feel@users.noreply.github.com>
Co-authored-by: Alex Feel <afilippov@spotware.com>
2025-07-28 10:39:41 -04:00
ccurme
3d9e694f73 feat(core): start on v1 chat model (#32276)
Co-authored-by: Nuno Campos <nuno@langchain.dev>
2025-07-28 10:17:06 -04:00
Aleksandr Filippov
f0b6baa0ef fix(core): track within-batch deduplication in indexing num_skipped count (#32273)
**Description:** Fixes incorrect `num_skipped` count in the LangChain
indexing API. The current implementation only counts documents that
already exist in RecordManager (cross-batch duplicates) but fails to
count documents removed during within-batch deduplication via
`_deduplicate_in_order()`.

This PR adds tracking of the original batch size before deduplication
and includes the difference in `num_skipped`, ensuring that `num_added +
num_skipped` equals the total number of input documents.

**Issue:** Fixes incorrect document count reporting in indexing
statistics

**Dependencies:** None

Fixes #32272

---------

Co-authored-by: Alex Feel <afilippov@spotware.com>
2025-07-28 09:58:51 -04:00
Mason Daugherty
96cbd90cba fix: formatting issues in docstrings (#32265)
Ensures proper reStructuredText formatting by adding the required blank
line before closing docstring quotes, which resolves the "Block quote
ends without a blank line; unexpected unindent" warning.
2025-07-27 23:37:47 -04:00
Mason Daugherty
f624ad489a feat(docs): improve devx, fix Makefile targets (#32237)
**TL;DR much of the provided `Makefile` targets were broken, and any
time I wanted to preview changes locally I either had to refer to a
command Chester gave me or try waiting on a Vercel preview deployment.
With this PR, everything should behave like normal.**

Significant updates to the `Makefile` and documentation files, focusing
on improving usability, adding clear messaging, and fixing/enhancing
documentation workflows.

### Updates to `Makefile`:

#### Enhanced build and cleaning processes:
- Added informative messages (e.g., "📚 Building LangChain
documentation...") to makefile targets like `docs_build`, `docs_clean`,
and `api_docs_build` for better user feedback during execution.
- Introduced a `clean-cache` target to the `docs` `Makefile` to clear
cached dependencies and ensure clean builds.

#### Improved dependency handling:
- Modified `install-py-deps` to create a `.venv/deps_installed` marker,
preventing redundant/duplicate dependency installations and improving
efficiency.

#### Streamlined file generation and infrastructure setup:
- Added caching for the LangServe README download and parallelized
feature table generation
- Added user-friendly completion messages for targets like `copy-infra`
and `render`.

#### Documentation server updates:
- Enhanced the `start` target with messages indicating server start and
URL for local documentation viewing.

---

### Documentation Improvements:

#### Content clarity and consistency:
- Standardized section titles for consistency across documentation
files.
[[1]](diffhunk://#diff-9b1a85ea8a9dcf79f58246c88692cd7a36316665d7e05a69141cfdc50794c82aL1-R1)
[[2]](diffhunk://#diff-944008ad3a79d8a312183618401fcfa71da0e69c75803eff09b779fc8e03183dL1-R1)
- Refined phrasing and formatting in sections like "Dependency
management" and "Formatting and linting" for better readability.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L6-R6)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L84-R82)

#### Enhanced workflows:
- Updated instructions for building and viewing documentation locally,
including tips for specifying server ports and handling API reference
previews.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L60-R94)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
- Expanded guidance on cleaning documentation artifacts and using
linting tools effectively.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)

#### API reference documentation:
- Improved instructions for generating and formatting in-code
documentation, highlighting best practices for docstring writing.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L144-R186)

---

### Minor Changes:
- Added support for a new package name (`langchain_v1`) in the API
documentation generation script.
- Fixed minor capitalization and formatting issues in documentation
files.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L40-R40)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L166-R160)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-07-25 14:49:03 -04:00
Mason Daugherty
c921d08b18 feat(docs): add docstring to _convert_from_v1_message() 2025-07-25 11:01:48 -04:00
Mason Daugherty
3f653011e6 nit: use block instead of content_block for consistency in convert_to_openai_image_block() 2025-07-25 10:57:22 -04:00
Mason Daugherty
ee13a3b6fa nit: rearrange index to be grouped with other always-present fields 2025-07-25 10:16:35 -04:00
niceg
0d6f915442 fix: LLM mimicking Unicode responses due to forced Unicode conversion of non-ASCII characters. (#32222)
fix: Fix LLM mimicking Unicode responses due to forced Unicode
conversion of non-ASCII characters.

- **Description:** This PR fixes an issue where the LLM would mimic
Unicode responses due to forced Unicode conversion of non-ASCII
characters in tool calls. The fix involves disabling the `ensure_ascii`
flag in `json.dumps()` when converting tool calls to OpenAI format.
- **Issue:** Fixes ↓↓↓
input:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "你好啊集团"}'}}]}
```
output:
```json
{'role': 'assistant', 'tool_calls': [{'type': 'function', 'id': 'call_nv9trcehdpihr21zj9po19vq', 'function': {'name': 'create_customer', 'arguments': '{"customer_name": "\\u4f60\\u597d\\u554a\\u96c6\\u56e2"}'}}]}
```
then:
llm will mimic outputting unicode. Unicode's vast number of symbols can
lengthen LLM responses, leading to slower performance.
<img width="686" height="277" alt="image"
src="https://github.com/user-attachments/assets/28f3b007-3964-4455-bee2-68f86ac1906d"
/>

---------

Co-authored-by: Mason Daugherty <github@mdrxy.com>
Co-authored-by: Mason Daugherty <mason@langchain.dev>
2025-07-24 17:01:31 -04:00
Mason Daugherty
d53ebf367e fix(docs): capitalization, codeblock formatting, and hyperlinks, note blocks (#32235)
widespread cleanup attempt
2025-07-24 16:55:04 -04:00
Mason Daugherty
bd3d6496f3 release(core): 0.3.72 (#32214)
fixes #32170
2025-07-23 20:33:48 -04:00
jmaillefaud
fb5da8384e fix(core): Dereference Refs for pydantic schema fails in tool schema generation (#32203)
The `_dereference_refs_helper` in `langchain_core.utils.json_schema`
incorrectly handled objects with a reference and other fields.

**Issue**: #32170

# Description

We change the check so that it accepts other keys in the object.
2025-07-23 20:28:27 -04:00
ccurme
e9b0b84675 feat: new message formats (v0.4) (#32208)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-07-23 13:30:21 -04:00