Commit Graph

7112 Commits

Author SHA1 Message Date
Chester Curme
d7133d760b lock 2025-06-16 18:00:17 -04:00
ccurme
b9357d456e
openai[patch]: refactor handling of Responses API (#31587) 2025-06-16 14:01:39 -04:00
Tom-Trumper
532e6455e9
text-splitters: Add keep_separator arg to HTMLSemanticPreservingSplitter (#31588)
### Description
Add keep_separator arg to HTMLSemanticPreservingSplitter and pass value
to instance of RecursiveCharacterTextSplitter used under the hood.
### Issue
Documents returned by `HTMLSemanticPreservingSplitter.split_text(text)`
are defaulted to use separators at beginning of page_content. [See third
and fourth document in example output from how-to
guide](https://python.langchain.com/docs/how_to/split_html/#using-htmlsemanticpreservingsplitter):
```
[Document(metadata={'Header 1': 'Main Title'}, page_content='This is an introductory paragraph with some basic content.'),
 Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content='This section introduces the topic'),
 Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content='. Below is a list: First item Second item Third item with bold text and a link Subsection 1.1: Details This subsection provides additional details'),
 Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content=". Here's a table: Header 1 Header 2 Header 3 Row 1, Cell 1 Row 1, Cell 2 Row 1, Cell 3 Row 2, Cell 1 Row 2, Cell 2 Row 2, Cell 3"),
 Document(metadata={'Header 2': 'Section 2: Media Content'}, page_content='This section contains an image and a video: ![image:example_image_link.mp4](example_image_link.mp4) ![video:example_video_link.mp4](example_video_link.mp4)'),
 Document(metadata={'Header 2': 'Section 3: Code Example'}, page_content='This section contains a code block: <code:html> <div> <p>This is a paragraph inside a div.</p> </div> </code>'),
 Document(metadata={'Header 2': 'Conclusion'}, page_content='This is the conclusion of the document.')]
```
### Dependencies
None

@ttrumper3
2025-06-14 17:56:14 -04:00
Peter Schneider
cecfec5efa
huggingface: handle image-text-to-text pipeline task (#31611)
**Description:** Allows for HuggingFacePipeline to handle
image-text-to-text pipeline
2025-06-14 16:41:11 -04:00
Akim Tsvigun
f345ae5a1d
docs: Integration with Nebius AI Studio (#31293)
Thank you for contributing to LangChain!

[x] PR title: langchain_ollama: support custom headers for Ollama
partner APIs

Where "package" is whichever of langchain, core, etc. is being modified.
Use "docs: ..." for purely docs changes, "infra: ..." for CI changes.
Example: "core: add foobar LLM"
[x] PR message:

**Description: This PR adds support for passing custom HTTP headers to
Ollama models when used as a LangChain integration. This is especially
useful for enterprise users or partners who need to send authentication
tokens, API keys, or custom tracking headers when querying secured
Ollama servers.
Issue: N/A (new enhancement)
**Dependencies: No external dependencies introduced.
Twitter handle: @arunkumar_offl
[x] Add tests and docs: If you're adding a new integration, please
include
1.Added a unit test in test_chat_models.py to validate headers are
passed correctly.
2. Added an example notebook:
docs/docs/integrations/llms/ollama_custom_headers.ipynb showing how to
use custom headers.

[x] Lint and test: Ran make format, make lint, and make test to ensure
the code is clean and passing all checks.

Additional guidelines:

Make sure optional dependencies are imported within a function.
Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
Most PRs should not touch more than one package.
Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.

This MR is only for the docs. Added integration with Nebius AI Studio to
docs. The integration package is available at
[https://github.com/nebius/langchain-nebius](https://github.com/nebius/langchain-nebius).

---------

Co-authored-by: Akim Tsvigun <aktsvigun@nebius.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-06-14 16:15:27 -04:00
Xin Jin
01fcdff118
bump langsmith to allow 0.4 (#31594)
Langsmith 0.4 is launched so bump it up across OSS: langchain and
langchain-core. Will have separate langsmith-doc announcement for that
2025-06-13 07:59:42 -07:00
ccurme
5839801897
openai: release 0.3.23 (#31604) 2025-06-13 14:02:38 +00:00
ccurme
0c10ff6418
openai[patch]: handle annotation change in openai==1.82.0 (#31597)
https://github.com/openai/openai-python/pull/2372/files#diff-91cfd5576e71b4b72da91e04c3a029bab50a72b5f7a2ac8393fca0a06e865fb3
2025-06-12 23:38:41 -04:00
Nuno Campos
ddc850ca72
core: In LangChainTracer, send only the first token event (#31591)
- only the first one is used for analytics
2025-06-12 14:04:23 -07:00
Eugene Yurtsev
d10fd02bb3
langchain[patch]: Allow specifying other hashing functions in embeddings (#31561)
Allow specifying other hashing functions in embeddings
2025-06-11 10:18:07 -04:00
ccurme
4071670f56
huggingface[patch]: bump transformers (#31559) 2025-06-10 20:43:33 +00:00
ccurme
40d6d4c738
huggingface[patch]: bump core dep (#31558) 2025-06-10 20:26:13 +00:00
Mohammad Mohtashim
42eb356a44
[OpenAI]: Encoding Model (#31402)
- **Description:** Small Fix for when getting the encoder in case of
KeyError and using the correct encoder for newer models
- **Issue:** #31390
2025-06-10 16:00:00 -04:00
ccurme
b0f100af7e
core: release 0.3.65 (#31557) 2025-06-10 19:39:50 +00:00
Sydney Runkle
5b165effcd
core(fix): revert set_text optimization (#31555)
Revert serialization regression introduced in
https://github.com/langchain-ai/langchain/pull/31238

Fixes https://github.com/langchain-ai/langchain/issues/31486
2025-06-10 13:36:55 -04:00
Eugene Yurtsev
9ce974247c
langchain[patch]: Remove proxy imports to langchain_experimental (#31541)
Remove proxy imports to langchain_experimental.

Previously, these imports would work if a user manually installed
langchain_experimental. However, we want to drop support even for that
as langchain_experimental is generally not recommended to be run in
production.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2025-06-09 17:09:09 -04:00
ccurme
71b0f78952
openai: release 0.3.22 (#31542) 2025-06-09 15:29:15 -04:00
ccurme
575662d5f1
openai[patch]: accommodate change in image generation API (#31522)
OpenAI changed their API to require the `partial_images` parameter when
using image generation + streaming.

As described in https://github.com/langchain-ai/langchain/pull/31424, we
are ignoring partial images. Here, we accept the `partial_images`
parameter (as required by OpenAI), but emit a warning and continue to
ignore partial images.
2025-06-09 14:57:46 -04:00
ccurme
ece9e31a7a
openai[patch]: VCR some tests (#31524) 2025-06-06 23:00:57 +00:00
Bagatur
5187817006
openai[release]: 0.3.21 (#31519) 2025-06-06 11:40:09 -04:00
Bagatur
761f8c3231
openai[patch]: pass through with_structured_output kwargs (#31518)
Support 
```python
from langchain.chat_models import init_chat_model
from pydantic import BaseModel


class ResponseSchema(BaseModel):
    response: str


def get_weather(location: str) -> str:
    """Get weather"""
    pass

llm = init_chat_model("openai:gpt-4o-mini")

structured_llm = llm.with_structured_output(
    ResponseSchema,
    tools=[get_weather],
    strict=True,
    include_raw=True,
    tool_choice="required",
    parallel_tool_calls=False,
)

structured_llm.invoke("whats up?")
```
2025-06-06 11:17:34 -04:00
Bagatur
0375848f6c
openai[patch]: update with_structured_outputs docstring (#31517)
Update docstrings
2025-06-06 10:03:47 -04:00
ccurme
9c639035c0
standard-tests: add cache_control to Anthropic inputs test (#31516) 2025-06-06 10:00:43 -04:00
ccurme
a1f068eb85
openai: release 0.3.20 (#31515) 2025-06-06 13:29:12 +00:00
ccurme
4cc2f6b807
openai[patch]: guard against None text completions in BaseOpenAI (#31514)
Some chat completions APIs will return null `text` output (even though
this is typed as string).
2025-06-06 09:14:37 -04:00
lc-arjun
35ae5eab4f
core: use run tree post/patch (#31500)
Use run post/patch
2025-06-05 14:05:57 -07:00
Eugene Yurtsev
73655b0ca8
huggingface: 0.3.0 release (#31503)
Breaking change to make some dependencies optional:
https://github.com/langchain-ai/langchain/pull/31268

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-06-05 20:20:15 +00:00
Bagatur
f7f52cab12
anthropic[patch]: cache tokens nit (#31484)
if you pass in beta headers directly cache_creation is a dict
2025-06-05 16:15:03 -04:00
ccurme
14c561e15d
infra: relax types-requests version range (#31504) 2025-06-05 18:57:08 +00:00
ccurme
6d6f305748
openai[patch]: clarify docs on api_version in docstring for AzureChatOpenAI (#31502) 2025-06-05 16:06:22 +00:00
Simon Stone
815bfa5408
huggingface[major]: Reduce disk footprint by 95% by making large dependencies optional (#31268)
**Description:** 
`langchain_huggingface` has a very large installation size of around 600
MB (on a Mac with Python 3.11). This is due to its dependency on
`sentence-transformers`, which in turn depends on `torch`, which is 320
MB all by itself. Similarly, the depedency on `transformers` adds
another set of heavy dependencies. With those dependencies removed, the
installation of `langchain_huggingface` only takes up ~26 MB. This is
only 5 % of the full installation!

These libraries are not necessary to use `langchain_huggingface`'s API
wrapper classes, only for local inferences/embeddings. All import
statements for those two libraries already have import guards in place
(try/catch with a helpful "please install x" message).

This PR therefore moves those two libraries to an optional dependency
group `full`. So a `pip install langchain_huggingface` will only install
the lightweight version, and a `pip install
"langchain_huggingface[full]"` will install all dependencies.

I know this may break existing code, because `sentence-transformers` and
`transformers` are now no longer installed by default. Given that users
will see helpful error messages when that happens, and the major impact
of this small change, I hope that you will still consider this PR.

**Dependencies:** No new dependencies, but new optional grouping.
2025-06-05 12:04:19 -04:00
Mohammad Mohtashim
ae3551c96b
core[patch]: Correct type casting of annotations in _infer_arg_descriptions (#31181)
- **Description:** 
- In _infer_arg_descriptions, the annotations dictionary contains string
representations of types instead of actual typing objects. This causes
_is_annotated_type to fail, preventing the correct description from
being generated.
- This is a simple fix using the get_type_hints method, which resolves
the annotations properly and is supported across all Python versions.

  - **Issue:** #31051

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-06-05 11:58:36 -04:00
ccurme
43bee469ce
standard-tests: release 0.3.20 (#31499) 2025-06-05 11:28:18 -04:00
ccurme
741bb1ffa1
core[patch]: revert change to stream type hint (#31501)
https://github.com/langchain-ai/langchain/pull/31286 included an update
to the return type for `BaseChatModel.(a)stream`, from
`Iterator[BaseMessageChunk]` to `Iterator[BaseMessage]`.

This change is correct, because when streaming is disabled, the stream
methods return an iterator of `BaseMessage`, and the inheritance is such
that an `BaseMessage` is not a `BaseMessageChunk` (but the reverse is
true).

However, LangChain includes a pattern throughout its docs of [summing
BaseMessageChunks](https://python.langchain.com/docs/how_to/streaming/#llms-and-chat-models)
to accumulate a chat model stream. This pattern is implemented in tests
for most integration packages and appears in application code. So
https://github.com/langchain-ai/langchain/pull/31286 introduces mypy
errors throughout the ecosystem (or maybe more accurately, it reveals
that this pattern does not account for use of the `.stream` method when
streaming is disabled).

Here we revert just the change to the stream return type to unblock
things. A fix for this should address docs + integration packages (or if
we elect to just force people to update code, be explicit about that).
2025-06-05 11:20:06 -04:00
Bagatur
ec8bab83f8
anthropic[fix]: bump langchain-core dep (#31483) 2025-06-03 10:56:48 -04:00
Bagatur
310e643842
release[anthropic]: 0.3.15 (#31479)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-06-03 10:38:11 -04:00
Christophe Bornet
539e5b6936
core: Add mypy strict-equality rule (#31286) 2025-06-02 18:24:35 +00:00
Sam Zhang
2c4e0ab3bc
fix: module 'defusedxml' has no attribute 'ElementTree' (#31429) (#31431)
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-06-02 18:09:22 +00:00
Eugene Yurtsev
6cb3ea514a
openai: release 0.3.19 (#31466)
Release 0.3.19
2025-06-02 12:44:49 -04:00
Eugene Yurtsev
17f34baa88
openai[minor]: add image generation to responses api (#31424)
Does not support partial images during generation at the moment. Before
doing that I'd like to figure out how to specify the aggregation logic
without requiring changes in core.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-06-02 10:03:54 -04:00
ccurme
d3be4a0c56
infra: remove use of --vcr-record=none (#31452)
This option is specific to `pytest-vcr`. `pytest-recording` runs in this
mode by default.
2025-06-01 10:49:59 -04:00
ccurme
3db1aa0ba6
standard-tests: migrate to pytest-recording (#31425)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2025-05-31 15:21:15 -04:00
ccurme
5bf89628bf
groq[patch]: update model for integration tests (#31440)
Llama-3.1 started failing consistently with
> groq.BadRequestError: Error code: 400 - ***'error': ***'message':
"Failed to call a function. Please adjust your prompt. See
'failed_generation' for more details.", 'type': 'invalid_request_error',
'code': 'tool_use_failed', 'failed_generation':
'<function=brave_search>***"query": "Hello!"***</function>'***
2025-05-30 17:27:12 +00:00
अंkur गोswami
729526ff7c
huggingface: Undefined model_id fix (#31358)
**Description:** This change fixes the undefined model_id issue when
instantiating
[ChatHuggingFace](https://github.com/langchain-ai/langchain/blob/master/libs/partners/huggingface/langchain_huggingface/chat_models/huggingface.py#L306)
**Issue:** Fixes https://github.com/langchain-ai/langchain/issues/31357


@baskaryan @hwchase17
2025-05-29 15:59:35 -04:00
ccurme
c8951ca124
infra: drop azure from streaming benchmarks (#31421)
Covered by BaseChatOpenAI
2025-05-29 15:06:12 -04:00
Eugene Yurtsev
19f2a92609
core: release 0.3.63 (#31419)
Release core 0.3.63

Small update just to expand the list of well known tools. This is
necessary while the logic lives in langchain-core.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-05-29 14:48:18 -04:00
ccurme
afd349cc95
openai: cache httpx client (#31260)
![Screenshot 2025-05-16 at 3 49
54 PM](https://github.com/user-attachments/assets/4b377384-a769-4487-b801-bd1aa0ed66c1)

Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
2025-05-29 14:03:06 -04:00
Eugene Yurtsev
e6633a7efb
langchain-core: Add image_generation tool to list of known openai tools (#31396)
Add image generation tool to the list of well known tools. This is needed for changes in the ChatOpenAI client. 

TODO: Some of this logic needs to be moved from core directly into the client as changes in core should not be required to add a new tool to the openai chat client.
2025-05-29 13:13:21 -04:00
Sydney Runkle
1917dd1ccd
benchmarks: always run (not conditional on changes) (#31409) 2025-05-29 11:45:57 -04:00
ccurme
49eeb0f3c3
standard-tests: add benchmarks (#31302)
Co-authored-by: Sydney Runkle <sydneymarierunkle@gmail.com>
2025-05-29 15:21:37 +00:00