`Runnable`'s `Input` is contravariant so we need to enumerate all
possible inputs and it's not possible to put them in a `Union`.
Also, it's better to only require a runnable that
accepts`list[BaseMessage]` instead of a broader `Sequence[BaseMessage]`
as internally the runnable is only called with a list.
**Description:**
Previously, when transitioning from a deeper Markdown header (e.g., ###)
to a shallower one (e.g., ##), the
ExperimentalMarkdownSyntaxTextSplitter retained the deeper header in the
metadata.
This commit updates the `_resolve_header_stack` method to remove headers
at the same or deeper levels before appending the current header. As a
result, each chunk now reflects only the active header context.
Fixes unexpected metadata leakage across sections in nested Markdown
documents.
Additionally, test cases have been updated to:
- Validate correct header resolution and metadata assignment.
- Cover edge cases with nested headers and horizontal rules.
**Issue:**
Fixes [#31596](https://github.com/langchain-ai/langchain/issues/31596)
**Dependencies:**
None
**Twitter handle:** -> [_RaghuKapur](https://twitter.com/_RaghuKapur)
**LinkedIn:** ->
[https://www.linkedin.com/in/raghukapur/](https://www.linkedin.com/in/raghukapur/)
## Description
<!-- What does this pull request accomplish? -->
- When parsing MistralAI chunk dicts to Langchain to `AIMessageChunk`
schemas via the `_convert_chunk_to_message_chunk` utility function, the
`finish_reason` was not being included in `response_metadata` as it is
for other providers.
- This PR adds a one-liner fix to include the finish reason.
- fixes: https://github.com/langchain-ai/langchain/issues/31666
* Simplified Pydantic handling since Pydantic v1 is not supported
anymore.
* Replace use of deprecated v1 methods by corresponding v2 methods.
* Remove use of other deprecated methods.
* Activate mypy errors on deprecated methods use.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
This PR fixes an `IndexError` that occurs when `LLMListwiseRerank` is
called with an empty list of documents.
Earlier, the code assumed the presence of at least one document and
attempted to construct the context string based on `len(documents) - 1`,
which raises an error when documents is an empty list.
The fix works with gpt-4o-mini if I make the list empty, but fails
occasionally with gpt-3.5-turbo. In case of empty list, setting the
string to "empty list" seems to have the expected response.
**Issue:** #31192
Description:
This pull request corrects minor spelling mistakes in the comments
within the `chat_models.py` file of the MistralAI partner integration.
Specifically, it fixes the spelling of "equivalent" and "compatibility"
in two separate comments. These changes improve code readability and
maintain professional documentation standards. No functional code
changes are included.
`uv lock --upgrade-package langsmith
`
Original issue: The lock file (uv.lock) was constraining
langsmith>=0.1.125,<0.4, preventing LangSmith 0.4.1 installation. Even
though the pyproject.toml wasn't restricting langchain core.
Issue:
https://langchain.slack.com/archives/C050X0VTN56/p1750107176007629
### Description
Add keep_separator arg to HTMLSemanticPreservingSplitter and pass value
to instance of RecursiveCharacterTextSplitter used under the hood.
### Issue
Documents returned by `HTMLSemanticPreservingSplitter.split_text(text)`
are defaulted to use separators at beginning of page_content. [See third
and fourth document in example output from how-to
guide](https://python.langchain.com/docs/how_to/split_html/#using-htmlsemanticpreservingsplitter):
```
[Document(metadata={'Header 1': 'Main Title'}, page_content='This is an introductory paragraph with some basic content.'),
Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content='This section introduces the topic'),
Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content='. Below is a list: First item Second item Third item with bold text and a link Subsection 1.1: Details This subsection provides additional details'),
Document(metadata={'Header 2': 'Section 1: Introduction'}, page_content=". Here's a table: Header 1 Header 2 Header 3 Row 1, Cell 1 Row 1, Cell 2 Row 1, Cell 3 Row 2, Cell 1 Row 2, Cell 2 Row 2, Cell 3"),
Document(metadata={'Header 2': 'Section 2: Media Content'}, page_content='This section contains an image and a video:  '),
Document(metadata={'Header 2': 'Section 3: Code Example'}, page_content='This section contains a code block: <code:html> <div> <p>This is a paragraph inside a div.</p> </div> </code>'),
Document(metadata={'Header 2': 'Conclusion'}, page_content='This is the conclusion of the document.')]
```
### Dependencies
None
@ttrumper3
Thank you for contributing to LangChain!
[x] PR title: langchain_ollama: support custom headers for Ollama
partner APIs
Where "package" is whichever of langchain, core, etc. is being modified.
Use "docs: ..." for purely docs changes, "infra: ..." for CI changes.
Example: "core: add foobar LLM"
[x] PR message:
**Description: This PR adds support for passing custom HTTP headers to
Ollama models when used as a LangChain integration. This is especially
useful for enterprise users or partners who need to send authentication
tokens, API keys, or custom tracking headers when querying secured
Ollama servers.
Issue: N/A (new enhancement)
**Dependencies: No external dependencies introduced.
Twitter handle: @arunkumar_offl
[x] Add tests and docs: If you're adding a new integration, please
include
1.Added a unit test in test_chat_models.py to validate headers are
passed correctly.
2. Added an example notebook:
docs/docs/integrations/llms/ollama_custom_headers.ipynb showing how to
use custom headers.
[x] Lint and test: Ran make format, make lint, and make test to ensure
the code is clean and passing all checks.
Additional guidelines:
Make sure optional dependencies are imported within a function.
Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
Most PRs should not touch more than one package.
Changes should be backwards compatible.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
This MR is only for the docs. Added integration with Nebius AI Studio to
docs. The integration package is available at
[https://github.com/nebius/langchain-nebius](https://github.com/nebius/langchain-nebius).
---------
Co-authored-by: Akim Tsvigun <aktsvigun@nebius.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Remove proxy imports to langchain_experimental.
Previously, these imports would work if a user manually installed
langchain_experimental. However, we want to drop support even for that
as langchain_experimental is generally not recommended to be run in
production.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
OpenAI changed their API to require the `partial_images` parameter when
using image generation + streaming.
As described in https://github.com/langchain-ai/langchain/pull/31424, we
are ignoring partial images. Here, we accept the `partial_images`
parameter (as required by OpenAI), but emit a warning and continue to
ignore partial images.
**Description:**
`langchain_huggingface` has a very large installation size of around 600
MB (on a Mac with Python 3.11). This is due to its dependency on
`sentence-transformers`, which in turn depends on `torch`, which is 320
MB all by itself. Similarly, the depedency on `transformers` adds
another set of heavy dependencies. With those dependencies removed, the
installation of `langchain_huggingface` only takes up ~26 MB. This is
only 5 % of the full installation!
These libraries are not necessary to use `langchain_huggingface`'s API
wrapper classes, only for local inferences/embeddings. All import
statements for those two libraries already have import guards in place
(try/catch with a helpful "please install x" message).
This PR therefore moves those two libraries to an optional dependency
group `full`. So a `pip install langchain_huggingface` will only install
the lightweight version, and a `pip install
"langchain_huggingface[full]"` will install all dependencies.
I know this may break existing code, because `sentence-transformers` and
`transformers` are now no longer installed by default. Given that users
will see helpful error messages when that happens, and the major impact
of this small change, I hope that you will still consider this PR.
**Dependencies:** No new dependencies, but new optional grouping.
- **Description:**
- In _infer_arg_descriptions, the annotations dictionary contains string
representations of types instead of actual typing objects. This causes
_is_annotated_type to fail, preventing the correct description from
being generated.
- This is a simple fix using the get_type_hints method, which resolves
the annotations properly and is supported across all Python versions.
- **Issue:** #31051
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
https://github.com/langchain-ai/langchain/pull/31286 included an update
to the return type for `BaseChatModel.(a)stream`, from
`Iterator[BaseMessageChunk]` to `Iterator[BaseMessage]`.
This change is correct, because when streaming is disabled, the stream
methods return an iterator of `BaseMessage`, and the inheritance is such
that an `BaseMessage` is not a `BaseMessageChunk` (but the reverse is
true).
However, LangChain includes a pattern throughout its docs of [summing
BaseMessageChunks](https://python.langchain.com/docs/how_to/streaming/#llms-and-chat-models)
to accumulate a chat model stream. This pattern is implemented in tests
for most integration packages and appears in application code. So
https://github.com/langchain-ai/langchain/pull/31286 introduces mypy
errors throughout the ecosystem (or maybe more accurately, it reveals
that this pattern does not account for use of the `.stream` method when
streaming is disabled).
Here we revert just the change to the stream return type to unblock
things. A fix for this should address docs + integration packages (or if
we elect to just force people to update code, be explicit about that).
Does not support partial images during generation at the moment. Before
doing that I'd like to figure out how to specify the aggregation logic
without requiring changes in core.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Llama-3.1 started failing consistently with
> groq.BadRequestError: Error code: 400 - ***'error': ***'message':
"Failed to call a function. Please adjust your prompt. See
'failed_generation' for more details.", 'type': 'invalid_request_error',
'code': 'tool_use_failed', 'failed_generation':
'<function=brave_search>***"query": "Hello!"***</function>'***
Release core 0.3.63
Small update just to expand the list of well known tools. This is
necessary while the logic lives in langchain-core.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Add image generation tool to the list of well known tools. This is needed for changes in the ChatOpenAI client.
TODO: Some of this logic needs to be moved from core directly into the client as changes in core should not be required to add a new tool to the openai chat client.
Added support for new Exa API features. Updated Exa docs and python
package (langchain-exa).
Description
Added support for new Exa API features in the langchain-exa package:
- Added max_characters option for text content
- Added support for summary and custom summary prompts
- Added livecrawl option with "always", "fallback", "never" settings
- Added "auto" option for search type
- Updated documentation and tests
Dependencies
- No new dependencies required. Using existing features from exa-py.
twitter: @theishangoswami
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Description:** ConversationChain has been deprecated, and the
documentation says to use RunnableWithMessageHistory in its place, but
the link at the top of the page to RunnableWithMessageHistory is broken
(it's rendering as "html()"). See here at the top of the page:
https://python.langchain.com/api_reference/langchain/chains/langchain.chains.conversation.base.ConversationChain.html.
This PR fixes the link.
**Issue**: N/A
**Dependencies**: N/A
**Twitter handle:**: If you're on Bluesky, I'm @vikramsaraph.com
Scheduled testing started failing today because the Responses API
stopped raising `BadRequestError` for a schema that was previously
invalid when `strict=True`.
Although docs still say that [some type-specific keywords are not yet
supported](https://platform.openai.com/docs/guides/structured-outputs#some-type-specific-keywords-are-not-yet-supported)
(including `minimum` and `maximum` for numbers), the below appears to
run and correctly respect the constraints:
```python
import json
import openai
maximums = list(range(1, 11))
arg_values = []
for maximum in maximums:
tool = {
"type": "function",
"name": "magic_function",
"description": "Applies a magic function to an input.",
"parameters": {
"properties": {
"input": {"maximum": maximum, "minimum": 0, "type": "integer"}
},
"required": ["input"],
"type": "object",
"additionalProperties": False
},
"strict": True
}
client = openai.OpenAI()
response = client.responses.create(
model="gpt-4.1",
input=[{"role": "user", "content": "What is the value of magic_function(3)? Use the tool."}],
tools=[tool],
)
function_call = next(item for item in response.output if item.type == "function_call")
args = json.loads(function_call.arguments)
arg_values.append(args["input"])
print(maximums)
print(arg_values)
# [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# [1, 2, 3, 3, 3, 3, 3, 3, 3, 3]
```
Until yesterday this raised BadRequestError.
The same is not true of Chat Completions, which appears to still raise
BadRequestError
```python
tool = {
"type": "function",
"function": {
"name": "magic_function",
"description": "Applies a magic function to an input.",
"parameters": {
"properties": {
"input": {"maximum": 5, "minimum": 0, "type": "integer"}
},
"required": ["input"],
"type": "object",
"additionalProperties": False
},
"strict": True
}
}
response = client.chat.completions.create(
model="gpt-4.1",
messages=[{"role": "user", "content": "What is the value of magic_function(3)? Use the tool."}],
tools=[tool],
)
response # raises BadRequestError
```
Here we update tests accordingly.