## Description
Currently when deserializing objects that contain non-deserializable
values, we throw an error. However, there are cases (e.g. proxies that
return response fields containing extra fields like Python datetimes),
where these values are not important and we just want to drop them.
Twitter handle: @hacubu
---------
Co-authored-by: Mason Daugherty <github@mdrxy.com>
TL;DR: you can't optimize imports with a lazy `__getattr__` if there is
a namespace conflict with a module name and an attribute name. We should
avoid introducing conflicts like this in the future.
This PR fixes a bug introduced by my lazy imports PR:
https://github.com/langchain-ai/langchain/pull/30769.
In `langchain_core`, we have utilities for loading and dumping data.
Unfortunately, one of those utilities is a `load` function, located in
`langchain_core/load/load.py`. To make this function more visible, we
make it accessible at the top level `langchain_core.load` module via
importing the function in `langchain_core/load/__init__.py`.
So, either of these imports should work:
```py
from langchain_core.load import load
from langchain_core.load.load import load
```
As you can tell, this is already a bit confusing. You'd think that the
first import would produce the module `load`, but because of the
`__init__.py` shortcut, both produce the function `load`.
<details> More on why the lazy imports PR broke this support...
All was well, except when the absolute import was run first, see the
last snippet:
```
>>> from langchain_core.load import load
>>> load
<function load at 0x101c320c0>
```
```
>>> from langchain_core.load.load import load
>>> load
<function load at 0x1069360c0>
```
```
>>> from langchain_core.load import load
>>> load
<function load at 0x10692e0c0>
>>> from langchain_core.load.load import load
>>> load
<function load at 0x10692e0c0>
```
```
>>> from langchain_core.load.load import load
>>> load
<function load at 0x101e2e0c0>
>>> from langchain_core.load import load
>>> load
<module 'langchain_core.load.load' from '/Users/sydney_runkle/oss/langchain/libs/core/langchain_core/load/load.py'>
```
In this case, the function `load` wasn't stored in the globals cache for
the `langchain_core.load` module (by the lazy import logic), so Python
defers to a module import.
</details>
New `langchain` tongue twister 😜: we've created a problem for ourselves
because you have to load the load function from the load file in the
load module 😨.
Most easily reviewed with the "hide whitespace" option toggled.
Seeing 10-50% speed ups in import time for common structures 🚀
The general purpose of this PR is to lazily import structures within
`langchain_core.XXX_module.__init__.py` so that we're not eagerly
importing expensive dependencies (`pydantic`, `requests`, etc).
Analysis of flamegraphs generated with `importtime` motivated these
changes. For example, the one below demonstrates that importing
`HumanMessage` accidentally triggered imports for `importlib.metadata`,
`requests`, etc.
There's still much more to do on this front, and we can start digging
into our own internal code for optimizations now that we're less
concerned about external imports.
<img width="1210" alt="Screenshot 2025-04-11 at 1 10 54 PM"
src="https://github.com/user-attachments/assets/112a3fe7-24a9-4294-92c1-d5ae64df839e"
/>
I've tracked the improvements with some local benchmarks:
## `pytest-benchmark` results
| Name | Before (s) | After (s) | Delta (s) | % Change |
|-----------------------------|------------|-----------|-----------|----------|
| Document | 2.8683 | 1.2775 | -1.5908 | -55.46% |
| HumanMessage | 2.2358 | 1.1673 | -1.0685 | -47.79% |
| ChatPromptTemplate | 5.5235 | 2.9709 | -2.5526 | -46.22% |
| Runnable | 2.9423 | 1.7793 | -1.163 | -39.53% |
| InMemoryVectorStore | 3.1180 | 1.8417 | -1.2763 | -40.93% |
| RunnableLambda | 2.7385 | 1.8745 | -0.864 | -31.55% |
| tool | 5.1231 | 4.0771 | -1.046 | -20.42% |
| CallbackManager | 4.2263 | 3.4099 | -0.8164 | -19.32% |
| LangChainTracer | 3.8394 | 3.3101 | -0.5293 | -13.79% |
| BaseChatModel | 4.3317 | 3.8806 | -0.4511 | -10.41% |
| PydanticOutputParser | 3.2036 | 3.2995 | 0.0959 | 2.99% |
| InMemoryRateLimiter | 0.5311 | 0.5995 | 0.0684 | 12.88% |
Note the lack of change for `InMemoryRateLimiter` and
`PydanticOutputParser` is just random noise, I'm getting comparable
numbers locally.
## Local CodSpeed results
We're still working on configuring CodSpeed on CI. The local usage
produced similar results.
Perplexity's importance in the space has been growing, so we think it's
time to add an official integration!
Note: following the release of `langchain-perplexity` to `pypi`, we
should be able to add `perplexity` as an extra in
`libs/langchain/pyproject.toml`, but we're blocked by a circular import
for now.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Release notes: https://pydantic.dev/articles/pydantic-v2-11-release
Covered here:
- We no longer access `model_fields` on class instances (that is now
deprecated);
- Update schema normalization for Pydantic version testing to reflect
changes to generated JSON schema (addition of `"additionalProperties":
True` for dict types with value Any or object).
## Considerations:
### Changes to JSON schema generation
#### Tool-calling / structured outputs
This may impact tool-calling + structured outputs for some providers,
but schema generation only changes if you have parameters of the form
`dict`, `dict[str, Any]`, `dict[str, object]`, etc. If dict parameters
are typed my understanding is there are no changes.
For OpenAI for example, untyped dicts work for structured outputs with
default settings before and after updating Pydantic, and error both
before/after if `strict=True`.
### Use of `model_fields`
There is one spot where we previously accessed `super(cls,
self).model_fields`, where `cls` is an object in the MRO. This was done
for the purpose of tracking aliases in secrets. I've updated this to
always be `type(self).model_fields`-- see comment in-line for detail.
---------
Co-authored-by: Sydney Runkle <54324534+sydney-runkle@users.noreply.github.com>
Resolves https://github.com/langchain-ai/langchain/issues/29003,
https://github.com/langchain-ai/langchain/issues/27264
Related: https://github.com/langchain-ai/langchain-redis/issues/52
```python
from langchain.chat_models import init_chat_model
from langchain.globals import set_llm_cache
from langchain_community.cache import SQLiteCache
from pydantic import BaseModel
cache = SQLiteCache()
set_llm_cache(cache)
class Temperature(BaseModel):
value: int
city: str
llm = init_chat_model("openai:gpt-4o-mini")
structured_llm = llm.with_structured_output(Temperature)
```
```python
# 681 ms
response = structured_llm.invoke("What is the average temperature of Rome in May?")
```
```python
# 6.98 ms
response = structured_llm.invoke("What is the average temperature of Rome in May?")
```
Ruff doesn't know about the python version in
`[tool.poetry.dependencies]`. It can get it from
`project.requires-python`.
Notes:
* poetry seems to have issues getting the python constraints from
`requires-python` and using `python` in per dependency constraints. So I
had to duplicate the info. I will open an issue on poetry.
* `inspect.isclass()` doesn't work correctly with `GenericAlias`
(`list[...]`, `dict[..., ...]`) on Python <3.11 so I added some `not
isinstance(type, GenericAlias)` checks:
Python 3.11
```pycon
>>> import inspect
>>> inspect.isclass(list)
True
>>> inspect.isclass(list[str])
False
```
Python 3.9
```pycon
>>> import inspect
>>> inspect.isclass(list)
True
>>> inspect.isclass(list[str])
True
```
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Support using additional import mapping. This allows users to override
old mappings/add new imports to the loads function.
- [x ] **Add tests and docs**: If you're adding a new integration,
please include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
This change adds a new message type `RemoveMessage`. This will enable
`langgraph` users to manually modify graph state (or have the graph
nodes modify the state) to remove messages by `id`
Examples:
* allow users to delete messages from state by calling
```python
graph.update_state(config, values=[RemoveMessage(id=state.values[-1].id)])
```
* allow nodes to delete messages
```python
graph.add_node("delete_messages", lambda state: [RemoveMessage(id=state[-1].id)])
```
The return type of `json.loads` is `Any`.
In fact, the return type of `dumpd` must be based on `json.loads`, so
the correction here is understandable.
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
```python
from langchain.agents import AgentExecutor, create_tool_calling_agent, tool
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful assistant"),
MessagesPlaceholder("chat_history", optional=True),
("human", "{input}"),
MessagesPlaceholder("agent_scratchpad"),
]
)
model = ChatAnthropic(model="claude-3-opus-20240229")
@tool
def magic_function(input: int) -> int:
"""Applies a magic function to an input."""
return input + 2
tools = [magic_function]
agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "what is the value of magic_function(3)?"})
```
```
> Entering new AgentExecutor chain...
Invoking: `magic_function` with `{'input': 3}`
responded: [{'text': '<thinking>\nThe user has asked for the value of magic_function applied to the input 3. Looking at the available tools, magic_function is the relevant one to use here, as it takes an integer input and returns an integer output.\n\nThe magic_function has one required parameter:\n- input (integer)\n\nThe user has directly provided the value 3 for the input parameter. Since the required parameter is present, we can proceed with calling the function.\n</thinking>', 'type': 'text'}, {'id': 'toolu_01HsTheJPA5mcipuFDBbJ1CW', 'input': {'input': 3}, 'name': 'magic_function', 'type': 'tool_use'}]
5
Therefore, the value of magic_function(3) is 5.
> Finished chain.
{'input': 'what is the value of magic_function(3)?',
'output': 'Therefore, the value of magic_function(3) is 5.'}
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- This ensures ids are stable across streamed chunks
- Multiple messages in batch call get separate ids
- Also fix ids being dropped when combining message chunks
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.