Anthropic models (including via Bedrock and other cloud platforms)
accept a status/is_error attribute on tool messages/results
(specifically in `tool_result` content blocks for Anthropic API). Adding
a ToolMessage.status attribute so that users can set this attribute when
using those models
This PR proposes to create a rate limiter in the chat model directly,
and would replace: https://github.com/langchain-ai/langchain/pull/21992
It resolves most of the constraints that the Runnable rate limiter
introduced:
1. It's not annoying to apply the rate limiter to existing code; i.e.,
possible to roll out the change at the location where the model is
instantiated,
rather than at every location where the model is used! (Which is
necessary
if the model is used in different ways in a given application.)
2. batch rate limiting is enforced properly
3. the rate limiter works correctly with streaming
4. the rate limiter is aware of the cache
5. The rate limiter can take into account information about the inputs
into the
model (we can add optional inputs to it down-the road together with
outputs!)
The only downside is that information will not be properly reflected in
tracing
as we don't have any metadata evens about a rate limiter. So the total
time
spent on a model invocation will be:
* time spent waiting for the rate limiter
* time spend on the actual model request
## Example
```python
from langchain_core.rate_limiters import InMemoryRateLimiter
from langchain_groq import ChatGroq
groq = ChatGroq(rate_limiter=InMemoryRateLimiter(check_every_n_seconds=1))
groq.invoke('hello')
```
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
### Description
* support asynchronous in InMemoryVectorStore
* since embeddings might be possible to call asynchronously, ensure that
both asynchronous and synchronous functions operate correctly.
This PR introduces the following Runnables:
1. BaseRateLimiter: an abstraction for specifying a time based rate
limiter as a Runnable
2. InMemoryRateLimiter: Provides an in-memory implementation of a rate
limiter
## Example
```python
from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda
from datetime import datetime
foo = InMemoryRateLimiter(requests_per_second=0.5)
def meow(x):
print(datetime.now().strftime("%H:%M:%S.%f"))
return x
chain = foo | meow
for _ in range(10):
print(chain.invoke('hello'))
```
Produces:
```
17:12:07.530151
hello
17:12:09.537932
hello
17:12:11.548375
hello
17:12:13.558383
hello
17:12:15.568348
hello
17:12:17.578171
hello
17:12:19.587508
hello
17:12:21.597877
hello
17:12:23.607707
hello
17:12:25.617978
hello
```

## Interface
The rate limiter uses the following interface for acquiring a token:
```python
class BaseRateLimiter(Runnable[Input, Output], abc.ABC):
@abc.abstractmethod
def acquire(self, *, blocking: bool = True) -> bool:
"""Attempt to acquire the necessary tokens for the rate limiter.```
```
The flag `blocking` has been added to the abstraction to allow
supporting streaming (which is easier if blocking=False).
## Limitations
- The rate limiter is not designed to work across different processes.
It is an in-memory rate limiter, but it is thread safe.
- The rate limiter only supports time-based rate limiting. It does not
take into account the size of the request or any other factors.
- The current implementation does not handle streaming inputs well and
will consume all inputs even if the rate limit has been reached. Better
support for streaming inputs will be added in the future.
- When the rate limiter is combined with another runnable via a
RunnableSequence, usage of .batch() or .abatch() will only respect the
average rate limit. There will be bursty behavior as .batch() and
.abatch() wait for each step to complete before starting the next step.
One way to mitigate this is to use batch_as_completed() or
abatch_as_completed().
## Bursty behavior in `batch` and `abatch`
When the rate limiter is combined with another runnable via a
RunnableSequence, usage of .batch() or .abatch() will only respect the
average rate limit. There will be bursty behavior as .batch() and
.abatch() wait for each step to complete before starting the next step.
This becomes a problem if users are using `batch` and `abatch` with many
inputs (e.g., 100). In this case, there will be a burst of 100 inputs
into the batch of the rate limited runnable.
1. Using a RunnableBinding
The API would look like:
```python
from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda
rate_limiter = InMemoryRateLimiter(requests_per_second=0.5)
def meow(x):
return x
rate_limited_meow = RunnableLambda(meow).with_rate_limiter(rate_limiter)
```
2. Another option is to add some init option to RunnableSequence that
changes `.batch()` to be depth first (e.g., by delegating to
`batch_as_completed`)
```python
RunnableSequence(first=rate_limiter, last=model, how='batch-depth-first')
```
Pros: Does not require Runnable Binding
Cons: Feels over-complicated
- **Description:** Add a DocumentTransformer for executing one or more
`LinkExtractor`s and adding the extracted links to each document.
- **Issue:** n/a
- **Depedencies:** none
---------
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Feedback that `RunnableWithMessageHistory` is unwieldy compared to
ConversationChain and similar legacy abstractions is common.
Legacy chains using memory typically had no explicit notion of threads
or separate sessions. To use `RunnableWithMessageHistory`, users are
forced to introduce this concept into their code. This possibly felt
like unnecessary boilerplate.
Here we enable `RunnableWithMessageHistory` to run without a config if
the `get_session_history` callable has no arguments. This enables
minimal implementations like the following:
```python
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
memory = InMemoryChatMessageHistory()
chain = RunnableWithMessageHistory(llm, lambda: memory)
chain.invoke("Hi I'm Bob") # Hello Bob!
chain.invoke("What is my name?") # Your name is Bob.
```
Before, if an exception was raised in the outer `try` block in
`Runnable._atransform_stream_with_config` before `iterator_` is
assigned, the corresponding `finally` block would blow up with an
`UnboundLocalError`:
```txt
UnboundLocalError: cannot access local variable 'iterator_' where it is not associated with a value
```
By assigning an initial value to `iterator_` before entering the `try`
block, this commit ensures that the `finally` can run, and not bury the
"true" exception under a "During handling of the above exception [...]"
traceback.
Thanks for your consideration!
This will allow tools and parsers to accept pydantic models from any of
the
following namespaces:
* pydantic.BaseModel with pydantic 1
* pydantic.BaseModel with pydantic 2
* pydantic.v1.BaseModel with pydantic 2
Description:
This PR fixes a KeyError: 400 that occurs in the JSON schema processing
within the reduce_openapi_spec function. The _retrieve_ref function in
json_schema.py was modified to handle missing components gracefully by
continuing to the next component if the current one is not found. This
ensures that the OpenAPI specification is fully interpreted and the
agent executes without errors.
Issue:
Fixes issue #24335
Dependencies:
No additional dependencies are required for this change.
Twitter handle:
@lunara_x
The functions `convert_to_messages` has had an expansion of the
arguments it can take:
1. Previously, it only could take a `Sequence` in order to iterate over
it. This has been broadened slightly to an `Iterable` (which should have
no other impact).
2. Support for `PromptValue` and `BaseChatPromptTemplate` has been
added. These are generated when combining messages using the overloaded
`+` operator.
Functions which rely on `convert_to_messages` (namely `filter_messages`,
`merge_message_runs` and `trim_messages`) have had the type of their
arguments similarly expanded.
Resolves#23706.
<!--
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
-->
---------
Signed-off-by: JP-Ellis <josh@jpellis.me>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Spell check fixes for docs, comments, and a couple of
strings. No code change e.g. variable names.
**Issue:** none
**Dependencies:** none
**Twitter handle:** hmartin
Disabled by default.
```python
from langchain_core.tools import tool
@tool(parse_docstring=True)
def foo(bar: str, baz: int) -> str:
"""The foo.
Args:
bar: this is the bar
baz: this is the baz
"""
return bar
foo.args_schema.schema()
```
```json
{
"title": "fooSchema",
"description": "The foo.",
"type": "object",
"properties": {
"bar": {
"title": "Bar",
"description": "this is the bar",
"type": "string"
},
"baz": {
"title": "Baz",
"description": "this is the baz",
"type": "integer"
}
},
"required": [
"bar",
"baz"
]
}
```
Refactor the code to use the existing InMemroyVectorStore.
This change is needed for another PR that moves some of the imports
around (and messes up the mock.patch in this file)
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Decisions to discuss:
1. is a new attr needed or could additional_kwargs be used for this
2. is raw_output a good name for this attr
3. should raw_output default to {} or None
4. should raw_output be included in serialization
5. do we need to update repr/str to exclude raw_output
- add version of AIMessageChunk.__add__ that can add many chunks,
instead of only 2
- In agenerate_from_stream merge and parse chunks in bg thread
- In output parse base classes do more work in bg threads where
appropriate
---------
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
This PR moves the in memory implementation to langchain-core.
* The implementation remains importable from langchain-community.
* Supporting utilities are marked as private for now.
resolves https://github.com/langchain-ai/langchain/issues/23911
When an AIMessageChunk is instantiated, we attempt to parse tool calls
off of the tool_call_chunks.
Here we add a special-case to this parsing, where `""` will be parsed as
`{}`.
This is a reaction to how Anthropic streams tool calls in the case where
a function has no arguments:
```
{'id': 'toolu_01J8CgKcuUVrMqfTQWPYh64r', 'input': {}, 'name': 'magic_function', 'type': 'tool_use', 'index': 1}
{'partial_json': '', 'type': 'tool_use', 'index': 1}
```
The `partial_json` does not accumulate to a valid json string-- most
other providers tend to emit `"{}"` in this case.
This PR introduces a GraphStore component. GraphStore extends
VectorStore with the concept of links between documents based on
document metadata. This allows linking documents based on a variety of
techniques, including common keywords, explicit links in the content,
and other patterns.
This works with existing Documents, so it’s easy to extend existing
VectorStores to be used as GraphStores. The interface can be implemented
for any Vector Store technology that supports metadata, not only graph
DBs.
When retrieving documents for a given query, the first level of search
is done using classical similarity search. Next, links may be followed
using various traversal strategies to get additional documents. This
allows documents to be retrieved that aren’t directly similar to the
query but contain relevant information.
2 retrieving methods are added to the VectorStore ones :
* traversal_search which gets all linked documents up to a certain depth
* mmr_traversal_search which selects linked documents using an MMR
algorithm to have more diverse results.
If a depth of retrieval of 0 is used, GraphStore is effectively a
VectorStore. It enables an easy transition from a simple VectorStore to
GraphStore by adding links between documents as a second step.
An implementation for Apache Cassandra is also proposed.
See
https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb
for a notebook explaining how to use GraphStore and that shows that it
can answer correctly to questions that a simple VectorStore cannot.
**Twitter handle:** _cbornet
This PR rolls out part of the new proposed interface for vectorstores
(https://github.com/langchain-ai/langchain/pull/23544) to existing store
implementations.
The PR makes the following changes:
1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert
methods to the vectorstore.
2. Updates `add_texts` and `aadd_texts` to be non required with a
default implementation that delegates to `upsert` and `aupsert` if those
have been implemented. The original `add_texts` and `aadd_texts` methods
are problematic as they spread object specific information across
document and **kwargs. (e.g., ids are not a part of the document)
3. Adds a default implementation to `add_documents` and `aadd_documents`
that delegates to `upsert` and `aupsert` respectively.
4. Adds standard unit tests to verify that a given vectorstore
implements a correct read/write API.
A downside of this implementation is that it creates `upsert` with a
very similar signature to `add_documents`.
The reason for introducing `upsert` is to:
* Remove any ambiguities about what information is allowed in `kwargs`.
Specifically kwargs should only be used for information common to all
indexed data. (e.g., indexing timeout).
*Allow inheriting from an anticipated generalized interface for indexing
that will allow indexing `BaseMedia` (i.e., allow making a vectorstore
for images/audio etc.)
`add_documents` can be deprecated in the future in favor of `upsert` to
make sure that users have a single correct way of indexing content.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
**Description**: After reviewing the prompts API, it is clear that the
only way a user can explicitly mark an input variable as optional is
through the `MessagePlaceholder.optional` attribute. Otherwise, the user
must explicitly pass in the `input_variables` expected to be used in the
`BasePromptTemplate`, which will be validated upon execution. Therefore,
to semantically handle a `MessagePlaceholder` `variable_name` as
optional, we will treat the `variable_name` of `MessagePlaceholder` as a
`partial_variable` if it has been marked as optional. This approach
aligns with how the `variable_name` of `MessagePlaceholder` is already
handled
[here](https://github.com/keenborder786/langchain/blob/optional_input_variables/libs/core/langchain_core/prompts/chat.py#L991).
Additionally, an attribute `optional_variable` has been added to
`BasePromptTemplate`, and the `variable_name` of `MessagePlaceholder` is
also made part of `optional_variable` when marked as optional.
Moreover, the `get_input_schema` method has been updated for
`BasePromptTemplate` to differentiate between optional and non-optional
variables.
**Issue**: #22832, #21425
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
This PR should fix the following issue:
https://github.com/langchain-ai/langchain/issues/23824
Introduced as part of this PR:
https://github.com/langchain-ai/langchain/pull/23416
I am unable to reproduce the issue locally though it's clear that we're
getting a `serialized` object which is not a dictionary somehow.
The test below passes for me prior to the PR as well
```python
def test_cache_with_sqllite() -> None:
from langchain_community.cache import SQLiteCache
from langchain_core.globals import set_llm_cache
cache = SQLiteCache(database_path=".langchain.db")
set_llm_cache(cache)
chat_model = FakeListChatModel(responses=["hello", "goodbye"], cache=True)
assert chat_model.invoke("How are you?").content == "hello"
assert chat_model.invoke("How are you?").content == "hello"
```
- Description: Add support for `path` and `detail` keys in
`ImagePromptTemplate`. Previously, only variables associated with the
`url` key were considered. This PR allows for the inclusion of a local
image path and a detail parameter as input to the format method.
- Issues:
- fixes#20820
- related to #22024
- Dependencies: None
- Twitter handle: @DeschampsTho5
---------
Co-authored-by: tdeschamps <tdeschamps@kameleoon.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>