**Description:** Spell check fixes for docs, comments, and a couple of
strings. No code change e.g. variable names.
**Issue:** none
**Dependencies:** none
**Twitter handle:** hmartin
Disabled by default.
```python
from langchain_core.tools import tool
@tool(parse_docstring=True)
def foo(bar: str, baz: int) -> str:
"""The foo.
Args:
bar: this is the bar
baz: this is the baz
"""
return bar
foo.args_schema.schema()
```
```json
{
"title": "fooSchema",
"description": "The foo.",
"type": "object",
"properties": {
"bar": {
"title": "Bar",
"description": "this is the bar",
"type": "string"
},
"baz": {
"title": "Baz",
"description": "this is the baz",
"type": "integer"
}
},
"required": [
"bar",
"baz"
]
}
```
Refactor the code to use the existing InMemroyVectorStore.
This change is needed for another PR that moves some of the imports
around (and messes up the mock.patch in this file)
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Decisions to discuss:
1. is a new attr needed or could additional_kwargs be used for this
2. is raw_output a good name for this attr
3. should raw_output default to {} or None
4. should raw_output be included in serialization
5. do we need to update repr/str to exclude raw_output
This PR moves the in memory implementation to langchain-core.
* The implementation remains importable from langchain-community.
* Supporting utilities are marked as private for now.
resolves https://github.com/langchain-ai/langchain/issues/23911
When an AIMessageChunk is instantiated, we attempt to parse tool calls
off of the tool_call_chunks.
Here we add a special-case to this parsing, where `""` will be parsed as
`{}`.
This is a reaction to how Anthropic streams tool calls in the case where
a function has no arguments:
```
{'id': 'toolu_01J8CgKcuUVrMqfTQWPYh64r', 'input': {}, 'name': 'magic_function', 'type': 'tool_use', 'index': 1}
{'partial_json': '', 'type': 'tool_use', 'index': 1}
```
The `partial_json` does not accumulate to a valid json string-- most
other providers tend to emit `"{}"` in this case.
This PR introduces a GraphStore component. GraphStore extends
VectorStore with the concept of links between documents based on
document metadata. This allows linking documents based on a variety of
techniques, including common keywords, explicit links in the content,
and other patterns.
This works with existing Documents, so it’s easy to extend existing
VectorStores to be used as GraphStores. The interface can be implemented
for any Vector Store technology that supports metadata, not only graph
DBs.
When retrieving documents for a given query, the first level of search
is done using classical similarity search. Next, links may be followed
using various traversal strategies to get additional documents. This
allows documents to be retrieved that aren’t directly similar to the
query but contain relevant information.
2 retrieving methods are added to the VectorStore ones :
* traversal_search which gets all linked documents up to a certain depth
* mmr_traversal_search which selects linked documents using an MMR
algorithm to have more diverse results.
If a depth of retrieval of 0 is used, GraphStore is effectively a
VectorStore. It enables an easy transition from a simple VectorStore to
GraphStore by adding links between documents as a second step.
An implementation for Apache Cassandra is also proposed.
See
https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb
for a notebook explaining how to use GraphStore and that shows that it
can answer correctly to questions that a simple VectorStore cannot.
**Twitter handle:** _cbornet
This PR rolls out part of the new proposed interface for vectorstores
(https://github.com/langchain-ai/langchain/pull/23544) to existing store
implementations.
The PR makes the following changes:
1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert
methods to the vectorstore.
2. Updates `add_texts` and `aadd_texts` to be non required with a
default implementation that delegates to `upsert` and `aupsert` if those
have been implemented. The original `add_texts` and `aadd_texts` methods
are problematic as they spread object specific information across
document and **kwargs. (e.g., ids are not a part of the document)
3. Adds a default implementation to `add_documents` and `aadd_documents`
that delegates to `upsert` and `aupsert` respectively.
4. Adds standard unit tests to verify that a given vectorstore
implements a correct read/write API.
A downside of this implementation is that it creates `upsert` with a
very similar signature to `add_documents`.
The reason for introducing `upsert` is to:
* Remove any ambiguities about what information is allowed in `kwargs`.
Specifically kwargs should only be used for information common to all
indexed data. (e.g., indexing timeout).
*Allow inheriting from an anticipated generalized interface for indexing
that will allow indexing `BaseMedia` (i.e., allow making a vectorstore
for images/audio etc.)
`add_documents` can be deprecated in the future in favor of `upsert` to
make sure that users have a single correct way of indexing content.
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
**Description**: After reviewing the prompts API, it is clear that the
only way a user can explicitly mark an input variable as optional is
through the `MessagePlaceholder.optional` attribute. Otherwise, the user
must explicitly pass in the `input_variables` expected to be used in the
`BasePromptTemplate`, which will be validated upon execution. Therefore,
to semantically handle a `MessagePlaceholder` `variable_name` as
optional, we will treat the `variable_name` of `MessagePlaceholder` as a
`partial_variable` if it has been marked as optional. This approach
aligns with how the `variable_name` of `MessagePlaceholder` is already
handled
[here](https://github.com/keenborder786/langchain/blob/optional_input_variables/libs/core/langchain_core/prompts/chat.py#L991).
Additionally, an attribute `optional_variable` has been added to
`BasePromptTemplate`, and the `variable_name` of `MessagePlaceholder` is
also made part of `optional_variable` when marked as optional.
Moreover, the `get_input_schema` method has been updated for
`BasePromptTemplate` to differentiate between optional and non-optional
variables.
**Issue**: #22832, #21425
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- Description: Add support for `path` and `detail` keys in
`ImagePromptTemplate`. Previously, only variables associated with the
`url` key were considered. This PR allows for the inclusion of a local
image path and a detail parameter as input to the format method.
- Issues:
- fixes#20820
- related to #22024
- Dependencies: None
- Twitter handle: @DeschampsTho5
---------
Co-authored-by: tdeschamps <tdeschamps@kameleoon.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Use pydantic to infer nested schemas and all that fun.
Include bagatur's convenient docstring parser
Include annotation support
Previously we didn't adequately support many typehints in the
bind_tools() method on raw functions (like optionals/unions, nested
types, etc.)
The prompt template variable detection only worked for singly-nested
sections because we just kept track of whether we were in a section and
then set that to false as soon as we encountered an end block. i.e. the
following:
```
{{#outerSection}}
{{variableThatShouldntShowUp}}
{{#nestedSection}}
{{nestedVal}}
{{/nestedSection}}
{{anotherVariableThatShouldntShowUp}}
{{/outerSection}}
```
Would yield `['outerSection', 'anotherVariableThatShouldntShowUp']` as
input_variables (whereas it should just yield `['outerSection']`). This
fixes that by keeping track of the current depth and using a stack.
This PR implements a BaseContent object from which Document and Blob
objects will inherit proposed here:
https://github.com/langchain-ai/langchain/pull/23544
Alternative: Create a base object that only has an identifier and no
metadata.
For now decided against it, since that refactor can be done at a later
time. It also feels a bit odd since our IDs are optional at the moment.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
This PR introduces a maxsize parameter for the InMemoryCache class,
allowing users to specify the maximum number of items to store in the
cache. If the cache exceeds the specified maximum size, the oldest items
are removed. Additionally, comprehensive unit tests have been added to
ensure all functionalities are thoroughly tested. The tests are written
using pytest and cover both synchronous and asynchronous methods.
Twitter: @spyrosavl
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Fix LLM string representation for serializable objects.
Fix for issue: https://github.com/langchain-ai/langchain/issues/23257
The llm string of serializable chat models is the serialized
representation of the object. LangChain serialization dumps some basic
information about non serializable objects including their repr() which
includes an object id.
This means that if a chat model has any non serializable fields (e.g., a
cache), then any new instantiation of the those fields will change the
llm representation of the chat model and cause chat misses.
i.e., re-instantiating a postgres cache would result in cache misses!
This change adds a new message type `RemoveMessage`. This will enable
`langgraph` users to manually modify graph state (or have the graph
nodes modify the state) to remove messages by `id`
Examples:
* allow users to delete messages from state by calling
```python
graph.update_state(config, values=[RemoveMessage(id=state.values[-1].id)])
```
* allow nodes to delete messages
```python
graph.add_node("delete_messages", lambda state: [RemoveMessage(id=state[-1].id)])
```
This PR adds an optional ID field to the document schema.
# 1. Optional or Required
- An optional field will will requrie additional checking for the type
in user code (annoying).
- However, vectorstores currently don't respect this field. So if we
make it
required and start returning random UUIDs that might be even more
confusing
to users.
**Proposal**: Start with Optional and convert to Required (with default
set to uuid4()) in 1-2 major releases.
# 2. Override __str__ or generic solution in prompts
Overriding __str__ as a simple way to avoid changing user code that
relies on
default str(document) in prompts.
I considered rolling out a more general solution in prompts
(https://github.com/langchain-ai/langchain/pull/8685),
but to do that we need to:
1. Make things serializable
2. The more general solution would likely need to be backwards
compatible as well
3. It's unclear that one wants to format a List[int] in the same way as
List[Document]. The former should be `,` seperated (likely), the latter
should be `---` separated (likely).
**Proposal** Start with __str__ override and focus on the vectorstore
APIs, we generalize prompts later
These currently read off AIMessage.tool_calls, and only fall back to
OpenAI parsing if tool calls aren't populated.
Importing these from `openai_tools` (e.g., in our [tool calling
docs](https://python.langchain.com/v0.2/docs/how_to/tool_calling/#tool-calls))
can lead to confusion.
After landing, would need to release core and update docs.
This fixes processing issue for nodes with numbers in their labels (e.g.
`"node_1"`, which would previously be relabeled as `"node__"`, and now
are correctly processed as `"node_1"`)
- **Description:** When use
RunnableWithMessageHistory/SQLChatMessageHistory in async mode, we'll
get the following error:
```
Error in RootListenersTracer.on_chain_end callback: RuntimeError("There is no current event loop in thread 'asyncio_3'.")
```
which throwed by
ddfbca38df/libs/community/langchain_community/chat_message_histories/sql.py (L259).
and no message history will be add to database.
In this patch, a new _aexit_history function which will'be called in
async mode is added, and in turn aadd_messages will be called.
In this patch, we use `afunc` attribute of a Runnable to check if the
end listener should be run in async mode or not.
- **Issue:** #22021, #22022
- **Dependencies:** N/A
- **Description:** Add optional max_messages to MessagePlaceholder
- **Issue:**
[16096](https://github.com/langchain-ai/langchain/issues/16096)
- **Dependencies:** None
- **Twitter handle:** @davedecaprio
Sometimes it's better to limit the history in the prompt itself rather
than the memory. This is needed if you want different prompts in the
chain to have different history lengths.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- Moved doc-strings below attribtues in TypedDicts -- seems to render
better on APIReference pages.
* Provided more description and some simple code examples
**Description:** `astream_events(version="v2")` didn't propagate
exceptions in `langchain-core<=0.2.6`, fixed in the #22916. This PR adds
a unit test to check that exceptions are propagated upwards.
Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
This raises ImportError due to a circular import:
```python
from langchain_core import chat_history
```
This does not:
```python
from langchain_core import runnables
from langchain_core import chat_history
```
Here we update `test_imports` to run each import in a separate
subprocess. Open to other ways of doing this!
- StopIteration can't be set on an asyncio.Future it raises a TypeError
and leaves the Future pending forever so we need to convert it to a
RuntimeError
This PR moves the validation of the decorator to a better place to avoid
creating bugs while deprecating code.
Prevent issues like this from arising:
https://github.com/langchain-ai/langchain/issues/22510
we should replace with a linter at some point that just does static
analysis
- [x] **Adding AsyncRootListener**: "langchain_core: Adding
AsyncRootListener"
- **Description:** Adding an AsyncBaseTracer, AsyncRootListener and
`with_alistener` function. This is to enable binding async root listener
to runnables. This currently only supported for sync listeners.
- **Issue:** None
- **Dependencies:** None
- [x] **Add tests and docs**: Added units tests and example snippet code
within the function description of `with_alistener`
- [x] **Lint and test**: Run make format_diff, make lint_diff and make
test
This PR adds deduplication of callback handlers in merge_configs.
Fix for this issue:
https://github.com/langchain-ai/langchain/issues/22227
The issue appears when the code is:
1) running python >=3.11
2) invokes a runnable from within a runnable
3) binds the callbacks to the child runnable from the parent runnable
using with_config
In this case, the same callbacks end up appearing twice: (1) the first
time from with_config, (2) the second time with langchain automatically
propagating them on behalf of the user.
Prior to this PR this will emit duplicate events:
```python
@tool
async def get_items(question: str, callbacks: Callbacks): # <--- Accept callbacks
"""Ask question"""
template = ChatPromptTemplate.from_messages(
[
(
"human",
"'{question}"
)
]
)
chain = template | chat_model.with_config(
{
"callbacks": callbacks, # <-- Propagate callbacks
}
)
return await chain.ainvoke({"question": question})
```
Prior to this PR this will work work correctly (no duplicate events):
```python
@tool
async def get_items(question: str, callbacks: Callbacks): # <--- Accept callbacks
"""Ask question"""
template = ChatPromptTemplate.from_messages(
[
(
"human",
"'{question}"
)
]
)
chain = template | chat_model
return await chain.ainvoke({"question": question}, {"callbacks": callbacks})
```
This will also work (as long as the user is using python >= 3.11) -- as
langchain will automatically propagate callbacks
```python
@tool
async def get_items(question: str,):
"""Ask question"""
template = ChatPromptTemplate.from_messages(
[
(
"human",
"'{question}"
)
]
)
chain = template | chat_model
return await chain.ainvoke({"question": question})
```
Anthropic's streaming treats tool calls as different content parts
(streamed back with a different index) from normal content in the
`content`.
This means that we need to update our chunk-merging logic to handle
chunks with multi-part content. The alternative is coerceing Anthropic's
responses into a string, but we generally like to preserve model
provider responses faithfully when we can. This will also likely be
useful for multimodal outputs in the future.
This current PR does unfortunately make `index` a magic field within
content parts, but Anthropic and OpenAI both use it at the moment to
determine order anyway. To avoid cases where we have content arrays with
holes and to simplify the logic, I've also restricted merging to chunks
in order.
TODO: tests
CC @baskaryan @ccurme @efriis
- This is a pattern that shows up occasionally in langgraph questions,
people chain a graph to something else after, and want to pass the graph
some kwargs (eg. stream_mode)
LangSmith and LangChain context var handling evolved in parallel since
originally we didn't expect people to want to interweave the decorator
and langchain code.
Once we get a new langsmith release, this PR will let you seemlessly
hand off between @traceable context and runnable config context so you
can arbitrarily nest code.
It's expected that this fails right now until we get another release of
the SDK
# Description
## Problem
`Runnable.get_graph` fails when `InputType` or `OutputType` property
raises `TypeError`.
-
003c98e5b4/libs/core/langchain_core/runnables/base.py (L250-L274)
-
003c98e5b4/libs/core/langchain_core/runnables/base.py (L394-L396)
This problem prevents getting a graph of `Runnable` objects whose
`InputType` or `OutputType` property raises `TypeError` but whose
`invoke` works well, such as `langchain.output_parsers.RegexParser`,
which I have already pointed out in #19792 that a `TypeError` would
occur.
## Solution
- Add `try-except` syntax to handle `TypeError` to the codes which get
`input_node` and `output_node`.
# Issue
- #19801
# Twitter Handle
- [hmdev3](https://twitter.com/hmdev3)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
```python
class UsageMetadata(TypedDict):
"""Usage metadata for a message, such as token counts.
Attributes:
input_tokens: (int) count of input (or prompt) tokens
output_tokens: (int) count of output (or completion) tokens
total_tokens: (int) total token count
"""
input_tokens: int
output_tokens: int
total_tokens: int
```
```python
class AIMessage(BaseMessage):
...
usage_metadata: Optional[UsageMetadata] = None
"""If provided, token usage information associated with the message."""
...
```
- if tap_output_iter/aiter is called multiple times for the same run
issue events only once
- if chat model run is tapped don't issue duplicate on_llm_new_token
events
- if first chunk arrives after run has ended do not emit it as a stream
event
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Example error message:
line 206, in _get_python_function_required_args
if is_function_type and required[0] == "self":
~~~~~~~~^^^
IndexError: list index out of range
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Check if event stream is closed in memory loop.
Using try/except here to avoid race condition, but this may incur a
small overhead in versions prios to 3.11
Do not prefix function signature
---
* Reason for this is that information is already present with tool
calling models.
* This will save on tokens for those models, and makes it more obvious
what the description is!
* The @tool can get more parameters to allow a user to re-introduce the
the signature if we want
To permit proper coercion of objects like the following:
```python
class MyAsyncCallable:
async def __call__(self, foo):
return await ...
class MyAsyncGenerator:
async def __call__(self, foo):
await ...
yield
```
This PR introduces a v2 implementation of astream events that removes
intermediate abstractions and fixes some issues with v1 implementation.
The v2 implementation significantly reduces relevant code that's
associated with the astream events implementation together with
overhead.
After this PR, the astream events implementation:
- Uses an async callback handler
- No longer relies on BaseTracer
- No longer relies on json patch
As a result of this re-write, a number of issues were discovered with
the existing implementation.
## Changes in V2 vs. V1
### on_chat_model_end `output`
The outputs associated with `on_chat_model_end` changed depending on
whether it was within a chain or not.
As a root level runnable the output was:
```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```
As part of a chain the output was:
```
"data": {
"output": {
"generations": [
[
{
"generation_info": None,
"message": AIMessageChunk(
content="hello world!", id=AnyStr()
),
"text": "hello world!",
"type": "ChatGenerationChunk",
}
]
],
"llm_output": None,
}
},
```
After this PR, we will always use the simpler representation:
```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```
**NOTE** Non chat models (i.e., regular LLMs) are still associated with
the more verbose format.
### Remove some `_stream` events
`on_retriever_stream` and `on_tool_stream` events were removed -- these
were not real events, but created as an artifact of implementing on top
of astream_log.
The same information is already available in the `x_on_end` events.
### Propagating Names
Names of runnables have been updated to be more consistent
```python
model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
messages=ConfigurableField(
id="messages",
name="Messages",
description="Messages return by the LLM",
)
)
```
Before:
```python
"name": "RunnableConfigurableFields",
```
After:
```python
"name": "GenericFakeChatModel",
```
### on_retriever_end
on_retriever_end will always return `output` which is a list of
documents (rather than a dict containing a key called "documents")
### Retry events
Removed the `on_retry` callback handler. It was incorrectly showing that
the failed function being retried has invoked `on_chain_end`
https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394
Add unit tests that show differences between sync / async versions when
streaming.
The inner on_chain_chunk event is missing if mixing sync and async
functionality. Likely due to missing tap_output_iter implementation on
the sync variant of `_transform_stream_with_config`
- it's only node ids that are limited
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Issues (nit):
1. `utils.guard_import` prints wrong error message when there is an
import `error.` It prints the whole `module_name` but should be only the
first part as the pip package name. E.i. `langchain_core.utils` -> print
not `langchain-core` but `langchain_core.utils`. Also replace '_' with
'-' in the pip package name.
2. it does not handle the `ModuleNotFoundError` which raised if
`guard_import("wrong_module")`
Fixed issues; added ut-s. Controversial: I've reraised
`ModuleNotFoundError` as `ImportError`, since in case of the error, the
proposed action is the same - we need to install a missed package.
- support two-tuples of any sequence type (eg. json.loads never produces
tuples)
- support type alias for role key
- if id is passed in in dict form use it
- if tool_calls passed in in dict form use them
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Removed redundant self/cls from required args of class functions in
_get_python_function_required_args:
```python
class MemberTool:
def search_member(
self,
keyword: str,
*args,
**kwargs,
):
"""Search on members with any keyword like first_name, last_name, email
Args:
keyword: Any keyword of member
"""
headers = dict(authorization=kwargs['token'])
members = []
try:
members = request_(
method='SEARCH',
url=f'{service_url}/apiv1/members',
headers=headers,
json=dict(query=keyword),
)
except Exception as e:
logger.info(e.__doc__)
return members
convert_to_openai_tool(MemberTool.search_member)
```
expected result:
```
{'type': 'function', 'function': {'name': 'search_member', 'description': 'Search on members with any keyword like first_name, last_name, username, email', 'parameters': {'type': 'object', 'properties': {'keyword': {'type': 'string', 'description': 'Any keyword of member'}}, 'required': ['keyword']}}}
```
#20685
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description**: This update enhances the `extract_sub_links` function
within the `langchain_core/utils/html.py` module to include query
parameters in the extracted URLs.
**Issue**: N/A
**Dependencies**: No additional dependencies required for this change.
**Twitter handle**: N/A
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Changes
`lanchain_core.output_parsers.CommaSeparatedListOutputParser` to handle
`,` as a delimiter alongside the previous implementation which used `, `
as delimiter.
- **Issue:** Started noticing that some results returned by LLMs were
not getting parsed correctly when the output contained `,` instead of `,
`.
- **Dependencies:** No
- **Twitter handle:** not active on twitter.
<!---
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
-->