Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.
This works correctly also in pydantic v1.
```python
from pydantic.v1 import BaseModel
class Foo(BaseModel, extra="forbid"):
x: int
Foo(x=5, y=1)
```
And
```python
from pydantic.v1 import BaseModel
class Foo(BaseModel):
x: int
class Config:
extra = "forbid"
Foo(x=5, y=1)
```
## Enum -> literal using grit pattern:
```
engine marzano(0.1)
language python
or {
`extra=Extra.allow` => `extra="allow"`,
`extra=Extra.forbid` => `extra="forbid"`,
`extra=Extra.ignore` => `extra="ignore"`
}
```
Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)
## Sort attributes in Config:
```
engine marzano(0.1)
language python
function sort($values) js {
return $values.text.split(',').sort().join("\n");
}
class_definition($name, $body) as $C where {
$name <: `Config`,
$body <: block($statements),
$values = [],
$statements <: some bubble($values) assignment() as $A where {
$values += $A
},
$body => sort($values),
}
```
Description: RetryWithErrorOutputParser.from_llm() creates a retry chain
that returns a Generation instance, when it should actually just return
a string.
This class was forgotten when fixing the issue in PR #24687
The @pre_init validator is a temporary solution for base models. It has
similar (but not identical) semantics to @root_validator(), but it works
strictly as a pre-init validator.
It'll work as expected as long as the pydantic model type hints were
correct.
Description: Since moving away from `langchain-community` is
recommended, `init_chat_models()` should import ChatOllama from
`langchain-ollama` instead.
Description: OutputFixingParser.from_llm() creates a retry chain that
returns a Generation instance, when it should actually just return a
string.
Issue: https://github.com/langchain-ai/langchain/issues/24600
Twitter handle: scribu
---------
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Description:
- This PR adds a self query retriever implementation for SAP HANA Cloud
Vector Engine. The retriever supports all operators except for contains.
- Issue: N/A
- Dependencies: no new dependencies added
**Add tests and docs:**
Added integration tests to:
libs/community/tests/unit_tests/query_constructors/test_hanavector.py
**Documentation for self query retriever:**
/docs/integrations/retrievers/self_query/hanavector_self_query.ipynb
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
The previous implementation would never be called.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
This will generate a meaningless string "system: " for generating
condense question; this increases the probability to make an improper
condense question and misunderstand user's question. Below is a case
- Original Question: Can you explain the arguments of Meilisearch?
- Condense Question
- What are the benefits of using Meilisearch? (by CodeLlama)
- What are the reasons for using Meilisearch? (by GPT-4)
The condense questions (not matter from CodeLlam or GPT-4) are different
from the original one.
By checking the content of each dialogue turn, generating history string
only when the dialog content is not empty.
Since there is nothing before first turn, the "history" mechanism will
be ignored at the very first turn.
Doing so, the condense question will be "What are the arguments for
using Meilisearch?".
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
### Description
* Fix `libs/langchain/dev.Dockerfile` file. copy the
`libs/standard-tests` folder when building the devcontainer.
* `poetry install --no-interaction --no-ansi --with dev,test,docs`
command requires this folder, but it was not copied.
### Reference
#### Error message when building the devcontainer from the master branch
```
...
[2024-07-20T14:27:34.779Z] ------
> [langchain langchain-dev-dependencies 7/7] RUN poetry install --no-interaction --no-ansi --with dev,test,docs:
0.409
0.409 Directory ../standard-tests does not exist
------
...
```
#### After the fix
Build success at vscode:
<img width="866" alt="image"
src="https://github.com/user-attachments/assets/10db1b50-6fcf-4dfe-83e1-d93c96aa2317">
Added asynchronously callable methods according to the
ConversationSummaryBufferMemory API documentation.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
**Description:**
When initializing retrievers with `configurable_fields` as base
retriever, `ContextualCompressionRetriever` validation fails with the
following error:
```
ValidationError: 1 validation error for ContextualCompressionRetriever
base_retriever
Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents (type=type_error)
```
Example code:
```python
esearch_retriever = VertexAISearchRetriever(
project_id=GCP_PROJECT_ID,
location_id="global",
data_store_id=SEARCH_ENGINE_ID,
).configurable_fields(
filter=ConfigurableField(id="vertex_search_filter", name="Vertex Search Filter")
)
# rerank documents with Vertex AI Rank API
reranker = VertexAIRank(
project_id=GCP_PROJECT_ID,
location_id=GCP_REGION,
ranking_config="default_ranking_config",
)
retriever_with_reranker = ContextualCompressionRetriever(
base_compressor=reranker, base_retriever=esearch_retriever
)
```
It seems like the issue stems from ContextualCompressionRetriever
insisting that base retrievers must be strictly `BaseRetriever`
inherited, and doesn't take into account cases where retrievers need to
be chained and can have configurable fields defined.
0a1e475a30/libs/langchain/langchain/retrievers/contextual_compression.py (L15-L22)
This PR proposes that the base_retriever type be set to `RetrieverLike`,
similar to how `EnsembleRetriever` validates its list of retrievers:
0a1e475a30/libs/langchain/langchain/retrievers/ensemble.py (L58-L75)
**Description:**
When you use Agents with multi-input tool and some of these tools have
`return_direct=True`, langchain thrown an error related to one
validator.
This change is implemented on [JS
community](https://github.com/langchain-ai/langchainjs/pull/4643) as
well
**Issue**:
This MR resolves#19843
**Dependencies:**
None
Co-authored-by: Jesus Martinez <jesusabraham.martinez@tyson.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Fix MultiQueryRetriever breaking Embeddings with empty lines
```
[chain/end] [1:chain:ConversationalRetrievalChain > 2:retriever:Retriever > 3:retriever:Retriever > 4:chain:LLMChain] [2.03s] Exiting Chain run with output:
[outputs]
> /workspaces/Sfeir/sncf/metabot-backend/.venv/lib/python3.11/site-packages/langchain/retrievers/multi_query.py(116)_aget_relevant_documents()
-> if self.include_original:
(Pdb) queries
['## Alternative questions for "Hello, tell me about phones?":', '', '1. **What are the latest trends in smartphone technology?** (Focuses on recent advancements)', '2. **How has the mobile phone industry evolved over the years?** (Historical perspective)', '3. **What are the different types of phones available in the market, and which one is best for me?** (Categorization and recommendation)']
```
Example of failure on VertexAIEmbeddings
```
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "The text content is empty."
debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.184.234:443 {created_time:"2024-04-30T09:57:45.625698408+00:00", grpc_status:3, grpc_message:"The text content is empty."}"
```
Fixes: #15959
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- **Description:** Add an async version of `add_documents` to
`ParentDocumentRetriever`
- **Twitter handle:** @johnkdev
---------
Co-authored-by: John Kelly <j.kelly@mwam.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
# Description
This PR aims to solve a bug in `OutputFixingParser`, `RetryOutputParser`
and `RetryWithErrorOutputParser`
The bug is that the wrong keyword argument was given to `retry_chain`.
The correct keyword argument is 'completion', but 'input' is used.
This pull request makes the following changes:
1. correct a `dict` key given to `retry_chain`;
2. add a test when using the default prompt.
- `NAIVE_FIX_PROMPT` for `OutputFixingParser`;
- `NAIVE_RETRY_PROMPT` for `RetryOutputParser`;
- `NAIVE_RETRY_WITH_ERROR_PROMPT` for `RetryWithErrorOutputParser`;
3. ~~add comments on `retry_chain` input and output types~~ clarify
`InputType` and `OutputType` of `retry_chain`
# Issue
The bug is pointed out in
https://github.com/langchain-ai/langchain/pull/19792#issuecomment-2196512928
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** `StuffDocumentsChain` uses `LLMChain` which is
deprecated by langchain runnables. `create_stuff_documents_chain` is the
replacement, but needs support for `document_variable_name` to allow
multiple uses of the chain within a longer chain.
- **Issue:** none
- **Dependencies:** none
**Description**: The ``declarative_base()`` function is now available as
sqlalchemy.orm.declarative_base(). (depreca ted since: 2.0) (Background
on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
---------
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
**langchain: ConversationVectorStoreTokenBufferMemory**
-**Description:** This PR adds ConversationVectorStoreTokenBufferMemory.
It is similar in concept to ConversationSummaryBufferMemory. It
maintains an in-memory buffer of messages up to a preset token limit.
After the limit is hit timestamped messages are written into a
vectorstore retriever rather than into a summary. The user's prompt is
then used to retrieve relevant fragments of the previous conversation.
By persisting the vectorstore, one can maintain memory from session to
session.
-**Issue:** n/a
-**Dependencies:** none
-**Twitter handle:** Please no!!!
- [X] **Add tests and docs**: I looked to see how the unit tests were
written for the other ConversationMemory modules, but couldn't find
anything other than a test for successful import. I need to know whether
you are using pytest.mock or another fixture to simulate the LLM and
vectorstore. In addition, I would like guidance on where to place the
documentation. Should it be a notebook file in docs/docs?
- [X] **Lint and test**: I am seeing some linting errors from a couple
of modules unrelated to this PR.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
The SelfQuery PGVectorTranslator is not correct. The operator is "eq"
and not "$eq".
This patch use a new version of PGVectorTranslator from
langchain_postgres.
It's necessary to release a new version of langchain_postgres (see
[here](https://github.com/langchain-ai/langchain-postgres/pull/75)
before accepting this PR in langchain.
fix systax warning in `create_json_chat_agent`
```
.../langchain/agents/json_chat/base.py:22: SyntaxWarning: invalid escape sequence '\ '
"""Create an agent that uses JSON to format its logic, build for Chat Models.
```
- **Description:** Restores compatibility with SQLAlchemy 1.4.x that was
broken since #18992 and adds a test run for this version on CI (only for
Python 3.11)
- **Issue:** fixes#19681
- **Dependencies:** None
- **Twitter handle:** `@krassowski_m`
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
# Description
This pull request aims to address specific issues related to the
ambiguity and error-proneness of the output types of certain output
parsers, as well as the absence of unit tests for some parsers. These
issues could potentially lead to runtime errors or unexpected behaviors
due to type mismatches when used, causing confusion for developers and
users. Through clarifying output types, this PR seeks to improve the
stability and reliability.
Therefore, this pull request
- fixes the `OutputType` of OutputParsers to be the expected type;
- e.g. `OutputType` property of `EnumOutputParser` raises `TypeError`.
This PR introduce a logic to extract `OutputType` from its attribute.
- and fixes the legacy API in OutputParsers like `LLMChain.run` to the
modern API like `LLMChain.invoke`;
- Note: For `OutputFixingParser`, `RetryOutputParser` and
`RetryWithErrorOutputParser`, this PR introduces `legacy` attribute with
False as default value in order to keep the backward compatibility
- and adds the tests for the `OutputFixingParser` and
`RetryOutputParser`.
The following table shows my expected output and the actual output of
the `OutputType` of OutputParsers.
I have used this table to fix `OutputType` of OutputParsers.
| Class Name of OutputParser | My Expected `OutputType` (after this PR)|
Actual `OutputType` [evidence](#evidence) (before this PR)| Fix Required
|
|---------|--------------|---------|--------|
| BooleanOutputParser | `<class 'bool'>` | `<class 'bool'>` | NO |
| CombiningOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| DatetimeOutputParser | `<class 'datetime.datetime'>` | `<class
'datetime.datetime'>` | NO |
| EnumOutputParser(enum=MyEnum) | `MyEnum` | `TypeError` is raised | YES
|
| OutputFixingParser | The same type as `self.parser.OutputType` | `~T`
| YES |
| CommaSeparatedListOutputParser | `typing.List[str]` |
`typing.List[str]` | NO |
| MarkdownListOutputParser | `typing.List[str]` | `typing.List[str]` |
NO |
| NumberedListOutputParser | `typing.List[str]` | `typing.List[str]` |
NO |
| JsonOutputKeyToolsParser | `typing.Any` | `typing.Any` | NO |
| JsonOutputToolsParser | `typing.Any` | `typing.Any` | NO |
| PydanticToolsParser | `typing.Any` | `typing.Any` | NO |
| PandasDataFrameOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| PydanticOutputParser(pydantic_object=MyModel) | `<class
'__main__.MyModel'>` | `<class '__main__.MyModel'>` | NO |
| RegexParser | `typing.Dict[str, str]` | `TypeError` is raised | YES |
| RegexDictParser | `typing.Dict[str, str]` | `TypeError` is raised |
YES |
| RetryOutputParser | The same type as `self.parser.OutputType` | `~T` |
YES |
| RetryWithErrorOutputParser | The same type as `self.parser.OutputType`
| `~T` | YES |
| StructuredOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| YamlOutputParser(pydantic_object=MyModel) | `MyModel` | `~T` | YES |
NOTE: In "Fix Required", "YES" means that it is required to fix in this
PR while "NO" means that it is not required.
# Issue
No issues for this PR.
# Twitter handle
- [hmdev3](https://twitter.com/hmdev3)
# Questions:
1. Is it required to create tests for legacy APIs `LLMChain.run` in the
following scripts?
- libs/langchain/tests/unit_tests/output_parsers/test_fix.py;
- libs/langchain/tests/unit_tests/output_parsers/test_retry.py.
2. Is there a more appropriate expected output type than I expect in the
above table?
- e.g. the `OutputType` of `CombiningOutputParser` should be
SOMETHING...
# Actual outputs (before this PR)
<div id='evidence'></div>
<details><summary>Actual outputs</summary>
## Requirements
- Python==3.9.13
- langchain==0.1.13
```python
Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import langchain
>>> langchain.__version__
'0.1.13'
>>> from langchain import output_parsers
```
### `BooleanOutputParser`
```python
>>> output_parsers.BooleanOutputParser().OutputType
<class 'bool'>
```
### `CombiningOutputParser`
```python
>>> output_parsers.CombiningOutputParser(parsers=[output_parsers.DatetimeOutputParser(), output_parsers.CommaSeparatedListOutputParser()]).OutputType
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
raise TypeError(
TypeError: Runnable CombiningOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```
### `DatetimeOutputParser`
```python
>>> output_parsers.DatetimeOutputParser().OutputType
<class 'datetime.datetime'>
```
### `EnumOutputParser`
```python
>>> from enum import Enum
>>> class MyEnum(Enum):
... a = 'a'
... b = 'b'
...
>>> output_parsers.EnumOutputParser(enum=MyEnum).OutputType
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
raise TypeError(
TypeError: Runnable EnumOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```
### `OutputFixingParser`
```python
>>> output_parsers.OutputFixingParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```
### `CommaSeparatedListOutputParser`
```python
>>> output_parsers.CommaSeparatedListOutputParser().OutputType
typing.List[str]
```
### `MarkdownListOutputParser`
```python
>>> output_parsers.MarkdownListOutputParser().OutputType
typing.List[str]
```
### `NumberedListOutputParser`
```python
>>> output_parsers.NumberedListOutputParser().OutputType
typing.List[str]
```
### `JsonOutputKeyToolsParser`
```python
>>> output_parsers.JsonOutputKeyToolsParser(key_name='tool').OutputType
typing.Any
```
### `JsonOutputToolsParser`
```python
>>> output_parsers.JsonOutputToolsParser().OutputType
typing.Any
```
### `PydanticToolsParser`
```python
>>> from langchain.pydantic_v1 import BaseModel
>>> class MyModel(BaseModel):
... a: int
...
>>> output_parsers.PydanticToolsParser(tools=[MyModel, MyModel]).OutputType
typing.Any
```
### `PandasDataFrameOutputParser`
```python
>>> output_parsers.PandasDataFrameOutputParser().OutputType
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
raise TypeError(
TypeError: Runnable PandasDataFrameOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```
### `PydanticOutputParser`
```python
>>> output_parsers.PydanticOutputParser(pydantic_object=MyModel).OutputType
<class '__main__.MyModel'>
```
### `RegexParser`
```python
>>> output_parsers.RegexParser(regex='$', output_keys=['a']).OutputType
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
raise TypeError(
TypeError: Runnable RegexParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```
### `RegexDictParser`
```python
>>> output_parsers.RegexDictParser(output_key_to_format={'a':'a'}).OutputType
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
raise TypeError(
TypeError: Runnable RegexDictParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```
### `RetryOutputParser`
```python
>>> output_parsers.RetryOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```
### `RetryWithErrorOutputParser`
```python
>>> output_parsers.RetryWithErrorOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```
### `StructuredOutputParser`
```python
>>> from langchain.output_parsers.structured import ResponseSchema
>>> response_schemas = [ResponseSchema(name="foo",description="a list of strings",type="List[string]"),ResponseSchema(name="bar",description="a string",type="string"), ]
>>> output_parsers.StructuredOutputParser.from_response_schemas(response_schemas).OutputType
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
raise TypeError(
TypeError: Runnable StructuredOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```
### `YamlOutputParser`
```python
>>> output_parsers.YamlOutputParser(pydantic_object=MyModel).OutputType
~T
```
<div>
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Fix https://github.com/langchain-ai/langchain/issues/22972.
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
**Description:**
- What I changed
- By specifying the `id_key` during the initialization of
`EnsembleRetriever`, it is now possible to determine which documents to
merge scores for based on the value corresponding to the `id_key`
element in the metadata, instead of `page_content`. Below is an example
of how to use the modified `EnsembleRetriever`:
```python
retriever = EnsembleRetriever(retrievers=[ret1, ret2], id_key="id") #
The Document returned by each retriever must keep the "id" key in its
metadata.
```
- Additionally, I added a script to easily test the behavior of the
`invoke` method of the modified `EnsembleRetriever`.
- Why I changed
- There are cases where you may want to calculate scores by treating
Documents with different `page_content` as the same when using
`EnsembleRetriever`. For example, when you want to ensemble the search
results of the same document described in two different languages.
- The previous `EnsembleRetriever` used `page_content` as the basis for
score aggregation, making the above usage difficult. Therefore, the
score is now calculated based on the specified key value in the
Document's metadata.
**Twitter handle:** @shimajiroxyz
- **Description:** add tool_messages_formatter for tool calling agent,
make tool messages can be formatted in different ways for your LLM.
- **Issue:** N/A
- **Dependencies:** N/A
Remove the REPL from community, and suggest an alternative import from
langchain_experimental.
Fix for this issue:
https://github.com/langchain-ai/langchain/issues/14345
This is not a bug in the code or an actual security risk. The python
REPL itself is behaving as expected.
The PR is done to appease blanket security policies that are just
looking for the presence of exec in the code.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
We need to use a different version of numpy for py3.8 and py3.12 in
pyproject.
And so do projects that use that Python version range and import
langchain.
- **Twitter handle:** _cbornet
Thank you for contributing to LangChain!
- [ ] **PR title**: "langchain: Fix chain_filter.py to be compatible
with async"
- [ ] **PR message**:
- **Description:** chain_filter is not compatible with async.
- **Twitter handle:** pprados
- [X ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Signed-off-by: zhangwangda <zhangwangda94@163.com>
Co-authored-by: Prakul <discover.prakul@gmail.com>
Co-authored-by: Lei Zhang <zhanglei@apache.org>
Co-authored-by: Gin <ictgtvt@gmail.com>
Co-authored-by: wangda <38549158+daziz@users.noreply.github.com>
Co-authored-by: Max Mulatz <klappradla@posteo.net>
- **Description:** allow to use partial variables to pass `top_k` and
`table_info`
- **Issue:** no
- **Dependencies:** no
- **Twitter handle:** @gymnstcs
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** This PR updates the `WandbTracer` to work with the
new RunV2 API so that wandb Traces logging works correctly for new
LangChain versions. Here's an example
[run](https://wandb.ai/parambharat/langchain-tracing/runs/wpm99ftq) from
the existing tests
- **Issue:** https://github.com/wandb/wandb/issues/7762
- **Twitter handle:** @ParamBharat
_If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17._
The fact that we outsourced pgvector to another project has an
unintended effect. The mapping dictionary found by
`_get_builtin_translator()` cannot recognize the new version of pgvector
because it comes from another package.
`SelfQueryRetriever` no longer knows `PGVector`.
I propose to fix this by creating a global dictionary that can be
populated by various database implementations. Thus, importing
`langchain_postgres` will allow the registration of the `PGvector`
mapping.
But for the moment I'm just adding a lazy import
Furthermore, the implementation of _get_builtin_translator()
reconstructs the BUILTIN_TRANSLATORS variable with each invocation,
which is not very efficient. A global map would be an optimization.
- **Twitter handle:** pprados
@eyurtsev, can you review this PR? And unlock the PR [Add async mode for
pgvector](https://github.com/langchain-ai/langchain-postgres/pull/32)
and PR [community[minor]: Add SQL storage
implementation](https://github.com/langchain-ai/langchain/pull/22207)?
Are you in favour of a global dictionary-based implementation of
Translator?
They cause `poetry lock` to take a ton of time, and `uv pip install` can
resolve the constraints from these toml files in trivial time
(addressing problem with #19153)
This allows us to properly upgrade lockfile dependencies moving forward,
which revealed some issues that were either fixed or type-ignored (see
file comments)
decisions to discuss
- only chat models
- model_provider isn't based on any existing values like llm-type,
package names, class names
- implemented as function not as a wrapper ChatModel
- function name (init_model)
- in langchain as opposed to community or core
- marked beta
Add kwargs in add_documents function
**langchain**: Add **kwargs in parent_document_retriever"
- **Add kwargs for `add_document` in `parent_document_retriever.py`**
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
**Description:** Update langchainhub integration test dependency and add
an integration test for pulling private prompt
**Dependencies:** langchainhub 0.1.16