- Mixtral with Groq has started consistently failing tool calling tests.
Here we restrict testing to llama 3.1.
- `.schema` is deprecated in pydantic proper in favor of
`.model_json_schema`.
- [ ] **PR title**: "langchain-openai: openai proxy added to base
embeddings"
- [ ] **PR message**:
- **Description:**
Dear langchain developers,
You've already supported proxy for ChatOpenAI implementation in your
package. At the same time, if somebody needed to use proxy for chat, it
also could be necessary to be able to use it for OpenAIEmbeddings.
That's why I think it's important to add proxy support for OpenAI
embeddings. That's what I've done in this PR.
@baskaryan
---------
Co-authored-by: karpov <karpov@dohod.ru>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
In the `ChatFireworks` class definition, the Field() call for the "stop"
("stop_sequences") parameter is missing the "default" keyword.
**Issue:**
Type checker reports "stop_sequences" as a missing arg (not recognizing
the default value is None)
**Dependencies:**
None
**Twitter handle:**
None
Mistral appears to have added validation for the format of its tool call
IDs:
`{"object":"error","message":"Tool call id was abc123 but must be a-z,
A-Z, 0-9, with a length of
9.","type":"invalid_request_error","param":null,"code":null}`
This breaks compatibility of messages from other providers. Here we add
a function that converts any string to a Mistral-valid tool call ID, and
apply it to incoming messages.
#### Update (2):
A single `UnstructuredLoader` is added to handle both local and api
partitioning. This loader also handles single or multiple documents.
#### Changes in `community`:
Changes here do not affect users. In the initial process of using the
SDK for the API Loaders, the Loaders in community were refactored.
Other changes include:
The `UnstructuredBaseLoader` has a new check to see if both
`mode="paged"` and `chunking_strategy="by_page"`. It also now has
`Element.element_id` added to the `Document.metadata`.
`UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such,
now both directly inherit from `UnstructuredBaseLoader` and initialize
their `file_path`/`file` attributes respectively and implement their own
`_post_process_elements` methods.
--------
#### Update:
New SDK Loaders in a [partner
package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo)
are introduced to prevent breaking changes for users (see discussion
below).
##### TODO:
- [x] Test docstring examples
--------
- **Description:** UnstructuredAPIFileIOLoader and
UnstructuredAPIFileLoader calls to the unstructured api are now made
using the unstructured-client sdk.
- **New Dependencies:** unstructured-client
- [x] **Add tests and docs**: If you're adding a new integration, please
include
- [x] a test for the integration, preferably unit tests that do not rely
on network access,
- [x] update the description in
`docs/docs/integrations/providers/unstructured.mdx`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
TODO:
- [x] Update
https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api
-
`langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb`
- The description here needs to indicate that users should install
`unstructured-client` instead of `unstructured`. Read over closely to
look for any other changes that need to be made.
- [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to
handle json responses from the API instead of just lists of elements.
- This method may need to be overwritten by the API loaders instead of
changing it in the `UnstructuredBaseLoader`.
- [x] Update the documentation links in the class docstrings (the
Unstructured documents have moved)
- [x] Update Document.metadata to include `element_id` (see thread
[here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419))
---------
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
add dynamic field feature to langchain_milvus
more unittest, more robustic
plan to deprecate the `metadata_field` in the future, because it's
function is the same as `enable_dynamic_field`, but the latter one is a
more advanced concept in milvus
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description**
Add support for Pinecone hosted embedding models as
`PineconeEmbeddings`. Replacement for #22890
**Dependencies**
Add `aiohttp` to support async embeddings call against REST directly
- [x] **Add tests and docs**: If you're adding a new integration, please
include
Added `docs/docs/integrations/text_embedding/pinecone.ipynb`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Twitter: `gdjdg17`
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** Fixes an issue where the chat message history was not
returned in order. Fixed it now by returning based on timestamps.
- [x] **Add tests and docs**: Updated the tests to check the order
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
**Description:** : Add support for chat message history using Couchbase
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
## Description
This pull-request improves the treatment of document IDs in
`MongoDBAtlasVectorSearch`.
Class method signatures of add_documents, add_texts, delete, and
from_texts
now include an `ids:Optional[List[str]]` keyword argument permitting the
user
greater control.
Note that, as before, IDs may also be inferred from
`Document.metadata['_id']`
if present, but this is no longer required,
IDs can also optionally be returned from searches.
This PR closes the following JIRA issues.
* [PYTHON-4446](https://jira.mongodb.org/browse/PYTHON-4446)
MongoDBVectorSearch delete / add_texts function rework
* [PYTHON-4435](https://jira.mongodb.org/browse/PYTHON-4435) Add support
for "Indexing"
* [PYTHON-4534](https://jira.mongodb.org/browse/PYTHON-4534) Ensure
datetimes are json-serializable
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** This pull request introduces two new methods to the
Langchain Chroma partner package that enable similarity search based on
image embeddings. These methods enhance the package's functionality by
allowing users to search for images similar to a given image URI. Also
introduces a notebook to demonstrate it's use.
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** @mrugank9009
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
## Description
This PR adds integration tests to follow up on #24164.
By default, the tests use an in-memory instance.
To run the full suite of tests, with both in-memory and Qdrant server:
```
$ docker run -p 6333:6333 qdrant/qdrant
$ make test
$ make integration_test
```
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** Explicitly add parameters from openai API
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
I stumbled upon a bug that led to different similarity scores between
the async and sync similarity searches with relevance scores in Qdrant.
The reason being is that _asimilarity_search_with_relevance_scores is
missing, this makes langchain_qdrant use the method of the vectorstore
baseclass leading to drastically different results.
To illustrate the magnitude here are the results running an identical
search in a test vectorstore.
Output of asimilarity_search_with_relevance_scores:
[0.9902903374601824, 0.9472135924938804, 0.8535534011299859]
Output of similarity_search_with_relevance_scores:
[0.9805806749203648, 0.8944271849877607, 0.7071068022599718]
Co-authored-by: Erick Friis <erick@langchain.dev>