This PR updates the update_documents docstring to accurately reflect its
batch behavior and clarify the positional mapping between ids and
documents.
No breaking changes. Documentation-only update.
**Description:**
Fixes the OpenCLIP × Chroma regression that caused nested embedding
errors when adding or searching image data.
The test case `test_openclip_chroma_embed_no_nesting_error` has been
restored and verified to work correctly with the current LangChain core
dependencies.
Functional validation confirms that `similarity_search_by_image` now
returns correct, metadata‑preserving results.
**Issue:**
Fixes #33851
**Dependencies:**
No new dependencies introduced.
**Testing:**
All tests under
```bash
uv run --group test pytest tests/unit_tests
```
result:
```
30 passed in 91.26s (0:01:31)
```
have passed successfully using Python 3.13.9 and uv‑managed environment.
This confirms that the regression has been fixed.
Running
```bash
make test
```
still produces cleanup‑time `AttributeError: 'ProactorEventLoop' object
has no attribute '_ssock'` on Windows (Python 3.13+).
This is a benign asyncio teardown message rather than a functional
failure.
`uv run pytest` closes event loops immediately after tests, while `make
test` invokes pytest through a secondary process layer that leaves a
background loop alive at interpreter shutdown.
This difference in teardown behavior explains the extra messages seen
only when using `make test`.
**Summary:**
- Verified the OpenCLIP + Chroma image pipeline works correctly.
- `uv run --group test pytest` fully passes; the fix is complete.
- The residual `_ssock` warnings occur only during
Windows asyncio cleanup and are not related to this code change.
This is my first time contributing code, please contact me with any
questions
---
---------
Co-authored-by: Mason Daugherty <mason@langchain.dev>
Co-authored-by: Mason Daugherty <github@mdrxy.com>
Largely:
- Remove explicit `"Default is x"` since new refs show default inferred
from sig
- Inline code (useful for eventual parsing)
- Fix code block rendering (indentations)
* Adding support for more Chroma client options (`HttpClient` and
`CloundClient`). This includes adding arguments necessary for
instantiating these clients.
* Adding support for Chroma's new persisted collection configuration (we
moved index configuration into this new construct).
* Delegate `Settings` configuration to Chroma's client constructors.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
- Example: "community: add foobar LLM"
**Description**:
* Starting to put together some PR's to fix the typing around
`langchain-chroma` `filter` and `where_document` query filtering, as
mentioned:
https://github.com/langchain-ai/langchain/issues/30879https://github.com/langchain-ai/langchain/issues/30507
The typing of `dict[str, str]` is on the one hand too restrictive (marks
valid filter expressions as ill-typed) and also too permissive (allows
illegal filter expressions). That's not what this PR addresses though.
This PR just removes from the documentation some examples of filters
that are illegal, and also syntactically incorrect: (a) dictionaries
with keys like `$contains` but the key is missing quotation marks; (b)
dictionaries with multiple entries - this is illegal in Chroma filter
syntax and will raise an exception. (`{"foo": "bar", "qux": "baz"}`).
Filter dictionaries in Chroma must have one and one key only. Again this
is just the documentation issue, which is the lowest hanging fruit. I
also think we need to update the types for `filter` and `where_document`
to be (at the very least `dict[str, Any]`), or, since we have access to
Chroma's types, they should be `Where` and `WhereDocument` types. This
has a wider blast radius though, so I'm starting small.
This PR does not fix the issues mentioned above, it's just starting to
get the ball rolling, and cleaning up the documentation.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Really Him <hesereallyhim@proton.me>
- **Description:** Small fix in `add_texts` to make embedding
nullability is checked properly.
- **Issue:** #29765
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
…ent path given.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- Run standard integration tests in Chroma
- Add `get_by_ids` method
- Fix bug in `add_texts`: if a list of `ids` is passed but any of them
are None, Chroma will raise an exception. Here we assign a uuid.
Description:
* Added internal `Document.id` support to Chroma VectorStore
Dependencies:
* https://github.com/langchain-ai/langchain/pull/27968 should be merged
first and this PR should be re-based on top of those changes.
Tests:
* Modified/Added tests for `Document.id` support. All tests are passing.
Note: I am not a member of the Chroma team.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
This PR adds an additional method to `Chroma` to retrieve the embedding
vectors, besides the most relevant Documents. This is sometimes of use
when you need to run a postprocessing algorithm on the retrieved results
based on the vectors, which has been the case for me lately.
Example issue (discussion) requesting this change:
https://github.com/langchain-ai/langchain/discussions/20383
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
Example updated for vectorstore ChromaDB.
If we want to apply multiple filters then ChromaDB supports filters like
this:
Reference: [ChromaDB
filters](https://cookbook.chromadb.dev/core/filters/)
Thank you.
- **Description:** This pull request introduces two new methods to the
Langchain Chroma partner package that enable similarity search based on
image embeddings. These methods enhance the package's functionality by
allowing users to search for images similar to a given image URI. Also
introduces a notebook to demonstrate it's use.
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** @mrugank9009
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [X] **PR title**: "docs: Chroma docstrings update"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [X] **PR message**:
- **Description:** Added and updated Chroma docstrings
- **Issue:** https://github.com/langchain-ai/langchain/issues/21983
- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- only docs
- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
**Description**:
- Reference to `Collection` object is set to `None` when deleting a
collection `delete_collection()`
- Added utility method `reset_collection()` to allow recreating the
collection
- Moved collection creation out of `__init__` into
`__ensure_collection()` to be reused by object init and
`reset_collection()`
- `_collection` is now a property to avoid breaking changes
**Issues**:
- chroma-core/chroma#2213
**Twitter**: @t_azarov
**Description:** Adds chroma to the partners package. Tests & code
mirror those in the community package.
**Dependencies:** None
**Twitter handle:** @akiradev0x
---------
Co-authored-by: Erick Friis <erick@langchain.dev>