Commit Graph

26 Commits

Author SHA1 Message Date
Really Him
918c950737 DOCS: partners/chroma: Fix documentation around chroma query filter syntax (#31058)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"

**Description**:
* Starting to put together some PR's to fix the typing around
`langchain-chroma` `filter` and `where_document` query filtering, as
mentioned:

https://github.com/langchain-ai/langchain/issues/30879
https://github.com/langchain-ai/langchain/issues/30507

The typing of `dict[str, str]` is on the one hand too restrictive (marks
valid filter expressions as ill-typed) and also too permissive (allows
illegal filter expressions). That's not what this PR addresses though.
This PR just removes from the documentation some examples of filters
that are illegal, and also syntactically incorrect: (a) dictionaries
with keys like `$contains` but the key is missing quotation marks; (b)
dictionaries with multiple entries - this is illegal in Chroma filter
syntax and will raise an exception. (`{"foo": "bar", "qux": "baz"}`).
Filter dictionaries in Chroma must have one and one key only. Again this
is just the documentation issue, which is the lowest hanging fruit. I
also think we need to update the types for `filter` and `where_document`
to be (at the very least `dict[str, Any]`), or, since we have access to
Chroma's types, they should be `Where` and `WhereDocument` types. This
has a wider blast radius though, so I'm starting small.

This PR does not fix the issues mentioned above, it's just starting to
get the ball rolling, and cleaning up the documentation.



- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Really Him <hesereallyhim@proton.me>
2025-04-30 17:51:07 -04:00
Sydney Runkle
8c6734325b partners[lint]: run pyupgrade to get code in line with 3.9 standards (#30781)
Using `pyupgrade` to get all `partners` code up to 3.9 standards
(mostly, fixing old `typing` imports).
2025-04-11 07:18:44 -04:00
Iris Liu
0d9f0b4215 docs: updates Chroma integration API ref docs (#29826)
- Description: updates Chroma integration API ref docs
- Issue: #29817
- Dependencies: N/A
- Twitter handle: @irieliu

Co-authored-by: “Iris <“liuirisny@gmail.com”>
2025-02-15 21:05:21 -05:00
ccurme
16fb1f5371 chroma[patch]: release 0.2.2 (#29769)
Resolves https://github.com/langchain-ai/langchain/issues/29765
2025-02-13 02:39:16 +00:00
Mohammad Mohtashim
2310847c0f (Chroma): Small Fix in add_texts when checking for embeddings (#29766)
- **Description:** Small fix in `add_texts` to make embedding
nullability is checked properly.
- **Issue:** #29765

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2025-02-13 02:26:13 +00:00
ccurme
d172984c91 infra: migrate to uv (#29566) 2025-02-06 13:36:26 -05:00
Vinit Kudva
a00258ec12 chroma: fix persistence if client_settings is passed in (#25199)
…ent path given.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-17 10:03:02 -05:00
Kaiwei Zhang
b909d54e70 chroma[patch]: Update logic for assigning ids 2024-12-13 21:58:34 +00:00
ZhangShenao
d26555c682 [VectorStore] Improvement: Improve chroma vector store (#28524)
- Complete unit test
- Fix spelling error
2024-12-05 11:58:32 -05:00
ccurme
8f9b3b7498 chroma[patch]: fix bug (#28538)
Fix bug introduced in
https://github.com/langchain-ai/langchain/pull/27995

If all document IDs are `""`, the chroma SDK will raise
```
DuplicateIDError: Expected IDs to be unique
```

Caught by [docs
tests](https://github.com/langchain-ai/langchain/actions/runs/12180395579/job/33974633950),
but added a test to langchain-chroma as well.
2024-12-05 15:37:19 +00:00
ccurme
eec55c2550 chroma[patch]: add get_by_ids and fix bug (#28516)
- Run standard integration tests in Chroma
- Add `get_by_ids` method
- Fix bug in `add_texts`: if a list of `ids` is passed but any of them
are None, Chroma will raise an exception. Here we assign a uuid.
2024-12-04 14:00:36 -05:00
Eric Pinzur
eff8a54756 langchain_chroma: added document.id support (#27995)
Description:
* Added internal `Document.id` support to Chroma VectorStore

Dependencies:
* https://github.com/langchain-ai/langchain/pull/27968 should be merged
first and this PR should be re-based on top of those changes.

Tests:
* Modified/Added tests for `Document.id` support. All tests are passing.


Note: I am not a member of the Chroma team.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-04 00:04:27 +00:00
Massimiliano Pronesti
83586661d6 partners[chroma]: add retrieval of embedding vectors (#28290)
This PR adds an additional method to `Chroma` to retrieve the embedding
vectors, besides the most relevant Documents. This is sometimes of use
when you need to run a postprocessing algorithm on the retrieved results
based on the vectors, which has been the case for me lately.

Example issue (discussion) requesting this change:
https://github.com/langchain-ai/langchain/discussions/20383

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-11-27 16:34:02 +00:00
SHJUN
f6b2f82099 community: chroma error patch(attribute changed on chroma) (#27827)
There was a change of attribute name which was "max_batch_size". It's
now "get_max_batch_size" method.
I want to use "create_batches" which is right down below.

Please check this PR link.
reference: https://github.com/chroma-core/chroma/pull/2305

---------

Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>
Co-authored-by: Prithvi Kannan <46332835+prithvikannan@users.noreply.github.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Jun Yamog <jkyamog@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ono-hiroki <86904208+ono-hiroki@users.noreply.github.com>
Co-authored-by: Dobiichi-Origami <56953648+Dobiichi-Origami@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Duy Huynh <vndee.huynh@gmail.com>
Co-authored-by: Rashmi Pawar <168514198+raspawar@users.noreply.github.com>
Co-authored-by: sifatj <26035630+sifatj@users.noreply.github.com>
Co-authored-by: Eric Pinzur <2641606+epinzur@users.noreply.github.com>
Co-authored-by: Daniel Vu Dao <danielvdao@users.noreply.github.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: Stéphane Philippart <wildagsx@gmail.com>
2024-11-05 19:43:11 +00:00
RIdham Golakiya
73ad7f2e7a langchain_chroma[patch]: updated example for get documents with where clause (#26767)
Example updated for vectorstore ChromaDB.

If we want to apply multiple filters then ChromaDB supports filters like
this:
Reference: [ChromaDB
filters](https://cookbook.chromadb.dev/core/filters/)

Thank you.
2024-10-08 20:21:58 +00:00
Daniel Cooke
7835c0651f langchain_chroma: Pass through kwargs to Chroma collection.delete (#25970)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:21:24 +00:00
Isaac Francisco
a72fddbf8d [docs]: vector store integration pages (#24858)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-06 17:20:27 +00:00
Bagatur
e81ddb32a6 docs: fix kwargs docstring (#25010)
Fix:
![Screenshot 2024-08-02 at 5 33 37
PM](https://github.com/user-attachments/assets/7c56cdeb-ee81-454c-b3eb-86aa8a9bdc8d)
2024-08-02 19:54:54 -07:00
mrugank-wadekar
66bebeb76a partners: add similarity search by image functionality to langchain_chroma partner package (#22982)
- **Description:** This pull request introduces two new methods to the
Langchain Chroma partner package that enable similarity search based on
image embeddings. These methods enhance the package's functionality by
allowing users to search for images similar to a given image URI. Also
introduces a notebook to demonstrate it's use.
  - **Issue:** N/A
  - **Dependencies:** None
  - **Twitter handle:** @mrugank9009

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-15 18:48:22 +00:00
Bagatur
a0c2281540 infra: update mypy 1.10, ruff 0.5 (#23721)
```python
"""python scripts/update_mypy_ruff.py"""
import glob
import tomllib
from pathlib import Path

import toml
import subprocess
import re

ROOT_DIR = Path(__file__).parents[1]


def main():
    for path in glob.glob(str(ROOT_DIR / "libs/**/pyproject.toml"), recursive=True):
        print(path)
        with open(path, "rb") as f:
            pyproject = tomllib.load(f)
        try:
            pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = (
                "^1.10"
            )
            pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = (
                "^0.5"
            )
        except KeyError:
            continue
        with open(path, "w") as f:
            toml.dump(pyproject, f)
        cwd = "/".join(path.split("/")[:-1])
        completed = subprocess.run(
            "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color",
            cwd=cwd,
            shell=True,
            capture_output=True,
            text=True,
        )
        logs = completed.stdout.split("\n")

        to_ignore = {}
        for l in logs:
            if re.match("^(.*)\:(\d+)\: error:.*\[(.*)\]", l):
                path, line_no, error_type = re.match(
                    "^(.*)\:(\d+)\: error:.*\[(.*)\]", l
                ).groups()
                if (path, line_no) in to_ignore:
                    to_ignore[(path, line_no)].append(error_type)
                else:
                    to_ignore[(path, line_no)] = [error_type]
        print(len(to_ignore))
        for (error_path, line_no), error_types in to_ignore.items():
            all_errors = ", ".join(error_types)
            full_path = f"{cwd}/{error_path}"
            try:
                with open(full_path, "r") as f:
                    file_lines = f.readlines()
            except FileNotFoundError:
                continue
            file_lines[int(line_no) - 1] = (
                file_lines[int(line_no) - 1][:-1] + f"  # type: ignore[{all_errors}]\n"
            )
            with open(full_path, "w") as f:
                f.write("".join(file_lines))

        subprocess.run(
            "poetry run ruff format .; poetry run ruff --select I --fix .",
            cwd=cwd,
            shell=True,
            capture_output=True,
            text=True,
        )


if __name__ == "__main__":
    main()

```
2024-07-03 10:33:27 -07:00
wenngong
f9aea3db07 partners: add lint docstrings for chroma module (#23249)
Description: add lint docstrings for chroma module
Issue: the issue #23188 @baskaryan

test:  ruff check passed.


![image](https://github.com/langchain-ai/langchain/assets/76683249/5e168a0c-32d0-464f-8ddb-110233918019)

---------

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
2024-06-21 19:49:24 +00:00
Klaudia Lemiec
45351d1bc6 docs: Chroma docstrings update (#22001)
Thank you for contributing to LangChain!

- [X] **PR title**: "docs: Chroma docstrings update"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [X] **PR message**: 
    - **Description:** Added and updated Chroma docstrings
    - **Issue:** https://github.com/langchain-ai/langchain/issues/21983


- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
  - only docs


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-05-22 21:45:30 +00:00
Trayan Azarov
f54cbf8ff5 chroma[patch]: Chroma - remove reference to collection upon delete_collection (#21817)
**Description**:

- Reference to `Collection` object is set to `None` when deleting a
collection `delete_collection()`
- Added utility method `reset_collection()` to allow recreating the
collection
- Moved collection creation out of `__init__` into
`__ensure_collection()` to be reused by object init and
`reset_collection()`
- `_collection` is now a property to avoid breaking changes

**Issues**: 

- chroma-core/chroma#2213

**Twitter**: @t_azarov
2024-05-20 15:42:36 -07:00
Erick Friis
91a2ea5cd6 chroma, mongodb: fix docstrings (#21629) 2024-05-13 21:27:43 +00:00
Trayan Azarov
ba7d53689c community: Chroma Adding create_collection_if_not_exists flag to Chroma constructor (#21420)
- **Description:** Adds the ability to either `get_or_create` or simply
`get_collection`. This is useful when dealing with read-only Chroma
instances where users are constraint to using `get_collection`. Targeted
at Http/CloudClients mostly.
- **Issue:** chroma-core/chroma#2163
- **Dependencies:** N/A
- **Twitter handle:** `@t_azarov`




| Collection Exists | create_collection_if_not_exists | Outcome | test |

|-------------------|---------------------------------|----------------------------------------------------------------|----------------------------------------------------------|
| True | False | No errors, collection state unchanged |
`test_create_collection_if_not_exist_false_existing` |
| True | True | No errors, collection state unchanged |
`test_create_collection_if_not_exist_true_existing` |
| False | False | Error, `get_collection()` fails |
`test_create_collection_if_not_exist_false_non_existing` |
| False | True | No errors, `get_or_create_collection()` creates the
collection | `test_create_collection_if_not_exist_true_non_existing` |
2024-05-09 11:45:10 -04:00
killind-dev
f8a54d1d73 chroma: Add chroma partner package (#19292)
**Description:** Adds chroma to the partners package. Tests & code
mirror those in the community package.
**Dependencies:** None
**Twitter handle:** @akiradev0x

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-10 19:33:45 +00:00