Commit Graph

8 Commits

Author SHA1 Message Date
André Quintino
a26c786bc5
community: refactor opensearch query constructor to use wildcard instead of match in the contain comparator (#26653)
- **Description:** Changed the comparator to use a wildcard query
instead of match. This modification allows for partial text matching on
analyzed fields, which improves the flexibility of the search by
performing full-text searches that aren't limited to exact matches.
- **Issue:** The previous implementation used a match query, which
performs exact matches on analyzed fields. This approach limited the
search capabilities by requiring the query terms to align with the
indexed text. The modification to use a wildcard query instead addresses
this limitation. The wildcard query allows for partial text matching,
which means the search can return results even if only a portion of the
term matches the text. This makes the search more flexible and suitable
for use cases where exact matches aren't necessary or expected, enabling
broader full-text searches across analyzed fields.
In short, the problem was that match queries were too restrictive, and
the change to wildcard queries enhances the ability to perform partial
matches.
- **Dependencies:** none
- **Twitter handle:** @Andre_Q_Pereira

---------

Co-authored-by: André Quintino <andre.quintino@tui.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-16 11:16:34 -05:00
William Smith
15e7353168
langchain_community: updated query constructor for Databricks Vector Search due to LangChainDeprecationWarning: filters was deprecated since langchain-community 0.2.11 and will be removed in 0.3. Please use filter instead. (#27974)
- **Description:** Updated the kwargs for the structured query from
filters to filter due to deprecation of 'filters' for Databricks Vector
Search. Also changed the error messages as the allowed operators and
comparators are different which can cause issues with functions such as
get_query_constructor_prompt()

- **Issue:** Fixes the Key Error for filters due to deprecation in favor
for 'filter':

LangChainDeprecationWarning: DatabricksVectorSearch received a key
`filters` in search_kwargs. `filters` was deprecated since
langchain-community 0.2.11 and will be removed in 0.3. Please use
`filter` instead.

- **Dependencies:** N/A
- **Twitter handle:** N/A

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-03 16:03:53 -08:00
Alex Thomas
5867f25ff3
community[patch]: Neo4j community deprecation (#28130)
Adds deprecation notices for Neo4j components moving to the
`langchain_neo4j` partner package.

- Adds deprecation warnings to all Neo4j-related classes and functions
that have been migrated to the new `langchain_neo4j` partner package
- Updates documentation to reference the new `langchain_neo4j` package
instead of `langchain_community`
2024-11-25 10:34:22 -08:00
默奕
6377185291
add neo4j query constructor for self query (#25288)
- [x] **PR title - community: add neo4j query constructor for self
query**

- [x] **PR message**
- **Description:** adding a Neo4jTranslator so that the Neo4j vector
database can use SelfQueryRetriever
    - **Issue:** this issue had been raised before in #19748
    - **Dependencies:** none. 
    - **Twitter handle:** @moyi_dang
- p.s. I have not added the query constructor in BUILTIN_TRANSLATORS in
this PR, I want to make changes to only one package at a time.

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-30 14:54:33 +00:00
Eugene Yurtsev
d24b82357f
community[patch]: Add missing annotations (#24890)
This PR adds annotations in comunity package.

Annotations are only strictly needed in subclasses of BaseModel for
pydantic 2 compatibility.

This PR adds some unnecessary annotations, but they're not bad to have
regardless for documentation pages.
2024-07-31 18:13:44 +00:00
yonarw
b65ac8d39c
community[minor]: Self query retriever for HANA Cloud Vector Engine (#24494)
Description:

- This PR adds a self query retriever implementation for SAP HANA Cloud
Vector Engine. The retriever supports all operators except for contains.
- Issue: N/A
- Dependencies: no new dependencies added

**Add tests and docs:**
Added integration tests to:
libs/community/tests/unit_tests/query_constructors/test_hanavector.py

**Documentation for self query retriever:**
/docs/integrations/retrievers/self_query/hanavector_self_query.ipynb

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-26 06:56:51 +00:00
Bagatur
a0c2281540
infra: update mypy 1.10, ruff 0.5 (#23721)
```python
"""python scripts/update_mypy_ruff.py"""
import glob
import tomllib
from pathlib import Path

import toml
import subprocess
import re

ROOT_DIR = Path(__file__).parents[1]


def main():
    for path in glob.glob(str(ROOT_DIR / "libs/**/pyproject.toml"), recursive=True):
        print(path)
        with open(path, "rb") as f:
            pyproject = tomllib.load(f)
        try:
            pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = (
                "^1.10"
            )
            pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = (
                "^0.5"
            )
        except KeyError:
            continue
        with open(path, "w") as f:
            toml.dump(pyproject, f)
        cwd = "/".join(path.split("/")[:-1])
        completed = subprocess.run(
            "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color",
            cwd=cwd,
            shell=True,
            capture_output=True,
            text=True,
        )
        logs = completed.stdout.split("\n")

        to_ignore = {}
        for l in logs:
            if re.match("^(.*)\:(\d+)\: error:.*\[(.*)\]", l):
                path, line_no, error_type = re.match(
                    "^(.*)\:(\d+)\: error:.*\[(.*)\]", l
                ).groups()
                if (path, line_no) in to_ignore:
                    to_ignore[(path, line_no)].append(error_type)
                else:
                    to_ignore[(path, line_no)] = [error_type]
        print(len(to_ignore))
        for (error_path, line_no), error_types in to_ignore.items():
            all_errors = ", ".join(error_types)
            full_path = f"{cwd}/{error_path}"
            try:
                with open(full_path, "r") as f:
                    file_lines = f.readlines()
            except FileNotFoundError:
                continue
            file_lines[int(line_no) - 1] = (
                file_lines[int(line_no) - 1][:-1] + f"  # type: ignore[{all_errors}]\n"
            )
            with open(full_path, "w") as f:
                f.write("".join(file_lines))

        subprocess.run(
            "poetry run ruff format .; poetry run ruff --select I --fix .",
            cwd=cwd,
            shell=True,
            capture_output=True,
            text=True,
        )


if __name__ == "__main__":
    main()

```
2024-07-03 10:33:27 -07:00
Eugene Yurtsev
f92006de3c
multiple: langchain 0.2 in master (#21191)
0.2rc 

migrations

- [x] Move memory
- [x] Move remaining retrievers
- [x] graph_qa chains
- [x] some dependency from evaluation code potentially on math utils
- [x] Move openapi chain from `langchain.chains.api.openapi` to
`langchain_community.chains.openapi`
- [x] Migrate `langchain.chains.ernie_functions` to
`langchain_community.chains.ernie_functions`
- [x] migrate `langchain/chains/llm_requests.py` to
`langchain_community.chains.llm_requests`
- [x] Moving `langchain_community.cross_enoders.base:BaseCrossEncoder`
->
`langchain_community.retrievers.document_compressors.cross_encoder:BaseCrossEncoder`
(namespace not ideal, but it needs to be moved to `langchain` to avoid
circular deps)
- [x] unit tests langchain -- add pytest.mark.community to some unit
tests that will stay in langchain
- [x] unit tests community -- move unit tests that depend on community
to community
- [x] mv integration tests that depend on community to community
- [x] mypy checks

Other todo

- [x] Make deprecation warnings not noisy (need to use warn deprecated
and check that things are implemented properly)
- [x] Update deprecation messages with timeline for code removal (likely
we actually won't be removing things until 0.4 release) -- will give
people more time to transition their code.
- [ ] Add information to deprecation warning to show users how to
migrate their code base using langchain-cli
- [ ] Remove any unnecessary requirements in langchain (e.g., is
SQLALchemy required?)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-08 16:46:52 -04:00