langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-31 02:11:09 +00:00

Author	SHA1	Message	Date
Eric Pinzur	ea0ad917b0	community: added Document.id support to opensearch vectorstore (#27945 ) Description: * Added support of Document.id on OpenSearch vector store * Added tests cases to match	2024-11-06 15:04:09 -05:00
SHJUN	f6b2f82099	community: chroma error patch(attribute changed on chroma) (#27827 ) There was a change of attribute name which was "max_batch_size". It's now "get_max_batch_size" method. I want to use "create_batches" which is right down below. Please check this PR link. reference: https://github.com/chroma-core/chroma/pull/2305 --------- Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com> Co-authored-by: Prithvi Kannan <46332835+prithvikannan@users.noreply.github.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Jun Yamog <jkyamog@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ono-hiroki <86904208+ono-hiroki@users.noreply.github.com> Co-authored-by: Dobiichi-Origami <56953648+Dobiichi-Origami@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Duy Huynh <vndee.huynh@gmail.com> Co-authored-by: Rashmi Pawar <168514198+raspawar@users.noreply.github.com> Co-authored-by: sifatj <26035630+sifatj@users.noreply.github.com> Co-authored-by: Eric Pinzur <2641606+epinzur@users.noreply.github.com> Co-authored-by: Daniel Vu Dao <danielvdao@users.noreply.github.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: Stéphane Philippart <wildagsx@gmail.com>	2024-11-05 19:43:11 +00:00
Ofer Mendelevitch	d7c39e6dbb	community: update Vectara integration (#27869 ) Thank you for contributing to LangChain! - Description: Updated Vectara integration - Issue: refresh on descriptions across all demos and added UDF reranker - Dependencies: None - Twitter handle: @ofermend --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 20:40:39 +00:00
putao520	2545fbe709	fix "WARNING: Received notification from DBMS server: {severity: WARN… (#27112 ) …ING} {code: Neo.ClientNotification.Statement.FeatureDeprecationWarning} {category: DEPRECATION} {title: This feature is deprecated and will be removed in future versions.} {description: CALL subquery without a variable scope clause is now deprecated." this warning Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: putao520 <putao520@putao282.com>	2024-10-31 18:47:25 +00:00
Daniel Birn	389771ccc0	community: fix @embeddingKey in azure cosmos db no sql (#27377 ) I will keep this PR as small as the changes made. Description: fixes a fatal bug syntax error in AzureCosmosDBNoSqlVectorSearch Issue: #27269 #25468	2024-10-31 18:36:02 +00:00
Aayush Kataria	a8a33b2dc6	LangChain-Community - AzureCosmos Mongo vCore: Bug Fix when the data doesn't contain metadata field (#27772 ) Thank you for contributing to LangChain! - Description: Adding an empty metadata field when metadata is not present in the data - Issue: This PR fixes the issue when the data items doesn't contain the metadata field. This happens when there is already data in the container, or cx uses CosmosDB Python SDK to insert data. - Dependencies: No dependencies required Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-10-30 20:05:25 -07:00
Qier LU	8d8e38b090	community[pathch]: Add missing custom content_key handling in Redis vector store (#27736 ) This fix an error caused by missing custom content_key handling in Redis vector store in function similarity_search_with_score.	2024-10-30 13:57:20 +00:00
Erick Friis	600b7bdd61	all: test 3.13 ci (#27197 ) Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-10-25 12:56:58 -07:00
Eric Pinzur	f636c83321	community: Cassandra Vector Store: modernize implementation (#27253 ) Description: This PR updates `CassandraGraphVectorStore` to be based off `CassandraVectorStore`, instead of using a custom CQL implementation. This allows users using a `CassandraVectorStore` to upgrade to a `GraphVectorStore` without having to change their database schema or re-embed documents. This PR also updates the documentation of the `GraphVectorStore` base class and contains native async implementations for the standard graph methods: `traversal_search` and `mmr_traversal_search` in `CassandraVectorStore`. Issue: No issue number. Dependencies: https://github.com/langchain-ai/langchain/pull/27078 (already-merged) Lint and test: - Lint and tests all pass, including existing `CassandraGraphVectorStore` tests. - Also added numerous additional tests based of the tests in `langchain-astradb` which cover many more scenarios than the existing tests for `Cassandra` and `CassandraGraphVectorStore` BREAKING CHANGE Note that this is a breaking change for existing users of `CassandraGraphVectorStore`. They will need to wipe their database table and restart. However: - The interfaces have not changed. Just the underlying storage mechanism. - Any one using `langchain_community.vectorstores.Cassandra` can instead use `langchain_community.graph_vectorstores.CassandraGraphVectorStore` and they will gain Graph capabilities without having to re-embed their existing documents. This is the primary goal of this PR. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-22 18:11:11 +00:00
Yuki Watanabe	b8bfebd382	community: Add deprecation notice for Databricks integration in langchain-community (#27355 ) We have released the [langchain-databricks](https://github.com/langchain-ai/langchain-databricks) package for Databricks integration. This PR deprecates the legacy classes within `langchain-community`. --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-16 02:20:40 +00:00
Matthew Peveler	c6533616b6	docs: fix community pgvector deprecation warning formatting (#27094 ) Description: PR fixes some formatting errors in deprecation message in the `langchain_community.vectorstores.pgvector` module, where it was missing spaces between a few words, and one word was misspelled. Issue: n/a Dependencies: n/a Signed-off-by: mpeveler@timescale.com Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-15 15:45:53 +00:00
Marcelo Nunes Alves	5647276998	community: Problem with embeddings in new versions of clickhouse. (#26041 ) Starting with Clickhouse version 24.8, a different type of configuration has been introduced in the vectorized data ingestion, and if this configuration occurs, an error occurs when generating the table. As can be seen below: ![Screenshot from 2024-09-04 11-48-00](https://github.com/user-attachments/assets/70840a93-1001-490c-921a-26924c51d9eb) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-11 18:54:50 +00:00
Vittorio Rigamonti	7da2efd9d3	community[minor]: VectorStore Infinispan. Adding TLS and authentication (#23522 ) Description: this PR enable VectorStore TLS and authentication (digest, basic) with HTTP/2 for Infinispan server. Based on httpx. Added docker-compose facilities for testing Added documentation Dependencies: requires `pip install httpx[http2]` if HTTP2 is needed Twitter handle: https://twitter.com/infinispan	2024-10-09 10:51:39 -04:00
Stefano Lottini	d05fdd97dd	community: Cassandra Vector Store: extend metadata-related methods (#27078 ) Description: this PR adds a set of methods to deal with metadata associated to the vector store entries. These, while essential to the Graph-related extension of the `Cassandra` vector store, are also useful in themselves. These are (all come in their sync+async versions): - `[a]delete_by_metadata_filter` - `[a]replace_metadata` - `[a]get_by_document_id` - `[a]metadata_search` Additionally, a `[a]similarity_search_with_embedding_id_by_vector` method is introduced to better serve the store's internal working (esp. related to reranking logic). Issue: no issue number, but now all Document's returned bear their `.id` consistently (as a consequence of a slight refactoring in how the raw entries read from DB are made back into `Document` instances). Dependencies: (no new deps: packaging comes through langchain-core already; `cassio` is now required to be version 0.1.10+) Add tests and docs Added integration tests for the relevant newly-introduced methods. (Docs will be updated in a separate PR). Lint and test Lint and (updated) test all pass. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-09 06:41:34 +00:00
Tomaz Bratanic	481bd25d29	community: Fix database connections for neo4j (#27190 ) Fixes https://github.com/langchain-ai/langchain/issues/27185 Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-08 23:47:55 +00:00
Oleksii Pokotylo	37ca468d03	community: AzureSearch: fix reranking for empty lists (#27104 ) Description: Fix reranking for empty lists Issue: ``` ValueError: not enough values to unpack (expected 3, got 0) documents, scores, vectors = map(list, zip(*docs)) File langchain_community/vectorstores/azuresearch.py", line 1680, in _reorder_results_with_maximal_marginal_relevance ``` Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com>	2024-10-07 15:27:09 -04:00
Christophe Bornet	c4ebccfec2	core[minor]: Improve support for id in VectorStore (#26660 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-07 15:01:08 -04:00
Tomaz Bratanic	446144e7c6	Update neo4j vector procedures (#26775 )	2024-09-30 14:45:09 -07:00
Eugene Yurtsev	7fde2791dc	core[patch]: Add kwargs to Runnable (#27008 ) Fixes #26685 --------- Co-authored-by: Tibor Reiss <tibor.reiss@gmail.com>	2024-09-30 16:45:29 -04:00
Abhi Agarwal	696114e145	community: add sqlite-vec vectorstore (#25003 ) Description: Adds a vector store integration with [sqlite-vec](https://alexgarcia.xyz/sqlite-vec/), the successor to sqlite-vss that is a single C file with no external dependencies. Pretty straightforward, just copy-pasted the sqlite-vss integration and made a few tweaks and added integration tests. Only question is whether all documentation should be directed away from sqlite-vss if it is defacto deprecated (cc @asg017). --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: philippe-oger <philippe.oger@adevinta.com>	2024-09-26 17:37:10 +00:00
Rajendra Kadam	51c4393298	community[patch]: Fix validation error in SettingsConfigDict across multiple Langchain modules (#26852 ) - Description: This pull request addresses the validation error in `SettingsConfigDict` due to extra fields in the `.env` file. The issue is prevalent across multiple Langchain modules. This fix ensures that extra fields in the `.env` file are ignored, preventing validation errors. Changes include: - Applied fixes to modules using `SettingsConfigDict`. - Issue: NA, similar https://github.com/langchain-ai/langchain/issues/26850 - Dependencies: NA	2024-09-25 10:02:14 -04:00
Eric	90031b1b3e	support epsilla cloud vector database in langchain (#26065 ) Description - support epsilla cloud in langchain --------- Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-20 17:14:23 +00:00
Martin Triska	3fc0ea510e	community : [bugfix] Use document ids as keys in AzureSearch vectorstore (#25486 ) # Description [Vector store base class](`4cdaca67dc/libs/core/langchain_core/vectorstores/base.py (L65)`) currently expects `ids` to be passed in and that is what it passes along to the AzureSearch vector store when attempting to `add_texts()`. However AzureSearch expects `keys` to be passed in. When they are not present, AzureSearch `add_embeddings()` makes up new uuids. This is a problem when trying to run indexing. [Indexing code expects](`b297af5482/libs/core/langchain_core/indexing/api.py (L371)`) the documents to be uploaded using provided ids. Currently AzureSearch ignores `ids` passed from `indexing` and makes up new ones. Later when `indexer` attempts to delete removed file, it uses the `id` it had stored when uploading the document, however it was uploaded under different `id`. Twitter handle: @martintriska1	2024-09-19 09:37:18 -04:00
Tomaz Bratanic	03b9aca55d	community: Retry retriable errors in Neo4j (#26211 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-19 04:01:07 +00:00
wlleiiwang	2ef4c9466f	community: modify document links for tencent vectordb (#26316 ) - modify document links for create a tencent vectordb database instance. Co-authored-by: wlleiiwang <wlleiiwang@tencent.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-17 21:11:10 +00:00
Erick Friis	c2a3021bb0	multiple: pydantic 2 compatibility, v0.3 (#26443 ) Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com> Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com> Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: ZhangShenao <15201440436@163.com> Co-authored-by: Friso H. Kingma <fhkingma@gmail.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Morgante Pell <morgantep@google.com>	2024-09-13 14:38:45 -07:00
ZhangShenao	c812237217	Improvement[Community] Improve args description in api doc of `DocArrayInMemorySearch` (#26024 ) - Add missing arg - Remove redundant arg	2024-09-04 09:26:26 -04:00
JonZeolla	78ff51ce83	community[patch]: update the default hf bge embeddings (#22627 ) Description: This updates the langchain_community > huggingface > default bge embeddings ([the current default recommends this change](https://huggingface.co/BAAI/bge-large-en)) Issue: None Dependencies: None Twitter handle: @jonzeolla --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-02 22:10:21 +00:00
Leonid Ganeline	150251fd49	docs: `integrations` reference updates 13 (#25711 ) Added missed provider pages and links. Fixed inconsistent formatting. Added arxiv references to docstirngs. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-09-02 22:08:50 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	64b62f6ae4	community[neo4j_vector]: make embedding dimension check optional (#25737 ) Description: Starting from Neo4j 5.23 (22 August 2024), with vector-2.0 indexes, `vector.dimensions` is not required to be set, which will cause it the key not exist error in index config if it's not set. Since the existence of vector.dimensions will only ensure additional checks, this commit turns embedding dimension check optional, and only do checks when it exists (not None). https://neo4j.com/release-notes/database/neo4j-5/ Twitter handle: @HollowM186 Signed-off-by: Hollow Man <hollowman@opensuse.org> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-31 12:36:20 +00:00
Erick Friis	f7e62754a1	community: undo azure_ad_access_token breaking change (#25818 )	2024-08-30 02:06:14 +00:00
ccurme	afe8ccaaa6	community[patch]: Add ID field back to Azure AI Search results (#25828 ) Commandeering https://github.com/langchain-ai/langchain/pull/23243 as maintainers don't have ability to modify that PR. Fixes https://github.com/langchain-ai/langchain/issues/22827 --------- Co-authored-by: Ming Quah <fleetadmiralbutter@icloud.com>	2024-08-28 17:56:50 -04:00
Tomaz Bratanic	f359e6b0a5	Add mmr to neo4j vector (#25765 )	2024-08-27 08:55:19 -04:00
Luis Valencia	99f9a664a5	community: Azure Search Vector Store is missing Access Token Authentication (#24330 ) Added Azure Search Access Token Authentication instead of API KEY auth. Fixes Issue: https://github.com/langchain-ai/langchain/issues/24263 Dependencies: None Twitter: @levalencia @baskaryan Could you please review? First time creating a PR that fixes some code. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-26 15:41:50 -07:00
Ian	64ace25eb8	<Community>: tidb vector support vector index (#19984 ) This PR introduces adjustments to ensure compatibility with the recently released preview version of [TiDB Serverless Vector Search](https://tidb.cloud/ai), aiming to prevent user confusion. - TiDB Vector now supports vector indexing with cosine and l2 distance strategies, although inner_product remains unsupported. - Changing the distance strategy is currently not supported, so the test cased should be adjusted.	2024-08-23 13:59:23 -04:00
Djordje	22f9ae489f	community: Opensearch - added score function for similarity_score_threshold (#23928 ) This PR resolves the NotImplemented error for the similarity_score_threshold search type for OpenSearch.	2024-08-23 11:30:04 -04:00
conjuncts	818267bbc3	community: allow chroma DB delete() to use "where" argument (#19826 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" Description: Simply pass kwargs to allow arguments like "where" to be propagated Issue: Previously, db.delete(where={}) wouldn't work for chroma vectorstores Dependencies: N/A Twitter handle: N/A - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-08-23 10:10:57 -04:00
Erik Lindgren	583b0449eb	community[patch]: Fix Hybrid Search for non-Databricks managed embeddings (#25590 ) Description: Send both the query and query_embedding to the Databricks index for hybrid search. Issue: When using hybrid search with non-Databricks managed embedding we currently don't pass both the embedding and query_text to the index. Hybrid search requires both of these. This change fixes this issue for both `similarity_search` and `similarity_search_by_vector`. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-23 08:57:13 +00:00
Noah Mayerhofer	0091947efd	community: add retry for session expired exception in neo4j (#25660 ) Description: The neo4j driver can raise a SessionExpired error, which is considered a retriable error. If a query fails with a SessionExpired error, this change retries every query once. This change will make the neo4j integration less flaky. Twitter handle: noahmay_	2024-08-22 13:07:36 +00:00
Jabir	12e490ea56	Update azuresearch.py (#25577 ) This will allow complextype metadata to be returned. the current implementation throws error when dealing with nested metadata Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-20 12:53:30 +00:00
Bagatur	493e474063	docs: udpated api reference (#25172 ) - Move the API reference into the vercel build - Update api reference organization and styling	2024-08-14 07:00:17 -07:00
ccurme	27690506d0	multiple: update removal targets (#25361 )	2024-08-14 09:50:39 -04:00
thedavgar	9d08369442	community: fix AzureSearch vectorstore asyncronous methods (#24921 ) Description Fix the asyncronous methods to retrieve documents from AzureSearch VectorStore. The previous changes from [this commit](`ffe6ca986e`) create a similar code for the syncronous methods and the asyncronous ones but the asyncronous client return an asyncronous iterator "AsyncSearchItemPaged" as said in the issue #24740. To solve this issue, the syncronous iterators in asyncronous methods where changed to asyncronous iterators. @chrislrobert said in [this comment](https://github.com/langchain-ai/langchain/issues/24740#issuecomment-2254168302) that there was a still a flaw due to `with` blocks that close the client after each call. I removed this `with` blocks in the `async_client` following the same pattern as the sync `client`. In order to close up the connections, a __del__ method is included to gently close up clients once the vectorstore object is destroyed. Issue: #24740 and #24064 Dependencies: No new dependencies for this change Example notebook: I created a notebook just to test the changes work and gives the same results as the syncronous methods for vector and hybrid search. With these changes, the asyncronous methods in the retriever work as well. ![image](https://github.com/user-attachments/assets/697e431b-9d7f-4d0d-b205-59d051ac2b67) Lint and test: Passes the tests and the linter	2024-08-13 14:20:51 -07:00
Eugene Yurtsev	6dd9f053e3	core[patch]: Deprecating beta upsert APIs in vectorstore (#25069 ) This PR deprecates the beta upsert APIs in vectorstore. We'll introduce them in a V2 abstraction instead to keep the existing vectorstore implementations lighter weight. The main problem with the existing APIs is that it's a bit more challenging to implement the correct behavior w/ respect to IDs since ID can be present in both the function signature and as an optional attribute on the document object. But VectorStores that pass the standard tests should have implemented the semantics properly! --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-09 17:17:36 -04:00
Eugene Yurtsev	6e57aa7c36	community[patch]: Remove usage of @root_validator(allow_reuse=True) (#25235 ) Remove usage of @root_validator(allow_reuse=True)	2024-08-09 10:57:42 -04:00
thiswillbeyourgithub	a2b4c33bd6	community[patch]: FAISS: ValueError mentions normalize_score_fn isntead of relevance_score_fn (#25225 ) Thank you for contributing to LangChain! - [X] PR title: "community: fix valueerror mentions wrong argument missing" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [X] PR message: *Delete this entire checklist* and replace with - Description: when faiss.py has a None relevance_score_fn it raises a ValueError that says a normalize_fn_score argument is needed. Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-09 14:40:29 +00:00
Eugene Yurtsev	98779797fe	community[patch]: Use get_fields adapter for pydantic (#25191 ) Change all usages of __fields__ with get_fields adapter merged into langchain_core. Code mod generated using the following grit pattern: ``` engine marzano(0.1) language python `$X.__fields__` => `get_fields($X)` where { add_import(source="langchain_core.utils.pydantic", name="get_fields") } ```	2024-08-08 14:43:09 -04:00
Eugene Yurtsev	bf5193bb99	community[patch]: Upgrade pydantic extra (#25185 ) Upgrade to using a literal for specifying the extra which is the recommended approach in pydantic 2. This works correctly also in pydantic v1. ```python from pydantic.v1 import BaseModel class Foo(BaseModel, extra="forbid"): x: int Foo(x=5, y=1) ``` And ```python from pydantic.v1 import BaseModel class Foo(BaseModel): x: int class Config: extra = "forbid" Foo(x=5, y=1) ``` ## Enum -> literal using grit pattern: ``` engine marzano(0.1) language python or { `extra=Extra.allow` => `extra="allow"`, `extra=Extra.forbid` => `extra="forbid"`, `extra=Extra.ignore` => `extra="ignore"` } ``` Resorted attributes in config and removed doc-string in case we will need to deal with going back and forth between pydantic v1 and v2 during the 0.3 release. (This will reduce merge conflicts.) ## Sort attributes in Config: ``` engine marzano(0.1) language python function sort($values) js { return $values.text.split(',').sort().join("\n"); } class_definition($name, $body) as $C where { $name <: `Config`, $body <: block($statements), $values = [], $statements <: some bubble($values) assignment() as $A where { $values += $A }, $body => sort($values), } ```	2024-08-08 17:20:39 +00:00
Isaac Francisco	a72fddbf8d	[docs]: vector store integration pages (#24858 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-06 17:20:27 +00:00
jigsawlabs-student	427a04151c	community: fix neo4j from_existing_graph (#24912 ) Fixes Neo4JVector.from_existing_graph integration with huggingface Previously threw an error with existing databases, because from_existing_graph query returns empty list of new nodes, which are then passed to embedding function, and huggingface errors with empty list. Fixes [24401](https://github.com/langchain-ai/langchain/issues/24401) --------- Co-authored-by: Jeff Katzy <jeffreyerickatz@gmail.com>	2024-08-05 21:01:46 +00:00
Bagatur	e81ddb32a6	docs: fix kwargs docstring (#25010 ) Fix: ![Screenshot 2024-08-02 at 5 33 37 PM](https://github.com/user-attachments/assets/7c56cdeb-ee81-454c-b3eb-86aa8a9bdc8d)	2024-08-02 19:54:54 -07:00
Eugene Yurtsev	d24b82357f	community[patch]: Add missing annotations (#24890 ) This PR adds annotations in comunity package. Annotations are only strictly needed in subclasses of BaseModel for pydantic 2 compatibility. This PR adds some unnecessary annotations, but they're not bad to have regardless for documentation pages.	2024-07-31 18:13:44 +00:00
Shailendra Mishra	f2d810b3c0	clob_bugfix... (#24813 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-30 12:44:04 -04:00
Pere Pasamonte	98175860ad	community: Fix AWS DocumentDB similarity_search when filter is None (#24777 ) Description Fixes DocumentDBVectorSearch similarity_search when no filter is used; it defaults to None but $match does not accept None, so changed default to empty {} before pipeline is created. Issue AWS DocumentDB similarity search does not work when no filter is used. Error msg: "the match filter must be an expression in an object" #24775 Dependencies No dependencies Twitter handle https://x.com/perepasamonte --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-29 15:32:05 +00:00
Marc Gibbons	cc451effd1	community[patch]: langchain_community.vectorstores.azuresearch Raise LangChainException instead of bare Exception (#23935 ) Raise `LangChainException` instead of `Exception`. This alleviates the need for library users to use bare try/except to handle exceptions raised by `AzureSearch`. Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-26 15:59:06 +00:00
yonarw	b65ac8d39c	community[minor]: Self query retriever for HANA Cloud Vector Engine (#24494 ) Description: - This PR adds a self query retriever implementation for SAP HANA Cloud Vector Engine. The retriever supports all operators except for contains. - Issue: N/A - Dependencies: no new dependencies added Add tests and docs: Added integration tests to: libs/community/tests/unit_tests/query_constructors/test_hanavector.py Documentation for self query retriever: /docs/integrations/retrievers/self_query/hanavector_self_query.ipynb --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-26 06:56:51 +00:00
Yuki Watanabe	2b6a262f84	community[patch]: Replace `filters` argument to `filter` in DatabricksVectorSearch (#24530 ) The [DatabricksVectorSearch](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/databricks_vector_search.py#L21) class exposes similarity search APIs with argument `filters`, which is inconsistent with other VS classes who uses `filter` (singular). This PR updates the argument and add alias for backward compatibility. --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>	2024-07-25 21:20:18 -07:00
Chaunte W. Lacewell	69eacaa887	Community[minor]: Update VDMS vectorstore (#23729 ) Description: - This PR exposes some functions in VDMS vectorstore, updates VDMS related notebooks, updates tests, and upgrade version of VDMS (>=0.0.20) Issue: N/A Dependencies: - Update vdms>=0.0.20	2024-07-25 22:13:04 -04:00
Aayush Kataria	0f45ac4088	LangChain Community: VectorStores: Azure Cosmos DB Filtered Vector Search (#24087 ) Thank you for contributing to LangChain! - This PR adds vector search filtering for Azure Cosmos DB Mongo vCore and NoSQL. - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-23 16:59:23 -07:00
ccurme	dcba7df2fe	community[patch]: deprecate langchain_community Chroma in favor of langchain_chroma (#24474 )	2024-07-22 11:00:13 -04:00
ZhangShenao	0f6737cbfe	[Vector Store] Fix function `add_texts` in `TencentVectorDB` (#24469 ) Regardless of whether `embedding_func` is set or not, the 'text' attribute of document should be assigned, otherwise the `page_content` in the document of the final search result will be lost	2024-07-22 09:50:22 -04:00
Rafael Pereira	cf28708e7b	Neo4j: Update with non-deprecated cypher methods, and new method to associate relationship embeddings (#23725 ) Description: At the moment neo4j wrapper is using setVectorProperty, which is deprecated ([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)). I replaced with the non-deprecated version. Neo4j recently introduced a new cypher method to associate embeddings into relations using "setRelationshipVectorProperty" method. In this PR I also implemented a new method to perform this association maintaining the same format used in the "add_embeddings" method which is used to associate embeddings into Nodes. I also included a test case for this new method.	2024-07-17 12:37:47 -04:00
bovlb	5caa381177	community[minor]: Add ApertureDB as a vectorstore (#24088 ) Thank you for contributing to LangChain! - [X] ApertureDB as vectorstore: "community: Add ApertureDB as a vectorestore" - Description:* this change provides a new community integration that uses ApertureData's ApertureDB as a vector store. - Issue: none - Dependencies: depends on ApertureDB Python SDK - Twitter handle: ApertureData - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Integration tests rely on a local run of a public docker image. Example notebook additionally relies on a local Ollama server. - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ All lint tests pass. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Gautam <gautam@aperturedata.io>	2024-07-16 09:32:59 -07:00
Lage Ragnarsson	a3c10fc6ce	community: Add support for specifying hybrid search for Databricks vector search (#23528 ) Description: Databricks Vector Search recently added support for hybrid keyword-similarity search. See [usage examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint) from their documentation. This PR updates the Langchain vectorstore interface for Databricks to enable the user to pass the query_type parameter to similarity_search to make use of this functionality. By default, there will not be any changes for existing users of this interface. To use the new hybrid search feature, it is now possible to do ```python # ... dvs = DatabricksVectorSearch(index) dvs.similarity_search("my search query", query_type="HYBRID") ``` Or using the retriever: ```python retriever = dvs.as_retriever( search_kwargs={ "query_type": "HYBRID", } ) retriever.invoke("my search query") ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-15 22:14:08 +00:00
mrugank-wadekar	66bebeb76a	partners: add similarity search by image functionality to langchain_chroma partner package (#22982 ) - Description: This pull request introduces two new methods to the Langchain Chroma partner package that enable similarity search based on image embeddings. These methods enhance the package's functionality by allowing users to search for images similar to a given image URI. Also introduces a notebook to demonstrate it's use. - Issue: N/A - Dependencies: None - Twitter handle: @mrugank9009 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-15 18:48:22 +00:00
thedavgar	ffe6ca986e	community: Fix Bug in Azure Search Vectorstore search asyncronously (#24081 ) Thank you for contributing to LangChain! Description: This PR fixes a bug described in the issue in #24064, when using the AzureSearch Vectorstore with the asyncronous methods to do search which is also the method used for the retriever. The proposed change includes just change the access of the embedding as optional because is it not used anywhere to retrieve documents. Actually, the syncronous methods of retrieval do not use the embedding neither. With this PR the code given by the user in the issue works. ```python vectorstore = AzureSearch( azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"), azure_search_key=os.getenv("AI_SEARCH_API_KEY"), index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"), fields=fields, embedding_function=encoder, ) retriever = vectorstore.as_retriever(search_type="hybrid", k=2) await vectorstore.avector_search("what is the capital of France") await retriever.ainvoke("what is the capital of France") ``` Issue: The Azure Search Vectorstore is not working when searching for documents with asyncronous methods, as described in issue #24064 Dependencies: There are no extra dependencies required for this change. --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-11 18:32:19 -07:00
Matt	8327925ab7	community:support additional Azure Search Options (#24134 ) - Description: Support additional kwargs options for the Azure Search client (Described here https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md#configurations) - Issue: N/A - Dependencies: No additional Dependencies ---------	2024-07-11 18:22:36 +00:00
Rafael Pereira	092e9ee0e6	community[minor]: Neo4j Fixed similarity docs (#23913 ) Description: There was missing some documentation regarding the `filter` and `params` attributes in similarity search methods. --------- Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>	2024-07-11 10:16:48 -04:00
Eugene Yurtsev	f765e8fa9d	core[minor],community[patch],standard-tests[patch]: Move InMemoryImplementation to langchain-core (#23986 ) This PR moves the in memory implementation to langchain-core. * The implementation remains importable from langchain-community. * Supporting utilities are marked as private for now.	2024-07-08 14:11:51 -07:00
Eugene Yurtsev	6f08e11d7c	core[minor]: add upsert, streaming_upsert, aupsert, astreaming_upsert methods to the VectorStore abstraction (#23774 ) This PR rolls out part of the new proposed interface for vectorstores (https://github.com/langchain-ai/langchain/pull/23544) to existing store implementations. The PR makes the following changes: 1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert methods to the vectorstore. 2. Updates `add_texts` and `aadd_texts` to be non required with a default implementation that delegates to `upsert` and `aupsert` if those have been implemented. The original `add_texts` and `aadd_texts` methods are problematic as they spread object specific information across document and *kwargs. (e.g., ids are not a part of the document) 3. Adds a default implementation to `add_documents` and `aadd_documents` that delegates to `upsert` and `aupsert` respectively. 4. Adds standard unit tests to verify that a given vectorstore implements a correct read/write API. A downside of this implementation is that it creates `upsert` with a very similar signature to `add_documents`. The reason for introducing `upsert` is to: Remove any ambiguities about what information is allowed in `kwargs`. Specifically kwargs should only be used for information common to all indexed data. (e.g., indexing timeout). *Allow inheriting from an anticipated generalized interface for indexing that will allow indexing `BaseMedia` (i.e., allow making a vectorstore for images/audio etc.) `add_documents` can be deprecated in the future in favor of `upsert` to make sure that users have a single correct way of indexing content. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-05 12:21:40 -04:00
Philippe PRADOS	289960bc60	community[patch]: Redis.delete should be a regular method not a static method (#23873 ) The `langchain_common.vectostore.Redis.delete()` must not be a `@staticmethod`. With the current implementation, it's not possible to have multiple instances of Redis vectorstore because all versions must share the `REDIS_URL`. It's not conform with the base class.	2024-07-05 12:04:58 -04:00
volodymyr-memsql	a4eb6d0fb1	community: add SingleStoreDB semantic cache (#23218 ) This PR adds a `SingleStoreDBSemanticCache` class that implements a cache based on SingleStoreDB vector store, integration tests, and a notebook example. Additionally, this PR contains minor changes to SingleStoreDB vector store: - change add texts/documents methods to return a list of inserted ids - implement delete(ids) method to delete documents by list of ids - added drop() method to drop a correspondent database table - updated integration tests to use and check functionality implemented above CC: @baskaryan, @hwchase17 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2024-07-05 09:26:06 -04:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
antonpibm	ffde8a6a09	Milvus vectorstore: fix pass ids as argument after upsert (#23761 ) Description: Milvus vectorstore supports both `add_documents` via the base class and `upsert` method which deletes and re-adds documents based on their ids Issue: Due to mismatch in the interfaces the ids used by `upsert` are neglected in `add_documents`, as `ids` are passed as argument in `upsert` but via `kwargs` is `add_documents` This caused exceptions and inconsistency in the DB, tested with `auto_id=False` Fix: pass `ids` via `kwargs` to `add_documents`	2024-07-02 13:45:30 +00:00
Eugene Yurtsev	f24e38876a	community[patch]: Update root_validators to use explicit pre=True or pre=False (#23736 )	2024-07-01 17:13:23 -04:00
Valentin	bf402f902e	community: Fix LanceDB similarity search bug (#23591 ) Description: LanceDB didn't allow querying the database using similarity score thresholds because the metrics value was missing. This PR simply fixes that bug. Issue: not applicable Dependencies: none Twitter handle: not available --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-01 16:33:45 +00:00
Tim Van Wassenhove	24916c6703	community: Register pandas df in duckdb when creating vector_store (#23690 ) - Description: Register pandas df in duckdb when creating vector_store - Issue: Resolves #23308 - Dependencies: None - Twitter handle: @timvw Co-authored-by: Tim Van Wassenhove <tim.van.wassenhove@telenetgroup.be>	2024-07-01 09:12:06 -04:00
yonarw	6d0ebbca1e	community: SAP HANA Vector Engine fix for latest HANA release (#23516 ) - Description: This PR fixes an issue with SAP HANA Cloud QRC03 version. In that version the number to indicate no length being set for a vector column changed from -1 to 0. The change in this PR support both behaviours (old/new). - Dependencies: No dependencies have been introduced. - Tests: The change is covered by previous unit tests.	2024-06-26 13:15:51 +00:00
Leonid Ganeline	51e75cf59d	community: docstrings (#23202 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-20 11:08:13 -04:00
Michał Krassowski	710197e18c	community[patch]: restore compatibility with SQLAlchemy 1.x (#22546 ) - Description: Restores compatibility with SQLAlchemy 1.4.x that was broken since #18992 and adds a test run for this version on CI (only for Python 3.11) - Issue: fixes #19681 - Dependencies: None - Twitter handle: `@krassowski_m` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-19 17:58:57 +00:00
Artem Mukhin	e271f75bee	docs: Fix URL formatting in deprecation warnings (#23075 ) Description Updated the URLs in deprecation warning messages. The URLs were previously written as raw strings and are now formatted to be clickable HTML links. Example of a broken link in the current API Reference: https://api.python.langchain.com/en/latest/chains/langchain.chains.openai_functions.extraction.create_extraction_chain_pydantic.html <img width="942" alt="Screenshot 2024-06-18 at 13 21 07" src="https://github.com/langchain-ai/langchain/assets/4854600/a1b1863c-cd03-4af2-a9bc-70375407fb00">	2024-06-18 14:49:58 -04:00
Raghav Dixit	55705c0f5e	LanceDB integration update (#22869 ) Added : - [x] relevance search (w/wo scores) - [x] maximal marginal search - [x] image ingestion - [x] filtering support - [x] hybrid search w reranking make test, lint_diff and format checked.	2024-06-17 20:54:26 -07:00
Mohammad Mohtashim	bf839676c7	[Community]: FIxed the DocumentDBVectorSearch `_similarity_search_without_score` (#22970 ) - Description: The PR #22777 introduced a bug in `_similarity_search_without_score` which was raising the `OperationFailure` error. The mistake was syntax error for MongoDB pipeline which has been corrected now. - Issue: #22770	2024-06-17 20:08:42 -07:00
caiyueliang	9944ad7f5f	community: 'Solve the issue where the _search function in ElasticsearchStore supports passing a query_vector parameter, but the parameter does not take effect. (#21532 ) Issue: When using the similarity_search_with_score function in ElasticsearchStore, I expected to pass in the query_vector that I have already obtained. I noticed that the _search function does support the query_vector parameter, but it seems to be ineffective. I am attempting to resolve this issue. Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-14 17:13:11 -07:00
Eugene Yurtsev	8f7cc73817	ci: Add script to check for pickle usage in community (#22863 ) Add script to check for pickle usage in community.	2024-06-13 16:13:15 -04:00
Eugene Yurtsev	77209f315e	community[patch]: FAISS VectorStore deserializer should be opt-in (#22861 ) FAISS deserializer uses pickle module. Users have to opt-in to de-serialize.	2024-06-13 15:48:13 -04:00
Rohan Aggarwal	86e8224cf1	community[patch]: Support for old clients (Thin and Thick) Oracle Vector Store (#22766 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" Support for old clients (Thin and Thick) Oracle Vector Store - [ ] PR message: *Delete this entire checklist* and replace with Support for old clients (Thin and Thick) Oracle Vector Store - [ ] Add tests and docs: If you're adding a new integration, please include Have our own local tests --------- Co-authored-by: rohan.aggarwal@oracle.com <rohaagga@phoenix95642.dev3sub2phx.databasede3phx.oraclevcn.com>	2024-06-11 11:36:06 -07:00
Aayush Kataria	71811e0547	community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676 ) This PR add supports for Azure Cosmos DB for NoSQL vector store. Summary: Description: added vector store integration for Azure Cosmos DB for NoSQL Vector Store, Dependencies: azure-cosmos dependency, Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-11 10:34:01 -07:00
Mohammad Mohtashim	36cad5d25c	[Community]: Added Metadata filter support for DocumentDB Vector Store (#22777 ) - Description: As pointed out in this issue #22770, DocumentDB `similarity_search` does not support filtering through metadata which this PR adds by passing in the parameter `filter`. Also this PR fixes a minor Documentation error. - Issue: #22770 --------- Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-11 16:37:53 +00:00
am-kinetica	ad101adec8	community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724 ) - [ ] Miscellaneous updates and fixes: - Description: Handled error in querying; quotes in table names; updated gpudb API - Issue: Threw an error with an error message difficult to understand if a query failed or returned no records - Dependencies: Updated GPUDB API version to `7.2.0.9` @baskaryan @hwchase17	2024-06-11 10:01:26 -04:00
Nithish Raghunandanan	f2f0e0e13d	couchbase: Add the initial version of Couchbase partner package (#22087 ) Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-07 14:04:08 -07:00
Erick Friis	a24a9c6427	multiple: get rid of pyproject extras (#22581 ) They cause `poetry lock` to take a ton of time, and `uv pip install` can resolve the constraints from these toml files in trivial time (addressing problem with #19153) This allows us to properly upgrade lockfile dependencies moving forward, which revealed some issues that were either fixed or type-ignored (see file comments)	2024-06-06 15:45:22 -07:00
lucasiscovici	05bf98b2f9	community[patch]: pgvector replace nin_ by not_in (#22619 ) - [ ] community: "pgvector: replace nin_ by not_in" - [ ] PR message: nin_ do not exist in sqlalchemy orm, it's not_in	2024-06-06 12:17:22 -04:00
Bagatur	584a1e30ac	community[patch]: AzureSearch async functions (#22075 )	2024-06-05 14:39:54 -07:00
Stefano Lottini	328d0c99f2	community[minor]: Add support for metadata indexing policy in Cassandra vector store (#22548 ) This PR adds a constructor `metadata_indexing` parameter to the Cassandra vector store to allow optional fine-tuning of which fields of the metadata are to be indexed. This is a feature supported by the underlying CassIO library. Indexing mode of "all", "none" or deny- and allow-list based choices are available. The rationale is, in some cases it's advisable to programmatically exclude some portions of the metadata from the index if one knows in advance they won't ever be used at search-time. this keeps the index more lightweight and performant and avoids limitations on the length of _indexed_ strings. I added a integration test of the feature. I also added the possibility of running the integration test with Cassandra on an arbitrary IP address (e.g. Dockerized), via `CASSANDRA_CONTACT_POINTS=10.1.1.5,10.1.1.6 poetry run pytest [...]` or similar. While I was at it, I added a line to the `.gitignore` since the mypy _test_ cache was not ignored yet. My X (Twitter) handle: @rsprrs.	2024-06-05 11:23:26 -04:00
Vincent Min	59bef31997	community[minor]: Improve InMemoryVectorStore with ability to persist to disk and filter on metadata. (#22186 ) - Description: The InMemoryVectorStore is a nice and simple vector store implementation for quick development and debugging. The current implementation is quite limited in its functionalities. This PR extends the functionalities by adding utility function to persist the vector store to a json file and to load it from a json file. We choose the json file format because it allows inspection of the database contents in a text editor, which is great for debugging. Furthermore, it adds a `filter` keyword that can be used to filter out documents on their `page_content` or `metadata`. - Issue: - - Dependencies: - - Twitter handle: @Vincent_Min	2024-06-05 10:40:34 -04:00
Anthony Bernabeu	4e676a63b8	community[minor]: Added filter search for LanceDB (#22461 ) - [ ] community: "vectorstore: added filtering support for LanceDB vector store" - [ ] This PR adds filtering capabilities to LanceDB: - Description: In LanceDB filtering can be applied when searching for data into the vectorstore. It is using the SQL language as mentioned in the LanceDB documentation. - Issue: #18235 - Dependencies: No - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-06-05 09:33:54 -04:00
Ofer Mendelevitch	ad502e8d50	community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334 ) Thank you for contributing to LangChain! Description: update to the Vectara / Langchain integration to integrate new Vectara capabilities: - Full RAG implemented as a Runnable with as_rag() - Vectara chat supported with as_chat() - Both support streaming response - Updated documentation and example notebook to reflect all the changes - Updated Vectara templates Twitter handle: ofermend Add tests and docs: no new tests or docs, but updated both existing tests and existing docs	2024-06-04 12:57:28 -07:00
Christophe Bornet	9a8fe58ebe	community[minor]: Improve Cassandra VectorStore as_retriever (#22465 ) The Vectorstore's API `as_retriever` doesn't expose explicitly the parameters `search_type` and `search_kwargs` and so these are not well documented. This PR improves `as_retriever` for the Cassandra VectorStore by making these parameters explicit. NB: An alternative would have been to modify `as_retriever` in `Vectorstore`. But there's probably a good reason these were not exposed in the first place ? Is it because implementations may decide to not support them and have fixed values when creating the VectorStoreRetriever ?	2024-06-04 09:51:17 -04:00
Fahreddin Özcan	0061ded002	community[patch]: Upstash Vector Store Namespace Support (#22251 ) This PR introduces namespace support for Upstash Vector Store, which would allow users to partition their data in the vector index. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-03 17:30:56 -07:00

1 2 3 4 5 ...

408 Commits