langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-09 21:08:59 +00:00

Author	SHA1	Message	Date
ZhangShenao	3f1d652f15	Improvement[Community] Improve api doc for `PineconeHybridSearchRetriever` (#25803 ) - Complete missing args in api doc	2024-08-28 08:38:56 -04:00
Moritz Schlager	555f97becb	community[patch]: fix model initialization bug for deepinfra (#25727 ) ### Description adds an init method to ChatDeepInfra to set the model_name attribute accordings to the argument ### Issue currently, the model_name specified by the user during initialization of the ChatDeepInfra class is never set. Therefore, it always chooses the default model (meta-llama/Llama-2-70b-chat-hf, however probably since this is deprecated it always uses meta-llama/Llama-3-70b-Instruct). We stumbled across this issue and fixed it as proposed in this pull request. Feel free to change the fix according to your coding guidelines and style, this is just a proposal and we want to draw attention to this problem. ### Dependencies no additional dependencies required Feel free to contact me or @timo282 and @finitearth if you have any questions. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-08-28 02:02:35 -07:00
Bagatur	b0ac6fe8d3	community[patch]: Release 0.2.13 (#25806 )	2024-08-28 08:57:49 +00:00
zysoong	25a6790e1a	community[patch]: Minor Improvement of extract hyperlinks tool output (#25728 ) Description: Make the hyperlink only appear once in the extract_hyperlinks tool output. (for some websites output contains meaningless '#' hyperlinks multiple times which will extend the tokens of context window without any advantage) Issue: None Dependencies: None	2024-08-28 08:02:40 +00:00
Isaac Francisco	d5ddaac1fc	docs minor fix (#25794 )	2024-08-28 04:14:36 +00:00
Tomaz Bratanic	f359e6b0a5	Add mmr to neo4j vector (#25765 )	2024-08-27 08:55:19 -04:00
Luis Valencia	99f9a664a5	community: Azure Search Vector Store is missing Access Token Authentication (#24330 ) Added Azure Search Access Token Authentication instead of API KEY auth. Fixes Issue: https://github.com/langchain-ai/langchain/issues/24263 Dependencies: None Twitter: @levalencia @baskaryan Could you please review? First time creating a PR that fixes some code. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-26 15:41:50 -07:00
ZhangShenao	44e3e2391c	Improvement[Community] Improve methods in `IMessageChatLoader` (#25746 ) - Add @staticmethod to static methods in `IMessageChatLoader`. - Format args name.	2024-08-26 14:20:22 -04:00
maang-h	a566a15930	Fix MoonshotChat instantiate with alias (#25755 ) - Description: - Fix `MoonshotChat` instantiate with alias - Add `MoonshotChat` to `__init__.py` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-26 17:33:22 +00:00
Ashvin	af3b3a4474	Update endpoint for AzureMLEndpointApiType class. (#25725 ) This addresses the issue mentioned in #25702 I have updated the endpoint used in validating the endpoint API type in the AzureMLBaseEndpoint class from `/v1/completions` to `/completions` and `/v1/chat/completions` to `/chat/completions`. Co-authored-by: = <=>	2024-08-26 08:50:02 -04:00
Dristy Srivastava	7205057c3e	[Community][minor]: Added langchain_version while calling discover API (#24428 ) - Description: Added langchain version while calling discover API during both ingestion and retrieval - Issue: NA - Dependencies: NA - Tests: NA - Docs NA --------- Co-authored-by: dristy.cd <dristy@clouddefense.io>	2024-08-26 08:47:48 -04:00
Dristy Srivastava	fbb4761199	[Community][minor]: Updating source path, and file path for SharePoint loader in PebbloSafeLoader (#25592 ) - Description: Updating source path and file path in Pebblo safe loader for SharePoint apps during loading - Issue: NA - Dependencies: NA - Tests: NA - Docs NA --------- Co-authored-by: dristy.cd <dristy@clouddefense.io>	2024-08-26 08:38:40 -04:00
Rajendra Kadam	745d1c2b8d	community[minor]: [Pebblo] Fix URL construction in newer Python versions (#25747 ) - PR message: Fix URL construction in newer Python versions - Description: - Update the URL construction logic to use the .value attribute for Routes enum members. - This adjustment resolves an issue where the code worked correctly in Python 3.9 but failed in Python 3.11. - Clean up unused routes. - Issue: NA - Dependencies: NA	2024-08-26 07:27:30 -04:00
Rajendra Kadam	58a98c7d8a	community: [PebbloRetrievalQA] Implemented Async support for prompt APIs (#25748 ) - Description: PebbloRetrievalQA: Implemented Async support for prompt APIs (classification and governance) - Issue: NA - Dependencies: NA	2024-08-26 07:27:05 -04:00
Christophe Bornet	038c287b3a	all: Improve make lint command (#25344 ) * Removed `ruff check --select I` as `I` is already selected and checked in the main `ruff check` command * Added checks for non-empty `PYTHON_FILES` * Run `ruff check` only on `PYTHON_FILES` Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-23 18:23:52 -07:00
Erick Friis	f6491ceb7d	community: remove integration test deps (#24460 ) they arent used	2024-08-23 23:25:17 +00:00
Sharmistha S. Gupta	90439b12f6	Added support for Nebula Chat model (#21925 ) Description: Added support for Nebula Chat model in addition to Nebula Instruct Dependencies: N/A Twitter handle: @Symbldotai --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-23 22:34:32 +00:00
Ian	64ace25eb8	<Community>: tidb vector support vector index (#19984 ) This PR introduces adjustments to ensure compatibility with the recently released preview version of [TiDB Serverless Vector Search](https://tidb.cloud/ai), aiming to prevent user confusion. - TiDB Vector now supports vector indexing with cosine and l2 distance strategies, although inner_product remains unsupported. - Changing the distance strategy is currently not supported, so the test cased should be adjusted.	2024-08-23 13:59:23 -04:00
Austin Burdette	f355a98bb6	community:yuan2[patch]: standardize init args (#21462 ) updated stop and request_timeout so they aliased to stop_sequences, and timeout respectively. Added test that both continue to set the same underlying attributes. Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-23 17:56:19 +00:00
Erick Friis	b365ee996b	community: remove unused verify_ssl kwarg from aiohttp request (#25707 ) it's not a valid kwarg in aiohttp request	2024-08-23 17:14:04 +00:00
Ashvin	2cd77a53a3	docs: Add docstrings for CassandraChatMessageHistory class and package namespace function. (#24222 ) - Modified docstring for CassandraChatMessageHistory in libs/community/langchain_community/chat_message_history/cassandra.py. - Added docstring for _package_namespace function in docs/api_reference/create_api_rst.py --------- Co-authored-by: ashvin <ashvin.anilkumar@qburst.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-23 15:49:41 +00:00
Leonid Ganeline	8788a34bfa	community: `NeptuneGraph` fix (#23281 ) Issue: the `service` optional parameter was mentioned but not used. Fix: added this parameter. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-23 15:34:26 +00:00
Djordje	22f9ae489f	community: Opensearch - added score function for similarity_score_threshold (#23928 ) This PR resolves the NotImplemented error for the similarity_score_threshold search type for OpenSearch.	2024-08-23 11:30:04 -04:00
ZhangShenao	b38c83ff93	patch[Community] Optimize methods in several ChatLoaders (#24806 ) There are some static methods in ChatLoaders, try to add @staticmethod decorator for them.	2024-08-23 11:00:41 -04:00
James Espichan Vilca	644e0d3463	Use extend method for embeddings concatenation in mlflow_gateway (#14358 ) ## Description There is a bug in the concatenation of embeddings obtained from MLflow that does not conform to the type hint requested by the function. ``` python def _query(self, texts: List[str]) -> List[List[float]]: ``` It is logical to expect a List[List[float]] for a List[str]. However, the append method encapsulates the response in a global List. To avoid this, the extend method should be used, which will add the embeddings of all strings at the same list level. ## Testing I have tried using OpenAI-ADA to obtain the embeddings, and the result of executing this snippet is as follows: ``` python embeds = await MlflowAIGatewayEmbeddings().aembed_documents(texts=["hi", "how are you?"]) print(embeds) ``` ``` python [[[-0.03512698, -0.020624293, -0.015343423, ...], [-0.021260535, -0.011461929, -0.00033121882, ...]]] ``` When in reality, the expected result should be: ``` python [[-0.03512698, -0.020624293, -0.015343423, ...], [-0.021260535, -0.011461929, -0.00033121882, ...]] ``` The above result complies with the expected type hint: List[List[float]] . As I mentioned, we can achieve that by using the extend method instead of the append method. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-23 14:43:43 +00:00
Christophe Bornet	7f1e444efa	partners: Use simsimd types (#25299 ) The simsimd package [now has types](https://github.com/ashvardanian/SimSIMD/releases/tag/v5.0.0)	2024-08-23 10:41:39 -04:00
clement.l	642f9530cd	community: add supported blockchains to Blockchain Document Loader (#25428 ) - Remove deprecated chains. - Add more supported chains. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-23 14:39:42 +00:00
conjuncts	818267bbc3	community: allow chroma DB delete() to use "where" argument (#19826 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" Description: Simply pass kwargs to allow arguments like "where" to be propagated Issue: Previously, db.delete(where={}) wouldn't work for chroma vectorstores Dependencies: N/A Twitter handle: N/A - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-08-23 10:10:57 -04:00
Kevin Engelke	3c7f12cbf5	community[minor]: Fix missing 'keep_newlines' parameter forward-pass to 'process_pages' function in confluence loader (#20086 ) (#20087 ) - Description: Fixed missing `keep_newlines` parameter forward-pass in confluence-loader - Issue: #20086 - Dependencies: None --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-23 12:59:38 +00:00
Erik Lindgren	583b0449eb	community[patch]: Fix Hybrid Search for non-Databricks managed embeddings (#25590 ) Description: Send both the query and query_embedding to the Databricks index for hybrid search. Issue: When using hybrid search with non-Databricks managed embedding we currently don't pass both the embedding and query_text to the index. Hybrid search requires both of these. This change fixes this issue for both `similarity_search` and `similarity_search_by_vector`. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-23 08:57:13 +00:00
Alejandro Companioni	bcd5842b5d	community[patch]: Updating default PPLX model to supported llama-3.1 model. (#25643 ) # Issue As of late July, Perplexity [no longer supports Llama 3 models](https://docs.perplexity.ai/changelog/introducing-new-and-improved-sonar-models). # Description This PR updates the default model and doc examples to reflect their latest supported model. (Mostly updating the same places changed by #23723.) # Twitter handle `@acompa_` on behalf of the team at Not Diamond. Check us out [here](https://notdiamond.ai). --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-08-23 08:33:30 +00:00
Jakub W.	b865ee49a0	community[patch]: Dynamodb history messages key (#25658 ) - Description: adding the history_messages_key so you don't have to use "History" as a key in langchain	2024-08-23 08:05:28 +00:00
Manuel Jaiczay	1c31234eed	community: fix HuggingFacePipeline pipeline_kwargs (#19920 ) Fix handling of pipeline_kwargs to prioritize class attribute defaults. #19770 Co-authored-by: jaizo <manuel.jaiczay@polygons.at> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-08-22 18:29:46 -04:00
Nobuhiko Otoba	4b63a217c2	"community: Fix GithubFileLoader source code", "docs: Fix GithubFileLoader code sample" (#19943 ) This PR adds tiny improvements to the `GithubFileLoader` document loader and its code sample, addressing the following issues: 1. Currently, the `file_extension` argument of `GithubFileLoader` does not change its behavior at all. 1. The `GithubFileLoader` sample code in `docs/docs/integrations/document_loaders/github.ipynb` does not work as it stands. The respective solutions I propose are the following: 1. Remove `file_extension` argument from `GithubFileLoader`. 1. Specify the branch as `master` (not the default `main`) and rename `documents` as `document`. --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-08-22 18:24:57 -04:00
Nada Amin	ac7b71e0d7	langchain_community.graphs: Neo4JGraph: prop min_size might be None (#23944 ) When I used the Neo4JGraph enhanced_schema=True option, I ran into an error because a prop min_size of None was compared numerically with an int. The fix I applied is similar to the pattern of skipping embeddings elsewhere in the file. Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-22 20:29:52 +00:00
William FH	fad6fc866a	Rm DeepInfra Breakpoint Comment (#25206 ) tbh should rm the print staement too	2024-08-22 14:43:44 -04:00
Eric Pinzur	01ded5e2f9	community: add metadata filter to CassandraGraphVectorStore (#25663 ) - Description: - Added metadata filtering support to `langchain_community.graph_vectorstores.cassandra.CassandraGraphVectorStore` - Also fixed type conversion issues highlighted by mypy. - Dependencies: - `ragstack-ai-knowledge-store 0.2.0` (released July 23, 2024) --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-22 14:27:16 -04:00
mschoenb97IL	e499caa9cd	community: Give more context on DeepInfra 500 errors (#25671 ) Description: DeepInfra 500 errors have useful information in the text field that isn't being exposed to the user. I updated the error message to fix this. As an example, this code ``` from langchain_community.chat_models import ChatDeepInfra from langchain_core.messages import HumanMessage model = "meta-llama/Meta-Llama-3-70B-Instruct" deepinfra_api_token = "..." model = ChatDeepInfra(model=model, deepinfra_api_token=deepinfra_api_token) messages = [HumanMessage("All work and no play makes Jack a dull boy\n" * 9000)] response = model.invoke(messages) ``` Currently gives this error: ``` langchain_community.chat_models.deepinfra.ChatDeepInfraException: DeepInfra Server: Error 500 ``` This change would give the following error: ``` langchain_community.chat_models.deepinfra.ChatDeepInfraException: DeepInfra Server error status 500: {"error":{"message":"Requested input length 99009 exceeds maximum input length 8192"}} ```	2024-08-22 10:10:51 -07:00
Rajendra Kadam	4ff2f4499e	community: Refactor PebbloRetrievalQA (#25583 ) Refactor PebbloRetrievalQA - Created `APIWrapper` and moved API logic into it. - Created smaller functions/methods for better readability. - Properly read environment variables. - Removed unused code. - Updated models Issue: NA Dependencies: NA tests: NA	2024-08-22 11:51:21 -04:00
Rajendra Kadam	1f1679e960	community: Refactor PebbloSafeLoader (#25582 ) Refactor PebbloSafeLoader - Created `APIWrapper` and moved API logic into it. - Moved helper functions to the utility file. - Created smaller functions and methods for better readability. - Properly read environment variables. - Removed unused code. Issue: NA Dependencies: NA tests: Updated	2024-08-22 11:46:52 -04:00
maang-h	5e3a321f71	docs: Add ChatZhipuAI tool calling and structured output docstring (#25669 ) - Description: Add `ChatZhipuAI` tool calling and structured output docstring.	2024-08-22 10:34:41 -04:00
Noah Mayerhofer	0091947efd	community: add retry for session expired exception in neo4j (#25660 ) Description: The neo4j driver can raise a SessionExpired error, which is considered a retriable error. If a query fails with a SessionExpired error, this change retries every query once. This change will make the neo4j integration less flaky. Twitter handle: noahmay_	2024-08-22 13:07:36 +00:00
Dristy Srivastava	b002702af6	[Community][minor]: Updating metadata with full_path in SharePoint loader (#25593 ) - Description: Updating metadata for sharepoint loader with full path i.e., webUrl - Issue: NA - Dependencies: NA - Tests: NA - Docs NA Co-authored-by: dristy.cd <dristy@clouddefense.io> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-21 13:10:14 +00:00
Jabir	12e490ea56	Update azuresearch.py (#25577 ) This will allow complextype metadata to be returned. the current implementation throws error when dealing with nested metadata Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-20 12:53:30 +00:00
Erick Friis	e01c6789c4	core,community: add beta decorator to missed GraphVectorStore extensions (#25562 )	2024-08-19 17:29:09 -07:00
maang-h	015ab91b83	community[patch]: Add ToolMessage for ChatZhipuAI (#25547 ) - Description: Add ToolMessage for `ChatZhipuAI` to solve the issue #25490	2024-08-19 11:26:38 -04:00
Mohammad Mohtashim	75c3c81b8c	[Community]: Fix - Open AI Whisper `client.audio.transcriptions` returning Text Object which raises error (#25271 ) - Description: The following [line](`fd546196ef/libs/community/langchain_community/document_loaders/parsers/audio.py (L117)`) in `OpenAIWhisperParser` returns a text object for some odd reason despite the official documentation saying it should return `Transcript` Instance which should have the text attribute. But for the example given in the issue and even when I tried running on my own, I was directly getting the text. The small PR accounts for that. - Issue: : #25218 I was able to replicate the error even without the GenericLoader as shown below and the issue was with `OpenAIWhisperParser` ```python parser = OpenAIWhisperParser(api_key="sk-fxxxxxxxxx", response_format="srt", temperature=0) list(parser.lazy_parse(Blob.from_path('path_to_file.m4a'))) ```	2024-08-19 09:36:42 -04:00
maang-h	32f5147523	docs: Fix QianfanLLMEndpoint and Tongyi input text (#25529 ) - Description: Fix `QianfanLLMEndpoint` and `Tongyi` input text.	2024-08-19 09:23:09 -04:00
ZhangShenao	4255a30f20	Improvement[Community] Improve api doc for `SingleFileFacebookMessengerChatLoader` (#25536 ) Delete redundant args in api doc	2024-08-19 09:00:21 -04:00
ccurme	b83f1eb0d5	core, partners: implement standard tracing params for LLMs (#25410 )	2024-08-16 13:18:09 -04:00
Bagatur	253ceca76a	docs: fix mimetype parser docstring (#25463 )	2024-08-15 16:16:52 -07:00
ccurme	8afbab4cf6	langchain[patch]: deprecate various chains (#25310 ) - [x] NatbotChain: move to community, deprecate langchain version. Update to use `prompt \| llm \| output_parser` instead of LLMChain. - [x] LLMMathChain: deprecate + add langgraph replacement example to API ref - [x] HypotheticalDocumentEmbedder (retriever): update to use `prompt \| llm \| output_parser` instead of LLMChain - [x] FlareChain: update to use `prompt \| llm \| output_parser` instead of LLMChain - [x] ConstitutionalChain: deprecate + add langgraph replacement example to API ref - [x] LLMChainExtractor (document compressor): update to use `prompt \| llm \| output_parser` instead of LLMChain - [x] LLMChainFilter (document compressor): update to use `prompt \| llm \| output_parser` instead of LLMChain - [x] RePhraseQueryRetriever (retriever): update to use `prompt \| llm \| output_parser` instead of LLMChain	2024-08-15 10:49:26 -04:00
ccurme	ba167dc158	community[patch]: update connection string in azure cosmos integration test (#25438 )	2024-08-15 14:07:54 +00:00
Isaac Francisco	966b408634	[docs]: doc loader changes (#25417 )	2024-08-14 19:46:33 -07:00
Werner van der Merwe	1d3f7231b8	fix: typo where github should be gitlab (#25397 ) PR title: "GitLabToolkit: fix typo" - Description: fix typo where GitHub should have been GitLab - Dependencies: None	2024-08-14 18:36:25 +00:00
Bagatur	493e474063	docs: udpated api reference (#25172 ) - Move the API reference into the vercel build - Update api reference organization and styling	2024-08-14 07:00:17 -07:00
ccurme	27690506d0	multiple: update removal targets (#25361 )	2024-08-14 09:50:39 -04:00
Harrison Chase	967b6f21f6	docs: improve document loaders index (#25365 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-14 01:48:48 +00:00
Isaac Francisco	f4ffd692a3	[docs]: standardize doc loader doc strings (#25325 )	2024-08-13 23:18:56 +00:00
Isaac Francisco	e0bbb81d04	[docs]: standardize tool docstrings (#25351 )	2024-08-13 16:10:00 -07:00
thedavgar	9d08369442	community: fix AzureSearch vectorstore asyncronous methods (#24921 ) Description Fix the asyncronous methods to retrieve documents from AzureSearch VectorStore. The previous changes from [this commit](`ffe6ca986e`) create a similar code for the syncronous methods and the asyncronous ones but the asyncronous client return an asyncronous iterator "AsyncSearchItemPaged" as said in the issue #24740. To solve this issue, the syncronous iterators in asyncronous methods where changed to asyncronous iterators. @chrislrobert said in [this comment](https://github.com/langchain-ai/langchain/issues/24740#issuecomment-2254168302) that there was a still a flaw due to `with` blocks that close the client after each call. I removed this `with` blocks in the `async_client` following the same pattern as the sync `client`. In order to close up the connections, a __del__ method is included to gently close up clients once the vectorstore object is destroyed. Issue: #24740 and #24064 Dependencies: No new dependencies for this change Example notebook: I created a notebook just to test the changes work and gives the same results as the syncronous methods for vector and hybrid search. With these changes, the asyncronous methods in the retriever work as well. ![image](https://github.com/user-attachments/assets/697e431b-9d7f-4d0d-b205-59d051ac2b67) Lint and test: Passes the tests and the linter	2024-08-13 14:20:51 -07:00
Fedor Nikolaev	2b15518c5f	community: add args_schema to SearxSearchResults tool (#25350 ) This adds `args_schema` member to `SearxSearchResults` tool. This member is already present in the `SearxSearchRun` tool in the same file. I was having `TypeError: Type is not JSON serializable: AsyncCallbackManagerForToolRun` being thrown in langserve playground when I was using `SearxSearchResults` tool as a part of chain there. This fixes the issue, so the error is not raised anymore. This is a example langserve app that was giving me the error, but it works properly after the proposed fix: ```python #!/usr/bin/env python from fastapi import FastAPI from langchain_core.prompts import ChatPromptTemplate from langchain_core.output_parsers import StrOutputParser from langchain_core.runnables import RunnablePassthrough from langchain_openai import ChatOpenAI from langchain_community.utilities import SearxSearchWrapper from langchain_community.tools.searx_search.tool import SearxSearchResults from langserve import add_routes template = """Answer the question based only on the following context: {context} Question: {question} """ prompt = ChatPromptTemplate.from_template(template) model = ChatOpenAI() s = SearxSearchWrapper(searx_host="http://localhost:8080") search = SearxSearchResults(wrapper=s) search_chain = ( {"context": search, "question": RunnablePassthrough()} \| prompt \| model \| StrOutputParser() ) app = FastAPI() add_routes( app, search_chain, path="/chain", ) if __name__ == "__main__": import uvicorn uvicorn.run(app, host="localhost", port=8000) ```	2024-08-13 18:26:09 +00:00
maang-h	089f5e6cad	Standardize SparkLLM (#25239 ) - Description: Standardize SparkLLM, include: - docs, the issue #24803 - to support stream - update api url - model init arg names, the issue #20085	2024-08-13 09:50:12 -04:00
Chen Xiabin	24155aa1ac	qianfan generate/agenerate with usage_metadata (#25332 )	2024-08-13 09:24:41 -04:00
Erick Friis	2907ab2297	community: release 0.2.12 (#25324 )	2024-08-12 23:30:27 +00:00
Ben Chambers	1adc161642	community: kwargs for CassandraGraphVectorStore (#25300 ) - Description: pass kwargs from CassandraGraphVectorStore to underlying store Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-12 18:01:29 +00:00
ccurme	e77eeee6ee	core[patch]: add standard tracing params for retrievers (#25240 )	2024-08-12 14:51:59 +00:00
Mohammad Mohtashim	9927a4866d	[Community] - Added bind_tools and with_structured_output for ChatZhipuAI (#23887 ) - Description: This PR implements the `bind_tool` functionality for ChatZhipuAI as requested by the user. ChatZhipuAI models support tool calling according to the `OpenAI` tool format, as outlined in their official documentation [here](https://open.bigmodel.cn/dev/api#glm-4). - Issue: ##23868 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-12 14:11:43 +00:00
maang-h	bc60cddc1b	docs: Fix ChatBaichuan, QianfanChatEndpoint, ChatSparkLLM, ChatZhipuAI docs (#25265 ) - Description: Fix some chat models docs, include: - ChatBaichuan - QianfanChatEndpoint - ChatSparkLLM - ChatZhipuAI	2024-08-11 16:23:55 -04:00
ZhangShenao	43deed2a95	Improvement[Embeddings] Add dimension support to `ZhipuAIEmbeddings` (#25274 ) - In the in ` embedding-3 ` and later models of Zhipu AI, it is supported to specify the dimensions parameter of Embedding. Ref: https://bigmodel.cn/dev/api#text_embedding-3 . - Add test case for `embedding-3` model by assigning dimensions.	2024-08-11 16:20:37 -04:00
Eugene Yurtsev	6dd9f053e3	core[patch]: Deprecating beta upsert APIs in vectorstore (#25069 ) This PR deprecates the beta upsert APIs in vectorstore. We'll introduce them in a V2 abstraction instead to keep the existing vectorstore implementations lighter weight. The main problem with the existing APIs is that it's a bit more challenging to implement the correct behavior w/ respect to IDs since ID can be present in both the function signature and as an optional attribute on the document object. But VectorStores that pass the standard tests should have implemented the semantics properly! --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-09 17:17:36 -04:00
Eugene Yurtsev	b6f0174bb9	community[patch],core[patch]: Update EdenaiTool root_validator and add unit test in core (#25233 ) This PR gets rid `root_validators(allow_reuse=True)` logic used in EdenAI Tool in preparation for pydantic 2 upgrade. - add another test to secret_from_env_factory	2024-08-09 15:59:27 +00:00
Eugene Yurtsev	bd6c31617e	community[patch]: Remove more @allow_reuse=True validators (#25236 ) Remove some additional allow_reuse=True usage in @root_validators.	2024-08-09 11:10:27 -04:00
Eugene Yurtsev	6e57aa7c36	community[patch]: Remove usage of @root_validator(allow_reuse=True) (#25235 ) Remove usage of @root_validator(allow_reuse=True)	2024-08-09 10:57:42 -04:00
thiswillbeyourgithub	a2b4c33bd6	community[patch]: FAISS: ValueError mentions normalize_score_fn isntead of relevance_score_fn (#25225 ) Thank you for contributing to LangChain! - [X] PR title: "community: fix valueerror mentions wrong argument missing" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [X] PR message: *Delete this entire checklist* and replace with - Description: when faiss.py has a None relevance_score_fn it raises a ValueError that says a normalize_fn_score argument is needed. Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-09 14:40:29 +00:00
Shivendra Soni	66b7206ab6	community: Add llm-extraction option to FireCrawl Document Loader (#25231 ) Description: This minor PR aims to add `llm_extraction` to Firecrawl loader. This feature is supported on API and PythonSDK, but the langchain loader omits adding this to the response. Twitter handle: [scalable_pizza](https://x.com/scalablepizza) --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-09 13:59:10 +00:00
ccurme	3b7437d184	docs: update integration api refs (#25195 ) - [x] toolkits - [x] retrievers (in this repo)	2024-08-09 12:27:32 +00:00
Eugene Yurtsev	98779797fe	community[patch]: Use get_fields adapter for pydantic (#25191 ) Change all usages of __fields__ with get_fields adapter merged into langchain_core. Code mod generated using the following grit pattern: ``` engine marzano(0.1) language python `$X.__fields__` => `get_fields($X)` where { add_import(source="langchain_core.utils.pydantic", name="get_fields") } ```	2024-08-08 14:43:09 -04:00
Rajendra Kadam	663638d6a8	community[minor]: [SharePointLoader] Load extended metadata for the root folder (#24872 ) - Title: [SharePointLoader] Load extended metadata for the root folder - Description: - Ensure extended metadata loads correctly for the root folder. - Cleanup: Refactor SharePointLoader to remove unused fields(`file_id` & `site_id`). - Dependencies: NA - Add tests and docs: NA	2024-08-08 14:39:16 -04:00
Eugene Yurtsev	bf5193bb99	community[patch]: Upgrade pydantic extra (#25185 ) Upgrade to using a literal for specifying the extra which is the recommended approach in pydantic 2. This works correctly also in pydantic v1. ```python from pydantic.v1 import BaseModel class Foo(BaseModel, extra="forbid"): x: int Foo(x=5, y=1) ``` And ```python from pydantic.v1 import BaseModel class Foo(BaseModel): x: int class Config: extra = "forbid" Foo(x=5, y=1) ``` ## Enum -> literal using grit pattern: ``` engine marzano(0.1) language python or { `extra=Extra.allow` => `extra="allow"`, `extra=Extra.forbid` => `extra="forbid"`, `extra=Extra.ignore` => `extra="ignore"` } ``` Resorted attributes in config and removed doc-string in case we will need to deal with going back and forth between pydantic v1 and v2 during the 0.3 release. (This will reduce merge conflicts.) ## Sort attributes in Config: ``` engine marzano(0.1) language python function sort($values) js { return $values.text.split(',').sort().join("\n"); } class_definition($name, $body) as $C where { $name <: `Config`, $body <: block($statements), $values = [], $statements <: some bubble($values) assignment() as $A where { $values += $A }, $body => sort($values), } ```	2024-08-08 17:20:39 +00:00
ololand	249945a572	Update polygon.py for business subscription (#25085 ) For business subscription the status is STOCKSBUSINESS not OK Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-08-08 15:28:41 +00:00
ogawa	d895db11d6	community[patch]: gpt-4o-2024-08-06 costs (#25164 ) - Description: updated OpenAI cost definitions according to the following: - https://openai.com/api/pricing/ - Twitter handle: `@ogawa65a` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-08 13:22:11 +00:00
maang-h	0ba125c3cd	docs: Standardize QianfanLLMEndpoint LLM (#25139 ) - Description: Standardize QianfanLLMEndpoint LLM，include: - docs, the issue #24803 - model init arg names, the issue #20085	2024-08-07 10:57:27 -04:00
Pat Patterson	7e7fcf5b1f	community: Fix ValidationError on creating GPT4AllEmbeddings with no gpt4all_kwargs (#25124 ) - Description: Instantiating `GPT4AllEmbeddings` with no `gpt4all_kwargs` argument raised a `ValidationError`. Root cause: #21238 added the capability to pass `gpt4all_kwargs` through to the `GPT4All` instance via `Embed4All`, but broke code that did not specify a `gpt4all_kwargs` argument. - Issue: #25119 - Dependencies: None - Twitter handle: [`@metadaddy`](https://twitter.com/metadaddy)	2024-08-07 13:34:01 +00:00
Virat Singh	264ab96980	community: Add stock market tools from financialdatasets.ai (#25025 ) Description: In this PR, I am adding three stock market tools from financialdatasets.ai (my API!): - get balance sheets - get cash flow statements - get income statements Twitter handle: [@virattt](https://twitter.com/virattt) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-06 18:28:12 +00:00
Naval Chand	71c0698ee4	Added bedrock 3-5 sonnet cost detials for BedrockAnthropicTokenUsageCallbackHandler (#25104 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Example: "community: Added bedrock 3-5 sonnet cost detials for BedrockAnthropicTokenUsageCallbackHandler" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Naval Chand <navalchand@192.168.1.36>	2024-08-06 17:28:47 +00:00
Isaac Francisco	a72fddbf8d	[docs]: vector store integration pages (#24858 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-06 17:20:27 +00:00
maang-h	1028af17e7	docs: Standardize Tongyi (#25103 ) - Description: Standardize Tongyi LLM，include: - docs, the issue #24803 - model init arg names, the issue #20085	2024-08-06 11:44:12 -04:00
Dobiichi-Origami	061ed250f6	delete the default model value from langchain and discard the need fo… (#24915 ) - description: I remove the limitation of mandatory existence of `QIANFAN_AK` and default model name which langchain uses cause there is already a default model nama underlying `qianfan` SDK powering langchain component. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-06 14:11:05 +00:00
Dominik Fladung	ffa0c838d8	Allow ConfluenceLoader authorization via Personal Access Tokens (#25096 ) - community: Allow authorization to Confluence with bearer token - Description: Allow authorization to Confluence with [Personal Access Token](https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html) by checking for the keys `['client_id', token: ['access_token', 'token_type']]` - Issue: Currently the following error occurs when using an personal access token for authorization. ```python loader = ConfluenceLoader( url=os.getenv('CONFLUENCE_URL'), oauth2={ 'token': {"access_token": os.getenv("CONFLUENCE_ACCESS_TOKEN"), "token_type": "bearer"}, 'client_id': 'client_id', }, page_ids=['12345678'], ) ``` ``` ValueError: Error(s) while validating input: ["You have either omitted require keys or added extra keys to the oauth2 dictionary. key values should be `['access_token', 'access_token_secret', 'consumer_key', 'key_cert']`"] ``` With this PR the loader runs as expected. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-06 13:42:47 +00:00
jigsawlabs-student	427a04151c	community: fix neo4j from_existing_graph (#24912 ) Fixes Neo4JVector.from_existing_graph integration with huggingface Previously threw an error with existing databases, because from_existing_graph query returns empty list of new nodes, which are then passed to embedding function, and huggingface errors with empty list. Fixes [24401](https://github.com/langchain-ai/langchain/issues/24401) --------- Co-authored-by: Jeff Katzy <jeffreyerickatz@gmail.com>	2024-08-05 21:01:46 +00:00
Jim Baldwin	6890daa90c	community: make AthenaLoader profile_name optional and fix type hint (#24958 ) - Description: This PR makes the AthenaLoader profile_name optional and fixes the type hint which says the type is `str` but it should be `str` or `None` as None is handled in the loader init. This is a minor problem but it just confused me when I was using the Athena Loader to why we had to use a Profile, as I want that for local but not production. - Issue: #24957 - Dependencies: None.	2024-08-05 14:28:58 +00:00
Dobiichi-Origami	c5cb52a3c6	community: fix issue of the existence of numeric object in `additional_kwargs` a… (#24863 ) - Description: A previous PR breaks the code from `baidu_qianfan_endpoint.py` which causes the malfunction of streaming	2024-08-05 10:15:55 -04:00
ZhangShenao	cda79dbb6c	community[patch]: Optimize test case for `MoonshotChat` (#25050 ) Optimize test case for `MoonshotChat`. Use standard ChatModelIntegrationTests.	2024-08-05 10:11:25 -04:00
Alex Sherstinsky	208042e0f2	community: Fix Predibase Integration for HuggingFace-hosted fine-tuned adapters (#25015 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-08-03 14:05:43 -07:00
maang-h	f5da0d6d87	docs: Standardize MiniMaxEmbeddings (#24983 ) - Description: Standardize MiniMaxEmbeddings - docs, the issue #24856 - model init arg names, the issue #20085	2024-08-03 14:01:23 -04:00
maang-h	7de62abc91	docs: Standardize SparkLLMTextEmbeddings docstrings (#25021 ) - Description: Standardize SparkLLMTextEmbeddings docstrings - Issue: the issue #24856	2024-08-03 13:44:09 -04:00
Bagatur	e81ddb32a6	docs: fix kwargs docstring (#25010 ) Fix: ![Screenshot 2024-08-02 at 5 33 37 PM](https://github.com/user-attachments/assets/7c56cdeb-ee81-454c-b3eb-86aa8a9bdc8d)	2024-08-02 19:54:54 -07:00
Isaac Francisco	73570873ab	docs: standardizing tavily tool docs (#24736 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-08-02 22:25:27 +00:00
Bagatur	8e2316b8c2	community[patch]: Release 0.2.11 (#24989 )	2024-08-02 20:08:44 +00:00
ccurme	22c1a4041b	community[patch]: support named arguments in github toolkit (#24986 ) Parameters may be passed in by name if generated from tool calls.	2024-08-02 18:27:32 +00:00
ZhangShenao	71c0564c9f	community[patch]: Add test case for MoonshotChat (#24960 ) Add test case for `MoonshotChat`.	2024-08-02 09:37:31 -04:00
Isaac Francisco	d7688a4328	community[patch]: adding artifact to Tavily search (#24376 ) This allows you to get raw content as well as the answer, instead of just getting the results. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-08-01 21:12:11 -07:00
maang-h	ea505985c4	docs: Standardize ZhipuAIEmbeddings docstrings (#24933 ) - Description: Standardize ZhipuAIEmbeddings rich docstrings. - Issue: the issue #24856	2024-08-01 14:06:53 -04:00
Anneli Samuel	2204d8cb7d	community[patch]: Invoke on_llm_new_token callback before yielding chunk (#24938 ) Description: Invoke on_llm_new_token callback before yielding chunk in streaming mode Issue: [#16913](https://github.com/langchain-ai/langchain/issues/16913)	2024-08-01 16:39:04 +00:00
Serena Ruan	1827bb4042	community[patch]: support bind_tools for ChatMlflow (#24547 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - Description: Support ChatMlflow.bind_tools method Tested in Databricks: <img width="836" alt="image" src="https://github.com/user-attachments/assets/fa28ef50-0110-4698-8eda-4faf6f0b9ef8"> - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: Serena Ruan <serena.rxy@gmail.com>	2024-08-01 08:43:07 -07:00
BottlePumpkin	bfc59c1d26	community: Fix KeyError in NotionDB loader when 'name' is missing (#24224 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Description: This PR fixes a KeyError in NotionDBLoader when the "name" key is missing in the "people" property. Issue: Fixes #24223 Dependencies: None --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-01 13:55:40 +00:00
alexqiao	8eb0bdead3	community[patch]: Invoke callback prior to yielding token (#24917 ) Description: Invoke callback prior to yielding token in stream method for chat_models . Issue: https://github.com/langchain-ai/langchain/issues/16913 #16913	2024-08-01 13:19:55 +00:00
Nikita Pakunov	c776471ac6	community: fix AttributeError: 'YandexGPT' object has no attribute '_grpc_metadata' (#24432 ) Fixes #24049 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-31 21:18:33 +00:00
Eugene Yurtsev	add16111b9	community[patch]: Make the pydantic linter stricter (#24897 ) Stricter linting of deprecated pydantic features.	2024-07-31 18:57:37 +00:00
Eugene Yurtsev	a4a444f73d	community[patch]: Fix arcee llm usage of root_validator(pre=False) (#24896 ) Should be pre=True	2024-07-31 18:49:20 +00:00
Eugene Yurtsev	d24b82357f	community[patch]: Add missing annotations (#24890 ) This PR adds annotations in comunity package. Annotations are only strictly needed in subclasses of BaseModel for pydantic 2 compatibility. This PR adds some unnecessary annotations, but they're not bad to have regardless for documentation pages.	2024-07-31 18:13:44 +00:00
ccurme	30f18c7b02	docs: add retriever integrations template (#24836 )	2024-07-31 13:50:44 -04:00
Anirudh31415926535	4da3d4b18e	docs: Minor corrections and updates to Cohere docs (#22726 ) - Description: Update the Cohere's provider and RagRetriever documentations with latest updates. - Twitter handle: Anirudh1810	2024-07-31 10:16:26 -07:00
Nishan Jain	b00c0fc558	[Community][minor]: Added prompt governance in pebblo_retrieval (#24874 ) Title: [pebblo_retrieval] Identifying entities in prompts given in PebbloRetrievalQA leading to prompt governance Description: Implemented identification of entities in the prompt using Pebblo prompt governance API. Issue: NA Dependencies: NA Add tests and docs: NA	2024-07-31 13:14:51 +00:00
Rajendra Kadam	a6add89bd4	community[minor]: [PebbloSafeLoader] Implement content-size-based batching (#24871 ) - Title: [PebbloSafeLoader] Implement content-size-based batching in the classification flow(loader/doc API) - Description: - Implemented content-size-based batching in the loader/doc API, set to 100KB with no external configuration option, intentionally hard-coded to prevent timeouts. - Remove unused field(pb_id) from doc_metadata - Issue: NA - Dependencies: NA - Add tests and docs: Updated	2024-07-31 09:10:28 -04:00
TrumanYan	096b66db4a	community: replace it with Tencent Cloud SDK (#24172 ) Description: The old method will be discontinued; use the official SDK for more model options. Issue: None Dependencies: None Twitter handle: None Co-authored-by: trumanyan <trumanyan@tencent.com>	2024-07-31 09:05:38 -04:00
Erick Friis	1f5444817a	community: deprecate BedrockEmbeddings in favor of langchain-aws (#24846 )	2024-07-30 23:13:17 +00:00
Shailendra Mishra	f2d810b3c0	clob_bugfix... (#24813 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-30 12:44:04 -04:00
Anush	51b15448cc	community: Fix FastEmbedEmbeddings (#24462 ) ## Description This PR: - Fixes the validation error in `FastEmbedEmbeddings`. - Adds support for `batch_size`, `parallel` params. - Removes support for very old FastEmbed versions. - Updates the FastEmbed doc with the new params. Associated Issues: - Resolves #24039 - Resolves #https://github.com/qdrant/fastembed/issues/296	2024-07-30 12:42:46 -04:00
ccurme	73ec24fc56	docs[patch]: add toolkit template (#24791 )	2024-07-30 12:36:09 -04:00
Igor Drozdov	c2706cfb9e	feat(community): add tools support for litellm (#23906 ) I used the following example to validate the behavior ```python from langchain_core.prompts import ChatPromptTemplate from langchain_core.runnables import ConfigurableField from langchain_anthropic import ChatAnthropic from langchain_community.chat_models import ChatLiteLLM from langchain_core.tools import tool from langchain.agents import create_tool_calling_agent, AgentExecutor @tool def multiply(x: float, y: float) -> float: """Multiply 'x' times 'y'.""" return x * y @tool def exponentiate(x: float, y: float) -> float: """Raise 'x' to the 'y'.""" return x**y @tool def add(x: float, y: float) -> float: """Add 'x' and 'y'.""" return x + y prompt = ChatPromptTemplate.from_messages([ ("system", "you're a helpful assistant"), ("human", "{input}"), ("placeholder", "{agent_scratchpad}"), ]) tools = [multiply, exponentiate, add] llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0) # llm = ChatLiteLLM(model="claude-3-sonnet-20240229", temperature=0) agent = create_tool_calling_agent(llm, tools, prompt) agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True) agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", }) ``` `ChatAnthropic` version works: ``` > Entering new AgentExecutor chain... Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}` responded: [{'text': 'To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools:', 'type': 'text', 'index': 0}, {'id': 'toolu_01Gf54DFTkfLMJQX3TXffmxe', 'input': {}, 'name': 'exponentiate', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 5, "y": 2.743}'}] 82.65606421491815 Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}` responded: [{'id': 'toolu_01XUq9S56GT3Yv2N1KmNmmWp', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 0, 'partial_json': '{"x": 3, "y": 82.65606421491815}'}] 85.65606421491815 Invoking: `add` with `{'x': 17.24, 'y': -918.1241}` responded: [{'text': '\n\nSo 3 + 5^2.743 = 85.66\n\nTo calculate 17.24 - 918.1241, we can use:', 'type': 'text', 'index': 0}, {'id': 'toolu_01BkXTwP7ec9JKYtZPy5JKjm', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 17.24, "y": -918.1241}'}] -900.8841[{'text': '\n\nTherefore, 17.24 - 918.1241 = -900.88', 'type': 'text', 'index': 0}] > Finished chain. ``` While `ChatLiteLLM` version doesn't. But with the changes in this PR, along with: - https://github.com/langchain-ai/langchain/pull/23823 - https://github.com/BerriAI/litellm/pull/4554 The result is _almost_ the same: ``` > Entering new AgentExecutor chain... Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}` responded: To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools: 82.65606421491815 Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}` 85.65606421491815 Invoking: `add` with `{'x': 17.24, 'y': -918.1241}` responded: So 3 + 5^2.743 = 85.66 To calculate 17.24 - 918.1241, we can use: -900.8841 Therefore, 17.24 - 918.1241 = -900.88 > Finished chain. ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-30 15:39:34 +00:00
David Robertson	bfb7f8d40a	Brave Search: Enhance search result details with extra snippets (#19209 ) Description: This update significantly improves the Brave Search Tool's utility within the LangChain library by enriching the search results it returns. The tool previously returned title, link, and snippet, with the snippet being a truncated 140-character description from the search engine. To make the search results more informative, this update enables extra_snippets by default and introduces additional result fields: title, link, description (enhancing and renaming the former snippet field), age, and snippets. The snippets field provides a list of strings summarizing the webpage, utilizing Brave's capability for more detailed search insights. This enhancement aims to make the search tool far more informative and beneficial for users. Issue: N/A Dependencies: No additional dependencies introduced. Twitter handle: @davidalexr987 Code Changes Summary: - Changed the default setting to include extra_snippets in search results. - Renamed the snippet field to description to accurately reflect its content and included an age field for search results. - Introduced a snippets field that lists webpage summaries, providing users with comprehensive search result insights. Backward Compatibility Note: The renaming of snippet to description improves the accuracy of the returned data field but may impact existing users who have developed integration's or analyses based on the snippet field. I believe this change is essential for clarity and utility, and it aligns better with the data provided by Brave Search. Additional Notes: This proposal focuses exclusively on the Brave Search package, without affecting other LangChain packages or introducing new dependencies.	2024-07-30 15:29:38 +00:00
Ben Chambers	435771fe74	[community]: Fix package name mismatch (#24824 ) - Description: fix a mismatch in pypi package names	2024-07-30 11:21:39 -04:00
maang-h	4bb1a11e02	community: Add MiniMaxChat bind_tools and structured output (#24310 ) - Description: - Add `bind_tools` method to support tool calling - Add `with_structured_output` method to support structured output	2024-07-29 15:51:52 -04:00
maang-h	bf685c242f	docs: Standardize QianfanEmbeddingsEndpoint (#24786 ) - Description: Standardize QianfanEmbeddingsEndpoint, include: - docstrings, the issue #21983 - model init arg names, the issue #20085	2024-07-29 13:19:24 -04:00
M. Ali	c086410677	fix docs typos (#23668 ) Thank you for contributing to LangChain! - [x] PR title: "docs: fix multiple typos" Co-authored-by: mohblnk <mohamed.ali@blnk.ai> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-29 16:10:55 +00:00
Pere Pasamonte	98175860ad	community: Fix AWS DocumentDB similarity_search when filter is None (#24777 ) Description Fixes DocumentDBVectorSearch similarity_search when no filter is used; it defaults to None but $match does not accept None, so changed default to empty {} before pipeline is created. Issue AWS DocumentDB similarity search does not work when no filter is used. Error msg: "the match filter must be an expression in an object" #24775 Dependencies No dependencies Twitter handle https://x.com/perepasamonte --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-29 15:32:05 +00:00
AmosDinh	c113682328	community:Add support for specifying document_loaders.firecrawl api url. (#24747 ) community:Add support for specifying document_loaders.firecrawl api url. Add support for specifying document_loaders.firecrawl api url. This is mainly to support the [self-hosting](https://github.com/mendableai/firecrawl/blob/main/SELF_HOST.md) option firecrawl provides. Eg. now I can specify localhost:.... The corresponding firecrawl class already provides functionality to pass the argument. See here: `4c9d62f6d3/apps/python-sdk/firecrawl/firecrawl.py (L29)` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-28 14:30:36 -04:00
Haijian Wang	cda3025ee1	Integrating the Yi family of models. (#24491 ) Thank you for contributing to LangChain! - [x] PR title: "community:add Yi LLM", "docs:add Yi Documentation" - [x] PR message: *Delete this entire checklist* and replace with - Description: This PR adds support for the Yi model to LangChain. - Dependencies: [langchain_core,requests,contextlib,typing,logging,json,langchain_community] - Twitter handle: 01.AI - [x] Add tests and docs: I've added the corresponding documentation to the relevant paths --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-26 10:57:33 -07:00
Marc Gibbons	cc451effd1	community[patch]: langchain_community.vectorstores.azuresearch Raise LangChainException instead of bare Exception (#23935 ) Raise `LangChainException` instead of `Exception`. This alleviates the need for library users to use bare try/except to handle exceptions raised by `AzureSearch`. Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-26 15:59:06 +00:00
Diverrez morgan	c4d2a53f18	community: creation score_threshold in flashrank_rerank.py (#24016 ) Description: add a optional score relevance threshold for select only coherent document, it's in complement of top_n Discussion: add relevance score threshold in flashrank_rerank document compressors #24013 Dependencies: no dependencies --------- Co-authored-by: Benjamin BERNARD <benjamin.bernard@openpathview.fr> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-26 13:34:39 +00:00
Cong Peng	190988d93e	community: Add parameter `allow_dangerous_requests` to `WebResearchRetriever.from_llm` construct (#24712 ) Description: To avoid ValueError when construct the retriever from method `from_llm()`.	2024-07-26 06:24:58 -07:00
monysun	5f593c172a	community: fix dashcope embeddings embed_query func post too much req to api (#24707 ) the fuc of embed_query of dashcope embeddings send a str param, and in the embed_with_retry func will send error content to api	2024-07-26 12:44:07 +00:00
yonarw	b65ac8d39c	community[minor]: Self query retriever for HANA Cloud Vector Engine (#24494 ) Description: - This PR adds a self query retriever implementation for SAP HANA Cloud Vector Engine. The retriever supports all operators except for contains. - Issue: N/A - Dependencies: no new dependencies added Add tests and docs: Added integration tests to: libs/community/tests/unit_tests/query_constructors/test_hanavector.py Documentation for self query retriever: /docs/integrations/retrievers/self_query/hanavector_self_query.ipynb --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-26 06:56:51 +00:00
nobbbbby	4f3b4fc7fe	community[patch]: Extend Baichuan model with tool support (#24529 ) Description: Expanded the chat model functionality to support tools in the 'baichuan.py' file. Updated module imports and added tool object handling in message conversions. Additional changes include the implementation of tool binding and related unit tests. The alterations offer enhanced model capabilities by enabling interaction with tool-like objects. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-25 23:20:44 -07:00
Rave Harpaz	ee399e3ec5	community[patch]: Add OCI Generative AI tool and structured output support (#24693 ) - [x] PR title: community: Add OCI Generative AI tool and structured output support - [x] PR message: - Description: adding tool calling and structured output support for chat models offered by OCI Generative AI services. This is an update to our last PR 22880 with changes in /langchain_community/chat_models/oci_generative_ai.py - Issue: NA - Dependencies: NA - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests 2. we have updated our documentation under /docs/docs/integrations/chat/oci_generative_ai.ipynb - [x] Lint and test: `make format`, `make lint` and `make test` we run successfully --------- Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com> Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>	2024-07-25 23:19:00 -07:00
Yuki Watanabe	2b6a262f84	community[patch]: Replace `filters` argument to `filter` in DatabricksVectorSearch (#24530 ) The [DatabricksVectorSearch](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/databricks_vector_search.py#L21) class exposes similarity search APIs with argument `filters`, which is inconsistent with other VS classes who uses `filter` (singular). This PR updates the argument and add alias for backward compatibility. --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>	2024-07-25 21:20:18 -07:00
Sunish Sheth	59880a9147	community[patch]: mlflow handle empty chunk(#24689 )	2024-07-25 20:36:29 -07:00
Chaunte W. Lacewell	69eacaa887	Community[minor]: Update VDMS vectorstore (#23729 ) Description: - This PR exposes some functions in VDMS vectorstore, updates VDMS related notebooks, updates tests, and upgrade version of VDMS (>=0.0.20) Issue: N/A Dependencies: - Update vdms>=0.0.20	2024-07-25 22:13:04 -04:00
KyrianC	0fdbaf4a8d	community: fix ChatEdenAI + EdenAI Tools (#23715 ) Fixes for Eden AI Custom tools and ChatEdenAI: - add missing import in __init__ of chat_models - add `args_schema` to custom tools. otherwise '__arg1' would sometimes be passed to the `run` method - fix IndexError when no human msg is added in ChatEdenAI	2024-07-25 15:19:14 -04:00
maang-h	38d30e285a	docs: Standardize BaichuanTextEmbeddings docstrings (#24674 ) - Description: Standardize BaichuanTextEmbeddings docstrings. - Issue: the issue #21983	2024-07-25 12:12:00 -04:00
rick-SOPTIM	cd563fb628	community[minor]: passthrough auth parameter on requests to Ollama-LLMs (#24068 ) Thank you for contributing to LangChain! Description: This PR allows users of `langchain_community.llms.ollama.Ollama` to specify the `auth` parameter, which is then forwarded to all internal calls of `requests.request`. This works in the same way as the existing `headers` parameters. The auth parameter enables the usage of the given class with Ollama instances, which are secured by more complex authentication mechanisms, that do not only rely on static headers. An example are AWS API Gateways secured by the IAM authorizer, which expects signatures dynamically calculated on the specific HTTP request. Issue: Integrating a remote LLM running through Ollama using `langchain_community.llms.ollama.Ollama` only allows setting static HTTP headers with the parameter `headers`. This does not work, if the given instance of Ollama is secured with an authentication mechanism that makes use of dynamically created HTTP headers which for example may depend on the content of a given request. Dependencies: None Twitter handle: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-25 15:48:35 +00:00
Luca Dorigo	5fdbdd6bec	community[patch]: Fix invalid iohttp verify parameter (#24655 ) Should fix https://github.com/langchain-ai/langchain/issues/24654	2024-07-25 11:09:21 -04:00
Oleg Kulyk	4b1b7959a2	community[minor]: Add ScrapingAnt Loader Community Integration (#24514 ) Added [ScrapingAnt](https://scrapingant.com/) Web Loader integration. ScrapingAnt is a web scraping API that allows extracting web page data into accessible and well-formatted markdown. Description: Added ScrapingAnt web loader for retrieving web page data as markdown Dependencies: scrapingant-client Twitter: @WeRunTheWorld3 --------- Co-authored-by: Oleg Kulyk <oleg@scrapingant.com>	2024-07-24 21:11:43 -04:00
John	d59c656ea5	unstructured, community, initialize langchain-unstructured package (#22779 ) #### Update (2): A single `UnstructuredLoader` is added to handle both local and api partitioning. This loader also handles single or multiple documents. #### Changes in `community`: Changes here do not affect users. In the initial process of using the SDK for the API Loaders, the Loaders in community were refactored. Other changes include: The `UnstructuredBaseLoader` has a new check to see if both `mode="paged"` and `chunking_strategy="by_page"`. It also now has `Element.element_id` added to the `Document.metadata`. `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such, now both directly inherit from `UnstructuredBaseLoader` and initialize their `file_path`/`file` attributes respectively and implement their own `_post_process_elements` methods. -------- #### Update: New SDK Loaders in a [partner package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo) are introduced to prevent breaking changes for users (see discussion below). ##### TODO: - [x] Test docstring examples -------- - Description: UnstructuredAPIFileIOLoader and UnstructuredAPIFileLoader calls to the unstructured api are now made using the unstructured-client sdk. - New Dependencies: unstructured-client - [x] Add tests and docs: If you're adding a new integration, please include - [x] a test for the integration, preferably unit tests that do not rely on network access, - [x] update the description in `docs/docs/integrations/providers/unstructured.mdx` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. TODO: - [x] Update https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api - `langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb` - The description here needs to indicate that users should install `unstructured-client` instead of `unstructured`. Read over closely to look for any other changes that need to be made. - [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to handle json responses from the API instead of just lists of elements. - This method may need to be overwritten by the API loaders instead of changing it in the `UnstructuredBaseLoader`. - [x] Update the documentation links in the class docstrings (the Unstructured documents have moved) - [x] Update Document.metadata to include `element_id` (see thread [here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419)) --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com>	2024-07-24 23:21:20 +00:00
Eugene Yurtsev	b55f6105c6	community[patch]: Add linter to prevent further usage of root_validator and validator (#24613 ) This linter is meant to move development to use __init__ instead of root_validator and validator. We need to investigate whether we need to lint some of the functionality of Field (e.g., `lt` and `gt`, `alias`) `alias` is the one that's most popular: (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "alias=" \| wc -l 144 (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "ge=" \| wc -l 10 (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "gt=" \| wc -l 4	2024-07-24 12:35:21 -04:00
Anindyadeep	12c3454fd9	[Community] PremAI Tool Calling Functionality (#23931 ) This PR is under WIP and adds the following functionalities: - [X] Supports tool calling across the langchain ecosystem. (However streaming is not supported) - [X] Update documentation	2024-07-24 09:53:58 -04:00
Vishnu Nandakumar	e271965d1e	community: retrievers: added capability for using Product Quantization as one of the retriever. (#22424 ) - [ ] Community: "Retrievers: Product Quantization" - [X] This PR adds Product Quantization feature to the retrievers to the Langchain Community. PQ is one of the fastest retrieval methods if the embeddings are rich enough in context due to the concepts of quantization and representation through centroids - Description: Adding PQ as one of the retrievers - Dependencies: using the package nanopq for this PR - Twitter handle: vishnunkumar_ - [X] Add tests and docs: If you're adding a new integration, please include - [X] Added unit tests for the same in the retrievers. - [] Will add an example notebook subsequently - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ - done the same --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-24 13:52:15 +00:00
stydxm	b9bea36dd4	community: fix typo in warning message (#24597 ) - Description: This PR fixes a small typo in a warning message - Issue: ![](https://github.com/user-attachments/assets/5aa57724-26c5-49f6-8bc1-5a54bb67ed49) There were double `Use` and double `instead`	2024-07-24 13:19:07 +00:00
cüre	da06d4d7af	community: update finetuned model cost for 4o-mini (#24605 ) - Description: adds model price for. reference: https://openai.com/api/pricing/ - Issue: - - Dependencies: - - Twitter handle: cureef	2024-07-24 13:17:26 +00:00
ZhangShenao	ad18afc3ec	community[patch]: Fix param spelling error in `ElasticsearchChatMessageHistory` (#24589 ) Fix param spelling error in `ElasticsearchChatMessageHistory`	2024-07-23 19:29:42 -07:00
Aayush Kataria	0f45ac4088	LangChain Community: VectorStores: Azure Cosmos DB Filtered Vector Search (#24087 ) Thank you for contributing to LangChain! - This PR adds vector search filtering for Azure Cosmos DB Mongo vCore and NoSQL. - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-23 16:59:23 -07:00
Carlos André Antunes	325068bb53	community: Fix azure_openai.py (#24572 ) In some lines its trying to read a key that do not exists yet. In this cases I changed the direct access to dict.get() method - [ x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-07-23 16:22:21 -04:00
Bagatur	8691a5a37f	community[patch]: Release 0.2.10 (#24560 )	2024-07-23 09:24:57 -07:00
Ben Chambers	e80b0932ee	community[patch]: small fixes to link extractors (#24528 ) - Description: small fixes to imports / types in the link extraction work --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-23 14:28:06 +00:00
Morteza Hosseini	9e06991aae	community[patch]: Update URL to the 2markdown API (#24546 ) Update the URL to Markdown endpoint. API information is available here: https://2markdown.com/docs#url2md	2024-07-23 14:27:55 +00:00
maang-h	378db2e1a5	docs: Add RedisChatMessageHistory docstrings (#24548 ) - Description: Add `RedisChatMessageHistory ` rich docstrings. - Issue: the issue #21983 Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-23 14:23:46 +00:00
Ben Chambers	a5a3d28776	community[patch]: Remove targets_table from C* GraphVectorStore (#24502 ) - Description: Remove the unnecessary `targets_table` parameter	2024-07-22 22:09:36 -04:00
Alexander Golodkov	2a70a07aad	community[minor]: added new document loaders based on dedoc library (#24303 ) ### Description This pull request added new document loaders to load documents of various formats using [Dedoc](https://github.com/ispras/dedoc): - `DedocFileLoader` (determine file types automatically and parse) - `DedocPDFLoader` (for `PDF` and images parsing) - `DedocAPIFileLoader` (determine file types automatically and parse using Dedoc API without library installation) [Dedoc](https://dedoc.readthedocs.io) is an open-source library/service that extracts texts, tables, attached files and document structure (e.g., titles, list items, etc.) from files of various formats. The library is actively developed and maintained by a group of developers. `Dedoc` supports `DOCX`, `XLSX`, `PPTX`, `EML`, `HTML`, `PDF`, images and more. Full list of supported formats can be found [here](https://dedoc.readthedocs.io/en/latest/#id1). For `PDF` documents, `Dedoc` allows to determine textual layer correctness and split the document into paragraphs. ### Issue This pull request extends variety of document loaders supported by `langchain_community` allowing users to choose the most suitable option for raw documents parsing. ### Dependencies The PR added a new (optional) dependency `dedoc>=2.2.5` ([library documentation](https://dedoc.readthedocs.io)) to the `extended_testing_deps.txt` ### Twitter handle None ### Add tests and docs 1. Test for the integration: `libs/community/tests/integration_tests/document_loaders/test_dedoc.py` 2. Example notebook: `docs/docs/integrations/document_loaders/dedoc.ipynb` 3. Information about the library: `docs/docs/integrations/providers/dedoc.mdx` ### Lint and test Done locally: - `make format` - `make lint` - `make integration_tests` - `make docs_build` (from the project root) --------- Co-authored-by: Nasty <bogatenkova.anastasiya@mail.ru>	2024-07-23 02:04:53 +00:00
Ben Chambers	5ac936a284	community[minor]: add document transformer for extracting links (#24186 ) - Description: Add a DocumentTransformer for executing one or more `LinkExtractor`s and adding the extracted links to each document. - Issue: n/a - Depedencies: none --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-22 22:01:21 -04:00
Erick Friis	3dce2e1d35	all: add release notes to pypi (#24519 )	2024-07-22 13:59:13 -07:00
Bagatur	236e957abb	core,groq,openai,mistralai,robocorp,fireworks,anthropic[patch]: Update BaseModel subclass and instance checks to handle both v1 and proper namespaces (#24417 ) After this PR chat models will correctly handle pydantic 2 with bind_tools and with_structured_output. ```python import pydantic print(pydantic.__version__) ``` 2.8.2 ```python from langchain_openai import ChatOpenAI from pydantic import BaseModel, Field class Add(BaseModel): x: int y: int model = ChatOpenAI().bind_tools([Add]) print(model.invoke('2 + 5').tool_calls) model = ChatOpenAI().with_structured_output(Add) print(type(model.invoke('2 + 5'))) ``` ``` [{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_PNUFa4pdfNOYXxIMHc6ps2Do', 'type': 'tool_call'}] <class '__main__.Add'> ``` ```python from langchain_openai import ChatOpenAI from pydantic.v1 import BaseModel, Field class Add(BaseModel): x: int y: int model = ChatOpenAI().bind_tools([Add]) print(model.invoke('2 + 5').tool_calls) model = ChatOpenAI().with_structured_output(Add) print(type(model.invoke('2 + 5'))) ``` ```python [{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_hhiHYP441cp14TtrHKx3Upg0', 'type': 'tool_call'}] <class '__main__.Add'> ``` Addresses issues: https://github.com/langchain-ai/langchain/issues/22782 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-22 20:07:39 +00:00
Naka Masato	884f76e05a	fix: load google credentials properly in GoogleDriveLoader (#12871 ) - Description: - Fix #12870: set scope in `default` func (ref: https://google-auth.readthedocs.io/en/master/reference/google.auth.html) - Moved the code to load default credentials to the bottom for clarity of the logic - Add docstring and comment for each credential loading logic - Issue: https://github.com/langchain-ai/langchain/issues/12870 - Dependencies: no dependencies change - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: @gymnstcs <!-- If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-22 17:43:33 +00:00
Jorge Piedrahita Ortiz	10e3982b59	community: sambanova integration minor changes (#24503 ) - Minor changes in samabanova llm integration - default api - docstrings - minor changes in docs	2024-07-22 17:06:35 +00:00
maang-h	721f709dec	community: Improve QianfanChatEndpoint tool result to model (#24466 ) - Description: `QianfanChatEndpoint` When using tool result to answer questions, the content of the tool is required to be in Dict format. Of course, this can require users to return Dict format when calling the tool, but in order to be consistent with other Chat Models, I think such modifications are necessary.	2024-07-22 11:29:00 -04:00
ccurme	dcba7df2fe	community[patch]: deprecate langchain_community Chroma in favor of langchain_chroma (#24474 )	2024-07-22 11:00:13 -04:00
Mohammad Mohtashim	5ade0187d0	[Commutiy]: Prompts Fixed for ZERO_SHOT_REACT React Agent Type in `create_sql_agent` function (#23693 ) - Description: The correct Prompts for ZERO_SHOT_REACT were not being used in the `create_sql_agent` function. They were not using the specific `SQL_PREFIX` and `SQL_SUFFIX` prompts if client does not provide any prompts. This is fixed. - Issue: #23585 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-22 14:04:20 +00:00
ZhangShenao	0f6737cbfe	[Vector Store] Fix function `add_texts` in `TencentVectorDB` (#24469 ) Regardless of whether `embedding_func` is set or not, the 'text' attribute of document should be assigned, otherwise the `page_content` in the document of the final search result will be lost	2024-07-22 09:50:22 -04:00
clement.l	d98b830e4b	community: add flag to toggle progress bar (#24463 ) - Description: Add a flag to determine whether to show progress bar - Issue: n/a - Dependencies: n/a - Twitter handle: n/a --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-20 13:18:02 +00:00
chuanbei888	6b08a33fa4	community: fix QianfanChatEndpoint default model (#24464 ) the baidu_qianfan_endpoint has been changed from ERNIE-Bot-turbo to ERNIE-Lite-8K	2024-07-20 13:00:29 +00:00
maang-h	7b28359719	docs: Add ChatSparkLLM docstrings (#24449 ) - Description: - Add `ChatSparkLLM` docstrings, the issue #22296 - To support `stream` method	2024-07-19 20:19:14 -07:00
Erick Friis	f4ee3c8a22	infra: add min version testing to pr test flow (#24358 ) xfailing some sql tests that do not currently work on sqlalchemy v1 #22207 was very much not sqlalchemy v1 compatible. Moving forward, implementations should be compatible with both to pass CI	2024-07-19 22:03:19 +00:00
Bagatur	842065a9cc	community[patch]: Release 0.2.9 (#24453 )	2024-07-19 12:50:22 -07:00
Bagatur	dda9438e87	community[patch]: gpt-4o-mini costs (#24421 )	2024-07-19 19:02:44 +00:00
Eugene Yurtsev	604dfe2d99	community[patch]: Force opt-in for WebResearchRetriever (CVE-2024-3095) (#24451 ) This PR addresses the issue raised by (CVE-2024-3095) https://huntr.com/bounties/e62d4895-2901-405b-9559-38276b6a5273 Unfortunately, we didn't do a good job writing the initial report. It's pointing at both the wrong package and the wrong code. The affected code is the Web Retriever not the AsyncHTMLLoader, and the WebRetriever lives in langchain-community The vulnerable code lives here: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L233-L233)` This PR adds a forced opt-in for users to make sure they are aware of the risk and can mitigate by configuring a proxy: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L84-L84)`	2024-07-19 18:51:35 +00:00
Asi Greenholts	372c27f2e5	community[minor]: [GoogleApiYoutubeLoader] Replace API used in _get_document_for_channel from search to playlistItem (#24034 ) - Description: Search has a limit of 500 results, playlistItems doesn't. Added a class in except clause to catch another common error. - Issue: None - Dependencies: None - Twitter handle: @TupleType --------- Co-authored-by: asi-cider <88270351+asi-cider@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 14:04:34 -04:00
Rafael Pereira	6a45bf9554	community[minor]: GraphCypherQAChain to accept additional inputs as provided by the user for cypher generation (#24300 ) Description: This PR introduces a change to the `cypher_generation_chain` to dynamically concatenate inputs. This improvement aims to streamline the input handling process and make the method more flexible. The change involves updating the arguments dictionary with all elements from the `inputs` dictionary, ensuring that all necessary inputs are dynamically appended. This will ensure that any cypher generation template will not require a new `_call` method patch. Issue: This PR fixes issue #24260.	2024-07-19 14:03:14 -04:00
Philippe PRADOS	f5856680fe	community[minor]: add mongodb byte store (#23876 ) The `MongoDBStore` can manage only documents. It's not possible to use MongoDB for an `CacheBackedEmbeddings`. With this new implementation, it's possible to use: ```python CacheBackedEmbeddings.from_bytes_store( underlying_embeddings=embeddings, document_embedding_cache=MongoDBByteStore( connection_string=db_uri, db_name=db_name, collection_name=collection_name, ), ) ``` and use MongoDB to cache the embeddings !	2024-07-19 13:54:12 -04:00
yabooung	07715f815b	community[minor]: Add ability to specify file encoding and json encoding for FileChatMessageHistory (#24258 ) Description: Add UTF-8 encoding support Issue: Inability to properly handle characters from certain languages (e.g., Korean) Fix: Implement UTF-8 encoding in FileChatMessageHistory --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 13:53:21 -04:00
Dristy Srivastava	020cc1cf3e	Community[minor]: Added checksum in while send data to pebblo-cloud (#23968 ) - Description: - Updated checksum in doc metadata - Sending checksum and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `/loader/doc` API - Adding `pb_id` i.e. pebblo id to doc metadata - Refactoring as needed. - Sending `content-checksum` and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `prmopt` API - Issue: NA - Dependencies: NA - Tests: Updated - Docs NA --------- Co-authored-by: dristy.cd <dristy@clouddefense.io>	2024-07-19 13:52:54 -04:00
keval dekivadiya	06f47678ae	community[minor]: Add TextEmbed Embedding Integration (#22946 ) Description: TextEmbed is a high-performance embedding inference server designed to provide a high-throughput, low-latency solution for serving embeddings. It supports various sentence-transformer models and includes the ability to deploy image and text embedding models. TextEmbed offers flexibility and scalability for diverse applications. - PyPI Package: [TextEmbed on PyPI](https://pypi.org/project/textembed/) - Docker Image: [TextEmbed on Docker Hub](https://hub.docker.com/r/kevaldekivadiya/textembed) - GitHub Repository: [TextEmbed on GitHub](https://github.com/kevaldekivadiya2415/textembed) PR Description This PR adds functionality for embedding documents and queries using the `TextEmbedEmbeddings` class. The implementation allows for both synchronous and asynchronous embedding requests to a TextEmbed API endpoint. The class handles batching and permuting of input texts to optimize the embedding process. Example Usage: ```python from langchain_community.embeddings import TextEmbedEmbeddings # Initialise the embeddings class embeddings = TextEmbedEmbeddings(model="your-model-id", api_key="your-api-key", api_url="your_api_url") # Define a list of documents documents = [ "Data science involves extracting insights from data.", "Artificial intelligence is transforming various industries.", "Cloud computing provides scalable computing resources over the internet.", "Big data analytics helps in understanding large datasets.", "India has a diverse cultural heritage." ] # Define a query query = "What is the cultural heritage of India?" # Embed all documents document_embeddings = embeddings.embed_documents(documents) # Embed the query query_embedding = embeddings.embed_query(query) # Print embeddings for each document for i, embedding in enumerate(document_embeddings): print(f"Document {i+1} Embedding:", embedding) # Print the query embedding print("Query Embedding:", query_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-19 17:30:25 +00:00
Andrew Benton	f9d64d22e5	community[minor]: Add Riza Python/JS code execution tool (#23995 ) - Description: Add Riza Python/JS code execution tool - Issue: N/A - Dependencies: an optional dependency on the `rizaio` pypi package - Twitter handle: [@rizaio](https://x.com/rizaio) [Riza](https://riza.io) is a safe code execution environment for agent-generated Python and JavaScript that's easy to integrate into langchain apps. This PR adds two new tool classes to the community package.	2024-07-19 17:03:22 +00:00
Ben Chambers	3691701d58	community[minor]: Add keybert-based link extractor (#24311 ) - Description: Add a `KeybertLinkExtractor` for graph vectorstores. This allows extracting links from keywords in a Document and linking nodes that have common keywords. - Issue: None - Dependencies: None. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-19 12:25:07 -04:00
Ben Chambers	83f3d95ffa	community[minor]: GLiNER link extraction (#24314 ) - Description: This allows extracting links between documents with common named entities using [GLiNER](https://github.com/urchade/GLiNER). - Issue: None - Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 15:34:54 +00:00
Anas Khan	b5acb91080	Mask API keys for various LLM/ChatModel Modules (#13885 ) Description: - Added masking of the API Keys for the modules: - `langchain/chat_models/openai.py` - `langchain/llms/openai.py` - `langchain/llms/google_palm.py` - `langchain/chat_models/google_palm.py` - `langchain/llms/edenai.py` - Updated the modules to utilize `SecretStr` from pydantic to securely manage API key. - Added unit/integration tests - `langchain/chat_models/asure_openai.py` used the `open_api_key` that is derived from the `ChatOpenAI` Class and it was assuming `openai_api_key` is a str so we changed it to expect `SecretStr` instead. Issue: https://github.com/langchain-ai/langchain/issues/12165 , Dependencies: none, Tag maintainer: @eyurtsev --------- Co-authored-by: HassanA01 <anikeboss@gmail.com> Co-authored-by: Aneeq Hassan <aneeq.hassan@utoronto.ca> Co-authored-by: kristinspenc <kristinspenc2003@gmail.com> Co-authored-by: faisalt14 <faisalt14@gmail.com> Co-authored-by: Harshil-Patel28 <76663814+Harshil-Patel28@users.noreply.github.com> Co-authored-by: kristinspenc <146893228+kristinspenc@users.noreply.github.com> Co-authored-by: faisalt14 <90787271+faisalt14@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 15:23:34 +00:00
ccurme	f99369a54c	community[patch]: fix formatting (#24443 ) Somehow this got through CI: https://github.com/langchain-ai/langchain/pull/24363	2024-07-19 14:38:53 +00:00
Ben Chambers	242b085be7	Merge pull request #24315 * community: Add Hierarchy link extractor * add example * lint	2024-07-19 09:42:26 -04:00
Rhuan Barros	c3308f31bc	Merge pull request #24363 * important email fields	2024-07-19 09:41:20 -04:00
Han Sol Park	aade9bfde5	Mask API key for ChatOpenAI based chat_models (#14293 ) - Description: Mask API key for ChatOpenAi based chat_models (openai, azureopenai, anyscale, everlyai). Made changes to all chat_models that are based on ChatOpenAI since all of them assumes that openai_api_key is str rather than SecretStr. - Issue:: #12165 - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: N/A --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 02:25:38 +00:00
Eun Hye Kim	07c5c60f63	community: fix tool appending logic and update planner prompt in OpenAPI agent toolkit (#24384 ) Description: - Updated the format for the 'Action' section in the planner prompt to ensure it must be one of the tools without additional words. Adjusted the phrasing from "should be" to "must be" for clarity and enforceability. - Corrected the tool appending logic in the `_create_api_controller_agent` function to ensure that `RequestsDeleteToolWithParsing` and `RequestsPatchToolWithParsing` are properly added to the tools list for "DELETE" and "PATCH" operations. Issue: #24382 Dependencies: None Twitter handle: @lunara_x --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-18 13:37:46 +00:00
Chen Xiabin	63c60a31f0	[fix] baidu qianfan AiMessage with usage_metadata (#24389 ) make AIMessage usage_metadata has error	2024-07-18 09:28:16 -04:00
Rajendra Kadam	1c65529fd7	community[minor]: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders (#24393 ) Thank you for contributing to LangChain! - [x] PR title: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders - Description: Minor fixes in the PebbloSafeLoader: - Renamed the loader type from `remote_db` to `cloud_folder`. - Added `SharePointLoader` to the list of loaders supported by PebbloSafeLoader. - Issue: NA - Dependencies: NA - [x] Add tests and docs: NA	2024-07-18 08:23:12 -04:00
Paolo Ráez	0dec72cab0	Community[patch]: Missing "stream" parameter in cloudflare_workersai (#23987 ) ### Description Missing "stream" parameter. Without it, you'd never receive a stream of tokens when using stream() or astream() ### Issue No existing issue available	2024-07-18 02:09:39 +00:00
Brice Fotzo	034a8c7c1b	community: support advanced text extraction options for pdf documents (#20265 ) Description: - Updated constructors in PyPDFParser and PyPDFLoader to handle `extraction_mode` and additional kwargs, aligning with the capabilities of `PageObject.extract_text()` from pypdf. - Added `test_pypdf_loader_with_layout` along with a corresponding example text file to validate layout extraction from PDFs. Issue: fixes #19735 Dependencies: This change requires updating the pypdf dependency from version 3.4.0 to at least 4.0.0. Additional changes include the addition of a new test test_pypdf_loader_with_layout and an example text file to ensure the functionality of layout extraction from PDFs aligns with the new capabilities. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 20:47:09 +00:00
Bagatur	b5360e2e5f	community[patch]: Release 0.2.8 (#24354 )	2024-07-17 17:07:27 +00:00
Luis Moros	bcb5f354ad	community: Fix SQLDatabse.from_databricks issue when ran from Job (#24346 ) - Description: When SQLDatabase.from_databricks is ran from a Databricks Workflow job, line 205 (default_host = context.browserHostName) throws an ``AttributeError`` as the ``context`` object has no ``browserHostName`` attribute. The fix handles the exception and sets the ``default_host`` variable to null --------- Co-authored-by: lmorosdb <lmorosdb> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-17 12:40:12 -04:00
Rafael Pereira	cf28708e7b	Neo4j: Update with non-deprecated cypher methods, and new method to associate relationship embeddings (#23725 ) Description: At the moment neo4j wrapper is using setVectorProperty, which is deprecated ([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)). I replaced with the non-deprecated version. Neo4j recently introduced a new cypher method to associate embeddings into relations using "setRelationshipVectorProperty" method. In this PR I also implemented a new method to perform this association maintaining the same format used in the "add_embeddings" method which is used to associate embeddings into Nodes. I also included a test case for this new method.	2024-07-17 12:37:47 -04:00
maang-h	2a3288b15d	docs: Add ChatBaichuan docstrings (#24348 ) - Description: Add ChatBaichuan rich docstrings. - Issue: the issue #22296	2024-07-17 12:00:16 -04:00
Rafael Pereira	fc41730e28	neo4j: Fix test for order-insensitive comparison and floating-point precision issues (#24338 ) Description: This PR addresses two main issues in the `test_neo4jvector.py`: 1. Order-insensitive Comparison: Modified the `test_retrieval_dictionary` to ensure that it passes regardless of the order of returned values by parsing `page_content` into a structured format (dictionary) before comparison. 2. Floating-point Precision: Updated `test_neo4jvector_relevance_score` to handle minor floating-point precision differences by using the `isclose` function for comparing relevance scores with a relative tolerance. Errors addressed: - test_neo4jvector_relevance_score: ``` AssertionError: assert [(Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746), (Document(page_content='bar', metadata={'page': '1'}), 0.9998371005058289), (Document(page_content='baz', metadata={'page': '2'}), 0.9993508458137512)] == [(Document(page_content='foo', metadata={'page': '0'}), 1.0), (Document(page_content='bar', metadata={'page': '1'}), 0.9998376369476318), (Document(page_content='baz', metadata={'page': '2'}), 0.9993523359298706)] At index 0 diff: (Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746) != (Document(page_content='foo', metadata={'page': '0'}), 1.0) Full diff: - [(Document(page_content='foo', metadata={'page': '0'}), 1.0), + [(Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746), ? +++++++++++++++ - (Document(page_content='bar', metadata={'page': '1'}), 0.9998376369476318), ? ^^^ ------ + (Document(page_content='bar', metadata={'page': '1'}), 0.9998371005058289), ? ^^^^^^^^^ - (Document(page_content='baz', metadata={'page': '2'}), 0.9993523359298706), ? ---------- + (Document(page_content='baz', metadata={'page': '2'}), 0.9993508458137512), ? ++++++++++ ] ``` - test_retrieval_dictionary: ``` AssertionError: assert [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nname: John\nage: 30\n')] == [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n')] At index 0 diff: Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nname: John\nage: 30\n') != Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n') Full diff: - [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n')] ? --------- + [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: John\nage: 30\n')] ? +++++++++ ```	2024-07-17 09:28:25 -04:00

... 2 3 4 5 6 ...

1645 Commits