langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-06-22 06:39:52 +00:00

Author	SHA1	Message	Date
Noah Mayerhofer	0091947efd	community: add retry for session expired exception in neo4j (#25660 ) Description: The neo4j driver can raise a SessionExpired error, which is considered a retriable error. If a query fails with a SessionExpired error, this change retries every query once. This change will make the neo4j integration less flaky. Twitter handle: noahmay_	2024-08-22 13:07:36 +00:00
Dristy Srivastava	b002702af6	[Community][minor]: Updating metadata with full_path in SharePoint loader (#25593 ) - Description: Updating metadata for sharepoint loader with full path i.e., webUrl - Issue: NA - Dependencies: NA - Tests: NA - Docs NA Co-authored-by: dristy.cd <dristy@clouddefense.io> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-21 13:10:14 +00:00
Jabir	12e490ea56	Update azuresearch.py (#25577 ) This will allow complextype metadata to be returned. the current implementation throws error when dealing with nested metadata Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-20 12:53:30 +00:00
Erick Friis	e01c6789c4	core,community: add beta decorator to missed GraphVectorStore extensions (#25562 )	2024-08-19 17:29:09 -07:00
maang-h	015ab91b83	community[patch]: Add ToolMessage for ChatZhipuAI (#25547 ) - Description: Add ToolMessage for `ChatZhipuAI` to solve the issue #25490	2024-08-19 11:26:38 -04:00
Mohammad Mohtashim	75c3c81b8c	[Community]: Fix - Open AI Whisper `client.audio.transcriptions` returning Text Object which raises error (#25271 ) - Description: The following [line](`fd546196ef/libs/community/langchain_community/document_loaders/parsers/audio.py (L117)`) in `OpenAIWhisperParser` returns a text object for some odd reason despite the official documentation saying it should return `Transcript` Instance which should have the text attribute. But for the example given in the issue and even when I tried running on my own, I was directly getting the text. The small PR accounts for that. - Issue: : #25218 I was able to replicate the error even without the GenericLoader as shown below and the issue was with `OpenAIWhisperParser` ```python parser = OpenAIWhisperParser(api_key="sk-fxxxxxxxxx", response_format="srt", temperature=0) list(parser.lazy_parse(Blob.from_path('path_to_file.m4a'))) ```	2024-08-19 09:36:42 -04:00
maang-h	32f5147523	docs: Fix QianfanLLMEndpoint and Tongyi input text (#25529 ) - Description: Fix `QianfanLLMEndpoint` and `Tongyi` input text.	2024-08-19 09:23:09 -04:00
ZhangShenao	4255a30f20	Improvement[Community] Improve api doc for `SingleFileFacebookMessengerChatLoader` (#25536 ) Delete redundant args in api doc	2024-08-19 09:00:21 -04:00
ccurme	b83f1eb0d5	core, partners: implement standard tracing params for LLMs (#25410 )	2024-08-16 13:18:09 -04:00
Bagatur	253ceca76a	docs: fix mimetype parser docstring (#25463 )	2024-08-15 16:16:52 -07:00
ccurme	8afbab4cf6	langchain[patch]: deprecate various chains (#25310 ) - [x] NatbotChain: move to community, deprecate langchain version. Update to use `prompt \| llm \| output_parser` instead of LLMChain. - [x] LLMMathChain: deprecate + add langgraph replacement example to API ref - [x] HypotheticalDocumentEmbedder (retriever): update to use `prompt \| llm \| output_parser` instead of LLMChain - [x] FlareChain: update to use `prompt \| llm \| output_parser` instead of LLMChain - [x] ConstitutionalChain: deprecate + add langgraph replacement example to API ref - [x] LLMChainExtractor (document compressor): update to use `prompt \| llm \| output_parser` instead of LLMChain - [x] LLMChainFilter (document compressor): update to use `prompt \| llm \| output_parser` instead of LLMChain - [x] RePhraseQueryRetriever (retriever): update to use `prompt \| llm \| output_parser` instead of LLMChain	2024-08-15 10:49:26 -04:00
ccurme	ba167dc158	community[patch]: update connection string in azure cosmos integration test (#25438 )	2024-08-15 14:07:54 +00:00
Isaac Francisco	966b408634	[docs]: doc loader changes (#25417 )	2024-08-14 19:46:33 -07:00
Werner van der Merwe	1d3f7231b8	fix: typo where github should be gitlab (#25397 ) PR title: "GitLabToolkit: fix typo" - Description: fix typo where GitHub should have been GitLab - Dependencies: None	2024-08-14 18:36:25 +00:00
Bagatur	493e474063	docs: udpated api reference (#25172 ) - Move the API reference into the vercel build - Update api reference organization and styling	2024-08-14 07:00:17 -07:00
ccurme	27690506d0	multiple: update removal targets (#25361 )	2024-08-14 09:50:39 -04:00
Harrison Chase	967b6f21f6	docs: improve document loaders index (#25365 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-14 01:48:48 +00:00
Isaac Francisco	f4ffd692a3	[docs]: standardize doc loader doc strings (#25325 )	2024-08-13 23:18:56 +00:00
Isaac Francisco	e0bbb81d04	[docs]: standardize tool docstrings (#25351 )	2024-08-13 16:10:00 -07:00
thedavgar	9d08369442	community: fix AzureSearch vectorstore asyncronous methods (#24921 ) Description Fix the asyncronous methods to retrieve documents from AzureSearch VectorStore. The previous changes from [this commit](`ffe6ca986e`) create a similar code for the syncronous methods and the asyncronous ones but the asyncronous client return an asyncronous iterator "AsyncSearchItemPaged" as said in the issue #24740. To solve this issue, the syncronous iterators in asyncronous methods where changed to asyncronous iterators. @chrislrobert said in [this comment](https://github.com/langchain-ai/langchain/issues/24740#issuecomment-2254168302) that there was a still a flaw due to `with` blocks that close the client after each call. I removed this `with` blocks in the `async_client` following the same pattern as the sync `client`. In order to close up the connections, a __del__ method is included to gently close up clients once the vectorstore object is destroyed. Issue: #24740 and #24064 Dependencies: No new dependencies for this change Example notebook: I created a notebook just to test the changes work and gives the same results as the syncronous methods for vector and hybrid search. With these changes, the asyncronous methods in the retriever work as well. ![image](https://github.com/user-attachments/assets/697e431b-9d7f-4d0d-b205-59d051ac2b67) Lint and test: Passes the tests and the linter	2024-08-13 14:20:51 -07:00
Fedor Nikolaev	2b15518c5f	community: add args_schema to SearxSearchResults tool (#25350 ) This adds `args_schema` member to `SearxSearchResults` tool. This member is already present in the `SearxSearchRun` tool in the same file. I was having `TypeError: Type is not JSON serializable: AsyncCallbackManagerForToolRun` being thrown in langserve playground when I was using `SearxSearchResults` tool as a part of chain there. This fixes the issue, so the error is not raised anymore. This is a example langserve app that was giving me the error, but it works properly after the proposed fix: ```python #!/usr/bin/env python from fastapi import FastAPI from langchain_core.prompts import ChatPromptTemplate from langchain_core.output_parsers import StrOutputParser from langchain_core.runnables import RunnablePassthrough from langchain_openai import ChatOpenAI from langchain_community.utilities import SearxSearchWrapper from langchain_community.tools.searx_search.tool import SearxSearchResults from langserve import add_routes template = """Answer the question based only on the following context: {context} Question: {question} """ prompt = ChatPromptTemplate.from_template(template) model = ChatOpenAI() s = SearxSearchWrapper(searx_host="http://localhost:8080") search = SearxSearchResults(wrapper=s) search_chain = ( {"context": search, "question": RunnablePassthrough()} \| prompt \| model \| StrOutputParser() ) app = FastAPI() add_routes( app, search_chain, path="/chain", ) if __name__ == "__main__": import uvicorn uvicorn.run(app, host="localhost", port=8000) ```	2024-08-13 18:26:09 +00:00
maang-h	089f5e6cad	Standardize SparkLLM (#25239 ) - Description: Standardize SparkLLM, include: - docs, the issue #24803 - to support stream - update api url - model init arg names, the issue #20085	2024-08-13 09:50:12 -04:00
Chen Xiabin	24155aa1ac	qianfan generate/agenerate with usage_metadata (#25332 )	2024-08-13 09:24:41 -04:00
Erick Friis	2907ab2297	community: release 0.2.12 (#25324 )	2024-08-12 23:30:27 +00:00
Ben Chambers	1adc161642	community: kwargs for CassandraGraphVectorStore (#25300 ) - Description: pass kwargs from CassandraGraphVectorStore to underlying store Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-12 18:01:29 +00:00
ccurme	e77eeee6ee	core[patch]: add standard tracing params for retrievers (#25240 )	2024-08-12 14:51:59 +00:00
Mohammad Mohtashim	9927a4866d	[Community] - Added bind_tools and with_structured_output for ChatZhipuAI (#23887 ) - Description: This PR implements the `bind_tool` functionality for ChatZhipuAI as requested by the user. ChatZhipuAI models support tool calling according to the `OpenAI` tool format, as outlined in their official documentation [here](https://open.bigmodel.cn/dev/api#glm-4). - Issue: ##23868 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-12 14:11:43 +00:00
maang-h	bc60cddc1b	docs: Fix ChatBaichuan, QianfanChatEndpoint, ChatSparkLLM, ChatZhipuAI docs (#25265 ) - Description: Fix some chat models docs, include: - ChatBaichuan - QianfanChatEndpoint - ChatSparkLLM - ChatZhipuAI	2024-08-11 16:23:55 -04:00
ZhangShenao	43deed2a95	Improvement[Embeddings] Add dimension support to `ZhipuAIEmbeddings` (#25274 ) - In the in ` embedding-3 ` and later models of Zhipu AI, it is supported to specify the dimensions parameter of Embedding. Ref: https://bigmodel.cn/dev/api#text_embedding-3 . - Add test case for `embedding-3` model by assigning dimensions.	2024-08-11 16:20:37 -04:00
Eugene Yurtsev	6dd9f053e3	core[patch]: Deprecating beta upsert APIs in vectorstore (#25069 ) This PR deprecates the beta upsert APIs in vectorstore. We'll introduce them in a V2 abstraction instead to keep the existing vectorstore implementations lighter weight. The main problem with the existing APIs is that it's a bit more challenging to implement the correct behavior w/ respect to IDs since ID can be present in both the function signature and as an optional attribute on the document object. But VectorStores that pass the standard tests should have implemented the semantics properly! --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-09 17:17:36 -04:00
Eugene Yurtsev	b6f0174bb9	community[patch],core[patch]: Update EdenaiTool root_validator and add unit test in core (#25233 ) This PR gets rid `root_validators(allow_reuse=True)` logic used in EdenAI Tool in preparation for pydantic 2 upgrade. - add another test to secret_from_env_factory	2024-08-09 15:59:27 +00:00
Eugene Yurtsev	bd6c31617e	community[patch]: Remove more @allow_reuse=True validators (#25236 ) Remove some additional allow_reuse=True usage in @root_validators.	2024-08-09 11:10:27 -04:00
Eugene Yurtsev	6e57aa7c36	community[patch]: Remove usage of @root_validator(allow_reuse=True) (#25235 ) Remove usage of @root_validator(allow_reuse=True)	2024-08-09 10:57:42 -04:00
thiswillbeyourgithub	a2b4c33bd6	community[patch]: FAISS: ValueError mentions normalize_score_fn isntead of relevance_score_fn (#25225 ) Thank you for contributing to LangChain! - [X] PR title: "community: fix valueerror mentions wrong argument missing" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [X] PR message: *Delete this entire checklist* and replace with - Description: when faiss.py has a None relevance_score_fn it raises a ValueError that says a normalize_fn_score argument is needed. Co-authored-by: ccurme <chester.curme@gmail.com>	2024-08-09 14:40:29 +00:00
Shivendra Soni	66b7206ab6	community: Add llm-extraction option to FireCrawl Document Loader (#25231 ) Description: This minor PR aims to add `llm_extraction` to Firecrawl loader. This feature is supported on API and PythonSDK, but the langchain loader omits adding this to the response. Twitter handle: [scalable_pizza](https://x.com/scalablepizza) --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-09 13:59:10 +00:00
ccurme	3b7437d184	docs: update integration api refs (#25195 ) - [x] toolkits - [x] retrievers (in this repo)	2024-08-09 12:27:32 +00:00
Eugene Yurtsev	98779797fe	community[patch]: Use get_fields adapter for pydantic (#25191 ) Change all usages of __fields__ with get_fields adapter merged into langchain_core. Code mod generated using the following grit pattern: ``` engine marzano(0.1) language python `$X.__fields__` => `get_fields($X)` where { add_import(source="langchain_core.utils.pydantic", name="get_fields") } ```	2024-08-08 14:43:09 -04:00
Rajendra Kadam	663638d6a8	community[minor]: [SharePointLoader] Load extended metadata for the root folder (#24872 ) - Title: [SharePointLoader] Load extended metadata for the root folder - Description: - Ensure extended metadata loads correctly for the root folder. - Cleanup: Refactor SharePointLoader to remove unused fields(`file_id` & `site_id`). - Dependencies: NA - Add tests and docs: NA	2024-08-08 14:39:16 -04:00
Eugene Yurtsev	bf5193bb99	community[patch]: Upgrade pydantic extra (#25185 ) Upgrade to using a literal for specifying the extra which is the recommended approach in pydantic 2. This works correctly also in pydantic v1. ```python from pydantic.v1 import BaseModel class Foo(BaseModel, extra="forbid"): x: int Foo(x=5, y=1) ``` And ```python from pydantic.v1 import BaseModel class Foo(BaseModel): x: int class Config: extra = "forbid" Foo(x=5, y=1) ``` ## Enum -> literal using grit pattern: ``` engine marzano(0.1) language python or { `extra=Extra.allow` => `extra="allow"`, `extra=Extra.forbid` => `extra="forbid"`, `extra=Extra.ignore` => `extra="ignore"` } ``` Resorted attributes in config and removed doc-string in case we will need to deal with going back and forth between pydantic v1 and v2 during the 0.3 release. (This will reduce merge conflicts.) ## Sort attributes in Config: ``` engine marzano(0.1) language python function sort($values) js { return $values.text.split(',').sort().join("\n"); } class_definition($name, $body) as $C where { $name <: `Config`, $body <: block($statements), $values = [], $statements <: some bubble($values) assignment() as $A where { $values += $A }, $body => sort($values), } ```	2024-08-08 17:20:39 +00:00
ololand	249945a572	Update polygon.py for business subscription (#25085 ) For business subscription the status is STOCKSBUSINESS not OK Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-08-08 15:28:41 +00:00
ogawa	d895db11d6	community[patch]: gpt-4o-2024-08-06 costs (#25164 ) - Description: updated OpenAI cost definitions according to the following: - https://openai.com/api/pricing/ - Twitter handle: `@ogawa65a` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-08 13:22:11 +00:00
maang-h	0ba125c3cd	docs: Standardize QianfanLLMEndpoint LLM (#25139 ) - Description: Standardize QianfanLLMEndpoint LLM，include: - docs, the issue #24803 - model init arg names, the issue #20085	2024-08-07 10:57:27 -04:00
Pat Patterson	7e7fcf5b1f	community: Fix ValidationError on creating GPT4AllEmbeddings with no gpt4all_kwargs (#25124 ) - Description: Instantiating `GPT4AllEmbeddings` with no `gpt4all_kwargs` argument raised a `ValidationError`. Root cause: #21238 added the capability to pass `gpt4all_kwargs` through to the `GPT4All` instance via `Embed4All`, but broke code that did not specify a `gpt4all_kwargs` argument. - Issue: #25119 - Dependencies: None - Twitter handle: [`@metadaddy`](https://twitter.com/metadaddy)	2024-08-07 13:34:01 +00:00
Virat Singh	264ab96980	community: Add stock market tools from financialdatasets.ai (#25025 ) Description: In this PR, I am adding three stock market tools from financialdatasets.ai (my API!): - get balance sheets - get cash flow statements - get income statements Twitter handle: [@virattt](https://twitter.com/virattt) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-06 18:28:12 +00:00
Naval Chand	71c0698ee4	Added bedrock 3-5 sonnet cost detials for BedrockAnthropicTokenUsageCallbackHandler (#25104 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Example: "community: Added bedrock 3-5 sonnet cost detials for BedrockAnthropicTokenUsageCallbackHandler" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Naval Chand <navalchand@192.168.1.36>	2024-08-06 17:28:47 +00:00
Isaac Francisco	a72fddbf8d	[docs]: vector store integration pages (#24858 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-06 17:20:27 +00:00
maang-h	1028af17e7	docs: Standardize Tongyi (#25103 ) - Description: Standardize Tongyi LLM，include: - docs, the issue #24803 - model init arg names, the issue #20085	2024-08-06 11:44:12 -04:00
Dobiichi-Origami	061ed250f6	delete the default model value from langchain and discard the need fo… (#24915 ) - description: I remove the limitation of mandatory existence of `QIANFAN_AK` and default model name which langchain uses cause there is already a default model nama underlying `qianfan` SDK powering langchain component. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-06 14:11:05 +00:00
Dominik Fladung	ffa0c838d8	Allow ConfluenceLoader authorization via Personal Access Tokens (#25096 ) - community: Allow authorization to Confluence with bearer token - Description: Allow authorization to Confluence with [Personal Access Token](https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html) by checking for the keys `['client_id', token: ['access_token', 'token_type']]` - Issue: Currently the following error occurs when using an personal access token for authorization. ```python loader = ConfluenceLoader( url=os.getenv('CONFLUENCE_URL'), oauth2={ 'token': {"access_token": os.getenv("CONFLUENCE_ACCESS_TOKEN"), "token_type": "bearer"}, 'client_id': 'client_id', }, page_ids=['12345678'], ) ``` ``` ValueError: Error(s) while validating input: ["You have either omitted require keys or added extra keys to the oauth2 dictionary. key values should be `['access_token', 'access_token_secret', 'consumer_key', 'key_cert']`"] ``` With this PR the loader runs as expected. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-06 13:42:47 +00:00
jigsawlabs-student	427a04151c	community: fix neo4j from_existing_graph (#24912 ) Fixes Neo4JVector.from_existing_graph integration with huggingface Previously threw an error with existing databases, because from_existing_graph query returns empty list of new nodes, which are then passed to embedding function, and huggingface errors with empty list. Fixes [24401](https://github.com/langchain-ai/langchain/issues/24401) --------- Co-authored-by: Jeff Katzy <jeffreyerickatz@gmail.com>	2024-08-05 21:01:46 +00:00
Jim Baldwin	6890daa90c	community: make AthenaLoader profile_name optional and fix type hint (#24958 ) - Description: This PR makes the AthenaLoader profile_name optional and fixes the type hint which says the type is `str` but it should be `str` or `None` as None is handled in the loader init. This is a minor problem but it just confused me when I was using the Athena Loader to why we had to use a Profile, as I want that for local but not production. - Issue: #24957 - Dependencies: None.	2024-08-05 14:28:58 +00:00
Dobiichi-Origami	c5cb52a3c6	community: fix issue of the existence of numeric object in `additional_kwargs` a… (#24863 ) - Description: A previous PR breaks the code from `baidu_qianfan_endpoint.py` which causes the malfunction of streaming	2024-08-05 10:15:55 -04:00
ZhangShenao	cda79dbb6c	community[patch]: Optimize test case for `MoonshotChat` (#25050 ) Optimize test case for `MoonshotChat`. Use standard ChatModelIntegrationTests.	2024-08-05 10:11:25 -04:00
Alex Sherstinsky	208042e0f2	community: Fix Predibase Integration for HuggingFace-hosted fine-tuned adapters (#25015 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-08-03 14:05:43 -07:00
maang-h	f5da0d6d87	docs: Standardize MiniMaxEmbeddings (#24983 ) - Description: Standardize MiniMaxEmbeddings - docs, the issue #24856 - model init arg names, the issue #20085	2024-08-03 14:01:23 -04:00
maang-h	7de62abc91	docs: Standardize SparkLLMTextEmbeddings docstrings (#25021 ) - Description: Standardize SparkLLMTextEmbeddings docstrings - Issue: the issue #24856	2024-08-03 13:44:09 -04:00
Bagatur	e81ddb32a6	docs: fix kwargs docstring (#25010 ) Fix: ![Screenshot 2024-08-02 at 5 33 37 PM](https://github.com/user-attachments/assets/7c56cdeb-ee81-454c-b3eb-86aa8a9bdc8d)	2024-08-02 19:54:54 -07:00
Isaac Francisco	73570873ab	docs: standardizing tavily tool docs (#24736 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-08-02 22:25:27 +00:00
Bagatur	8e2316b8c2	community[patch]: Release 0.2.11 (#24989 )	2024-08-02 20:08:44 +00:00
ccurme	22c1a4041b	community[patch]: support named arguments in github toolkit (#24986 ) Parameters may be passed in by name if generated from tool calls.	2024-08-02 18:27:32 +00:00
ZhangShenao	71c0564c9f	community[patch]: Add test case for MoonshotChat (#24960 ) Add test case for `MoonshotChat`.	2024-08-02 09:37:31 -04:00
Isaac Francisco	d7688a4328	community[patch]: adding artifact to Tavily search (#24376 ) This allows you to get raw content as well as the answer, instead of just getting the results. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-08-01 21:12:11 -07:00
maang-h	ea505985c4	docs: Standardize ZhipuAIEmbeddings docstrings (#24933 ) - Description: Standardize ZhipuAIEmbeddings rich docstrings. - Issue: the issue #24856	2024-08-01 14:06:53 -04:00
Anneli Samuel	2204d8cb7d	community[patch]: Invoke on_llm_new_token callback before yielding chunk (#24938 ) Description: Invoke on_llm_new_token callback before yielding chunk in streaming mode Issue: [#16913](https://github.com/langchain-ai/langchain/issues/16913)	2024-08-01 16:39:04 +00:00
Serena Ruan	1827bb4042	community[patch]: support bind_tools for ChatMlflow (#24547 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - Description: Support ChatMlflow.bind_tools method Tested in Databricks: <img width="836" alt="image" src="https://github.com/user-attachments/assets/fa28ef50-0110-4698-8eda-4faf6f0b9ef8"> - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: Serena Ruan <serena.rxy@gmail.com>	2024-08-01 08:43:07 -07:00
BottlePumpkin	bfc59c1d26	community: Fix KeyError in NotionDB loader when 'name' is missing (#24224 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Description: This PR fixes a KeyError in NotionDBLoader when the "name" key is missing in the "people" property. Issue: Fixes #24223 Dependencies: None --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-08-01 13:55:40 +00:00
alexqiao	8eb0bdead3	community[patch]: Invoke callback prior to yielding token (#24917 ) Description: Invoke callback prior to yielding token in stream method for chat_models . Issue: https://github.com/langchain-ai/langchain/issues/16913 #16913	2024-08-01 13:19:55 +00:00
Nikita Pakunov	c776471ac6	community: fix AttributeError: 'YandexGPT' object has no attribute '_grpc_metadata' (#24432 ) Fixes #24049 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-31 21:18:33 +00:00
Eugene Yurtsev	add16111b9	community[patch]: Make the pydantic linter stricter (#24897 ) Stricter linting of deprecated pydantic features.	2024-07-31 18:57:37 +00:00
Eugene Yurtsev	a4a444f73d	community[patch]: Fix arcee llm usage of root_validator(pre=False) (#24896 ) Should be pre=True	2024-07-31 18:49:20 +00:00
Eugene Yurtsev	d24b82357f	community[patch]: Add missing annotations (#24890 ) This PR adds annotations in comunity package. Annotations are only strictly needed in subclasses of BaseModel for pydantic 2 compatibility. This PR adds some unnecessary annotations, but they're not bad to have regardless for documentation pages.	2024-07-31 18:13:44 +00:00
ccurme	30f18c7b02	docs: add retriever integrations template (#24836 )	2024-07-31 13:50:44 -04:00
Anirudh31415926535	4da3d4b18e	docs: Minor corrections and updates to Cohere docs (#22726 ) - Description: Update the Cohere's provider and RagRetriever documentations with latest updates. - Twitter handle: Anirudh1810	2024-07-31 10:16:26 -07:00
Nishan Jain	b00c0fc558	[Community][minor]: Added prompt governance in pebblo_retrieval (#24874 ) Title: [pebblo_retrieval] Identifying entities in prompts given in PebbloRetrievalQA leading to prompt governance Description: Implemented identification of entities in the prompt using Pebblo prompt governance API. Issue: NA Dependencies: NA Add tests and docs: NA	2024-07-31 13:14:51 +00:00
Rajendra Kadam	a6add89bd4	community[minor]: [PebbloSafeLoader] Implement content-size-based batching (#24871 ) - Title: [PebbloSafeLoader] Implement content-size-based batching in the classification flow(loader/doc API) - Description: - Implemented content-size-based batching in the loader/doc API, set to 100KB with no external configuration option, intentionally hard-coded to prevent timeouts. - Remove unused field(pb_id) from doc_metadata - Issue: NA - Dependencies: NA - Add tests and docs: Updated	2024-07-31 09:10:28 -04:00
TrumanYan	096b66db4a	community: replace it with Tencent Cloud SDK (#24172 ) Description: The old method will be discontinued; use the official SDK for more model options. Issue: None Dependencies: None Twitter handle: None Co-authored-by: trumanyan <trumanyan@tencent.com>	2024-07-31 09:05:38 -04:00
Erick Friis	1f5444817a	community: deprecate BedrockEmbeddings in favor of langchain-aws (#24846 )	2024-07-30 23:13:17 +00:00
Shailendra Mishra	f2d810b3c0	clob_bugfix... (#24813 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-30 12:44:04 -04:00
Anush	51b15448cc	community: Fix FastEmbedEmbeddings (#24462 ) ## Description This PR: - Fixes the validation error in `FastEmbedEmbeddings`. - Adds support for `batch_size`, `parallel` params. - Removes support for very old FastEmbed versions. - Updates the FastEmbed doc with the new params. Associated Issues: - Resolves #24039 - Resolves #https://github.com/qdrant/fastembed/issues/296	2024-07-30 12:42:46 -04:00
ccurme	73ec24fc56	docs[patch]: add toolkit template (#24791 )	2024-07-30 12:36:09 -04:00
Igor Drozdov	c2706cfb9e	feat(community): add tools support for litellm (#23906 ) I used the following example to validate the behavior ```python from langchain_core.prompts import ChatPromptTemplate from langchain_core.runnables import ConfigurableField from langchain_anthropic import ChatAnthropic from langchain_community.chat_models import ChatLiteLLM from langchain_core.tools import tool from langchain.agents import create_tool_calling_agent, AgentExecutor @tool def multiply(x: float, y: float) -> float: """Multiply 'x' times 'y'.""" return x * y @tool def exponentiate(x: float, y: float) -> float: """Raise 'x' to the 'y'.""" return x**y @tool def add(x: float, y: float) -> float: """Add 'x' and 'y'.""" return x + y prompt = ChatPromptTemplate.from_messages([ ("system", "you're a helpful assistant"), ("human", "{input}"), ("placeholder", "{agent_scratchpad}"), ]) tools = [multiply, exponentiate, add] llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0) # llm = ChatLiteLLM(model="claude-3-sonnet-20240229", temperature=0) agent = create_tool_calling_agent(llm, tools, prompt) agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True) agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", }) ``` `ChatAnthropic` version works: ``` > Entering new AgentExecutor chain... Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}` responded: [{'text': 'To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools:', 'type': 'text', 'index': 0}, {'id': 'toolu_01Gf54DFTkfLMJQX3TXffmxe', 'input': {}, 'name': 'exponentiate', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 5, "y": 2.743}'}] 82.65606421491815 Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}` responded: [{'id': 'toolu_01XUq9S56GT3Yv2N1KmNmmWp', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 0, 'partial_json': '{"x": 3, "y": 82.65606421491815}'}] 85.65606421491815 Invoking: `add` with `{'x': 17.24, 'y': -918.1241}` responded: [{'text': '\n\nSo 3 + 5^2.743 = 85.66\n\nTo calculate 17.24 - 918.1241, we can use:', 'type': 'text', 'index': 0}, {'id': 'toolu_01BkXTwP7ec9JKYtZPy5JKjm', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 17.24, "y": -918.1241}'}] -900.8841[{'text': '\n\nTherefore, 17.24 - 918.1241 = -900.88', 'type': 'text', 'index': 0}] > Finished chain. ``` While `ChatLiteLLM` version doesn't. But with the changes in this PR, along with: - https://github.com/langchain-ai/langchain/pull/23823 - https://github.com/BerriAI/litellm/pull/4554 The result is _almost_ the same: ``` > Entering new AgentExecutor chain... Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}` responded: To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools: 82.65606421491815 Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}` 85.65606421491815 Invoking: `add` with `{'x': 17.24, 'y': -918.1241}` responded: So 3 + 5^2.743 = 85.66 To calculate 17.24 - 918.1241, we can use: -900.8841 Therefore, 17.24 - 918.1241 = -900.88 > Finished chain. ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-30 15:39:34 +00:00
David Robertson	bfb7f8d40a	Brave Search: Enhance search result details with extra snippets (#19209 ) Description: This update significantly improves the Brave Search Tool's utility within the LangChain library by enriching the search results it returns. The tool previously returned title, link, and snippet, with the snippet being a truncated 140-character description from the search engine. To make the search results more informative, this update enables extra_snippets by default and introduces additional result fields: title, link, description (enhancing and renaming the former snippet field), age, and snippets. The snippets field provides a list of strings summarizing the webpage, utilizing Brave's capability for more detailed search insights. This enhancement aims to make the search tool far more informative and beneficial for users. Issue: N/A Dependencies: No additional dependencies introduced. Twitter handle: @davidalexr987 Code Changes Summary: - Changed the default setting to include extra_snippets in search results. - Renamed the snippet field to description to accurately reflect its content and included an age field for search results. - Introduced a snippets field that lists webpage summaries, providing users with comprehensive search result insights. Backward Compatibility Note: The renaming of snippet to description improves the accuracy of the returned data field but may impact existing users who have developed integration's or analyses based on the snippet field. I believe this change is essential for clarity and utility, and it aligns better with the data provided by Brave Search. Additional Notes: This proposal focuses exclusively on the Brave Search package, without affecting other LangChain packages or introducing new dependencies.	2024-07-30 15:29:38 +00:00
Ben Chambers	435771fe74	[community]: Fix package name mismatch (#24824 ) - Description: fix a mismatch in pypi package names	2024-07-30 11:21:39 -04:00
maang-h	4bb1a11e02	community: Add MiniMaxChat bind_tools and structured output (#24310 ) - Description: - Add `bind_tools` method to support tool calling - Add `with_structured_output` method to support structured output	2024-07-29 15:51:52 -04:00
maang-h	bf685c242f	docs: Standardize QianfanEmbeddingsEndpoint (#24786 ) - Description: Standardize QianfanEmbeddingsEndpoint, include: - docstrings, the issue #21983 - model init arg names, the issue #20085	2024-07-29 13:19:24 -04:00
M. Ali	c086410677	fix docs typos (#23668 ) Thank you for contributing to LangChain! - [x] PR title: "docs: fix multiple typos" Co-authored-by: mohblnk <mohamed.ali@blnk.ai> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-29 16:10:55 +00:00
Pere Pasamonte	98175860ad	community: Fix AWS DocumentDB similarity_search when filter is None (#24777 ) Description Fixes DocumentDBVectorSearch similarity_search when no filter is used; it defaults to None but $match does not accept None, so changed default to empty {} before pipeline is created. Issue AWS DocumentDB similarity search does not work when no filter is used. Error msg: "the match filter must be an expression in an object" #24775 Dependencies No dependencies Twitter handle https://x.com/perepasamonte --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-29 15:32:05 +00:00
AmosDinh	c113682328	community:Add support for specifying document_loaders.firecrawl api url. (#24747 ) community:Add support for specifying document_loaders.firecrawl api url. Add support for specifying document_loaders.firecrawl api url. This is mainly to support the [self-hosting](https://github.com/mendableai/firecrawl/blob/main/SELF_HOST.md) option firecrawl provides. Eg. now I can specify localhost:.... The corresponding firecrawl class already provides functionality to pass the argument. See here: `4c9d62f6d3/apps/python-sdk/firecrawl/firecrawl.py (L29)` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-28 14:30:36 -04:00
Haijian Wang	cda3025ee1	Integrating the Yi family of models. (#24491 ) Thank you for contributing to LangChain! - [x] PR title: "community:add Yi LLM", "docs:add Yi Documentation" - [x] PR message: *Delete this entire checklist* and replace with - Description: This PR adds support for the Yi model to LangChain. - Dependencies: [langchain_core,requests,contextlib,typing,logging,json,langchain_community] - Twitter handle: 01.AI - [x] Add tests and docs: I've added the corresponding documentation to the relevant paths --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-26 10:57:33 -07:00
Marc Gibbons	cc451effd1	community[patch]: langchain_community.vectorstores.azuresearch Raise LangChainException instead of bare Exception (#23935 ) Raise `LangChainException` instead of `Exception`. This alleviates the need for library users to use bare try/except to handle exceptions raised by `AzureSearch`. Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-26 15:59:06 +00:00
Diverrez morgan	c4d2a53f18	community: creation score_threshold in flashrank_rerank.py (#24016 ) Description: add a optional score relevance threshold for select only coherent document, it's in complement of top_n Discussion: add relevance score threshold in flashrank_rerank document compressors #24013 Dependencies: no dependencies --------- Co-authored-by: Benjamin BERNARD <benjamin.bernard@openpathview.fr> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-26 13:34:39 +00:00
Cong Peng	190988d93e	community: Add parameter `allow_dangerous_requests` to `WebResearchRetriever.from_llm` construct (#24712 ) Description: To avoid ValueError when construct the retriever from method `from_llm()`.	2024-07-26 06:24:58 -07:00
monysun	5f593c172a	community: fix dashcope embeddings embed_query func post too much req to api (#24707 ) the fuc of embed_query of dashcope embeddings send a str param, and in the embed_with_retry func will send error content to api	2024-07-26 12:44:07 +00:00
yonarw	b65ac8d39c	community[minor]: Self query retriever for HANA Cloud Vector Engine (#24494 ) Description: - This PR adds a self query retriever implementation for SAP HANA Cloud Vector Engine. The retriever supports all operators except for contains. - Issue: N/A - Dependencies: no new dependencies added Add tests and docs: Added integration tests to: libs/community/tests/unit_tests/query_constructors/test_hanavector.py Documentation for self query retriever: /docs/integrations/retrievers/self_query/hanavector_self_query.ipynb --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-26 06:56:51 +00:00
nobbbbby	4f3b4fc7fe	community[patch]: Extend Baichuan model with tool support (#24529 ) Description: Expanded the chat model functionality to support tools in the 'baichuan.py' file. Updated module imports and added tool object handling in message conversions. Additional changes include the implementation of tool binding and related unit tests. The alterations offer enhanced model capabilities by enabling interaction with tool-like objects. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-25 23:20:44 -07:00
Rave Harpaz	ee399e3ec5	community[patch]: Add OCI Generative AI tool and structured output support (#24693 ) - [x] PR title: community: Add OCI Generative AI tool and structured output support - [x] PR message: - Description: adding tool calling and structured output support for chat models offered by OCI Generative AI services. This is an update to our last PR 22880 with changes in /langchain_community/chat_models/oci_generative_ai.py - Issue: NA - Dependencies: NA - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests 2. we have updated our documentation under /docs/docs/integrations/chat/oci_generative_ai.ipynb - [x] Lint and test: `make format`, `make lint` and `make test` we run successfully --------- Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com> Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>	2024-07-25 23:19:00 -07:00
Yuki Watanabe	2b6a262f84	community[patch]: Replace `filters` argument to `filter` in DatabricksVectorSearch (#24530 ) The [DatabricksVectorSearch](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/databricks_vector_search.py#L21) class exposes similarity search APIs with argument `filters`, which is inconsistent with other VS classes who uses `filter` (singular). This PR updates the argument and add alias for backward compatibility. --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>	2024-07-25 21:20:18 -07:00
Sunish Sheth	59880a9147	community[patch]: mlflow handle empty chunk(#24689 )	2024-07-25 20:36:29 -07:00
Chaunte W. Lacewell	69eacaa887	Community[minor]: Update VDMS vectorstore (#23729 ) Description: - This PR exposes some functions in VDMS vectorstore, updates VDMS related notebooks, updates tests, and upgrade version of VDMS (>=0.0.20) Issue: N/A Dependencies: - Update vdms>=0.0.20	2024-07-25 22:13:04 -04:00
KyrianC	0fdbaf4a8d	community: fix ChatEdenAI + EdenAI Tools (#23715 ) Fixes for Eden AI Custom tools and ChatEdenAI: - add missing import in __init__ of chat_models - add `args_schema` to custom tools. otherwise '__arg1' would sometimes be passed to the `run` method - fix IndexError when no human msg is added in ChatEdenAI	2024-07-25 15:19:14 -04:00
maang-h	38d30e285a	docs: Standardize BaichuanTextEmbeddings docstrings (#24674 ) - Description: Standardize BaichuanTextEmbeddings docstrings. - Issue: the issue #21983	2024-07-25 12:12:00 -04:00
rick-SOPTIM	cd563fb628	community[minor]: passthrough auth parameter on requests to Ollama-LLMs (#24068 ) Thank you for contributing to LangChain! Description: This PR allows users of `langchain_community.llms.ollama.Ollama` to specify the `auth` parameter, which is then forwarded to all internal calls of `requests.request`. This works in the same way as the existing `headers` parameters. The auth parameter enables the usage of the given class with Ollama instances, which are secured by more complex authentication mechanisms, that do not only rely on static headers. An example are AWS API Gateways secured by the IAM authorizer, which expects signatures dynamically calculated on the specific HTTP request. Issue: Integrating a remote LLM running through Ollama using `langchain_community.llms.ollama.Ollama` only allows setting static HTTP headers with the parameter `headers`. This does not work, if the given instance of Ollama is secured with an authentication mechanism that makes use of dynamically created HTTP headers which for example may depend on the content of a given request. Dependencies: None Twitter handle: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-25 15:48:35 +00:00
Luca Dorigo	5fdbdd6bec	community[patch]: Fix invalid iohttp verify parameter (#24655 ) Should fix https://github.com/langchain-ai/langchain/issues/24654	2024-07-25 11:09:21 -04:00
Oleg Kulyk	4b1b7959a2	community[minor]: Add ScrapingAnt Loader Community Integration (#24514 ) Added [ScrapingAnt](https://scrapingant.com/) Web Loader integration. ScrapingAnt is a web scraping API that allows extracting web page data into accessible and well-formatted markdown. Description: Added ScrapingAnt web loader for retrieving web page data as markdown Dependencies: scrapingant-client Twitter: @WeRunTheWorld3 --------- Co-authored-by: Oleg Kulyk <oleg@scrapingant.com>	2024-07-24 21:11:43 -04:00
John	d59c656ea5	unstructured, community, initialize langchain-unstructured package (#22779 ) #### Update (2): A single `UnstructuredLoader` is added to handle both local and api partitioning. This loader also handles single or multiple documents. #### Changes in `community`: Changes here do not affect users. In the initial process of using the SDK for the API Loaders, the Loaders in community were refactored. Other changes include: The `UnstructuredBaseLoader` has a new check to see if both `mode="paged"` and `chunking_strategy="by_page"`. It also now has `Element.element_id` added to the `Document.metadata`. `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such, now both directly inherit from `UnstructuredBaseLoader` and initialize their `file_path`/`file` attributes respectively and implement their own `_post_process_elements` methods. -------- #### Update: New SDK Loaders in a [partner package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo) are introduced to prevent breaking changes for users (see discussion below). ##### TODO: - [x] Test docstring examples -------- - Description: UnstructuredAPIFileIOLoader and UnstructuredAPIFileLoader calls to the unstructured api are now made using the unstructured-client sdk. - New Dependencies: unstructured-client - [x] Add tests and docs: If you're adding a new integration, please include - [x] a test for the integration, preferably unit tests that do not rely on network access, - [x] update the description in `docs/docs/integrations/providers/unstructured.mdx` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. TODO: - [x] Update https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api - `langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb` - The description here needs to indicate that users should install `unstructured-client` instead of `unstructured`. Read over closely to look for any other changes that need to be made. - [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to handle json responses from the API instead of just lists of elements. - This method may need to be overwritten by the API loaders instead of changing it in the `UnstructuredBaseLoader`. - [x] Update the documentation links in the class docstrings (the Unstructured documents have moved) - [x] Update Document.metadata to include `element_id` (see thread [here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419)) --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com>	2024-07-24 23:21:20 +00:00
Eugene Yurtsev	b55f6105c6	community[patch]: Add linter to prevent further usage of root_validator and validator (#24613 ) This linter is meant to move development to use __init__ instead of root_validator and validator. We need to investigate whether we need to lint some of the functionality of Field (e.g., `lt` and `gt`, `alias`) `alias` is the one that's most popular: (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "alias=" \| wc -l 144 (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "ge=" \| wc -l 10 (community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep " Field(" \| grep "gt=" \| wc -l 4	2024-07-24 12:35:21 -04:00
Anindyadeep	12c3454fd9	[Community] PremAI Tool Calling Functionality (#23931 ) This PR is under WIP and adds the following functionalities: - [X] Supports tool calling across the langchain ecosystem. (However streaming is not supported) - [X] Update documentation	2024-07-24 09:53:58 -04:00
Vishnu Nandakumar	e271965d1e	community: retrievers: added capability for using Product Quantization as one of the retriever. (#22424 ) - [ ] Community: "Retrievers: Product Quantization" - [X] This PR adds Product Quantization feature to the retrievers to the Langchain Community. PQ is one of the fastest retrieval methods if the embeddings are rich enough in context due to the concepts of quantization and representation through centroids - Description: Adding PQ as one of the retrievers - Dependencies: using the package nanopq for this PR - Twitter handle: vishnunkumar_ - [X] Add tests and docs: If you're adding a new integration, please include - [X] Added unit tests for the same in the retrievers. - [] Will add an example notebook subsequently - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ - done the same --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-24 13:52:15 +00:00
stydxm	b9bea36dd4	community: fix typo in warning message (#24597 ) - Description: This PR fixes a small typo in a warning message - Issue: ![](https://github.com/user-attachments/assets/5aa57724-26c5-49f6-8bc1-5a54bb67ed49) There were double `Use` and double `instead`	2024-07-24 13:19:07 +00:00
cüre	da06d4d7af	community: update finetuned model cost for 4o-mini (#24605 ) - Description: adds model price for. reference: https://openai.com/api/pricing/ - Issue: - - Dependencies: - - Twitter handle: cureef	2024-07-24 13:17:26 +00:00
ZhangShenao	ad18afc3ec	community[patch]: Fix param spelling error in `ElasticsearchChatMessageHistory` (#24589 ) Fix param spelling error in `ElasticsearchChatMessageHistory`	2024-07-23 19:29:42 -07:00
Aayush Kataria	0f45ac4088	LangChain Community: VectorStores: Azure Cosmos DB Filtered Vector Search (#24087 ) Thank you for contributing to LangChain! - This PR adds vector search filtering for Azure Cosmos DB Mongo vCore and NoSQL. - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-23 16:59:23 -07:00
Carlos André Antunes	325068bb53	community: Fix azure_openai.py (#24572 ) In some lines its trying to read a key that do not exists yet. In this cases I changed the direct access to dict.get() method - [ x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-07-23 16:22:21 -04:00
Bagatur	8691a5a37f	community[patch]: Release 0.2.10 (#24560 )	2024-07-23 09:24:57 -07:00
Ben Chambers	e80b0932ee	community[patch]: small fixes to link extractors (#24528 ) - Description: small fixes to imports / types in the link extraction work --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-23 14:28:06 +00:00
Morteza Hosseini	9e06991aae	community[patch]: Update URL to the 2markdown API (#24546 ) Update the URL to Markdown endpoint. API information is available here: https://2markdown.com/docs#url2md	2024-07-23 14:27:55 +00:00
maang-h	378db2e1a5	docs: Add RedisChatMessageHistory docstrings (#24548 ) - Description: Add `RedisChatMessageHistory ` rich docstrings. - Issue: the issue #21983 Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-23 14:23:46 +00:00
Ben Chambers	a5a3d28776	community[patch]: Remove targets_table from C* GraphVectorStore (#24502 ) - Description: Remove the unnecessary `targets_table` parameter	2024-07-22 22:09:36 -04:00
Alexander Golodkov	2a70a07aad	community[minor]: added new document loaders based on dedoc library (#24303 ) ### Description This pull request added new document loaders to load documents of various formats using [Dedoc](https://github.com/ispras/dedoc): - `DedocFileLoader` (determine file types automatically and parse) - `DedocPDFLoader` (for `PDF` and images parsing) - `DedocAPIFileLoader` (determine file types automatically and parse using Dedoc API without library installation) [Dedoc](https://dedoc.readthedocs.io) is an open-source library/service that extracts texts, tables, attached files and document structure (e.g., titles, list items, etc.) from files of various formats. The library is actively developed and maintained by a group of developers. `Dedoc` supports `DOCX`, `XLSX`, `PPTX`, `EML`, `HTML`, `PDF`, images and more. Full list of supported formats can be found [here](https://dedoc.readthedocs.io/en/latest/#id1). For `PDF` documents, `Dedoc` allows to determine textual layer correctness and split the document into paragraphs. ### Issue This pull request extends variety of document loaders supported by `langchain_community` allowing users to choose the most suitable option for raw documents parsing. ### Dependencies The PR added a new (optional) dependency `dedoc>=2.2.5` ([library documentation](https://dedoc.readthedocs.io)) to the `extended_testing_deps.txt` ### Twitter handle None ### Add tests and docs 1. Test for the integration: `libs/community/tests/integration_tests/document_loaders/test_dedoc.py` 2. Example notebook: `docs/docs/integrations/document_loaders/dedoc.ipynb` 3. Information about the library: `docs/docs/integrations/providers/dedoc.mdx` ### Lint and test Done locally: - `make format` - `make lint` - `make integration_tests` - `make docs_build` (from the project root) --------- Co-authored-by: Nasty <bogatenkova.anastasiya@mail.ru>	2024-07-23 02:04:53 +00:00
Ben Chambers	5ac936a284	community[minor]: add document transformer for extracting links (#24186 ) - Description: Add a DocumentTransformer for executing one or more `LinkExtractor`s and adding the extracted links to each document. - Issue: n/a - Depedencies: none --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-22 22:01:21 -04:00
Erick Friis	3dce2e1d35	all: add release notes to pypi (#24519 )	2024-07-22 13:59:13 -07:00
Bagatur	236e957abb	core,groq,openai,mistralai,robocorp,fireworks,anthropic[patch]: Update BaseModel subclass and instance checks to handle both v1 and proper namespaces (#24417 ) After this PR chat models will correctly handle pydantic 2 with bind_tools and with_structured_output. ```python import pydantic print(pydantic.__version__) ``` 2.8.2 ```python from langchain_openai import ChatOpenAI from pydantic import BaseModel, Field class Add(BaseModel): x: int y: int model = ChatOpenAI().bind_tools([Add]) print(model.invoke('2 + 5').tool_calls) model = ChatOpenAI().with_structured_output(Add) print(type(model.invoke('2 + 5'))) ``` ``` [{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_PNUFa4pdfNOYXxIMHc6ps2Do', 'type': 'tool_call'}] <class '__main__.Add'> ``` ```python from langchain_openai import ChatOpenAI from pydantic.v1 import BaseModel, Field class Add(BaseModel): x: int y: int model = ChatOpenAI().bind_tools([Add]) print(model.invoke('2 + 5').tool_calls) model = ChatOpenAI().with_structured_output(Add) print(type(model.invoke('2 + 5'))) ``` ```python [{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_hhiHYP441cp14TtrHKx3Upg0', 'type': 'tool_call'}] <class '__main__.Add'> ``` Addresses issues: https://github.com/langchain-ai/langchain/issues/22782 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-22 20:07:39 +00:00
Naka Masato	884f76e05a	fix: load google credentials properly in GoogleDriveLoader (#12871 ) - Description: - Fix #12870: set scope in `default` func (ref: https://google-auth.readthedocs.io/en/master/reference/google.auth.html) - Moved the code to load default credentials to the bottom for clarity of the logic - Add docstring and comment for each credential loading logic - Issue: https://github.com/langchain-ai/langchain/issues/12870 - Dependencies: no dependencies change - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: @gymnstcs <!-- If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-22 17:43:33 +00:00
Jorge Piedrahita Ortiz	10e3982b59	community: sambanova integration minor changes (#24503 ) - Minor changes in samabanova llm integration - default api - docstrings - minor changes in docs	2024-07-22 17:06:35 +00:00
maang-h	721f709dec	community: Improve QianfanChatEndpoint tool result to model (#24466 ) - Description: `QianfanChatEndpoint` When using tool result to answer questions, the content of the tool is required to be in Dict format. Of course, this can require users to return Dict format when calling the tool, but in order to be consistent with other Chat Models, I think such modifications are necessary.	2024-07-22 11:29:00 -04:00
ccurme	dcba7df2fe	community[patch]: deprecate langchain_community Chroma in favor of langchain_chroma (#24474 )	2024-07-22 11:00:13 -04:00
Mohammad Mohtashim	5ade0187d0	[Commutiy]: Prompts Fixed for ZERO_SHOT_REACT React Agent Type in `create_sql_agent` function (#23693 ) - Description: The correct Prompts for ZERO_SHOT_REACT were not being used in the `create_sql_agent` function. They were not using the specific `SQL_PREFIX` and `SQL_SUFFIX` prompts if client does not provide any prompts. This is fixed. - Issue: #23585 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-22 14:04:20 +00:00
ZhangShenao	0f6737cbfe	[Vector Store] Fix function `add_texts` in `TencentVectorDB` (#24469 ) Regardless of whether `embedding_func` is set or not, the 'text' attribute of document should be assigned, otherwise the `page_content` in the document of the final search result will be lost	2024-07-22 09:50:22 -04:00
clement.l	d98b830e4b	community: add flag to toggle progress bar (#24463 ) - Description: Add a flag to determine whether to show progress bar - Issue: n/a - Dependencies: n/a - Twitter handle: n/a --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-20 13:18:02 +00:00
chuanbei888	6b08a33fa4	community: fix QianfanChatEndpoint default model (#24464 ) the baidu_qianfan_endpoint has been changed from ERNIE-Bot-turbo to ERNIE-Lite-8K	2024-07-20 13:00:29 +00:00
maang-h	7b28359719	docs: Add ChatSparkLLM docstrings (#24449 ) - Description: - Add `ChatSparkLLM` docstrings, the issue #22296 - To support `stream` method	2024-07-19 20:19:14 -07:00
Erick Friis	f4ee3c8a22	infra: add min version testing to pr test flow (#24358 ) xfailing some sql tests that do not currently work on sqlalchemy v1 #22207 was very much not sqlalchemy v1 compatible. Moving forward, implementations should be compatible with both to pass CI	2024-07-19 22:03:19 +00:00
Bagatur	842065a9cc	community[patch]: Release 0.2.9 (#24453 )	2024-07-19 12:50:22 -07:00
Bagatur	dda9438e87	community[patch]: gpt-4o-mini costs (#24421 )	2024-07-19 19:02:44 +00:00
Eugene Yurtsev	604dfe2d99	community[patch]: Force opt-in for WebResearchRetriever (CVE-2024-3095) (#24451 ) This PR addresses the issue raised by (CVE-2024-3095) https://huntr.com/bounties/e62d4895-2901-405b-9559-38276b6a5273 Unfortunately, we didn't do a good job writing the initial report. It's pointing at both the wrong package and the wrong code. The affected code is the Web Retriever not the AsyncHTMLLoader, and the WebRetriever lives in langchain-community The vulnerable code lives here: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L233-L233)` This PR adds a forced opt-in for users to make sure they are aware of the risk and can mitigate by configuring a proxy: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L84-L84)`	2024-07-19 18:51:35 +00:00
Asi Greenholts	372c27f2e5	community[minor]: [GoogleApiYoutubeLoader] Replace API used in _get_document_for_channel from search to playlistItem (#24034 ) - Description: Search has a limit of 500 results, playlistItems doesn't. Added a class in except clause to catch another common error. - Issue: None - Dependencies: None - Twitter handle: @TupleType --------- Co-authored-by: asi-cider <88270351+asi-cider@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 14:04:34 -04:00
Rafael Pereira	6a45bf9554	community[minor]: GraphCypherQAChain to accept additional inputs as provided by the user for cypher generation (#24300 ) Description: This PR introduces a change to the `cypher_generation_chain` to dynamically concatenate inputs. This improvement aims to streamline the input handling process and make the method more flexible. The change involves updating the arguments dictionary with all elements from the `inputs` dictionary, ensuring that all necessary inputs are dynamically appended. This will ensure that any cypher generation template will not require a new `_call` method patch. Issue: This PR fixes issue #24260.	2024-07-19 14:03:14 -04:00
Philippe PRADOS	f5856680fe	community[minor]: add mongodb byte store (#23876 ) The `MongoDBStore` can manage only documents. It's not possible to use MongoDB for an `CacheBackedEmbeddings`. With this new implementation, it's possible to use: ```python CacheBackedEmbeddings.from_bytes_store( underlying_embeddings=embeddings, document_embedding_cache=MongoDBByteStore( connection_string=db_uri, db_name=db_name, collection_name=collection_name, ), ) ``` and use MongoDB to cache the embeddings !	2024-07-19 13:54:12 -04:00
yabooung	07715f815b	community[minor]: Add ability to specify file encoding and json encoding for FileChatMessageHistory (#24258 ) Description: Add UTF-8 encoding support Issue: Inability to properly handle characters from certain languages (e.g., Korean) Fix: Implement UTF-8 encoding in FileChatMessageHistory --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 13:53:21 -04:00
Dristy Srivastava	020cc1cf3e	Community[minor]: Added checksum in while send data to pebblo-cloud (#23968 ) - Description: - Updated checksum in doc metadata - Sending checksum and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `/loader/doc` API - Adding `pb_id` i.e. pebblo id to doc metadata - Refactoring as needed. - Sending `content-checksum` and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `prmopt` API - Issue: NA - Dependencies: NA - Tests: Updated - Docs NA --------- Co-authored-by: dristy.cd <dristy@clouddefense.io>	2024-07-19 13:52:54 -04:00
keval dekivadiya	06f47678ae	community[minor]: Add TextEmbed Embedding Integration (#22946 ) Description: TextEmbed is a high-performance embedding inference server designed to provide a high-throughput, low-latency solution for serving embeddings. It supports various sentence-transformer models and includes the ability to deploy image and text embedding models. TextEmbed offers flexibility and scalability for diverse applications. - PyPI Package: [TextEmbed on PyPI](https://pypi.org/project/textembed/) - Docker Image: [TextEmbed on Docker Hub](https://hub.docker.com/r/kevaldekivadiya/textembed) - GitHub Repository: [TextEmbed on GitHub](https://github.com/kevaldekivadiya2415/textembed) PR Description This PR adds functionality for embedding documents and queries using the `TextEmbedEmbeddings` class. The implementation allows for both synchronous and asynchronous embedding requests to a TextEmbed API endpoint. The class handles batching and permuting of input texts to optimize the embedding process. Example Usage: ```python from langchain_community.embeddings import TextEmbedEmbeddings # Initialise the embeddings class embeddings = TextEmbedEmbeddings(model="your-model-id", api_key="your-api-key", api_url="your_api_url") # Define a list of documents documents = [ "Data science involves extracting insights from data.", "Artificial intelligence is transforming various industries.", "Cloud computing provides scalable computing resources over the internet.", "Big data analytics helps in understanding large datasets.", "India has a diverse cultural heritage." ] # Define a query query = "What is the cultural heritage of India?" # Embed all documents document_embeddings = embeddings.embed_documents(documents) # Embed the query query_embedding = embeddings.embed_query(query) # Print embeddings for each document for i, embedding in enumerate(document_embeddings): print(f"Document {i+1} Embedding:", embedding) # Print the query embedding print("Query Embedding:", query_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-19 17:30:25 +00:00
Andrew Benton	f9d64d22e5	community[minor]: Add Riza Python/JS code execution tool (#23995 ) - Description: Add Riza Python/JS code execution tool - Issue: N/A - Dependencies: an optional dependency on the `rizaio` pypi package - Twitter handle: [@rizaio](https://x.com/rizaio) [Riza](https://riza.io) is a safe code execution environment for agent-generated Python and JavaScript that's easy to integrate into langchain apps. This PR adds two new tool classes to the community package.	2024-07-19 17:03:22 +00:00
Ben Chambers	3691701d58	community[minor]: Add keybert-based link extractor (#24311 ) - Description: Add a `KeybertLinkExtractor` for graph vectorstores. This allows extracting links from keywords in a Document and linking nodes that have common keywords. - Issue: None - Dependencies: None. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-19 12:25:07 -04:00
Ben Chambers	83f3d95ffa	community[minor]: GLiNER link extraction (#24314 ) - Description: This allows extracting links between documents with common named entities using [GLiNER](https://github.com/urchade/GLiNER). - Issue: None - Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 15:34:54 +00:00
Anas Khan	b5acb91080	Mask API keys for various LLM/ChatModel Modules (#13885 ) Description: - Added masking of the API Keys for the modules: - `langchain/chat_models/openai.py` - `langchain/llms/openai.py` - `langchain/llms/google_palm.py` - `langchain/chat_models/google_palm.py` - `langchain/llms/edenai.py` - Updated the modules to utilize `SecretStr` from pydantic to securely manage API key. - Added unit/integration tests - `langchain/chat_models/asure_openai.py` used the `open_api_key` that is derived from the `ChatOpenAI` Class and it was assuming `openai_api_key` is a str so we changed it to expect `SecretStr` instead. Issue: https://github.com/langchain-ai/langchain/issues/12165 , Dependencies: none, Tag maintainer: @eyurtsev --------- Co-authored-by: HassanA01 <anikeboss@gmail.com> Co-authored-by: Aneeq Hassan <aneeq.hassan@utoronto.ca> Co-authored-by: kristinspenc <kristinspenc2003@gmail.com> Co-authored-by: faisalt14 <faisalt14@gmail.com> Co-authored-by: Harshil-Patel28 <76663814+Harshil-Patel28@users.noreply.github.com> Co-authored-by: kristinspenc <146893228+kristinspenc@users.noreply.github.com> Co-authored-by: faisalt14 <90787271+faisalt14@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 15:23:34 +00:00
ccurme	f99369a54c	community[patch]: fix formatting (#24443 ) Somehow this got through CI: https://github.com/langchain-ai/langchain/pull/24363	2024-07-19 14:38:53 +00:00
Ben Chambers	242b085be7	Merge pull request #24315 * community: Add Hierarchy link extractor * add example * lint	2024-07-19 09:42:26 -04:00
Rhuan Barros	c3308f31bc	Merge pull request #24363 * important email fields	2024-07-19 09:41:20 -04:00
Han Sol Park	aade9bfde5	Mask API key for ChatOpenAI based chat_models (#14293 ) - Description: Mask API key for ChatOpenAi based chat_models (openai, azureopenai, anyscale, everlyai). Made changes to all chat_models that are based on ChatOpenAI since all of them assumes that openai_api_key is str rather than SecretStr. - Issue:: #12165 - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: N/A --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 02:25:38 +00:00
Eun Hye Kim	07c5c60f63	community: fix tool appending logic and update planner prompt in OpenAPI agent toolkit (#24384 ) Description: - Updated the format for the 'Action' section in the planner prompt to ensure it must be one of the tools without additional words. Adjusted the phrasing from "should be" to "must be" for clarity and enforceability. - Corrected the tool appending logic in the `_create_api_controller_agent` function to ensure that `RequestsDeleteToolWithParsing` and `RequestsPatchToolWithParsing` are properly added to the tools list for "DELETE" and "PATCH" operations. Issue: #24382 Dependencies: None Twitter handle: @lunara_x --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-18 13:37:46 +00:00
Chen Xiabin	63c60a31f0	[fix] baidu qianfan AiMessage with usage_metadata (#24389 ) make AIMessage usage_metadata has error	2024-07-18 09:28:16 -04:00
Rajendra Kadam	1c65529fd7	community[minor]: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders (#24393 ) Thank you for contributing to LangChain! - [x] PR title: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders - Description: Minor fixes in the PebbloSafeLoader: - Renamed the loader type from `remote_db` to `cloud_folder`. - Added `SharePointLoader` to the list of loaders supported by PebbloSafeLoader. - Issue: NA - Dependencies: NA - [x] Add tests and docs: NA	2024-07-18 08:23:12 -04:00
Paolo Ráez	0dec72cab0	Community[patch]: Missing "stream" parameter in cloudflare_workersai (#23987 ) ### Description Missing "stream" parameter. Without it, you'd never receive a stream of tokens when using stream() or astream() ### Issue No existing issue available	2024-07-18 02:09:39 +00:00
Brice Fotzo	034a8c7c1b	community: support advanced text extraction options for pdf documents (#20265 ) Description: - Updated constructors in PyPDFParser and PyPDFLoader to handle `extraction_mode` and additional kwargs, aligning with the capabilities of `PageObject.extract_text()` from pypdf. - Added `test_pypdf_loader_with_layout` along with a corresponding example text file to validate layout extraction from PDFs. Issue: fixes #19735 Dependencies: This change requires updating the pypdf dependency from version 3.4.0 to at least 4.0.0. Additional changes include the addition of a new test test_pypdf_loader_with_layout and an example text file to ensure the functionality of layout extraction from PDFs aligns with the new capabilities. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 20:47:09 +00:00
Bagatur	b5360e2e5f	community[patch]: Release 0.2.8 (#24354 )	2024-07-17 17:07:27 +00:00
Luis Moros	bcb5f354ad	community: Fix SQLDatabse.from_databricks issue when ran from Job (#24346 ) - Description: When SQLDatabase.from_databricks is ran from a Databricks Workflow job, line 205 (default_host = context.browserHostName) throws an ``AttributeError`` as the ``context`` object has no ``browserHostName`` attribute. The fix handles the exception and sets the ``default_host`` variable to null --------- Co-authored-by: lmorosdb <lmorosdb> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-17 12:40:12 -04:00
Rafael Pereira	cf28708e7b	Neo4j: Update with non-deprecated cypher methods, and new method to associate relationship embeddings (#23725 ) Description: At the moment neo4j wrapper is using setVectorProperty, which is deprecated ([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)). I replaced with the non-deprecated version. Neo4j recently introduced a new cypher method to associate embeddings into relations using "setRelationshipVectorProperty" method. In this PR I also implemented a new method to perform this association maintaining the same format used in the "add_embeddings" method which is used to associate embeddings into Nodes. I also included a test case for this new method.	2024-07-17 12:37:47 -04:00
maang-h	2a3288b15d	docs: Add ChatBaichuan docstrings (#24348 ) - Description: Add ChatBaichuan rich docstrings. - Issue: the issue #22296	2024-07-17 12:00:16 -04:00
Rafael Pereira	fc41730e28	neo4j: Fix test for order-insensitive comparison and floating-point precision issues (#24338 ) Description: This PR addresses two main issues in the `test_neo4jvector.py`: 1. Order-insensitive Comparison: Modified the `test_retrieval_dictionary` to ensure that it passes regardless of the order of returned values by parsing `page_content` into a structured format (dictionary) before comparison. 2. Floating-point Precision: Updated `test_neo4jvector_relevance_score` to handle minor floating-point precision differences by using the `isclose` function for comparing relevance scores with a relative tolerance. Errors addressed: - test_neo4jvector_relevance_score: ``` AssertionError: assert [(Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746), (Document(page_content='bar', metadata={'page': '1'}), 0.9998371005058289), (Document(page_content='baz', metadata={'page': '2'}), 0.9993508458137512)] == [(Document(page_content='foo', metadata={'page': '0'}), 1.0), (Document(page_content='bar', metadata={'page': '1'}), 0.9998376369476318), (Document(page_content='baz', metadata={'page': '2'}), 0.9993523359298706)] At index 0 diff: (Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746) != (Document(page_content='foo', metadata={'page': '0'}), 1.0) Full diff: - [(Document(page_content='foo', metadata={'page': '0'}), 1.0), + [(Document(page_content='foo', metadata={'page': '0'}), 1.0000014305114746), ? +++++++++++++++ - (Document(page_content='bar', metadata={'page': '1'}), 0.9998376369476318), ? ^^^ ------ + (Document(page_content='bar', metadata={'page': '1'}), 0.9998371005058289), ? ^^^^^^^^^ - (Document(page_content='baz', metadata={'page': '2'}), 0.9993523359298706), ? ---------- + (Document(page_content='baz', metadata={'page': '2'}), 0.9993508458137512), ? ++++++++++ ] ``` - test_retrieval_dictionary: ``` AssertionError: assert [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nname: John\nage: 30\n')] == [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n')] At index 0 diff: Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nname: John\nage: 30\n') != Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n') Full diff: - [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: 30\nname: John\n')] ? --------- + [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine Learning\nage: John\nage: 30\n')] ? +++++++++ ```	2024-07-17 09:28:25 -04:00
bovlb	5caa381177	community[minor]: Add ApertureDB as a vectorstore (#24088 ) Thank you for contributing to LangChain! - [X] ApertureDB as vectorstore: "community: Add ApertureDB as a vectorestore" - Description:* this change provides a new community integration that uses ApertureData's ApertureDB as a vector store. - Issue: none - Dependencies: depends on ApertureDB Python SDK - Twitter handle: ApertureData - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Integration tests rely on a local run of a public docker image. Example notebook additionally relies on a local Ollama server. - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ All lint tests pass. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Gautam <gautam@aperturedata.io>	2024-07-16 09:32:59 -07:00
frob	c59e663365	community[patch]: Fix docstring for ollama parameter "keep_alive" (#23973 ) Fix doc-string for ollama integration	2024-07-16 14:48:38 +00:00
Rahul Raghavendra Choudhury	f5a38772a8	community[patch]: Update TavilySearch to use TavilyClient instead of the deprecated Client (#24270 ) On using TavilySearchAPIRetriever with any conversation chain getting error : `TypeError: Client.__init__() got an unexpected keyword argument 'api_key'` It is because the retreiver class is using the depreciated `Client` class, `TavilyClient` need to be used instead. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-16 13:35:28 +00:00
Chen Xiabin	69b1603173	baidu qianfan AiMessage with usage_metadata (#24288 ) add usage_metadata to qianfan AIMessage. Thanks	2024-07-16 09:30:50 -04:00
Dobiichi-Origami	7aeaa1974d	community[patch]: change the class of `qianfan_ak` and `qianfan_sk` parameters (#24293 ) - Description: we changed the class of two parameters to fix a bug, which causes validation failure when using QianfanEmbeddingEndpoint	2024-07-16 09:17:48 -04:00
Lage Ragnarsson	a3c10fc6ce	community: Add support for specifying hybrid search for Databricks vector search (#23528 ) Description: Databricks Vector Search recently added support for hybrid keyword-similarity search. See [usage examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint) from their documentation. This PR updates the Langchain vectorstore interface for Databricks to enable the user to pass the query_type parameter to similarity_search to make use of this functionality. By default, there will not be any changes for existing users of this interface. To use the new hybrid search feature, it is now possible to do ```python # ... dvs = DatabricksVectorSearch(index) dvs.similarity_search("my search query", query_type="HYBRID") ``` Or using the retriever: ```python retriever = dvs.as_retriever( search_kwargs={ "query_type": "HYBRID", } ) retriever.invoke("my search query") ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-15 22:14:08 +00:00
Christopher Tee	5171ffc026	community(you): Integrate You.com conversational APIs (#23046 ) You.com is releasing two new conversational APIs — Smart and Research. This PR: - integrates those APIs with Langchain, as an LLM - streaming is supported If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-15 17:46:58 -04:00
maang-h	6c7d9f93b9	feat: Add ChatTongyi structured output (#24187 ) - Description: Add `with_structured_output` method to ChatTongyi to support structured output.	2024-07-15 15:57:21 -04:00
Chen Xiabin	8f4620f4b8	baidu qianfan streaming token_usage (#24117 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-15 19:52:31 +00:00
maang-h	9d97de34ae	community[patch]: Improve ChatBaichuan init args and role (#23878 ) - Description: Improve ChatBaichuan init args and role - ChatBaichuan adds `system` role - alias: `baichuan_api_base` -> `base_url` - `with_search_enhance` is deprecated - Add `max_tokens` argument	2024-07-15 15:17:00 -04:00
mrugank-wadekar	66bebeb76a	partners: add similarity search by image functionality to langchain_chroma partner package (#22982 ) - Description: This pull request introduces two new methods to the Langchain Chroma partner package that enable similarity search based on image embeddings. These methods enhance the package's functionality by allowing users to search for images similar to a given image URI. Also introduces a notebook to demonstrate it's use. - Issue: N/A - Dependencies: None - Twitter handle: @mrugank9009 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-15 18:48:22 +00:00
Carlos André Antunes	20151384d7	fix azure_openai.py: some keys do not exists (#24158 ) In some lines its trying to read a key that do not exists yet. In this cases I changed the direct access to dict.get() method Thank you for contributing to LangChain! - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-07-15 17:17:05 +00:00
Harold Martin	ccdaf14eff	docs: Spell check fixes (#24217 ) Description: Spell check fixes for docs, comments, and a couple of strings. No code change e.g. variable names. Issue: none Dependencies: none Twitter handle: hmartin	2024-07-15 15:51:43 +00:00
thehunmonkgroup	e8a21146d3	community[patch]: upgrade default model for ChatAnyscale (#24232 ) Old default `meta-llama/Llama-2-7b-chat-hf` no longer supported.	2024-07-15 11:34:59 -04:00
Bagatur	65321bf975	core[patch]: fix ToolCall "type" when streaming (#24218 )	2024-07-13 08:59:03 -07:00
Miroslav	aee55eda39	community: Skip Login to HuggubgFaceHub when token is not set (#21561 ) Thank you for contributing to LangChain! - [ ] HuggingFaceEndpoint: "Skip Login to HuggingFaceHub" - Where: langchain, community, llm, huggingface_endpoint - [ ] PR message: *Delete this entire checklist* and replace with - Description: Skip login to huggingface hub when when `huggingfacehub_api_token` is not set. This is needed when using custom `endpoint_url` outside of HuggingFaceHub. - Issue: the issue # it fixes https://github.com/langchain-ai/langchain/issues/20342 and https://github.com/langchain-ai/langchain/issues/19685 - Dependencies: None - [ ] Add tests and docs: 1. Tested with locally available TGI endpoint 2. Example Usage ```python from langchain_community.llms import HuggingFaceEndpoint llm = HuggingFaceEndpoint( endpoint_url='http://localhost:8080', server_kwargs={ "headers": {"Content-Type": "application/json"} } ) resp = llm.invoke("Tell me a joke") print(resp) ``` Also tested against HF Endpoints ```python from langchain_community.llms import HuggingFaceEndpoint huggingfacehub_api_token = "hf_xyz" repo_id = "mistralai/Mistral-7B-Instruct-v0.2" llm = HuggingFaceEndpoint( huggingfacehub_api_token=huggingfacehub_api_token, repo_id=repo_id, ) resp = llm.invoke("Tell me a joke") print(resp) ``` Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 22:10:32 +00:00
Jean Nshuti	d77d9bfc00	community[patch]: update typo document content returned from semanticscholar (#24175 ) Update "astract" -> abstract	2024-07-12 15:40:47 +00:00
Tomaz Bratanic	d3a2b9fae0	Fix neo4j type error on missing constraint information (#24177 ) If you use `refresh_schema=False`, then the metadata constraint doesn't exist. ATM, we used default `None` in the constraint check, but then `any` fails because it can't iterate over None value	2024-07-12 06:39:29 -04:00
thedavgar	ffe6ca986e	community: Fix Bug in Azure Search Vectorstore search asyncronously (#24081 ) Thank you for contributing to LangChain! Description: This PR fixes a bug described in the issue in #24064, when using the AzureSearch Vectorstore with the asyncronous methods to do search which is also the method used for the retriever. The proposed change includes just change the access of the embedding as optional because is it not used anywhere to retrieve documents. Actually, the syncronous methods of retrieval do not use the embedding neither. With this PR the code given by the user in the issue works. ```python vectorstore = AzureSearch( azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"), azure_search_key=os.getenv("AI_SEARCH_API_KEY"), index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"), fields=fields, embedding_function=encoder, ) retriever = vectorstore.as_retriever(search_type="hybrid", k=2) await vectorstore.avector_search("what is the capital of France") await retriever.ainvoke("what is the capital of France") ``` Issue: The Azure Search Vectorstore is not working when searching for documents with asyncronous methods, as described in issue #24064 Dependencies: There are no extra dependencies required for this change. --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-11 18:32:19 -07:00
Bagatur	5fd1e67808	core[minor], integrations...[patch]: Support ToolCall as Tool input and ToolMessage as Tool output (#24038 ) Changes: - ToolCall, InvalidToolCall and ToolCallChunk can all accept a "type" parameter now - LLM integration packages add "type" to all the above - Tool supports ToolCall inputs that have "type" specified - Tool outputs ToolMessage when a ToolCall is passed as input - Tools can separately specify ToolMessage.content and ToolMessage.raw_output - Tools emit events for validation errors (using on_tool_error and on_tool_end) Example: ```python @tool("structured_api", response_format="content_and_raw_output") def _mock_structured_tool_with_raw_output( arg1: int, arg2: bool, arg3: Optional[dict] = None ) -> Tuple[str, dict]: """A Structured Tool""" return f"{arg1} {arg2}", {"arg1": arg1, "arg2": arg2, "arg3": arg3} def test_tool_call_input_tool_message_with_raw_output() -> None: tool_call: Dict = { "name": "structured_api", "args": {"arg1": 1, "arg2": True, "arg3": {"img": "base64string..."}}, "id": "123", "type": "tool_call", } expected = ToolMessage("1 True", raw_output=tool_call["args"], tool_call_id="123") tool = _mock_structured_tool_with_raw_output actual = tool.invoke(tool_call) assert actual == expected tool_call.pop("type") with pytest.raises(ValidationError): tool.invoke(tool_call) actual_content = tool.invoke(tool_call["args"]) assert actual_content == expected.content ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-11 14:54:02 -07:00
Jacob Lee	f1f1f75782	community[patch]: Make AzureML endpoint return AI messages for type assistant (#24085 )	2024-07-11 21:45:30 +02:00
Atul R	457677c1b7	community: Fixes use of ImagePromptTemplate with Ollama (#24140 ) Description: ImagePromptTemplate for Multimodal llms like llava when using Ollama Twitter handle: https://x.com/a7ulr Details: When using llava models / any ollama multimodal llms and passing images in the prompt as urls, langchain breaks with this error. ```python image_url_components = image_url.split(",") ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'dict' object has no attribute 'split' ``` From the looks of it, there was bug where the condition did check for a `url` field in the variable but missed to actually assign it. This PR fixes ImagePromptTemplate for Multimodal llms like llava when using Ollama specifically. @hwchase17	2024-07-11 11:31:48 -07:00
Matt	8327925ab7	community:support additional Azure Search Options (#24134 ) - Description: Support additional kwargs options for the Azure Search client (Described here https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md#configurations) - Issue: N/A - Dependencies: No additional Dependencies ---------	2024-07-11 18:22:36 +00:00
Eugene Yurtsev	08638ccc88	community[patch]: QianfanLLMEndpoint fix type information for the keys (#24128 ) Fix for issue: https://github.com/langchain-ai/langchain/issues/24126	2024-07-11 16:24:26 +00:00
Eugene Yurtsev	1e7d8ba9a6	ci[patch]: Update community linter to provide a helpful error message (#24127 ) Update community import linter to explain what's wrong	2024-07-11 16:22:08 +00:00
maang-h	16e178a8c2	docs: Add MiniMaxChat docstrings (#24026 ) - Description: Add MiniMaxChat rich docstrings. - Issue: the issue #22296	2024-07-11 10:55:02 -04:00
Christophe Bornet	5fc5ef2b52	community[minor]: Add graph store extractors (#24065 ) This adds an extractor interface and an implementation for HTML pages. Extractors are used to create GraphVectorStore Links on loaded content. Twitter handle: cbornet_	2024-07-11 10:35:31 -04:00
maang-h	9bcf8f867d	docs: Add SQLChatMessageHistory docstring (#23978 ) - Description: Add SQLChatMessageHistory docstring. - Issue: the issue #21983 Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-11 14:24:28 +00:00
Rafael Pereira	092e9ee0e6	community[minor]: Neo4j Fixed similarity docs (#23913 ) Description: There was missing some documentation regarding the `filter` and `params` attributes in similarity search methods. --------- Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>	2024-07-11 10:16:48 -04:00
ccurme	975b6129f6	core[patch]: support conversion of runnables to tools (#23992 ) Open to other thoughts on UX. string input: ```python as_tool = retriever.as_tool() as_tool.invoke("cat") # [Document(...), ...] ``` typed dict input: ```python class Args(TypedDict): key: int def f(x: Args) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(f).as_tool( name="my tool", description="description", # name, description are inferred if not supplied ) as_tool.invoke({"key": 3}) # "6" ``` for untyped dict input, allow specification of parameters + types ```python def g(x: Dict[str, Any]) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(g).as_tool(arg_types={"key": int}) result = as_tool.invoke({"key": 3}) # "6" ``` Passing the `arg_types` is slightly awkward but necessary to ensure tool calls populate parameters correctly: ```python from typing import Any, Dict from langchain_core.runnables import RunnableLambda from langchain_openai import ChatOpenAI def f(x: Dict[str, Any]) -> str: return str(x["key"] * 2) runnable = RunnableLambda(f) as_tool = runnable.as_tool(arg_types={"key": int}) llm = ChatOpenAI().bind_tools([as_tool]) result = llm.invoke("Use the tool on 3.") tool_call = result.tool_calls[0] args = tool_call["args"] assert args == {"key": 3} as_tool.run(args) ``` Contrived (?) example with langgraph agent as a tool: ```python from typing import List, Literal from typing_extensions import TypedDict from langchain_openai import ChatOpenAI from langgraph.prebuilt import create_react_agent llm = ChatOpenAI(temperature=0) def magic_function(input: int) -> int: """Applies a magic function to an input.""" return input + 2 agent_1 = create_react_agent(llm, [magic_function]) class Message(TypedDict): role: Literal["human"] content: str agent_tool = agent_1.as_tool( arg_types={"messages": List[Message]}, name="Jeeves", description="Ask Jeeves.", ) agent_2 = create_react_agent(llm, [agent_tool]) ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-10 19:29:59 -04:00
Eugene Yurtsev	c4e149d4f1	community[patch]: Add linter to catch @root_validator (#24070 ) - Add linter to prevent further usage of vanilla root validator - Udpate remaining root validators	2024-07-10 14:51:03 +00:00
ccurme	9c6efadec3	community[patch]: propagate cost information to OpenAI callback (#23996 ) This is enabled following https://github.com/langchain-ai/langchain/pull/22716.	2024-07-10 14:50:35 +00:00
Ethan Yang	13855ef0c3	[HuggingFace Pipeline] add streaming support (#23852 )	2024-07-09 17:02:00 -04:00
Erick Friis	e80c150c44	community: release 0.2.7 (prev was langchain) (#23997 )	2024-07-08 23:43:32 +00:00
Eugene Yurtsev	f765e8fa9d	core[minor],community[patch],standard-tests[patch]: Move InMemoryImplementation to langchain-core (#23986 ) This PR moves the in memory implementation to langchain-core. * The implementation remains importable from langchain-community. * Supporting utilities are marked as private for now.	2024-07-08 14:11:51 -07:00
Eugene Yurtsev	aa8c9bb4a9	community[patch]: Add constraint for pdfminer.six to unbreak CI (#23988 ) Something changed in pdfminer six. This PR unreaks CI without fixing the underlying PDF parser.	2024-07-08 20:55:19 +00:00
Eugene Yurtsev	2c180d645e	core[minor],community[minor]: Upgrade all @root_validator() to @pre_init (#23841 ) This PR introduces a @pre_init decorator that's a @root_validator(pre=True) but with all the defaults populated!	2024-07-08 16:09:29 -04:00
Rajendra Kadam	8b84457b17	community[minor]: Support PGVector in PebbloRetrievalQA (#23874 ) - Description: Support PGVector in PebbloRetrievalQA - Identity and Semantic Enforcement support for PGVector - Refactor Vectorstore validation and name check - Clear the overridden identity and semantic enforcement filters - Issue: NA - Dependencies: NA - Tests: NA(already added) - Docs: Updated - Twitter handle: [@Raj__725](https://twitter.com/Raj__725)	2024-07-05 16:02:25 -04:00
Rajendra Kadam	ee8aa54f53	community[patch]: Fix source path mismatch in PebbloSafeLoader (#23857 ) Description: Fix for source path mismatch in PebbloSafeLoader. The fix involves storing the full path in the doc metadata in VectorDB Issue: NA, caught in internal testing Dependencies: NA Add tests: Updated tests	2024-07-05 15:24:17 -04:00
Christophe Bornet	42d049f618	core[minor]: Add Graph Store component (#23092 ) This PR introduces a GraphStore component. GraphStore extends VectorStore with the concept of links between documents based on document metadata. This allows linking documents based on a variety of techniques, including common keywords, explicit links in the content, and other patterns. This works with existing Documents, so it’s easy to extend existing VectorStores to be used as GraphStores. The interface can be implemented for any Vector Store technology that supports metadata, not only graph DBs. When retrieving documents for a given query, the first level of search is done using classical similarity search. Next, links may be followed using various traversal strategies to get additional documents. This allows documents to be retrieved that aren’t directly similar to the query but contain relevant information. 2 retrieving methods are added to the VectorStore ones : * traversal_search which gets all linked documents up to a certain depth * mmr_traversal_search which selects linked documents using an MMR algorithm to have more diverse results. If a depth of retrieval of 0 is used, GraphStore is effectively a VectorStore. It enables an easy transition from a simple VectorStore to GraphStore by adding links between documents as a second step. An implementation for Apache Cassandra is also proposed. See https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb for a notebook explaining how to use GraphStore and that shows that it can answer correctly to questions that a simple VectorStore cannot. Twitter handle: _cbornet	2024-07-05 12:24:10 -04:00
Eugene Yurtsev	6f08e11d7c	core[minor]: add upsert, streaming_upsert, aupsert, astreaming_upsert methods to the VectorStore abstraction (#23774 ) This PR rolls out part of the new proposed interface for vectorstores (https://github.com/langchain-ai/langchain/pull/23544) to existing store implementations. The PR makes the following changes: 1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert methods to the vectorstore. 2. Updates `add_texts` and `aadd_texts` to be non required with a default implementation that delegates to `upsert` and `aupsert` if those have been implemented. The original `add_texts` and `aadd_texts` methods are problematic as they spread object specific information across document and *kwargs. (e.g., ids are not a part of the document) 3. Adds a default implementation to `add_documents` and `aadd_documents` that delegates to `upsert` and `aupsert` respectively. 4. Adds standard unit tests to verify that a given vectorstore implements a correct read/write API. A downside of this implementation is that it creates `upsert` with a very similar signature to `add_documents`. The reason for introducing `upsert` is to: Remove any ambiguities about what information is allowed in `kwargs`. Specifically kwargs should only be used for information common to all indexed data. (e.g., indexing timeout). *Allow inheriting from an anticipated generalized interface for indexing that will allow indexing `BaseMedia` (i.e., allow making a vectorstore for images/audio etc.) `add_documents` can be deprecated in the future in favor of `upsert` to make sure that users have a single correct way of indexing content. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-05 12:21:40 -04:00
Philippe PRADOS	289960bc60	community[patch]: Redis.delete should be a regular method not a static method (#23873 ) The `langchain_common.vectostore.Redis.delete()` must not be a `@staticmethod`. With the current implementation, it's not possible to have multiple instances of Redis vectorstore because all versions must share the `REDIS_URL`. It's not conform with the base class.	2024-07-05 12:04:58 -04:00
Klaudia Lemiec	a2082bc1f8	docs: Arxiv docs update (#23871 ) - [X] PR title - [X] PR message: *Delete this entire checklist* and replace with - Description: Update of docstrings and docpages - Issue: [22866](https://github.com/langchain-ai/langchain/issues/22866) - [X] Add tests and docs - [X] Lint and test	2024-07-05 11:43:51 -04:00
André Quintino	99b1467b63	community: add support for 'cloud' parameter in JiraAPIWrapper (#23057 ) - Description: Enhance JiraAPIWrapper to accept the 'cloud' parameter through an environment variable. This update allows more flexibility in configuring the environment for the Jira API. - Twitter handle: Andre_Q_Pereira --------- Co-authored-by: André Quintino <andre.quintino@tui.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-05 15:11:10 +00:00
wenngong	b1e90b3075	community: add model_name param valid for GPT4AllEmbeddings (#23867 ) Description: add model_name param valid for GPT4AllEmbeddings Issue: #23863 #22819 --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-07-05 10:46:34 -04:00
volodymyr-memsql	a4eb6d0fb1	community: add SingleStoreDB semantic cache (#23218 ) This PR adds a `SingleStoreDBSemanticCache` class that implements a cache based on SingleStoreDB vector store, integration tests, and a notebook example. Additionally, this PR contains minor changes to SingleStoreDB vector store: - change add texts/documents methods to return a list of inserted ids - implement delete(ids) method to delete documents by list of ids - added drop() method to drop a correspondent database table - updated integration tests to use and check functionality implemented above CC: @baskaryan, @hwchase17 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2024-07-05 09:26:06 -04:00
Igor Drozdov	bb597b1286	feat(community): add bind_tools function for ChatLiteLLM (#23823 ) It's a follow-up to https://github.com/langchain-ai/langchain/pull/23765 Now the tools can be bound by calling `bind_tools` ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool from langchain_community.chat_models import ChatLiteLLM class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" # tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] tools = [GetWeather, GetPopulation] llm = ChatLiteLLM(model="claude-3-sonnet-20240229").bind_tools(tools) ai_msg = llm.invoke(prompt) print(ai_msg.tool_calls) ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>	2024-07-05 09:19:41 -04:00
Jiejun Tan	2be66a38d8	huggingface: Fix huggingface tei support (#22653 ) Update former pull request: https://github.com/langchain-ai/langchain/pull/22595. Modified `libs/partners/huggingface/langchain_huggingface/embeddings/huggingface_endpoint.py`, where the API call function does not match current [Text Embeddings Inference API](https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/embed). One example is: ```json { "inputs": "string", "normalize": true, "truncate": false } ``` Parameters in `_model_kwargs` are not passed properly in the latest version. By the way, the issue [why cause 413? #50](https://github.com/huggingface/text-embeddings-inference/issues/50) might be solved.	2024-07-03 13:30:29 -07:00
Ikko Eltociear Ashimine	75734fbcf1	community: fix typo in unit tests for test_zenguard.py (#23819 ) enviroment -> environment - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM"	2024-07-03 14:05:42 -04:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
Oguz Vuruskaner	2a2c0d1a94	community[deepinfra]: fix tool call parsing. (#23162 ) This PR includes fix for DeepInfra tool call parsing.	2024-07-03 12:11:37 -04:00
maang-h	525109e506	feat: Implement ChatBaichuan asynchronous interface (#23589 ) - Description: Add interface to `ChatBaichuan` to support asynchronous requests - `_agenerate` method - `_astream` method --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-03 12:10:04 -04:00
Qingchuan Hao	5cd4083457	community: make bing web search as the only option (#23523 ) This PR make bing web search as the option for BingSearchAPIWrapper to facilitate and simply the user interface on Langchain. This is a follow-up work of https://github.com/langchain-ai/langchain/pull/23306.	2024-07-02 17:13:54 -04:00
maang-h	e4e28a6ff5	community[patch]: Fix MiniMaxChat validate_environment error (#23770 ) - Description: Fix some issues in MiniMaxChat - Fix `minimax_api_host` not in `values` error - Remove `minimax_group_id` from reading environment variables, the `minimax_group_id` no longer use in MiniMaxChat - Invoke callback prior to yielding token, the issus #16913	2024-07-02 13:23:32 -04:00
Eugene Yurtsev	46ff0f7a3c	community[patch]: Update @root_validators to use explicit pre=True or pre=False (#23737 )	2024-07-02 10:47:21 -04:00
Igor Drozdov	b664dbcc36	feat(community): add support for tool_calls response (#23765 ) When `model_kwargs={"tools": tools}` are passed to `ChatLiteLLM`, they are executed, but the response is not recognized correctly Let's add `tool_calls` to the `additional_kwargs` Thank you for contributing to LangChain! ## ChatAnthropic I used the following example to verify the output of llm with tools: ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_anthropic import ChatAnthropic class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") llm = ChatAnthropic(model="claude-3-sonnet-20240229") llm_with_tools = llm.bind_tools([GetWeather, GetPopulation]) ai_msg = llm_with_tools.invoke("Which city is hotter today and which is bigger: LA or NY?") print(ai_msg.tool_calls) ``` I get the following response: ```json [{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01UfDA89knrhw3vFV9X47neT'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01NrYVRYae7m7z7tBgyPb3Gd'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01EPFEpDgzL6vV2dTpD9SVP5'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01B5J6tPJXgwwfhQX9BHP2dt'}] ``` ## LiteLLM Based on https://litellm.vercel.app/docs/completion/function_call ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool import litellm class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] response = litellm.completion(model="claude-3-sonnet-20240229", messages=[{'role': 'user', 'content': prompt}], tools=tools) print(response.choices[0].message.tool_calls) ``` ```python [ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetWeather'), id='toolu_01HeDWV5vP7BDFfytH5FJsja', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetWeather'), id='toolu_01EiLesUSEr3YK1DaE2jxsQv', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetPopulation'), id='toolu_01Xz26zvkBDRxEUEWm9pX6xa', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetPopulation'), id='toolu_01SDqKnsLjvUXuBsgAZdEEpp', type='function')] ``` ## ChatLiteLLM When I try the following ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool from langchain_community.chat_models import ChatLiteLLM class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] llm = ChatLiteLLM(model="claude-3-sonnet-20240229", model_kwargs={"tools": tools}) ai_msg = llm.invoke(prompt) print(ai_msg) print(ai_msg.tool_calls) ``` ```python content="Okay, let's find out the current weather and populations for Los Angeles and New York City:" response_metadata={'token_usage': Usage(prompt_tokens=329, completion_tokens=193, total_tokens=522), 'model': 'claude-3-sonnet-20240229', 'finish_reason': 'tool_calls'} id='run-748b7a84-84f4-497e-bba1-320bd4823937-0' [] ``` --- When I apply the changes of this PR, the output is ```json [{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_017D2tGjiaiakB1HadsEFZ4e'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01WrDpJfVqLkPejWzonPCbLW'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_016UKyYrVAV9Pz99iZGgGU7V'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01Sgv1imExFX1oiR1Cw88zKy'}] ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>	2024-07-02 10:42:08 -04:00
Eugene Yurtsev	338cef35b4	community[patch]: update @root_validator in utilities namespace (#23768 ) Update all utilities to use `pre=True` or `pre=False` https://github.com/langchain-ai/langchain/issues/22819	2024-07-02 14:33:01 +00:00
antonpibm	ffde8a6a09	Milvus vectorstore: fix pass ids as argument after upsert (#23761 ) Description: Milvus vectorstore supports both `add_documents` via the base class and `upsert` method which deletes and re-adds documents based on their ids Issue: Due to mismatch in the interfaces the ids used by `upsert` are neglected in `add_documents`, as `ids` are passed as argument in `upsert` but via `kwargs` is `add_documents` This caused exceptions and inconsistency in the DB, tested with `auto_id=False` Fix: pass `ids` via `kwargs` to `add_documents`	2024-07-02 13:45:30 +00:00
Eugene Yurtsev	d084172b63	community[patch]: root validator set explicit pre=False or pre=True (#23764 ) See issue: https://github.com/langchain-ai/langchain/issues/22819	2024-07-02 09:42:05 -04:00
mattthomps1	cc55823486	docs: updated PPLX model (#23723 ) Description: updated pplx docs to reference a currently [supported model](https://docs.perplexity.ai/docs/model-cards). pplx-70b-online ->llama-3-sonar-small-32k-online --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-02 08:48:49 -04:00
Jacob Lee	7791d92711	community[patch]: Fix requests alias for load_tools (#23734 ) CC @baskaryan	2024-07-01 15:02:14 -07:00
Eugene Yurtsev	f24e38876a	community[patch]: Update root_validators to use explicit pre=True or pre=False (#23736 )	2024-07-01 17:13:23 -04:00
Eugene Yurtsev	5d2262af34	community[patch]: Update root_validators to use pre=True or pre=False (#23731 ) Update root_validators in preparation for pydantic 2 migration.	2024-07-01 20:10:15 +00:00
maang-h	96af8f31ae	community[patch]: Invoke callback prior to yielding token (#23638 ) - Description: Invoke callback prior to yielding token in stream and astream methods for ChatZhipuAI. - Issue: the issue #16913	2024-07-01 18:12:24 +00:00
Valentin	bf402f902e	community: Fix LanceDB similarity search bug (#23591 ) Description: LanceDB didn't allow querying the database using similarity score thresholds because the metrics value was missing. This PR simply fixes that bug. Issue: not applicable Dependencies: none Twitter handle: not available --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-01 16:33:45 +00:00
Rafael Pereira	4b9517db85	Jira: Allow Jira access using only the token (#23708 ) - Description: At the moment the Jira wrapper only accepts the the usage of the Username and Password/Token at the same time. However Jira allows the connection using only is useful for enterprise context. Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>	2024-07-01 13:13:51 +00:00
Tim Van Wassenhove	24916c6703	community: Register pandas df in duckdb when creating vector_store (#23690 ) - Description: Register pandas df in duckdb when creating vector_store - Issue: Resolves #23308 - Dependencies: None - Twitter handle: @timvw Co-authored-by: Tim Van Wassenhove <tim.van.wassenhove@telenetgroup.be>	2024-07-01 09:12:06 -04:00
Bagatur	fc8fd49328	openai, anthropic, ...: with_structured_output to pass in explicit tool choice (#23645 ) ...community, mistralai, groq, fireworks part of #23644	2024-06-28 16:39:53 -07:00
Bagatur	381aedcc61	docs: standardize azure openai page (#23642 ) part of #22296	2024-06-28 15:15:41 -07:00
Vadym Barda	e8d77002ea	core: add RemoveMessage (#23636 ) This change adds a new message type `RemoveMessage`. This will enable `langgraph` users to manually modify graph state (or have the graph nodes modify the state) to remove messages by `id` Examples: * allow users to delete messages from state by calling ```python graph.update_state(config, values=[RemoveMessage(id=state.values[-1].id)]) ``` * allow nodes to delete messages ```python graph.add_node("delete_messages", lambda state: [RemoveMessage(id=state[-1].id)]) ```	2024-06-28 14:40:02 -07:00
ccurme	8fce8c6771	community: fix extended tests (#23640 )	2024-06-28 16:35:38 -04:00
Jacob Lee	a032583b17	docs[patch]: Update diagrams (#23613 )	2024-06-28 12:36:00 -07:00
j pradhan	5f21eab491	community:perplexity[patch]: standardize init args (#21794 ) updated request_timeout default alias value per related docstring. Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) Thank you for contributing to LangChain! --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-28 13:26:12 +00:00
mackong	11483b0fb8	community[patch]: set tool name for tongyi&qianfan llm (#22889 ) - Description: The name of ToolMessage is default to None, which makes tool message send to LLM likes ```json {"role": "tool", "tool_call_id": "", "content": "{\"time\": \"12:12\"}", "name": null} ``` But the name seems essential for some LLMs like TongYi Qwen. so we need to set the name use agent_action's tool value. - Issue: N/A - Dependencies: N/A	2024-06-28 09:17:05 -04:00
Leonid Ganeline	e4caa41aa9	community: docstrings `toolkits` (#23616 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-28 08:40:52 -04:00
ccurme	adf2dc13de	community: fix lint (#23611 )	2024-06-27 22:12:16 +00:00
Eugene Yurtsev	68f348357e	community[patch]: Test InMemoryVectorStore with RWAPI test suite (#23603 ) Add standard test suite to InMemoryVectorStore implementation.	2024-06-27 16:43:43 -04:00
NG Sai Prasanth	5e6d23f27d	community: Standardise tool import for arxiv & semantic scholar (#23578 ) - Description: Fixing the way users have to import Arxiv and Semantic Scholar - Issue: Changed to use `from langchain_community.tools.arxiv import ArxivQueryRun` instead of `from langchain_community.tools.arxiv.tool import ArxivQueryRun` - Dependencies: None - Twitter handle: Nope	2024-06-27 16:35:50 -04:00
Ayo Ayibiowu	c6f700b7cb	fix(community): allow support for disabling max_tokens args (#21534 ) This PR fixes an issue with not able to use unlimited/infinity tokens from the respective provider for the LiteLLM provider. This is an issue when working in an agent environment that the token usage can drastically increase beyond the initial value set causing unexpected behavior.	2024-06-27 16:28:59 -04:00
mackong	70834cd741	community[patch]: support convert FunctionMessage for Tongyi (#23569 ) Description: For function call agent with Tongyi, cause the AgentAction will be converted to FunctionMessage by `47f69fe0d8/libs/core/langchain_core/agents.py (L188)` But now Tongyi's convert_message_to_dict doesn't support FunctionMessage `47f69fe0d8/libs/community/langchain_community/chat_models/tongyi.py (L184-L207)` Then next round conversation will be failed by the TypeError exception. This patch adds the support to convert FunctionMessage for Tongyi. Issue: N/A Dependencies: N/A	2024-06-27 15:49:26 -04:00
Jacob Lee	60fc15a56b	docs[patch]: Update docs introduction and README (#23558 ) CC @hwchase17 @baskaryan	2024-06-27 08:51:43 -07:00
maang-h	5070004e8a	docs: Update Tongyi ChatModel docstring (#23540 ) - Description: Update Tongyi ChatModel rich docstring - Issue: the issue #22296	2024-06-26 13:07:13 -04:00
yonarw	6d0ebbca1e	community: SAP HANA Vector Engine fix for latest HANA release (#23516 ) - Description: This PR fixes an issue with SAP HANA Cloud QRC03 version. In that version the number to indicate no length being set for a vector column changed from -1 to 0. The change in this PR support both behaviours (old/new). - Dependencies: No dependencies have been introduced. - Tests: The change is covered by previous unit tests.	2024-06-26 13:15:51 +00:00
Alireza Kashani	c39521b70d	Update grobid.py (#23399 ) fixed potential `IndexError: list index out of range` in case there is no title Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-26 09:11:02 -04:00
Qingchuan Hao	ee282a1d2e	community: add missing link (#23526 )	2024-06-26 09:06:28 -04:00
ccurme	99ce84ef23	community: release 0.2.6 (#23501 )	2024-06-25 21:29:52 +00:00
Nuradil	c93d9e66e4	Community: Update and fix ZenGuardTool docs and add ZenguardTool to init files (#23415 ) Thank you for contributing to LangChain! - [x] PR title: "community: update docs and add tool to init.py" - [x] PR message: - Description: Fixed some errors and comments in the docs and added our ZenGuardTool and additional classes to init.py for easy access when importing - Question: when will you update the langchain-community package in pypi to make our tool available? - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Thank you for review! --------- Co-authored-by: Baur <baur.krykpayev@gmail.com>	2024-06-25 19:26:32 +00:00
Qingchuan Hao	ad50702934	community: add default value to bing_search_url (#23306 ) bing_search_url is an endpoint to requests bing search resource and is normally invariant to users, we can give it the default value to simply the uesages of this utility/tool	2024-06-25 08:08:41 -04:00
wenngong	b33d2346db	community: FlashrankRerank support loading customer client (#23350 ) Description: FlashrankRerank Document compressor support loading customer client Issue: #23338 Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-06-24 17:50:08 -04:00
maang-h	f58c40b4e3	docs: Update QianfanChatEndpoint ChatModel docstring (#23337 ) - Description: Update QianfanChatEndpoint ChatModel rich docstring - Issue: the issue #22296	2024-06-24 17:42:46 -04:00
Rahul Triptahi	9ef93ecd7c	community[minor]: Added classification_location parameter in PebbloSafeLoader. (#22565 ) Description: Add classifier_location feature flag. This flag enables Pebblo to decide the classifier location, local or pebblo-cloud. Unit Tests: N/A Documentation: N/A --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-24 17:30:38 -04:00
yuncliu	398b2b9c51	community[minor]: Add Ascend NPU optimized Embeddings (#20260 ) - Description: Add NPU support for embeddings --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-24 20:15:11 +00:00
RUO	2b87e330b0	community: fix issue with nested field extraction in MongodbLoader (#22801 ) Description: This PR addresses an issue in the `MongodbLoader` where nested fields were not being correctly extracted. The loader now correctly handles nested fields specified in the `field_names` parameter. Issue: Fixes an issue where attempting to extract nested fields from MongoDB documents resulted in `KeyError`. Dependencies: No new dependencies are required for this change. Twitter handle: (Optional, your Twitter handle if you'd like a mention when the PR is announced) ### Changes 1. Field Name Parsing: - Added logic to parse nested field names and safely extract their values from the MongoDB documents. 2. Projection Construction: - Updated the projection dictionary to include nested fields correctly. 3. Field Extraction: - Updated the `aload` method to handle nested field extraction using a recursive approach to traverse the nested dictionaries. ### Example Usage Updated usage example to demonstrate how to specify nested fields in the `field_names` parameter: ```python loader = MongodbLoader( connection_string=MONGO_URI, db_name=MONGO_DB, collection_name=MONGO_COLLECTION, filter_criteria={"data.job.company.industry_name": "IT", "data.job.detail": { "$exists": True }}, field_names=[ "data.job.detail.id", "data.job.detail.position", "data.job.detail.intro", "data.job.detail.main_tasks", "data.job.detail.requirements", "data.job.detail.preferred_points", "data.job.detail.benefits", ], ) docs = loader.load() print(len(docs)) for doc in docs: print(doc.page_content) ``` ### Testing Tested with a MongoDB collection containing nested documents to ensure that the nested fields are correctly extracted and concatenated into a single page_content string. ### Note This change ensures backward compatibility for non-nested fields and improves functionality for nested field extraction. ### Output Sample ```python print(docs[:3]) ``` ```shell # output sample: [ Document( # Here in this example, page_content is the combined text from the fields below # "position", "intro", "main_tasks", "requirements", "preferred_points", "benefits" page_content='all combined contents from the requested fields in the document', metadata={'database': 'Your Database name', 'collection': 'Your Collection name'} ), ... ] ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-24 19:29:11 +00:00
Tomaz Bratanic	aeeda370aa	Sanitize backticks from neo4j labels and types for import (#23367 )	2024-06-24 19:05:31 +00:00
Rave Harpaz	f5ff7f178b	Add OCI Generative AI new model support (#22880 ) - [x] PR title: community: Add OCI Generative AI new model support - [x] PR message: - Description: adding support for new models offered by OCI Generative AI services. This is a moderate update of our initial integration PR 16548 and includes a new integration for our chat models under /langchain_community/chat_models/oci_generative_ai.py - Issue: NA - Dependencies: No new Dependencies, just latest version of our OCI sdk - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests 2. we have updated our documentation including a new ipynb for our new chat integration - [x] Lint and test: `make format`, `make lint`, and `make test` run successfully --------- Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com> Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>	2024-06-24 14:48:23 -04:00
Baur	aa358f2be4	community: Add ZenGuard tool (#22959 ) Description This is the community integration of ZenGuard AI - the fastest guardrails for GenAI applications. ZenGuard AI protects against: - Prompts Attacks - Veering of the pre-defined topics - PII, sensitive info, and keywords leakage. - Toxicity - Etc. Twitter Handle : @zenguardai - [x] Add tests and docs: If you're adding a new integration, please include 1. Added an integration test 2. Added colab - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. --------- Co-authored-by: Nuradil <nuradil.maksut@icloud.com> Co-authored-by: Nuradil <133880216+yaksh0nti@users.noreply.github.com>	2024-06-24 17:40:56 +00:00
Mathis Joffre	60103fc4a5	community: Fix OVHcloud 401 Unauthorized on embedding. (#23260 ) They are now rejecting with code 401 calls from users with expired or invalid tokens (while before they were being considered anonymous). Thus, the authorization header has to be removed when there is no token. Related to: #23178 --------- Signed-off-by: Joffref <mariusjoffre@gmail.com>	2024-06-24 12:58:32 -04:00
Leonid Ganeline	987099cfcd	community: `toolkits` docstrings (#23286 ) Added missed docstrings. Formatted docstrings to the consistent form. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-22 14:37:52 +00:00
Rahul Triptahi	0cd3f93361	Enhance metadata of sharepointLoader. (#22248 ) Description: 2 feature flags added to SharePointLoader in this PR: 1. load_auth: if set to True, adds authorised identities to metadata 2. load_extended_metadata, adds source, owner and full_path to metadata Unit tests:N/A Documentation: To be done. --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-21 17:03:38 -07:00
Rajendra Kadam	7ee2822ec2	community: Fix TypeError in PebbloRetrievalQA (#23170 ) Description: Fix "`TypeError: 'NoneType' object is not iterable`" when the auth_context is absent in PebbloRetrievalQA. The auth_context is optional; hence, PebbloRetrievalQA should work without it, but it throws an error at the moment. This PR fixes that issue. Issue: NA Dependencies: None Unit tests: NA --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-21 17:04:00 -04:00
Iurii Umnov	3b7b933aa2	community[minor]: OpenAPI agent. Add support for PUT, DELETE and PATCH (#22962 ) Description: Add PUT, DELETE and PATCH tools to tool list for OpenAPI agent if dangerous requests are allowed. Issue: https://github.com/langchain-ai/langchain/issues/20469	2024-06-21 20:44:23 +00:00
Guangdong Liu	3c42bf8d97	community(patch):Fix PineconeHynridSearchRetriever not having search_kwargs (#21577 ) - close #21521	2024-06-21 16:27:52 -04:00
Rahul Triptahi	4bb3d5c488	[community][quick-fix]: changed from blob.path to blob.path.name in 0365BaseLoader. (#22287 ) Description: file_metadata_ was not getting propagated to returned documents. Changed the lookup key to the name of the blob's path. Changed blob.path key to blob.path.name for metadata_dict key lookup. Documentation: N/A Unit tests: N/A Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-21 15:51:03 -04:00
mackong	360a70c8a8	core[patch]: fix no current event loop for sql history in async mode (#22933 ) - Description: When use RunnableWithMessageHistory/SQLChatMessageHistory in async mode, we'll get the following error: ``` Error in RootListenersTracer.on_chain_end callback: RuntimeError("There is no current event loop in thread 'asyncio_3'.") ``` which throwed by `ddfbca38df/libs/community/langchain_community/chat_message_histories/sql.py (L259)`. and no message history will be add to database. In this patch, a new _aexit_history function which will'be called in async mode is added, and in turn aadd_messages will be called. In this patch, we use `afunc` attribute of a Runnable to check if the end listener should be run in async mode or not. - Issue: #22021, #22022 - Dependencies: N/A	2024-06-21 10:39:47 -04:00
Zheng Robert Jia	a349fce880	docs[minor],community[patch]: Minor tutorial docs improvement, minor import error quick fix. (#22725 ) minor changes to module import error handling and minor issues in tutorial documents. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-20 15:36:49 -04:00
Luis Moros	d5be160af0	community[patch]: Fix sql_databse.from_databricks issue when ran from Job (#23224 ) Desscription: When the ``sql_database.from_databricks`` is executed from a Workflow Job, the ``context`` object does not have a "browserHostName" property, resulting in an error. This change manages the error so the "DATABRICKS_HOST" env variable value is used instead of stoping the flow Co-authored-by: lmorosdb <lmorosdb>	2024-06-20 19:34:15 +00:00
maang-h	bc4cd9c5cc	community[patch]: Update root_validators ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI (#22853 ) This PR updates root validators for: - ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI Issues #22819 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-20 16:36:41 +00:00
Leonid Ganeline	51e75cf59d	community: docstrings (#23202 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-20 11:08:13 -04:00
xyd	9b3a025f9c	fix https://github.com/langchain-ai/langchain/issues/23215 (#23216 ) fix bug The ZhipuAIEmbeddings class is not working. Co-authored-by: xu yandong <shaonian@acsx1.onexmail.com>	2024-06-20 13:04:50 +00:00
Michał Krassowski	710197e18c	community[patch]: restore compatibility with SQLAlchemy 1.x (#22546 ) - Description: Restores compatibility with SQLAlchemy 1.4.x that was broken since #18992 and adds a test run for this version on CI (only for Python 3.11) - Issue: fixes #19681 - Dependencies: None - Twitter handle: `@krassowski_m` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-19 17:58:57 +00:00
Jorge Piedrahita Ortiz	b3e53ffca0	community[patch]: sambanova llm integration improvement (#23137 ) - Description: sambanova sambaverse integration improvement: removed input parsing that was changing raw user input, and was making to use process prompt parameter as true mandatory	2024-06-19 10:30:14 -07:00
Jorge Piedrahita Ortiz	e162893d7f	community[patch]: update sambastudio embeddings (#23133 ) Description: update sambastudio embeddings integration, now compatible with generic endpoints and CoE endpoints	2024-06-19 10:26:56 -07:00
chenxi	505a2e8743	fix: MoonshotChat fails when setting the moonshot_api_key through the OS environment. (#23176 ) Close #23174 Co-authored-by: tianming <tianming@bytenew.com>	2024-06-19 16:28:24 +00:00
Eugene Yurtsev	1007a715a5	community[patch]: Prevent unit tests from making network requests (#23180 ) * Prevent unit tests from making network requests	2024-06-19 14:56:30 +00:00
ccurme	ca798bc6ea	community: move test to integration tests (#23178 ) Tests failing on master with > FAILED tests/unit_tests/embeddings/test_ovhcloud.py::test_ovhcloud_embed_documents - ValueError: Request failed with status code: 401, {"message":"Bad token; invalid JSON"}	2024-06-19 14:39:48 +00:00
鹿鹿鹿鲨	6b46b5e9ce	community: add request_kwargs and expect TimeError AsyncHtmlLoader (#23068 ) - Description: add `request_kwargs` and expect `TimeError` in `_fetch` function for AsyncHtmlLoader. This allows you to fill in the kwargs parameter when using the `load()` method of the `AsyncHtmlLoader` class. Co-authored-by: Yucolu <yucolu@tencent.com>	2024-06-18 20:02:46 -07:00
Artem Mukhin	e271f75bee	docs: Fix URL formatting in deprecation warnings (#23075 ) Description Updated the URLs in deprecation warning messages. The URLs were previously written as raw strings and are now formatted to be clickable HTML links. Example of a broken link in the current API Reference: https://api.python.langchain.com/en/latest/chains/langchain.chains.openai_functions.extraction.create_extraction_chain_pydantic.html <img width="942" alt="Screenshot 2024-06-18 at 13 21 07" src="https://github.com/langchain-ai/langchain/assets/4854600/a1b1863c-cd03-4af2-a9bc-70375407fb00">	2024-06-18 14:49:58 -04:00
Gabriel Petracca	c6660df58e	community[minor]: Implement Doctran async execution (#22372 ) Description The DoctranTextTranslator has an async transform function that was not implemented because [the doctran library](https://github.com/psychic-api/doctran) uses a sync version of the `execute` method. - I implemented the `DoctranTextTranslator.atransform_documents()` method using `asyncio.to_thread` to run the function in a separate thread. - I updated the example in the Notebook with the new async version. - The performance improvements can be appreciated when a big document is divided into multiple chunks. Relates to: - Issue #14645: https://github.com/langchain-ai/langchain/issues/14645 - Issue #14437: https://github.com/langchain-ai/langchain/issues/14437 - https://github.com/langchain-ai/langchain/pull/15264 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-18 18:17:37 +00:00
nold	226802f0c4	community: add args_schema to SearxSearch (#22954 ) This change adds args_schema (pydantic BaseModel) to SearxSearchRun for correct schema formatting on LLM function calls Issue: currently using SearxSearchRun with OpenAI function calling returns the following error "TypeError: SearxSearchRun._run() got an unexpected keyword argument '__arg1' ". This happens because the schema sent to the LLM is "input: '{"__arg1":"foobar"}'" while the method should be called with the "query" parameter. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 17:27:39 +00:00
Finlay Macklon	616d06d7fe	community: glob multiple patterns when using DirectoryLoader (#22852 ) - Description: Updated community.langchain_community.document_loaders.directory.py to enable the use of multiple glob patterns in the `DirectoryLoader` class. Now, the glob parameter is of type `list[str] \| str` and still defaults to the same value as before. I updated the docstring of the class to reflect this, and added a unit test to community.tests.unit_tests.document_loaders.test_directory.py named `test_directory_loader_glob_multiple`. This test also shows an example of how to use the new functionality. - ~~Issue:~~Discussion Thread: https://github.com/langchain-ai/langchain/discussions/18559 - Dependencies: None - Twitter handle: N/a - [x] Add tests and docs - Added test (described above) - Updated class docstring - [x] Lint and test --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-18 09:24:50 -07:00
Takuya Igei	9f791b6ad5	core[patch],community[patch],langchain[patch]: `tenacity` dependency to version `>=8.1.0,<8.4.0` (#22973 ) Fix https://github.com/langchain-ai/langchain/issues/22972. - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-18 10:34:28 -04:00
Raghav Dixit	55705c0f5e	LanceDB integration update (#22869 ) Added : - [x] relevance search (w/wo scores) - [x] maximal marginal search - [x] image ingestion - [x] filtering support - [x] hybrid search w reranking make test, lint_diff and format checked.	2024-06-17 20:54:26 -07:00
Chang Liu	62c8a67f56	community: add KafkaChatMessageHistory (#22216 ) Add chat history store based on Kafka. Files added: `libs/community/langchain_community/chat_message_histories/kafka.py` `docs/docs/integrations/memory/kafka_chat_message_history.ipynb` New issue to be created for future improvement: 1. Async method implementation. 2. Message retrieval based on timestamp. 3. Support for other configs when connecting to cloud hosted Kafka (e.g. add `api_key` field) 4. Improve unit testing & integration testing.	2024-06-17 20:34:01 -07:00
Lucas Tucker	e25a5966b5	docs: Standardize DocumentLoader docstrings (#22932 ) Standardizing DocumentLoader docstrings (of which there are many) This PR addresses issue #22866 and adds docstrings according to the issue's specified format (in the appendix) for files csv_loader.py and json_loader.py in langchain_community.document_loaders. In particular, the following sections have been added to both CSVLoader and JSONLoader: Setup, Instantiate, Load, Async load, and Lazy load. It may be worth adding a 'Metadata' section to the JSONLoader docstring to clarify how we want to extract the JSON metadata (using the `metadata_func` argument). The files I used to walkthrough the various sections were `example_2.json` from [HERE](https://support.oneskyapp.com/hc/en-us/articles/208047697-JSON-sample-files) and `hw_200.csv` from [HERE](https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html). --------- Co-authored-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-18 03:26:36 +00:00
Mohammad Mohtashim	60ba02f5db	[Community]: Fixed DDG DuckDuckGoSearchResults Docstring (#22968 ) - Description: A very small fix in the Docstring of `DuckDuckGoSearchResults` identified in the following issue. - Issue: #22961 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 03:16:24 +00:00
Eun Hye Kim	70761af8cf	community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community) (#22977 ) - PR title: "community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community)" - PR message: - Description: - Added an optional verify parameter to the Requests class with a default value of True. - Modified the get, post, patch, put, and delete methods to include the verify parameter. - Updated the _arequest async context manager to include the verify parameter. - Added the verify parameter to the GenericRequestsWrapper class and passed it to the Requests class. - Issue: This PR fixes issue #22975. - Dependencies: No additional dependencies are required for this change. - Twitter handle: @lunara_x You can check this change with below code. ```python from langchain_openai.chat_models import ChatOpenAI from langchain.requests import RequestsWrapper from langchain_community.agent_toolkits.openapi import planner from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec with open("swagger.yaml") as f: data = yaml.load(f, Loader=yaml.FullLoader) swagger_api_spec = reduce_openapi_spec(data) llm = ChatOpenAI(model='gpt-4o') swagger_requests_wrapper = RequestsWrapper(verify=False) # modified point superset_agent = planner.create_openapi_agent(swagger_api_spec, swagger_requests_wrapper, llm, allow_dangerous_requests=True, handle_parsing_errors=True) superset_agent.run( "Tell me the number and types of charts and dashboards available." ) ``` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 03:12:40 +00:00
Mohammad Mohtashim	bf839676c7	[Community]: FIxed the DocumentDBVectorSearch `_similarity_search_without_score` (#22970 ) - Description: The PR #22777 introduced a bug in `_similarity_search_without_score` which was raising the `OperationFailure` error. The mistake was syntax error for MongoDB pipeline which has been corrected now. - Issue: #22770	2024-06-17 20:08:42 -07:00
Anders Swanson	aacc6198b9	community: OCI GenAI embedding batch size (#22986 ) Thank you for contributing to LangChain! - [x] PR title: "community: OCI GenAI embedding batch size" - [x] PR message: - Issue: #22985 - [ ] Add tests and docs: N/A - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: Anders Swanson <anders.swanson@oracle.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-17 22:06:45 +00:00
Oguz Vuruskaner	dd25d08c06	community[minor]: add tool calling for DeepInfraChat (#22745 ) DeepInfra now supports tool calling for supported models. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-17 15:21:49 -04:00
maang-h	c6b7db6587	community: Add Baichuan Embeddings batch size (#22942 ) - Support batch size Baichuan updates the document, indicating that up to 16 documents can be imported at a time - Standardized model init arg names - baichuan_api_key -> api_key - model_name -> model	2024-06-17 14:11:04 -04:00
Shubham Pandey	56ac94e014	community[minor]: add `ChatSnowflakeCortex` chat model (#21490 ) Description: This PR adds a chat model integration for [Snowflake Cortex](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions), which gives an instant access to industry-leading large language models (LLMs) trained by researchers at companies like Mistral, Reka, Meta, and Google, including [Snowflake Arctic](https://www.snowflake.com/en/data-cloud/arctic/), an open enterprise-grade model developed by Snowflake. Dependencies: Snowflake's [snowpark](https://pypi.org/project/snowflake-snowpark-python/) library is required for using this integration. Twitter handle: [@gethouseware](https://twitter.com/gethouseware) - [x] Add tests and docs: 1. integration tests: `libs/community/tests/integration_tests/chat_models/test_snowflake.py` 2. unit tests: `libs/community/tests/unit_tests/chat_models/test_snowflake.py` 3. example notebook: `docs/docs/integrations/chat/snowflake.ipynb` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-06-17 09:47:05 -07:00
Christopher Tee	ada03dd273	community(you): Better support for You.com News API (#22622 ) ## Description While `YouRetriever` supports both You.com's Search and News APIs, news is supported as an afterthought. More specifically, not all of the News API parameters are exposed for the user, only those that happen to overlap with the Search API. This PR: - improves support for both APIs, exposing the remaining News API parameters while retaining backward compatibility - refactor some REST parameter generation logic - updates the docstring of `YouSearchAPIWrapper` - add input validation and warnings to ensure parameters are properly set by user - 🚨 Breaking: Limit the news results to `k` items If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-15 20:05:19 +00:00
maang-h	7a0af56177	docs: update ZhipuAI ChatModel docstring (#22934 ) - Description: Update ZhipuAI ChatModel rich docstring - Issue: the issue #22296	2024-06-15 09:12:21 -04:00
Bitmonkey	570d45b2a1	Update ollama.py with optional raw setting. (#21486 ) Ollama has a raw option now. https://github.com/ollama/ollama/blob/main/docs/api.md Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-14 17:19:26 -07:00
caiyueliang	9944ad7f5f	community: 'Solve the issue where the _search function in ElasticsearchStore supports passing a query_vector parameter, but the parameter does not take effect. (#21532 ) Issue: When using the similarity_search_with_score function in ElasticsearchStore, I expected to pass in the query_vector that I have already obtained. I noticed that the _search function does support the query_vector parameter, but it seems to be ineffective. I am attempting to resolve this issue. Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-14 17:13:11 -07:00
Erick Friis	79a64207f5	community: release 0.2.5 (#22923 )	2024-06-14 15:45:07 -07:00
Baskar Gopinath	c4f2bc9540	docs: Fix wrongly referenced class name in confluence.py (#22879 ) Fixes #22542 Changed ConfluenceReader to ConfluenceLoader	2024-06-14 14:00:48 -07:00
Philippe PRADOS	b61de9728e	community[minor]: Fix long_context_reorder.py async (#22839 ) Implement `async def atransform_documents( self, documents: Sequence[Document], **kwargs: Any ) -> Sequence[Document]` for `LongContextReorder`	2024-06-14 13:55:18 -04:00
Eugene Yurtsev	c72bcda4f2	community[major], experimental[patch]: Remove Python REPL from community (#22904 ) Remove the REPL from community, and suggest an alternative import from langchain_experimental. Fix for this issue: https://github.com/langchain-ai/langchain/issues/14345 This is not a bug in the code or an actual security risk. The python REPL itself is behaving as expected. The PR is done to appease blanket security policies that are just looking for the presence of exec in the code. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-14 17:53:29 +00:00
Eugene Yurtsev	9a877c7adb	community[patch]: SitemapLoader restrict depth of parsing sitemap (CVE-2024-2965) (#22903 ) This PR restricts the depth to which the sitemap can be parsed. Fix for: CVE-2024-2965	2024-06-14 13:04:40 -04:00
Eugene Yurtsev	4a77a3ab19	core[patch]: fix validation of @deprecated decorator (#22513 ) This PR moves the validation of the decorator to a better place to avoid creating bugs while deprecating code. Prevent issues like this from arising: https://github.com/langchain-ai/langchain/issues/22510 we should replace with a linter at some point that just does static analysis	2024-06-14 16:52:30 +00:00
Mohammad Mohtashim	d1b7a934aa	[Community]: HuggingFaceCrossEncoder `score` accounting for <not-relevant score,relevant score> pairs. (#22578 ) - Description: Some of the Cross-Encoder models provide scores in pairs, i.e., <not-relevant score (higher means the document is less relevant to the query), relevant score (higher means the document is more relevant to the query)>. However, the `HuggingFaceCrossEncoder` `score` method does not currently take into account the pair situation. This PR addresses this issue by modifying the method to consider only the relevant score if score is being provided in pair. The reason for focusing on the relevant score is that the compressors select the top-n documents based on relevance. - Issue: #22556 - Please also refer to this [comment](https://github.com/UKPLab/sentence-transformers/issues/568#issuecomment-729153075)	2024-06-14 08:28:24 -07:00
Thanh Nguyen	b5e2ba3a47	community[minor]: add chat model llamacpp (#22589 ) - PR title: [community] add chat model llamacpp - PR message: - Description: This PR introduces a new chat model integration with llamacpp_python, designed to work similarly to the existing ChatOpenAI model. + Work well with instructed chat, chain and function/tool calling. + Work with LangGraph (persistent memory, tool calling), will update soon - Dependencies: This change requires the llamacpp_python library to be installed. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-14 14:51:43 +00:00
Anindyadeep	c417803908	community[minor]: Prem Templates (#22783 ) This PR adds the feature add Prem Template feature in ChatPremAI. Additionally it fixes a minor bug for API auth error when API passed through arguments.	2024-06-13 19:59:28 -07:00
maang-h	1055b9a309	community[minor]: Implement ZhipuAIEmbeddings interface (#22821 ) - Description: Implement ZhipuAIEmbeddings interface, include: - The `embed_query` method - The `embed_documents` method refer to [ZhipuAI Embedding-2](https://open.bigmodel.cn/dev/api#text_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-13 19:45:11 -07:00
Isaac Francisco	75e966a2fa	docs, cli[patch]: document loaders doc template (#22862 ) From: https://github.com/langchain-ai/langchain/pull/22290 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-13 19:28:57 -07:00
Eugene Yurtsev	8f7cc73817	ci: Add script to check for pickle usage in community (#22863 ) Add script to check for pickle usage in community.	2024-06-13 16:13:15 -04:00
Eugene Yurtsev	77209f315e	community[patch]: FAISS VectorStore deserializer should be opt-in (#22861 ) FAISS deserializer uses pickle module. Users have to opt-in to de-serialize.	2024-06-13 15:48:13 -04:00
Isaac Francisco	869523ad72	[docs]: added info for TavilySearchResults (#22765 )	2024-06-13 12:14:11 -07:00
Isaac Francisco	345fd3a556	minor functionality change: adding API functionality to tavilysearch (#22761 )	2024-06-13 11:10:28 -07:00
Isaac Francisco	034257e9bf	docs: improved recursive url loader docs (#22648 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-13 11:09:35 -07:00
Christophe Bornet	d04e899b56	ci: add testing with Python 3.12 (#22813 ) We need to use a different version of numpy for py3.8 and py3.12 in pyproject. And so do projects that use that Python version range and import langchain. - Twitter handle: _cbornet	2024-06-12 16:31:36 -04:00
HyoJin Kang	b6bf2bb234	community[patch]: fix database uri type in SQLDatabase (#22661 ) Description sqlalchemy uses "sqlalchemy.engine.URL" type for db uri argument. Added 'URL' type for compatibility. Issue: None Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-12 15:11:00 -04:00
Eugene Yurtsev	265e650e64	community[patch]: Update root_validators embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub, Toolkits: Connery, ChatModels: PAI_EAS, (#22828 ) This PR updates root validators for: * Embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub * Toolkits: Connery * ChatModels: PAI_EAS Following this issue: https://github.com/langchain-ai/langchain/issues/22819	2024-06-12 13:59:05 -04:00
JonZeolla	32ba8cfab0	community[minor]: implement huggingface show_progress consistently (#22682 ) - Description: This implements `show_progress` more consistently (i.e. it is also added to the `HuggingFaceBgeEmbeddings` object). - Issue: This implements `show_progress` more consistently in the embeddings huggingface classes. Previously this could have been set via `encode_kwargs`. - Dependencies: None - Twitter handle: @jonzeolla	2024-06-12 17:30:56 +00:00
mrhbj	a1268d9e9a	community[patch]: fix hunyuan message include chinese signature error (#22795 ) (#22796 ) … (#22795) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-12 12:30:22 -04:00
Mr. Lance E Sloan «UMich»	08c466c603	community[patch]: bugfix for `YoutubeLoader`'s `LINES` format (#22815 ) - Description: A change I submitted recently introduced a bug in `YoutubeLoader`'s `LINES` output format. In those conditions, curly braces ("`{}`") creates a set, not a dictionary. This bugfix explicitly specifies that a dictionary is created. - Issue: N/A - Dependencies: N/A - Twitter: lsloan_umich - Mastodon: [lsloan@mastodon.social](https://mastodon.social/@lsloan)	2024-06-12 12:29:34 -04:00
Philippe PRADOS	23c22fcbc9	langchain[minor]: Make EmbeddingsFilters async (#22737 ) Add native async implementation for EmbeddingsFilter	2024-06-12 12:27:26 -04:00
mrhbj	9212c9fcb8	community[patch]: fix hunyuan client json analysis (#22452 ) (#22767 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-11 19:05:18 +00:00
Rohan Aggarwal	86e8224cf1	community[patch]: Support for old clients (Thin and Thick) Oracle Vector Store (#22766 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" Support for old clients (Thin and Thick) Oracle Vector Store - [ ] PR message: *Delete this entire checklist* and replace with Support for old clients (Thin and Thick) Oracle Vector Store - [ ] Add tests and docs: If you're adding a new integration, please include Have our own local tests --------- Co-authored-by: rohan.aggarwal@oracle.com <rohaagga@phoenix95642.dev3sub2phx.databasede3phx.oraclevcn.com>	2024-06-11 11:36:06 -07:00
Mr. Lance E Sloan «UMich»	84dc2dd059	community[patch]: Load YouTube transcripts (captions) as fixed-duration chunks with start times (#21710 ) - Description: Add a new format, `CHUNKS`, to `langchain_community.document_loaders.youtube.YoutubeLoader` which creates multiple `Document` objects from YouTube video transcripts (captions), each of a fixed duration. The metadata of each chunk `Document` includes the start time of each one and a URL to that time in the video on the YouTube website. I had implemented this for UMich (@umich-its-ai) in a local module, but it makes sense to contribute this to LangChain community for all to benefit and to simplify maintenance. - Issue: N/A - Dependencies: N/A - Twitter: lsloan_umich - Mastodon: [lsloan@mastodon.social](https://mastodon.social/@lsloan) With regards to tests and documentation, most existing features of the `YoutubeLoader` class are not tested. Only the `YoutubeLoader.extract_video_id()` static method had a test. However, while I was waiting for this PR to be reviewed and merged, I had time to add a test for the chunking feature I've proposed in this PR. I have added an example of using chunking to the `docs/docs/integrations/document_loaders/youtube_transcript.ipynb` notebook. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-11 17:44:36 +00:00
Aayush Kataria	71811e0547	community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676 ) This PR add supports for Azure Cosmos DB for NoSQL vector store. Summary: Description: added vector store integration for Azure Cosmos DB for NoSQL Vector Store, Dependencies: azure-cosmos dependency, Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-11 10:34:01 -07:00
Mohammad Mohtashim	36cad5d25c	[Community]: Added Metadata filter support for DocumentDB Vector Store (#22777 ) - Description: As pointed out in this issue #22770, DocumentDB `similarity_search` does not support filtering through metadata which this PR adds by passing in the parameter `filter`. Also this PR fixes a minor Documentation error. - Issue: #22770 --------- Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-11 16:37:53 +00:00
Dmitry Stepanov	912751e268	Ollama vision support (#22734 ) Description: Ollama vision with messages in OpenAI-style support `{ "image_url": { "url": ... } }` Issue: #22460 Added flexible solution for ChatOllama to support chat messages with images. Works when you provide either `image_url` as a string or as a dict with "url" inside (like OpenAI does). So it makes available to use tuples with `ChatPromptTemplate.from_messages()` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-11 16:10:19 +00:00
Jaeyeon Kim(김재연)	ce4e29ae42	community[minor]: fix redis store docstring and streamline initialization code (#22730 ) Thank you for contributing to LangChain! ### Description Fix the example in the docstring of redis store. Change the initilization logic and remove redundant check, enhance error message. ### Issue The example in docstring of how to use redis store was wrong. ![image](https://github.com/langchain-ai/langchain/assets/37469330/78c5d9ce-ee66-45b3-8dfe-ea29f125e6e9) ### Dependencies Nothing - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-11 14:08:05 +00:00
am-kinetica	ad101adec8	community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724 ) - [ ] Miscellaneous updates and fixes: - Description: Handled error in querying; quotes in table names; updated gpudb API - Issue: Threw an error with an error message difficult to understand if a query failed or returned no records - Dependencies: Updated GPUDB API version to `7.2.0.9` @baskaryan @hwchase17	2024-06-11 10:01:26 -04:00
Mathis Joffre	ea43f40daf	community[minor]: Add support for OVHcloud AI Endpoints Embedding (#22667 ) Description: Add support for [OVHcloud AI Endpoints](https://endpoints.ai.cloud.ovh.net/) Embedding models. Inspired by: https://gist.github.com/gmasse/e1f99339e161f4830df6be5d0095349a Signed-off-by: Joffref <mariusjoffre@gmail.com>	2024-06-10 21:07:25 +00:00
Eugene Yurtsev	05d31a2f00	community[patch]: Add missing type annotations (#22758 ) Add missing type annotations to objects in community. These missing type annotations will raise type errors in pydantic 2.	2024-06-10 16:59:28 -04:00
Bharat Ramanathan	2b5631a6be	community[patch]: fix `WandbTracer` to work with new "RunV2" API (#22673 ) - Description: This PR updates the `WandbTracer` to work with the new RunV2 API so that wandb Traces logging works correctly for new LangChain versions. Here's an example [run](https://wandb.ai/parambharat/langchain-tracing/runs/wpm99ftq) from the existing tests - Issue: https://github.com/wandb/wandb/issues/7762 - Twitter handle: @ParamBharat _If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17._	2024-06-10 13:56:35 -07:00
Oguz Vuruskaner	f0f4532579	community[patch]: fix deepinfra inference (#22680 ) This PR includes: 1. Update of default model to LLama3. 2. Handle some 400x errors with more user friendly error messages. 3. Handle user errors.	2024-06-10 13:55:55 -07:00
Tomaz Bratanic	76a193decc	community[patch]: Add function response to graph cypher qa chain (#22690 ) LLMs struggle with Graph RAG, because it's different from vector RAG in a way that you don't provide the whole context, only the answer and the LLM has to believe. However, that doesn't really work a lot of the time. However, if you wrap the context as function response the accuracy is much better. btw... `union[LLMChain, Runnable]` is linting fun, that's why so many ignores	2024-06-10 13:52:17 -07:00
X-HAN	34edfe4a16	community[minor]: add Volcengine Rerank (#22700 ) Description: this PR adds Volcengine Rerank capability to Langchain, you can find Volcengine Rerank API from [here](https://www.volcengine.com/docs/84313/1254474) & [here](https://www.volcengine.com/docs/84313/1254605). [Volcengine](https://www.volcengine.com/) is a cloud service platform developed by ByteDance, the parent company of TikTok. You can obtain Volcengine API AK/SK from [here](https://www.volcengine.com/docs/84313/1254553). Dependencies: VolcengineRerank depends on `volcengine` python package. Twitter handle: my twitter/x account is https://x.com/LastMonopoly and I'd like a mention, thank you! Tests and docs 1. integration test: `test_volcengine_rerank.py` 2. example notebook: `volcengine_rerank.ipynb` Lint and test: I have run `make format`, `make lint` and `make test` from the root of the package I've modified.	2024-06-10 13:41:05 -07:00
Mohammad Mohtashim	c3cce98d86	community[patch]: Small Fix in OutlookMessageLoader (Close the Message once Open) (#22744 ) - Description: A very small fix where we close the message when it opened - Issue: #22729	2024-06-10 13:08:39 -07:00
Max Mulatz	058a64c563	Community[minor]: Add language parser for Elixir (#22742 ) Hi 👋 First off, thanks a ton for your work on this 💚 Really appreciate what you're providing here for the community. ## Description This PR adds a basic language parser for the [Elixir](https://elixir-lang.org/) programming language. The parser code is based upon the approach outlined in https://github.com/langchain-ai/langchain/pull/13318: it's using `tree-sitter` under the hood and aligns with all the other `tree-sitter` based parses added that PR. The `CHUNK_QUERY` I'm using here is probably not the most sophisticated one, but it worked for my application. It's a starting point to provide "core" parsing support for Elixir in LangChain. It enables people to use the language parser out in real world applications which may then lead to further tweaking of the queries. I consider this PR just the ground work. - Dependencies: requires `tree-sitter` and `tree-sitter-languages` from the extended dependencies - Twitter handle:`@bitcrowd` ## Checklist - [x] PR title: "package: description" - [x] Add tests and docs - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. <!-- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->	2024-06-10 15:56:57 +00:00
Enzo Poggio	8f019e91d7	community[patch]: Use Custom Logger Instead of Root Logger in get_user_agent Function (#22691 ) ## Description This PR addresses a logging inconsistency in the `get_user_agent` function. Previously, the function was using the root logger to log a warning message when the "USER_AGENT" environment variable was not set. This bypassed the custom logger `log` that was created at the start of the module, leading to potential inconsistencies in logging behavior. Changes: - Replaced `logging.warning` with `log.warning` in the `get_user_agent` function to ensure that the custom logger is used. This change ensures that all logging in the `get_user_agent` function respects the configurations of the custom logger, leading to more consistent and predictable logging behavior. ## Dependencies None ## Issue None ## Tests and docs ☝🏻 see description ## `make format`, `make lint` & `cd libs/community; make test` ```shell > make format poetry run ruff format docs templates cookbook 1417 files left unchanged poetry run ruff check --select I --fix docs templates cookbook All checks passed! ``` ```shell > make lint poetry run ruff check docs templates cookbook All checks passed! poetry run ruff format docs templates cookbook --diff 1417 files already formatted poetry run ruff check --select I docs templates cookbook All checks passed! git grep 'from langchain import' docs/docs templates cookbook \| grep -vE 'from langchain import (hub)' && exit 1 \|\| exit 0 ``` ~cd libs/community; make test~ too much dependencies for integration ... ```shell > poetry run pytest tests/unit_tests .... ==== 884 passed, 466 skipped, 4447 warnings in 15.93s ==== ``` I choose you randomly : @ccurme	2024-06-08 02:33:07 +00:00
Philippe PRADOS	9aabb446c5	community[minor]: Add SQL storage implementation (#22207 ) Hello @eyurtsev - package: langchain-comminity - Description: Add SQL implementation for docstore. A new implementation, in line with my other PR ([async PGVector](https://github.com/langchain-ai/langchain-postgres/pull/32), [SQLChatMessageMemory](https://github.com/langchain-ai/langchain/pull/22065)) - Twitter handler: pprados --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Piotr Mardziel <piotrm@gmail.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-07 21:17:02 +00:00
Nithish Raghunandanan	f2f0e0e13d	couchbase: Add the initial version of Couchbase partner package (#22087 ) Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-07 14:04:08 -07:00
Cahid Arda Öz	6c07eb0c12	community[minor]: Add UpstashRatelimitHandler (#21885 ) Adding `UpstashRatelimitHandler` callback for rate limiting based on number of chain invocations or LLM token usage. For more details, see [upstash/ratelimit-py repository](https://github.com/upstash/ratelimit-py) or the notebook guide included in this PR. Twitter handle: @cahidarda --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-07 21:02:06 +00:00
Bagatur	235d91940d	community[patch]: Release 0.2.4 (#22643 )	2024-06-06 17:47:44 -07:00
Erick Friis	a24a9c6427	multiple: get rid of pyproject extras (#22581 ) They cause `poetry lock` to take a ton of time, and `uv pip install` can resolve the constraints from these toml files in trivial time (addressing problem with #19153) This allows us to properly upgrade lockfile dependencies moving forward, which revealed some issues that were either fixed or type-ignored (see file comments)	2024-06-06 15:45:22 -07:00
Xiangrui Meng	f26ab93df8	community: support Databricks Unity Catalog functions as LangChain tools (#22555 ) This PR adds support for using Databricks Unity Catalog functions as LangChain tools, which runs inside a Databricks SQL warehouse. * An example notebook is provided.	2024-06-06 09:38:50 -07:00
lucasiscovici	05bf98b2f9	community[patch]: pgvector replace nin_ by not_in (#22619 ) - [ ] community: "pgvector: replace nin_ by not_in" - [ ] PR message: nin_ do not exist in sqlalchemy orm, it's not_in	2024-06-06 12:17:22 -04:00
Isaac Francisco	ba3e219d83	community[patch]: recursive url loader fix and unit tests (#22521 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-05 17:56:20 -07:00
Jeffrey Mak	5fc5ed463c	community[patch]:Support filter for AzureAISearchRetriever (#22303 ) Description: The AzureAISearchRetriever does not support the "$filter" argument offered in the AISearch API: https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-get?view=rest-searchservice-2023-11-01&tabs=HTTP The $filter allows filtering of indexes based on values in metadata. Issue: https://github.com/langchain-ai/langchain/issues/19885 Dependencies: No Twitter handle: @Jeffreym9M - [ ] Add tests and docs: Not relevant - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-06-05 16:53:19 -07:00
Isaac Francisco	148088a588	docs: duckduckgosearch options listed (#22568 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-05 23:29:47 +00:00
X-HAN	62f13f95e4	community[minor]: add DashScope Rerank (#22403 ) Description: this PR adds DashScope Rerank capability to Langchain, you can find DashScope Rerank API from [here](https://help.aliyun.com/document_detail/2780058.html?spm=a2c4g.2780059.0.0.6d995024FlrJ12) & [here](https://help.aliyun.com/document_detail/2780059.html?spm=a2c4g.2780058.0.0.63f75024cr11N9). [DashScope](https://dashscope.aliyun.com/) is the generative AI service from Alibaba Cloud (Aliyun). You can create DashScope API key from [here](https://bailian.console.aliyun.com/?apiKey=1#/api-key). Dependencies: DashScopeRerank depends on `dashscope` python package. Twitter handle: my twitter/x account is https://x.com/LastMonopoly and I'd like a mention, thanks you! Tests and docs 1. integration test: `test_dashscope_rerank.py` 2. example notebook: `dashscope_rerank.ipynb` Lint and test: I have run `make format`, `make lint` and `make test` from the root of the package I've modified. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-05 15:40:21 -07:00
Ethan Yang	29064848f9	[Community]add option to delete the prompt from HF output (#22225 ) This will help to solve pattern mismatching issue when parsing the output in Agent. https://github.com/langchain-ai/langchain/issues/21912	2024-06-05 18:38:54 -04:00
Bagatur	584a1e30ac	community[patch]: AzureSearch async functions (#22075 )	2024-06-05 14:39:54 -07:00
ccurme	af129974a3	community: update how OpenAIAssistantV2Runnable creates threads with tool_resources (#22549 ) https://github.com/langchain-ai/langchain/issues/22503	2024-06-05 14:19:41 -04:00
Bagatur	51a0d4574e	community[patch]: Release 0.2.3 (#22562 )	2024-06-05 17:27:24 +00:00
leila-messallem	3280a5b49b	community[patch]: improve test setup to accurately test filtering of labels in neo4j (#22531 ) Description: This PR addresses an issue with an existing test that was not effectively testing the intended functionality. The previous test setup did not adequately validate the filtering of the labels in neo4j, because the nodes and relationship in the test data did not have any properties set. Without properties these labels would not have been returned, regardless of the filtering. --------- Co-authored-by: Oskar Hane <oh@oskarhane.com>	2024-06-05 15:56:53 +00:00
Stefano Lottini	328d0c99f2	community[minor]: Add support for metadata indexing policy in Cassandra vector store (#22548 ) This PR adds a constructor `metadata_indexing` parameter to the Cassandra vector store to allow optional fine-tuning of which fields of the metadata are to be indexed. This is a feature supported by the underlying CassIO library. Indexing mode of "all", "none" or deny- and allow-list based choices are available. The rationale is, in some cases it's advisable to programmatically exclude some portions of the metadata from the index if one knows in advance they won't ever be used at search-time. this keeps the index more lightweight and performant and avoids limitations on the length of _indexed_ strings. I added a integration test of the feature. I also added the possibility of running the integration test with Cassandra on an arbitrary IP address (e.g. Dockerized), via `CASSANDRA_CONTACT_POINTS=10.1.1.5,10.1.1.6 poetry run pytest [...]` or similar. While I was at it, I added a line to the `.gitignore` since the mypy _test_ cache was not ignored yet. My X (Twitter) handle: @rsprrs.	2024-06-05 11:23:26 -04:00
Emilien Chauvet	c3d4126eb1	community[minor]: add user agent for web scraping loaders (#22480 ) Description: This PR adds a `USER_AGENT` env variable that is to be used for web scraping. It creates a util to get that user agent and uses it in the classes used for scraping in [this piece of doc](https://python.langchain.com/v0.1/docs/use_cases/web_scraping/). Identifying your scraper is considered a good politeness practice, this PR aims at easing it. Issue: `None` Dependencies: `None` Twitter handle: `None`	2024-06-05 15:20:34 +00:00
Philippe PRADOS	8250c177de	community[minor]: Add native async support to SQLChatMessageHistory (#22065 ) # package community: Fix SQLChatMessageHistory ## Description Here is a rewrite of `SQLChatMessageHistory` to properly implement the asynchronous approach. The code circumvents [issue 22021](https://github.com/langchain-ai/langchain/issues/22021) by accepting a synchronous call to `def add_messages()` in an asynchronous scenario. This bypasses the bug. For the same reasons as in [PR 22](https://github.com/langchain-ai/langchain-postgres/pull/32) of `langchain-postgres`, we use a lazy strategy for table creation. Indeed, the promise of the constructor cannot be fulfilled without this. It is not possible to invoke a synchronous call in a constructor. We compensate for this by waiting for the next asynchronous method call to create the table. The goal of the `PostgresChatMessageHistory` class (in `langchain-postgres`) is, among other things, to be able to recycle database connections. The implementation of the class is problematic, as we have demonstrated in [issue 22021](https://github.com/langchain-ai/langchain/issues/22021). Our new implementation of `SQLChatMessageHistory` achieves this by using a singleton of type (`Async`)`Engine` for the database connection. The connection pool is managed by this singleton, and the code is then reentrant. We also accept the type `str` (optionally complemented by `async_mode`. I know you don't like this much, but it's the only way to allow an asynchronous connection string). In order to unify the different classes handling database connections, we have renamed `connection_string` to `connection`, and `Session` to `session_maker`. Now, a single transaction is used to add a list of messages. Thus, a crash during this write operation will not leave the database in an unstable state with a partially added message list. This makes the code resilient. We believe that the `PostgresChatMessageHistory` class is no longer necessary and can be replaced by: ``` PostgresChatMessageHistory = SQLChatMessageHistory ``` This also fixes the bug. ## Issue - [issue 22021](https://github.com/langchain-ai/langchain/issues/22021) - Bug in _exit_history() - Bugs in PostgresChatMessageHistory and sync usage - Bugs in PostgresChatMessageHistory and async usage - [issue 36](https://github.com/langchain-ai/langchain-postgres/issues/36) ## Twitter handle: pprados ## Tests - libs/community/tests/unit_tests/chat_message_histories/test_sql.py (add async test) @baskaryan, @eyurtsev or @hwchase17 can you check this PR ? And, I've been waiting a long time for validation from other PRs. Can you take a look? - [PR 32](https://github.com/langchain-ai/langchain-postgres/pull/32) - [PR 15575](https://github.com/langchain-ai/langchain/pull/15575) - [PR 13200](https://github.com/langchain-ai/langchain/pull/13200) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-05 15:10:38 +00:00
Vincent Min	59bef31997	community[minor]: Improve InMemoryVectorStore with ability to persist to disk and filter on metadata. (#22186 ) - Description: The InMemoryVectorStore is a nice and simple vector store implementation for quick development and debugging. The current implementation is quite limited in its functionalities. This PR extends the functionalities by adding utility function to persist the vector store to a json file and to load it from a json file. We choose the json file format because it allows inspection of the database contents in a text editor, which is great for debugging. Furthermore, it adds a `filter` keyword that can be used to filter out documents on their `page_content` or `metadata`. - Issue: - - Dependencies: - - Twitter handle: @Vincent_Min	2024-06-05 10:40:34 -04:00
maang-h	89128b7a49	community[patch]: add detailed paragraph and example for BaichuanTextEmbeddings (#22031 ) - Description: add detailed paragraph and example for BaichuanTextEmbeddings - Issue: the issue #21983	2024-06-05 10:18:11 -04:00
Anthony Bernabeu	4e676a63b8	community[minor]: Added filter search for LanceDB (#22461 ) - [ ] community: "vectorstore: added filtering support for LanceDB vector store" - [ ] This PR adds filtering capabilities to LanceDB: - Description: In LanceDB filtering can be applied when searching for data into the vectorstore. It is using the SQL language as mentioned in the LanceDB documentation. - Issue: #18235 - Dependencies: No - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-06-05 09:33:54 -04:00
Erick Friis	58192d617f	community: fix huggingface deprecations (#22522 )	2024-06-05 04:13:13 +00:00
Ofer Mendelevitch	ad502e8d50	community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334 ) Thank you for contributing to LangChain! Description: update to the Vectara / Langchain integration to integrate new Vectara capabilities: - Full RAG implemented as a Runnable with as_rag() - Vectara chat supported with as_chat() - Both support streaming response - Updated documentation and example notebook to reflect all the changes - Updated Vectara templates Twitter handle: ofermend Add tests and docs: no new tests or docs, but updated both existing tests and existing docs	2024-06-04 12:57:28 -07:00
Bagatur	e60f88ccdd	community[patch]: Release 0.2.2 (#22496 )	2024-06-04 11:42:11 -07:00
Joydeep Banik Roy	3796672c67	community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271 ) - [ ] Packages affected: - community: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/milvus: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/mongodb: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/pinecone: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/qdrant: fix `cosine_similarity` to support simsimd beyond 3.7.7 - [ ] Broadcast operation failure while using simsimd beyond v3.7.7: - Description: I was using simsimd 4.3.1 and the unsupported operand type issue popped up. When I checked out the repo and ran the tests, they failed as well (have attached a screenshot for that). Looks like it is a variant of https://github.com/langchain-ai/langchain/issues/18022 . Prior to 3.7.7, simd.cdist returned an ndarray but now it returns simsimd.DistancesTensor which is ineligible for a broadcast operation with numpy. With this change, it also remove the need to explicitly cast `Z` to numpy array - Issue: #19905 - Dependencies: No - Twitter handle: https://x.com/GetzJoydeep <img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56"> - [ ] Considerations: 1. I started with community but since similar changes were there in Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well. If touching multiple packages in one PR is not the norm, then I can remove them from this PR and raise separate ones 2. I have run and verified that the tests work. Since, only MongoDB had tests, I ran theirs and verified it works as well. Screenshots attached : <img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9"> <img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5"> I have added a test for simsimd. I feel it may not go well with the CI/CD setup as installing simsimd is not a dependency requirement. I have just imported simsimd to ensure simsimd cosine similarity is invoked. However, its not a good approach. Suggestions are welcome and I can make the required changes on the PR. Please provide guidance on the same as I am new to the community. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-04 17:36:31 +00:00
KyrianC	03178ee74f	community[minor]: Add tools calls to `ChatEdenAI` (#22320 ) ### Description Add tools implementation to `ChatEdenAI`: - `bind_tools()` - `with_structured_output()` ### Documentation Updated `docs/docs/integrations/chat/edenai.ipynb` ### Notes We don´t support stream with tools as of yet. If stream is called with tools we directly yield the whole message from `generate` (implemented the same way as Anthropic did).	2024-06-04 10:29:28 -07:00
pranavvuppala	9d4350e69a	docs : Update docstrings for OpenAI base.py (#22221 ) - [x] PR title: Update docstrings for OpenAI base.py -Description: Updated the docstring of few OpenAI functions for a better understanding of the function. - Issue: #21983 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-04 17:24:17 +00:00
Anindyadeep	7a197539aa	communty[patch]: Native RAG Support in Prem AI langchain (#22238 ) This PR adds native RAG support in langchain premai package. The same has been added in the docs too.	2024-06-04 10:19:54 -07:00
Rahul Triptahi	77ad857934	community[minor]: Enable retrieval api calls in PebbloRetrievalQA (#21958 ) Description: Enable app discovery and Prompt/Response apis in PebbloSafeRetrieval Documentation: NA Unit test: N/A --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-04 10:18:50 -07:00
Bagatur	17c127531a	community[patch]: deprecate all HF classes (#22444 )	2024-06-04 09:48:25 -07:00
Christophe Bornet	9a8fe58ebe	community[minor]: Improve Cassandra VectorStore as_retriever (#22465 ) The Vectorstore's API `as_retriever` doesn't expose explicitly the parameters `search_type` and `search_kwargs` and so these are not well documented. This PR improves `as_retriever` for the Cassandra VectorStore by making these parameters explicit. NB: An alternative would have been to modify `as_retriever` in `Vectorstore`. But there's probably a good reason these were not exposed in the first place ? Is it because implementations may decide to not support them and have fixed values when creating the VectorStoreRetriever ?	2024-06-04 09:51:17 -04:00
Fahreddin Özcan	0061ded002	community[patch]: Upstash Vector Store Namespace Support (#22251 ) This PR introduces namespace support for Upstash Vector Store, which would allow users to partition their data in the vector index. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-03 17:30:56 -07:00
Dristy Srivastava	ef3df45d9d	community[minor]: Updating payload for pebblo discover API (#22309 ) Description: Updating response for pebblo discover API. Also updating filed name case type Documentation: N/A Unit tests: N/A	2024-06-03 15:36:17 -07:00
ccurme	afe89a1411	community: add standard chat model params to Ollama (#22446 )	2024-06-03 17:45:03 -04:00
Ethan Yang	52da6a160d	community[patch]: Update OpenVINO embedding and reranker to support static input shape (#22171 ) It can help to deploy embedding models on NPU device	2024-06-03 13:27:17 -07:00
maang-h	01352bb55f	community[minor]: Implement MiniMaxChat interface (#22391 ) - Description: Implement MiniMaxChat interface, include: - No longer inherits the LLM class (like other chat model) - Update request parameters (v1 -> v2) - update `base url` - update message role (system, user, assistant) - add `stream` function - no longer use `group id` - Implement the `_stream`, `_agenerate`, and `_astream` interfaces [minimax v2 api document](https://platform.minimaxi.com/document/guides/chat-model/V2?id=65e0736ab2845de20908e2dd)	2024-06-03 13:22:38 -07:00
Brandon Sharp	56e5aa4dd9	community[patch]: Airtable to allow for addtl params (#22092 ) - [X] PR title: "community: added optional params to Airtable table.all()" - [X] PR message: - Description: Add's kwargs to AirtableLoader to allow for kwargs: https://pyairtable.readthedocs.io/en/latest/api.html#pyairtable.Table.all - Issue: N/A - Dependencies: N/A - Twitter handle: parakoopa88 - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [X] Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-03 13:05:56 -07:00
Harichandan Roy	1f751343e2	community[patch]: update embeddings/oracleai.py (#22240 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" "community/embeddings: update oracleai.py" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! Adding oracle VECTOR_ARRAY_T support. - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Tests are not impacted. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Done. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-03 12:38:51 -07:00
maang-h	13140dc4ff	community[patch]: Update the default api_url and reqeust_body of sparkllm embedding (#22136 ) - Description: When I was running the SparkLLMTextEmbeddings, app_id, api_key and api_secret are all correct, but it cannot run normally using the current URL. ```python # example from langchain_community.embeddings import SparkLLMTextEmbeddings embedding= SparkLLMTextEmbeddings( spark_app_id="my-app-id", spark_api_key="my-api-key", spark_api_secret="my-api-secret" ) embedding= "hello" print(spark.embed_query(text1)) ``` ![sparkembedding](https://github.com/langchain-ai/langchain/assets/55082429/11daa853-4f67-45b2-aae2-c95caa14e38c) So I updated the url and request body parameters according to [Embedding_api](https://www.xfyun.cn/doc/spark/Embedding_api.html), now it is runnable.	2024-06-03 12:38:11 -07:00
Yuwen Hu	ba0dca46d7	community[minor]: Add IPEX-LLM BGE embedding support on both Intel CPU and GPU (#22226 ) Description: [IPEX-LLM](https://github.com/intel-analytics/ipex-llm) is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low latency. This PR adds ipex-llm integrations to langchain for BGE embedding support on both Intel CPU and GPU. Dependencies: `ipex-llm`, `sentence-transformers` Contribution maintainer: @Oscilloscope98 tests and docs: - langchain/docs/docs/integrations/text_embedding/ipex_llm.ipynb - langchain/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb - langchain/libs/community/tests/integration_tests/embeddings/test_ipex_llm.py --------- Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>	2024-06-03 12:37:10 -07:00
Dan	86509161b0	community: fix AzureSearch delete documents (#22315 ) Description Fix AzureSearch delete documents method by using FIELDS_ID variable instead of the hard coded "id" value Issue: This is linked to this issue: https://github.com/langchain-ai/langchain/issues/22314 Co-authored-by: dseban <dan.seban@neoxia.com>	2024-06-03 15:55:06 +00:00
Charles John	2d81a72884	community: fix missing `apify_api_token` field in ApifyWrapper (#22421 ) - Description: The `ApifyWrapper` class expects `apify_api_token` to be passed as a named parameter or set as an environment variable. But the corresponding field was missing in the class definition causing the argument to be ignored when passed as a named param. This patch fixes that.	2024-06-03 14:32:57 +00:00
Joan Fontanals	a7ae16f912	add `embed_image` API to JinaEmbedding (#22416 ) - Description: Add `embed_image` to JinaEmbedding to embed images - Twitter handle: https://x.com/JinaAI_	2024-06-03 10:23:37 -04:00
WU LIFU	86698b02a9	doc: fix wrong documentation on FAISS load_local function (#22310 ) ### Issue: #22299 ### descriptions The documentation appears to be wrong. When the user actually sets this parameter "asynchronous" to be True, it fails because the __init__ function of FAISS class doesn't allow this parameter. In fact, most of the class/instance functions of this class have both the sync/async version, so it looks like what we need is just to remove this parameter from the doc. Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Lifu Wu <lifu@nextbillion.ai>	2024-05-30 15:15:04 +00:00
maang-h	596c062cba	community[patch]: Standardize qianfan model init args name (#22322 ) - Description: - Standardize qianfan chat model intialization arguments name - qianfan_ak (qianfan api key) -> api_key - qianfan_sk (qianfan secret key) -> secret_key - Delete unuse variable - Issue: #20085	2024-05-30 11:08:32 -04:00
Dobiichi-Origami	10b12e1c08	community: adding tool_call_id for every ToolCall (#22323 ) - Description: This PR contains a bugfix which result in malfunction of multi-turn conversation in QianfanChatEndpoint and adaption for ToolCall and ToolMessage	2024-05-30 10:59:08 -04:00
ccurme	f39e1a2288	community, docs: update token usage tracking callback + how-to guides (#22145 )	2024-05-29 17:00:47 -04:00
ChengZi	404d92ded0	milvus: New langchain_milvus package and new milvus features (#21077 ) New features: - New langchain_milvus package in partner - Milvus collection hybrid search retriever - Zilliz cloud pipeline retriever - Milvus Local guid - Rag-milvus template --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> Co-authored-by: Jael Gu <mengjia.gu@zilliz.com> Co-authored-by: Jackson <jacksonxie612@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-05-28 08:24:20 -07:00
Will Higgins	83d10df78d	community[patch]: Update firecrawl api key name (#22183 ) Change 'FIREWALL' to 'FIRECRAWL' as I believe this may have been in error. Other docs refer to 'FIRECRAWL_API_KEY'. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-27 21:39:29 +00:00
Pavlo Paliychuk	342df7cf83	community[minor]: Add Zep Cloud components + docs + examples (#21671 ) Thank you for contributing to LangChain! - [x] PR title: community: Add Zep Cloud components + docs + examples - [x] PR message: We have recently released our new zep-cloud sdks that are compatible with Zep Cloud (not Zep Open Source). We have also maintained our Cloud version of langchain components (ChatMessageHistory, VectorStore) as part of our sdks. This PRs goal is to port these components to langchain community repo, and close the gap with the existing Zep Open Source components already present in community repo (added ZepCloudMemory,ZepCloudVectorStore,ZepCloudRetriever). Also added a ZepCloudChatMessageHistory components together with an expression language example ported from our repo. We have left the original open source components intact on purpose as to not introduce any breaking changes. - Issue: - - Dependencies: Added optional dependency of our new cloud sdk `zep-cloud` - Twitter handle: @paulpaliychuk51 - [x] Add tests and docs - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-05-27 12:50:13 -07:00
Jan Soubusta	cccc8fbe2f	community[patch]: DuckDB VS - expose similarity, improve performance of from_texts (#20971 ) 3 fixes of DuckDB vector store: - unify defaults in constructor and from_texts (users no longer have to specify `vector_key`). - include search similarity into output metadata (fixes #20969) - significantly improve performance of `from_documents` Dependencies: added Pandas to speed up `from_documents`. I was thinking about CSV and JSON options, but I expect trouble loading JSON values this way and also CSV and JSON options require storing data to disk. Anyway, the poetry file for langchain-community already contains a dependency on Pandas. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-05-24 15:17:52 -07:00
Ameya Shenoy	8ba492ed6a	community[minor]: clickhouse -- ability to use secure connection (#22108 ) - Description: this PR gives clickhouse client the ability to use a secure connection to the clickhosue server - Issue: fixes #22082 - Dependencies: - - Twitter handle: `_codingcoffee_` Signed-off-by: Ameya Shenoy <shenoy.ameya@gmail.com> Co-authored-by: Shresth Rana <shresth@grapevine.in>	2024-05-24 17:30:22 +00:00
Rahul Triptahi	1a485f59b9	community[patch]: Put authorized identities behind a feature flag in SharepointLoader (#22125 ) Description: Put authorised identities behind a feature flag, load_auth. Documentation: N/A Unit tests: N/A --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-05-24 12:42:57 -04:00
sasha	1c9ceff503	community: add metadata to chain logging; (#22122 ) Hey, I'm Sasha. The SDK engineer from [Comet](https://comet.com). This PR updates the CometTracer class. Added metadata to CometTracerr. From now on, both chains and spans will send it.	2024-05-24 15:29:40 +00:00
Jirka Lhotka	7c0459faf2	community: Update costs of openai finetuned models (#22124 ) - Description: Update costs of finetuned models and add gpt-3-turbo-0125. Source: https://openai.com/api/pricing/ - Issue: N/A - Dependencies: None	2024-05-24 15:25:17 +00:00
Eugene Yurtsev	d3db83abe3	community[major]: lint for usage of xml library (#22132 ) * Lint for usage of standard xml library * Add forced opt-in for quip client * Actual security issue is with underlying QuipClient not LangChain integration (since the client is doing the parsing), but adding enforcement at the LangChain level.	2024-05-24 15:23:53 +00:00
Christophe Bornet	c838de5027	doc: Add doc for CassandraByteStore (#22126 ) Preview: https://langchain-git-fork-cbornet-doc-cassandrabytestore-langchain.vercel.app/v0.2/docs/integrations/stores/cassandra/	2024-05-24 10:57:55 -04:00
Eugene Yurtsev	2d693c484e	docs: fix some spelling mistakes caught by newest version of code spell (#22090 ) Going to merge this even though it doesn't pass all tests, and open a separate PR for the remaining spelling mistakes.	2024-05-23 16:59:11 -04:00
Pavel Zloi	fe26f937e4	community[minor]: ManticoreSearch engine added to vectorstore (#19117 ) Description: ManticoreSearch engine added to vectorstores Issue: no issue, just a new feature Dependencies: https://pypi.org/project/manticoresearch-dev/ Twitter handle: @EvilFreelancer - Example notebook with test integration: https://github.com/EvilFreelancer/langchain/blob/manticore-search-vectorstore/docs/docs/integrations/vectorstores/manticore_search.ipynb --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-23 13:56:18 -07:00
Bagatur	3d26807b92	community[patch]: Release. 0.2.1 (#22073 )	2024-05-23 10:40:32 -07:00
maang-h	9aba9e3e33	community[patch]: Update the default “API URL” and “MODEL” of sparkllm (#22070 ) - Description: When I was running the sparkllm, I found that the default parameters currently used could no longer run correctly. - original parameters & values: - spark_api_url: "wss://spark-api.xf-yun.com/v3.1/chat" - spark_llm_domain: "generalv3" ```python # example from langchain_community.chat_models import ChatSparkLLM spark = ChatSparkLLM(spark_app_id="my_app_id", spark_api_key="my_api_key", spark_api_secret="my_api_secret") spark.invoke("hello") ``` ![sparkllm](https://github.com/langchain-ai/langchain/assets/55082429/5369bfdf-4305-496a-bcf5-2d3f59d39414) So I updated them to 3.5 (same as sparkllm official website). After the update, they can be used normally. - new parameters & values: - spark_api_url: "wss://spark-api.xf-yun.com/v3.5/chat" - spark_llm_domain: "generalv3.5"	2024-05-23 12:25:20 -04:00
Martin Triska	2df8ac402a	community[minor]: Added propagation of document metadata from O365BaseLoader (#20663 ) Description: - Added propagation of document metadata from O365BaseLoader to FileSystemBlobLoader (O365BaseLoader uses FileSystemBlobLoader under the hood). - This is done by passing dictionary `metadata_dict`: key=filename and value=dictionary containing document's metadata - Modified `FileSystemBlobLoader` to accept the `metadata_dict`, use `mimetype` from it (if available) and pass metadata further into blob loader. Issue: - `O365BaseLoader` under the hood downloads documents to temp folder and then uses `FileSystemBlobLoader` on it. - However metadata about the document in question is lost in this process. In particular: - `mime_type`: `FileSystemBlobLoader` guesses `mime_type` from the file extension, but that does not work 100% of the time. - `web_url`: this is useful to keep around since in RAG LLM we might want to provide link to the source document. In order to work well with document parsers, we pass the `web_url` as `source` (`web_url` is ignored by parsers, `source` is preserved) Dependencies: None Twitter handle: @martintriska1 Please review @baskaryan --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-05-23 11:42:19 -04:00
Eugene Yurtsev	e5541d1da7	community[patch]: Update doc-string in CloudBlobLoader (#22069 ) Update doc-string	2024-05-23 15:31:41 +00:00
Philippe PRADOS	6dd621d636	community[minor]: Add CloudBlobLoader that supports loading data from cloud buckets (#21957 ) Thank you for contributing to LangChain! - [ ] PR title: "Add CloudBlobLoader" - community: Add CloudBlobLoader - [ ] PR message: Add cloud blob loader - Description: Langchain provides several approaches to read different file formats: Specific loaders (`CVSLoader`) or blob-compatible loaders (`FileSystemBlobLoader`). The only implementation proposed for BlobLoader is `FileSystemBlobLoader`. Many projects retrieve files from cloud storage. We propose a new implementation of `BlobLoader` to read files from the three cloud storage systems. The interface is strictly identical to `FileSystemBlobLoader`. The only difference is the constructor, which takes a cloud "url" object such as `s3://my-bucket`, `az://my-bucket`, or `gs://my-bucket`. By streamlining the process, this novel implementation eliminates the requirement to pre-download files from cloud storage to local temporary files (which are seldom removed). The code relies on the [CloudPathLib](https://cloudpathlib.drivendata.org/stable/) library to interpret cloud URLs. This has been added as an optional dependency. ```Python loader = CloudBlobLoader("s3://mybucket/id") for blob in loader.yield_blobs(): print(blob) ``` - [X] Dependencies: CloudPathLib - [X] Twitter handle: pprados - [X] Add tests and docs: Add unit test, but it's easy to convert to integration test, with some files in a cloud storage (see `test_cloud_blob_loader.py`) - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. Hello from Paris @hwchase17. Can you review this PR? --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-05-23 10:59:55 -04:00
Christophe Bornet	74947ec894	community[minor]: Add Cassandra ByteStore (#22064 )	2024-05-23 10:46:23 -04:00

... 6 7 8 9 10 ...

1804 Commits