langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-28 22:14:18 +00:00

Author	SHA1	Message	Date
Rajendra Kadam	a6add89bd4	community[minor]: [PebbloSafeLoader] Implement content-size-based batching (#24871 ) - Title: [PebbloSafeLoader] Implement content-size-based batching in the classification flow(loader/doc API) - Description: - Implemented content-size-based batching in the loader/doc API, set to 100KB with no external configuration option, intentionally hard-coded to prevent timeouts. - Remove unused field(pb_id) from doc_metadata - Issue: NA - Dependencies: NA - Add tests and docs: Updated	2024-07-31 09:10:28 -04:00
TrumanYan	096b66db4a	community: replace it with Tencent Cloud SDK (#24172 ) Description: The old method will be discontinued; use the official SDK for more model options. Issue: None Dependencies: None Twitter handle: None Co-authored-by: trumanyan <trumanyan@tencent.com>	2024-07-31 09:05:38 -04:00
Erick Friis	1f5444817a	community: deprecate BedrockEmbeddings in favor of langchain-aws (#24846 )	2024-07-30 23:13:17 +00:00
Shailendra Mishra	f2d810b3c0	clob_bugfix... (#24813 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-30 12:44:04 -04:00
Anush	51b15448cc	community: Fix FastEmbedEmbeddings (#24462 ) ## Description This PR: - Fixes the validation error in `FastEmbedEmbeddings`. - Adds support for `batch_size`, `parallel` params. - Removes support for very old FastEmbed versions. - Updates the FastEmbed doc with the new params. Associated Issues: - Resolves #24039 - Resolves #https://github.com/qdrant/fastembed/issues/296	2024-07-30 12:42:46 -04:00
ccurme	73ec24fc56	docs[patch]: add toolkit template (#24791 )	2024-07-30 12:36:09 -04:00
Igor Drozdov	c2706cfb9e	feat(community): add tools support for litellm (#23906 ) I used the following example to validate the behavior ```python from langchain_core.prompts import ChatPromptTemplate from langchain_core.runnables import ConfigurableField from langchain_anthropic import ChatAnthropic from langchain_community.chat_models import ChatLiteLLM from langchain_core.tools import tool from langchain.agents import create_tool_calling_agent, AgentExecutor @tool def multiply(x: float, y: float) -> float: """Multiply 'x' times 'y'.""" return x * y @tool def exponentiate(x: float, y: float) -> float: """Raise 'x' to the 'y'.""" return x**y @tool def add(x: float, y: float) -> float: """Add 'x' and 'y'.""" return x + y prompt = ChatPromptTemplate.from_messages([ ("system", "you're a helpful assistant"), ("human", "{input}"), ("placeholder", "{agent_scratchpad}"), ]) tools = [multiply, exponentiate, add] llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0) # llm = ChatLiteLLM(model="claude-3-sonnet-20240229", temperature=0) agent = create_tool_calling_agent(llm, tools, prompt) agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True) agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", }) ``` `ChatAnthropic` version works: ``` > Entering new AgentExecutor chain... Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}` responded: [{'text': 'To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools:', 'type': 'text', 'index': 0}, {'id': 'toolu_01Gf54DFTkfLMJQX3TXffmxe', 'input': {}, 'name': 'exponentiate', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 5, "y": 2.743}'}] 82.65606421491815 Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}` responded: [{'id': 'toolu_01XUq9S56GT3Yv2N1KmNmmWp', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 0, 'partial_json': '{"x": 3, "y": 82.65606421491815}'}] 85.65606421491815 Invoking: `add` with `{'x': 17.24, 'y': -918.1241}` responded: [{'text': '\n\nSo 3 + 5^2.743 = 85.66\n\nTo calculate 17.24 - 918.1241, we can use:', 'type': 'text', 'index': 0}, {'id': 'toolu_01BkXTwP7ec9JKYtZPy5JKjm', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 17.24, "y": -918.1241}'}] -900.8841[{'text': '\n\nTherefore, 17.24 - 918.1241 = -900.88', 'type': 'text', 'index': 0}] > Finished chain. ``` While `ChatLiteLLM` version doesn't. But with the changes in this PR, along with: - https://github.com/langchain-ai/langchain/pull/23823 - https://github.com/BerriAI/litellm/pull/4554 The result is _almost_ the same: ``` > Entering new AgentExecutor chain... Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}` responded: To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools: 82.65606421491815 Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}` 85.65606421491815 Invoking: `add` with `{'x': 17.24, 'y': -918.1241}` responded: So 3 + 5^2.743 = 85.66 To calculate 17.24 - 918.1241, we can use: -900.8841 Therefore, 17.24 - 918.1241 = -900.88 > Finished chain. ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-30 15:39:34 +00:00
David Robertson	bfb7f8d40a	Brave Search: Enhance search result details with extra snippets (#19209 ) Description: This update significantly improves the Brave Search Tool's utility within the LangChain library by enriching the search results it returns. The tool previously returned title, link, and snippet, with the snippet being a truncated 140-character description from the search engine. To make the search results more informative, this update enables extra_snippets by default and introduces additional result fields: title, link, description (enhancing and renaming the former snippet field), age, and snippets. The snippets field provides a list of strings summarizing the webpage, utilizing Brave's capability for more detailed search insights. This enhancement aims to make the search tool far more informative and beneficial for users. Issue: N/A Dependencies: No additional dependencies introduced. Twitter handle: @davidalexr987 Code Changes Summary: - Changed the default setting to include extra_snippets in search results. - Renamed the snippet field to description to accurately reflect its content and included an age field for search results. - Introduced a snippets field that lists webpage summaries, providing users with comprehensive search result insights. Backward Compatibility Note: The renaming of snippet to description improves the accuracy of the returned data field but may impact existing users who have developed integration's or analyses based on the snippet field. I believe this change is essential for clarity and utility, and it aligns better with the data provided by Brave Search. Additional Notes: This proposal focuses exclusively on the Brave Search package, without affecting other LangChain packages or introducing new dependencies.	2024-07-30 15:29:38 +00:00
Ben Chambers	435771fe74	[community]: Fix package name mismatch (#24824 ) - Description: fix a mismatch in pypi package names	2024-07-30 11:21:39 -04:00
maang-h	4bb1a11e02	community: Add MiniMaxChat bind_tools and structured output (#24310 ) - Description: - Add `bind_tools` method to support tool calling - Add `with_structured_output` method to support structured output	2024-07-29 15:51:52 -04:00
maang-h	bf685c242f	docs: Standardize QianfanEmbeddingsEndpoint (#24786 ) - Description: Standardize QianfanEmbeddingsEndpoint, include: - docstrings, the issue #21983 - model init arg names, the issue #20085	2024-07-29 13:19:24 -04:00
M. Ali	c086410677	fix docs typos (#23668 ) Thank you for contributing to LangChain! - [x] PR title: "docs: fix multiple typos" Co-authored-by: mohblnk <mohamed.ali@blnk.ai> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-29 16:10:55 +00:00
Pere Pasamonte	98175860ad	community: Fix AWS DocumentDB similarity_search when filter is None (#24777 ) Description Fixes DocumentDBVectorSearch similarity_search when no filter is used; it defaults to None but $match does not accept None, so changed default to empty {} before pipeline is created. Issue AWS DocumentDB similarity search does not work when no filter is used. Error msg: "the match filter must be an expression in an object" #24775 Dependencies No dependencies Twitter handle https://x.com/perepasamonte --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-29 15:32:05 +00:00
AmosDinh	c113682328	community:Add support for specifying document_loaders.firecrawl api url. (#24747 ) community:Add support for specifying document_loaders.firecrawl api url. Add support for specifying document_loaders.firecrawl api url. This is mainly to support the [self-hosting](https://github.com/mendableai/firecrawl/blob/main/SELF_HOST.md) option firecrawl provides. Eg. now I can specify localhost:.... The corresponding firecrawl class already provides functionality to pass the argument. See here: `4c9d62f6d3/apps/python-sdk/firecrawl/firecrawl.py (L29)` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-28 14:30:36 -04:00
Haijian Wang	cda3025ee1	Integrating the Yi family of models. (#24491 ) Thank you for contributing to LangChain! - [x] PR title: "community:add Yi LLM", "docs:add Yi Documentation" - [x] PR message: *Delete this entire checklist* and replace with - Description: This PR adds support for the Yi model to LangChain. - Dependencies: [langchain_core,requests,contextlib,typing,logging,json,langchain_community] - Twitter handle: 01.AI - [x] Add tests and docs: I've added the corresponding documentation to the relevant paths --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-26 10:57:33 -07:00
Marc Gibbons	cc451effd1	community[patch]: langchain_community.vectorstores.azuresearch Raise LangChainException instead of bare Exception (#23935 ) Raise `LangChainException` instead of `Exception`. This alleviates the need for library users to use bare try/except to handle exceptions raised by `AzureSearch`. Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-26 15:59:06 +00:00
Diverrez morgan	c4d2a53f18	community: creation score_threshold in flashrank_rerank.py (#24016 ) Description: add a optional score relevance threshold for select only coherent document, it's in complement of top_n Discussion: add relevance score threshold in flashrank_rerank document compressors #24013 Dependencies: no dependencies --------- Co-authored-by: Benjamin BERNARD <benjamin.bernard@openpathview.fr> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-26 13:34:39 +00:00
Cong Peng	190988d93e	community: Add parameter `allow_dangerous_requests` to `WebResearchRetriever.from_llm` construct (#24712 ) Description: To avoid ValueError when construct the retriever from method `from_llm()`.	2024-07-26 06:24:58 -07:00
monysun	5f593c172a	community: fix dashcope embeddings embed_query func post too much req to api (#24707 ) the fuc of embed_query of dashcope embeddings send a str param, and in the embed_with_retry func will send error content to api	2024-07-26 12:44:07 +00:00
yonarw	b65ac8d39c	community[minor]: Self query retriever for HANA Cloud Vector Engine (#24494 ) Description: - This PR adds a self query retriever implementation for SAP HANA Cloud Vector Engine. The retriever supports all operators except for contains. - Issue: N/A - Dependencies: no new dependencies added Add tests and docs: Added integration tests to: libs/community/tests/unit_tests/query_constructors/test_hanavector.py Documentation for self query retriever: /docs/integrations/retrievers/self_query/hanavector_self_query.ipynb --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-07-26 06:56:51 +00:00
nobbbbby	4f3b4fc7fe	community[patch]: Extend Baichuan model with tool support (#24529 ) Description: Expanded the chat model functionality to support tools in the 'baichuan.py' file. Updated module imports and added tool object handling in message conversions. Additional changes include the implementation of tool binding and related unit tests. The alterations offer enhanced model capabilities by enabling interaction with tool-like objects. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-25 23:20:44 -07:00
Rave Harpaz	ee399e3ec5	community[patch]: Add OCI Generative AI tool and structured output support (#24693 ) - [x] PR title: community: Add OCI Generative AI tool and structured output support - [x] PR message: - Description: adding tool calling and structured output support for chat models offered by OCI Generative AI services. This is an update to our last PR 22880 with changes in /langchain_community/chat_models/oci_generative_ai.py - Issue: NA - Dependencies: NA - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests 2. we have updated our documentation under /docs/docs/integrations/chat/oci_generative_ai.ipynb - [x] Lint and test: `make format`, `make lint` and `make test` we run successfully --------- Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com> Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>	2024-07-25 23:19:00 -07:00
Yuki Watanabe	2b6a262f84	community[patch]: Replace `filters` argument to `filter` in DatabricksVectorSearch (#24530 ) The [DatabricksVectorSearch](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/databricks_vector_search.py#L21) class exposes similarity search APIs with argument `filters`, which is inconsistent with other VS classes who uses `filter` (singular). This PR updates the argument and add alias for backward compatibility. --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>	2024-07-25 21:20:18 -07:00
Sunish Sheth	59880a9147	community[patch]: mlflow handle empty chunk(#24689 )	2024-07-25 20:36:29 -07:00
Chaunte W. Lacewell	69eacaa887	Community[minor]: Update VDMS vectorstore (#23729 ) Description: - This PR exposes some functions in VDMS vectorstore, updates VDMS related notebooks, updates tests, and upgrade version of VDMS (>=0.0.20) Issue: N/A Dependencies: - Update vdms>=0.0.20	2024-07-25 22:13:04 -04:00
KyrianC	0fdbaf4a8d	community: fix ChatEdenAI + EdenAI Tools (#23715 ) Fixes for Eden AI Custom tools and ChatEdenAI: - add missing import in __init__ of chat_models - add `args_schema` to custom tools. otherwise '__arg1' would sometimes be passed to the `run` method - fix IndexError when no human msg is added in ChatEdenAI	2024-07-25 15:19:14 -04:00
maang-h	38d30e285a	docs: Standardize BaichuanTextEmbeddings docstrings (#24674 ) - Description: Standardize BaichuanTextEmbeddings docstrings. - Issue: the issue #21983	2024-07-25 12:12:00 -04:00
rick-SOPTIM	cd563fb628	community[minor]: passthrough auth parameter on requests to Ollama-LLMs (#24068 ) Thank you for contributing to LangChain! Description: This PR allows users of `langchain_community.llms.ollama.Ollama` to specify the `auth` parameter, which is then forwarded to all internal calls of `requests.request`. This works in the same way as the existing `headers` parameters. The auth parameter enables the usage of the given class with Ollama instances, which are secured by more complex authentication mechanisms, that do not only rely on static headers. An example are AWS API Gateways secured by the IAM authorizer, which expects signatures dynamically calculated on the specific HTTP request. Issue: Integrating a remote LLM running through Ollama using `langchain_community.llms.ollama.Ollama` only allows setting static HTTP headers with the parameter `headers`. This does not work, if the given instance of Ollama is secured with an authentication mechanism that makes use of dynamically created HTTP headers which for example may depend on the content of a given request. Dependencies: None Twitter handle: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-25 15:48:35 +00:00
Luca Dorigo	5fdbdd6bec	community[patch]: Fix invalid iohttp verify parameter (#24655 ) Should fix https://github.com/langchain-ai/langchain/issues/24654	2024-07-25 11:09:21 -04:00
Oleg Kulyk	4b1b7959a2	community[minor]: Add ScrapingAnt Loader Community Integration (#24514 ) Added [ScrapingAnt](https://scrapingant.com/) Web Loader integration. ScrapingAnt is a web scraping API that allows extracting web page data into accessible and well-formatted markdown. Description: Added ScrapingAnt web loader for retrieving web page data as markdown Dependencies: scrapingant-client Twitter: @WeRunTheWorld3 --------- Co-authored-by: Oleg Kulyk <oleg@scrapingant.com>	2024-07-24 21:11:43 -04:00
John	d59c656ea5	unstructured, community, initialize langchain-unstructured package (#22779 ) #### Update (2): A single `UnstructuredLoader` is added to handle both local and api partitioning. This loader also handles single or multiple documents. #### Changes in `community`: Changes here do not affect users. In the initial process of using the SDK for the API Loaders, the Loaders in community were refactored. Other changes include: The `UnstructuredBaseLoader` has a new check to see if both `mode="paged"` and `chunking_strategy="by_page"`. It also now has `Element.element_id` added to the `Document.metadata`. `UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such, now both directly inherit from `UnstructuredBaseLoader` and initialize their `file_path`/`file` attributes respectively and implement their own `_post_process_elements` methods. -------- #### Update: New SDK Loaders in a [partner package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo) are introduced to prevent breaking changes for users (see discussion below). ##### TODO: - [x] Test docstring examples -------- - Description: UnstructuredAPIFileIOLoader and UnstructuredAPIFileLoader calls to the unstructured api are now made using the unstructured-client sdk. - New Dependencies: unstructured-client - [x] Add tests and docs: If you're adding a new integration, please include - [x] a test for the integration, preferably unit tests that do not rely on network access, - [x] update the description in `docs/docs/integrations/providers/unstructured.mdx` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. TODO: - [x] Update https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api - `langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb` - The description here needs to indicate that users should install `unstructured-client` instead of `unstructured`. Read over closely to look for any other changes that need to be made. - [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to handle json responses from the API instead of just lists of elements. - This method may need to be overwritten by the API loaders instead of changing it in the `UnstructuredBaseLoader`. - [x] Update the documentation links in the class docstrings (the Unstructured documents have moved) - [x] Update Document.metadata to include `element_id` (see thread [here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419)) --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com>	2024-07-24 23:21:20 +00:00
Anindyadeep	12c3454fd9	[Community] PremAI Tool Calling Functionality (#23931 ) This PR is under WIP and adds the following functionalities: - [X] Supports tool calling across the langchain ecosystem. (However streaming is not supported) - [X] Update documentation	2024-07-24 09:53:58 -04:00
Vishnu Nandakumar	e271965d1e	community: retrievers: added capability for using Product Quantization as one of the retriever. (#22424 ) - [ ] Community: "Retrievers: Product Quantization" - [X] This PR adds Product Quantization feature to the retrievers to the Langchain Community. PQ is one of the fastest retrieval methods if the embeddings are rich enough in context due to the concepts of quantization and representation through centroids - Description: Adding PQ as one of the retrievers - Dependencies: using the package nanopq for this PR - Twitter handle: vishnunkumar_ - [X] Add tests and docs: If you're adding a new integration, please include - [X] Added unit tests for the same in the retrievers. - [] Will add an example notebook subsequently - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ - done the same --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-24 13:52:15 +00:00
stydxm	b9bea36dd4	community: fix typo in warning message (#24597 ) - Description: This PR fixes a small typo in a warning message - Issue: ![](https://github.com/user-attachments/assets/5aa57724-26c5-49f6-8bc1-5a54bb67ed49) There were double `Use` and double `instead`	2024-07-24 13:19:07 +00:00
cüre	da06d4d7af	community: update finetuned model cost for 4o-mini (#24605 ) - Description: adds model price for. reference: https://openai.com/api/pricing/ - Issue: - - Dependencies: - - Twitter handle: cureef	2024-07-24 13:17:26 +00:00
ZhangShenao	ad18afc3ec	community[patch]: Fix param spelling error in `ElasticsearchChatMessageHistory` (#24589 ) Fix param spelling error in `ElasticsearchChatMessageHistory`	2024-07-23 19:29:42 -07:00
Aayush Kataria	0f45ac4088	LangChain Community: VectorStores: Azure Cosmos DB Filtered Vector Search (#24087 ) Thank you for contributing to LangChain! - This PR adds vector search filtering for Azure Cosmos DB Mongo vCore and NoSQL. - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-23 16:59:23 -07:00
Carlos André Antunes	325068bb53	community: Fix azure_openai.py (#24572 ) In some lines its trying to read a key that do not exists yet. In this cases I changed the direct access to dict.get() method - [ x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-07-23 16:22:21 -04:00
Ben Chambers	e80b0932ee	community[patch]: small fixes to link extractors (#24528 ) - Description: small fixes to imports / types in the link extraction work --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-23 14:28:06 +00:00
Morteza Hosseini	9e06991aae	community[patch]: Update URL to the 2markdown API (#24546 ) Update the URL to Markdown endpoint. API information is available here: https://2markdown.com/docs#url2md	2024-07-23 14:27:55 +00:00
maang-h	378db2e1a5	docs: Add RedisChatMessageHistory docstrings (#24548 ) - Description: Add `RedisChatMessageHistory ` rich docstrings. - Issue: the issue #21983 Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-23 14:23:46 +00:00
Ben Chambers	a5a3d28776	community[patch]: Remove targets_table from C* GraphVectorStore (#24502 ) - Description: Remove the unnecessary `targets_table` parameter	2024-07-22 22:09:36 -04:00
Alexander Golodkov	2a70a07aad	community[minor]: added new document loaders based on dedoc library (#24303 ) ### Description This pull request added new document loaders to load documents of various formats using [Dedoc](https://github.com/ispras/dedoc): - `DedocFileLoader` (determine file types automatically and parse) - `DedocPDFLoader` (for `PDF` and images parsing) - `DedocAPIFileLoader` (determine file types automatically and parse using Dedoc API without library installation) [Dedoc](https://dedoc.readthedocs.io) is an open-source library/service that extracts texts, tables, attached files and document structure (e.g., titles, list items, etc.) from files of various formats. The library is actively developed and maintained by a group of developers. `Dedoc` supports `DOCX`, `XLSX`, `PPTX`, `EML`, `HTML`, `PDF`, images and more. Full list of supported formats can be found [here](https://dedoc.readthedocs.io/en/latest/#id1). For `PDF` documents, `Dedoc` allows to determine textual layer correctness and split the document into paragraphs. ### Issue This pull request extends variety of document loaders supported by `langchain_community` allowing users to choose the most suitable option for raw documents parsing. ### Dependencies The PR added a new (optional) dependency `dedoc>=2.2.5` ([library documentation](https://dedoc.readthedocs.io)) to the `extended_testing_deps.txt` ### Twitter handle None ### Add tests and docs 1. Test for the integration: `libs/community/tests/integration_tests/document_loaders/test_dedoc.py` 2. Example notebook: `docs/docs/integrations/document_loaders/dedoc.ipynb` 3. Information about the library: `docs/docs/integrations/providers/dedoc.mdx` ### Lint and test Done locally: - `make format` - `make lint` - `make integration_tests` - `make docs_build` (from the project root) --------- Co-authored-by: Nasty <bogatenkova.anastasiya@mail.ru>	2024-07-23 02:04:53 +00:00
Ben Chambers	5ac936a284	community[minor]: add document transformer for extracting links (#24186 ) - Description: Add a DocumentTransformer for executing one or more `LinkExtractor`s and adding the extracted links to each document. - Issue: n/a - Depedencies: none --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-22 22:01:21 -04:00
Bagatur	236e957abb	core,groq,openai,mistralai,robocorp,fireworks,anthropic[patch]: Update BaseModel subclass and instance checks to handle both v1 and proper namespaces (#24417 ) After this PR chat models will correctly handle pydantic 2 with bind_tools and with_structured_output. ```python import pydantic print(pydantic.__version__) ``` 2.8.2 ```python from langchain_openai import ChatOpenAI from pydantic import BaseModel, Field class Add(BaseModel): x: int y: int model = ChatOpenAI().bind_tools([Add]) print(model.invoke('2 + 5').tool_calls) model = ChatOpenAI().with_structured_output(Add) print(type(model.invoke('2 + 5'))) ``` ``` [{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_PNUFa4pdfNOYXxIMHc6ps2Do', 'type': 'tool_call'}] <class '__main__.Add'> ``` ```python from langchain_openai import ChatOpenAI from pydantic.v1 import BaseModel, Field class Add(BaseModel): x: int y: int model = ChatOpenAI().bind_tools([Add]) print(model.invoke('2 + 5').tool_calls) model = ChatOpenAI().with_structured_output(Add) print(type(model.invoke('2 + 5'))) ``` ```python [{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_hhiHYP441cp14TtrHKx3Upg0', 'type': 'tool_call'}] <class '__main__.Add'> ``` Addresses issues: https://github.com/langchain-ai/langchain/issues/22782 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-22 20:07:39 +00:00
Naka Masato	884f76e05a	fix: load google credentials properly in GoogleDriveLoader (#12871 ) - Description: - Fix #12870: set scope in `default` func (ref: https://google-auth.readthedocs.io/en/master/reference/google.auth.html) - Moved the code to load default credentials to the bottom for clarity of the logic - Add docstring and comment for each credential loading logic - Issue: https://github.com/langchain-ai/langchain/issues/12870 - Dependencies: no dependencies change - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: @gymnstcs <!-- If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-22 17:43:33 +00:00
Jorge Piedrahita Ortiz	10e3982b59	community: sambanova integration minor changes (#24503 ) - Minor changes in samabanova llm integration - default api - docstrings - minor changes in docs	2024-07-22 17:06:35 +00:00
maang-h	721f709dec	community: Improve QianfanChatEndpoint tool result to model (#24466 ) - Description: `QianfanChatEndpoint` When using tool result to answer questions, the content of the tool is required to be in Dict format. Of course, this can require users to return Dict format when calling the tool, but in order to be consistent with other Chat Models, I think such modifications are necessary.	2024-07-22 11:29:00 -04:00
ccurme	dcba7df2fe	community[patch]: deprecate langchain_community Chroma in favor of langchain_chroma (#24474 )	2024-07-22 11:00:13 -04:00
Mohammad Mohtashim	5ade0187d0	[Commutiy]: Prompts Fixed for ZERO_SHOT_REACT React Agent Type in `create_sql_agent` function (#23693 ) - Description: The correct Prompts for ZERO_SHOT_REACT were not being used in the `create_sql_agent` function. They were not using the specific `SQL_PREFIX` and `SQL_SUFFIX` prompts if client does not provide any prompts. This is fixed. - Issue: #23585 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-22 14:04:20 +00:00
ZhangShenao	0f6737cbfe	[Vector Store] Fix function `add_texts` in `TencentVectorDB` (#24469 ) Regardless of whether `embedding_func` is set or not, the 'text' attribute of document should be assigned, otherwise the `page_content` in the document of the final search result will be lost	2024-07-22 09:50:22 -04:00
clement.l	d98b830e4b	community: add flag to toggle progress bar (#24463 ) - Description: Add a flag to determine whether to show progress bar - Issue: n/a - Dependencies: n/a - Twitter handle: n/a --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-20 13:18:02 +00:00
chuanbei888	6b08a33fa4	community: fix QianfanChatEndpoint default model (#24464 ) the baidu_qianfan_endpoint has been changed from ERNIE-Bot-turbo to ERNIE-Lite-8K	2024-07-20 13:00:29 +00:00
maang-h	7b28359719	docs: Add ChatSparkLLM docstrings (#24449 ) - Description: - Add `ChatSparkLLM` docstrings, the issue #22296 - To support `stream` method	2024-07-19 20:19:14 -07:00
Erick Friis	f4ee3c8a22	infra: add min version testing to pr test flow (#24358 ) xfailing some sql tests that do not currently work on sqlalchemy v1 #22207 was very much not sqlalchemy v1 compatible. Moving forward, implementations should be compatible with both to pass CI	2024-07-19 22:03:19 +00:00
Bagatur	dda9438e87	community[patch]: gpt-4o-mini costs (#24421 )	2024-07-19 19:02:44 +00:00
Eugene Yurtsev	604dfe2d99	community[patch]: Force opt-in for WebResearchRetriever (CVE-2024-3095) (#24451 ) This PR addresses the issue raised by (CVE-2024-3095) https://huntr.com/bounties/e62d4895-2901-405b-9559-38276b6a5273 Unfortunately, we didn't do a good job writing the initial report. It's pointing at both the wrong package and the wrong code. The affected code is the Web Retriever not the AsyncHTMLLoader, and the WebRetriever lives in langchain-community The vulnerable code lives here: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L233-L233)` This PR adds a forced opt-in for users to make sure they are aware of the risk and can mitigate by configuring a proxy: `0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L84-L84)`	2024-07-19 18:51:35 +00:00
Asi Greenholts	372c27f2e5	community[minor]: [GoogleApiYoutubeLoader] Replace API used in _get_document_for_channel from search to playlistItem (#24034 ) - Description: Search has a limit of 500 results, playlistItems doesn't. Added a class in except clause to catch another common error. - Issue: None - Dependencies: None - Twitter handle: @TupleType --------- Co-authored-by: asi-cider <88270351+asi-cider@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 14:04:34 -04:00
Rafael Pereira	6a45bf9554	community[minor]: GraphCypherQAChain to accept additional inputs as provided by the user for cypher generation (#24300 ) Description: This PR introduces a change to the `cypher_generation_chain` to dynamically concatenate inputs. This improvement aims to streamline the input handling process and make the method more flexible. The change involves updating the arguments dictionary with all elements from the `inputs` dictionary, ensuring that all necessary inputs are dynamically appended. This will ensure that any cypher generation template will not require a new `_call` method patch. Issue: This PR fixes issue #24260.	2024-07-19 14:03:14 -04:00
Philippe PRADOS	f5856680fe	community[minor]: add mongodb byte store (#23876 ) The `MongoDBStore` can manage only documents. It's not possible to use MongoDB for an `CacheBackedEmbeddings`. With this new implementation, it's possible to use: ```python CacheBackedEmbeddings.from_bytes_store( underlying_embeddings=embeddings, document_embedding_cache=MongoDBByteStore( connection_string=db_uri, db_name=db_name, collection_name=collection_name, ), ) ``` and use MongoDB to cache the embeddings !	2024-07-19 13:54:12 -04:00
yabooung	07715f815b	community[minor]: Add ability to specify file encoding and json encoding for FileChatMessageHistory (#24258 ) Description: Add UTF-8 encoding support Issue: Inability to properly handle characters from certain languages (e.g., Korean) Fix: Implement UTF-8 encoding in FileChatMessageHistory --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 13:53:21 -04:00
Dristy Srivastava	020cc1cf3e	Community[minor]: Added checksum in while send data to pebblo-cloud (#23968 ) - Description: - Updated checksum in doc metadata - Sending checksum and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `/loader/doc` API - Adding `pb_id` i.e. pebblo id to doc metadata - Refactoring as needed. - Sending `content-checksum` and removing actual content, while sending data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in `prmopt` API - Issue: NA - Dependencies: NA - Tests: Updated - Docs NA --------- Co-authored-by: dristy.cd <dristy@clouddefense.io>	2024-07-19 13:52:54 -04:00
keval dekivadiya	06f47678ae	community[minor]: Add TextEmbed Embedding Integration (#22946 ) Description: TextEmbed is a high-performance embedding inference server designed to provide a high-throughput, low-latency solution for serving embeddings. It supports various sentence-transformer models and includes the ability to deploy image and text embedding models. TextEmbed offers flexibility and scalability for diverse applications. - PyPI Package: [TextEmbed on PyPI](https://pypi.org/project/textembed/) - Docker Image: [TextEmbed on Docker Hub](https://hub.docker.com/r/kevaldekivadiya/textembed) - GitHub Repository: [TextEmbed on GitHub](https://github.com/kevaldekivadiya2415/textembed) PR Description This PR adds functionality for embedding documents and queries using the `TextEmbedEmbeddings` class. The implementation allows for both synchronous and asynchronous embedding requests to a TextEmbed API endpoint. The class handles batching and permuting of input texts to optimize the embedding process. Example Usage: ```python from langchain_community.embeddings import TextEmbedEmbeddings # Initialise the embeddings class embeddings = TextEmbedEmbeddings(model="your-model-id", api_key="your-api-key", api_url="your_api_url") # Define a list of documents documents = [ "Data science involves extracting insights from data.", "Artificial intelligence is transforming various industries.", "Cloud computing provides scalable computing resources over the internet.", "Big data analytics helps in understanding large datasets.", "India has a diverse cultural heritage." ] # Define a query query = "What is the cultural heritage of India?" # Embed all documents document_embeddings = embeddings.embed_documents(documents) # Embed the query query_embedding = embeddings.embed_query(query) # Print embeddings for each document for i, embedding in enumerate(document_embeddings): print(f"Document {i+1} Embedding:", embedding) # Print the query embedding print("Query Embedding:", query_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-19 17:30:25 +00:00
Andrew Benton	f9d64d22e5	community[minor]: Add Riza Python/JS code execution tool (#23995 ) - Description: Add Riza Python/JS code execution tool - Issue: N/A - Dependencies: an optional dependency on the `rizaio` pypi package - Twitter handle: [@rizaio](https://x.com/rizaio) [Riza](https://riza.io) is a safe code execution environment for agent-generated Python and JavaScript that's easy to integrate into langchain apps. This PR adds two new tool classes to the community package.	2024-07-19 17:03:22 +00:00
Ben Chambers	3691701d58	community[minor]: Add keybert-based link extractor (#24311 ) - Description: Add a `KeybertLinkExtractor` for graph vectorstores. This allows extracting links from keywords in a Document and linking nodes that have common keywords. - Issue: None - Dependencies: None. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-19 12:25:07 -04:00
Ben Chambers	83f3d95ffa	community[minor]: GLiNER link extraction (#24314 ) - Description: This allows extracting links between documents with common named entities using [GLiNER](https://github.com/urchade/GLiNER). - Issue: None - Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-19 15:34:54 +00:00
ccurme	f99369a54c	community[patch]: fix formatting (#24443 ) Somehow this got through CI: https://github.com/langchain-ai/langchain/pull/24363	2024-07-19 14:38:53 +00:00
Ben Chambers	242b085be7	Merge pull request #24315 * community: Add Hierarchy link extractor * add example * lint	2024-07-19 09:42:26 -04:00
Rhuan Barros	c3308f31bc	Merge pull request #24363 * important email fields	2024-07-19 09:41:20 -04:00
Han Sol Park	aade9bfde5	Mask API key for ChatOpenAI based chat_models (#14293 ) - Description: Mask API key for ChatOpenAi based chat_models (openai, azureopenai, anyscale, everlyai). Made changes to all chat_models that are based on ChatOpenAI since all of them assumes that openai_api_key is str rather than SecretStr. - Issue:: #12165 - Dependencies: N/A - Tag maintainer: @eyurtsev - Twitter handle: N/A --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-19 02:25:38 +00:00
Eun Hye Kim	07c5c60f63	community: fix tool appending logic and update planner prompt in OpenAPI agent toolkit (#24384 ) Description: - Updated the format for the 'Action' section in the planner prompt to ensure it must be one of the tools without additional words. Adjusted the phrasing from "should be" to "must be" for clarity and enforceability. - Corrected the tool appending logic in the `_create_api_controller_agent` function to ensure that `RequestsDeleteToolWithParsing` and `RequestsPatchToolWithParsing` are properly added to the tools list for "DELETE" and "PATCH" operations. Issue: #24382 Dependencies: None Twitter handle: @lunara_x --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-18 13:37:46 +00:00
Chen Xiabin	63c60a31f0	[fix] baidu qianfan AiMessage with usage_metadata (#24389 ) make AIMessage usage_metadata has error	2024-07-18 09:28:16 -04:00
Rajendra Kadam	1c65529fd7	community[minor]: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders (#24393 ) Thank you for contributing to LangChain! - [x] PR title: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders - Description: Minor fixes in the PebbloSafeLoader: - Renamed the loader type from `remote_db` to `cloud_folder`. - Added `SharePointLoader` to the list of loaders supported by PebbloSafeLoader. - Issue: NA - Dependencies: NA - [x] Add tests and docs: NA	2024-07-18 08:23:12 -04:00
Paolo Ráez	0dec72cab0	Community[patch]: Missing "stream" parameter in cloudflare_workersai (#23987 ) ### Description Missing "stream" parameter. Without it, you'd never receive a stream of tokens when using stream() or astream() ### Issue No existing issue available	2024-07-18 02:09:39 +00:00
Brice Fotzo	034a8c7c1b	community: support advanced text extraction options for pdf documents (#20265 ) Description: - Updated constructors in PyPDFParser and PyPDFLoader to handle `extraction_mode` and additional kwargs, aligning with the capabilities of `PageObject.extract_text()` from pypdf. - Added `test_pypdf_loader_with_layout` along with a corresponding example text file to validate layout extraction from PDFs. Issue: fixes #19735 Dependencies: This change requires updating the pypdf dependency from version 3.4.0 to at least 4.0.0. Additional changes include the addition of a new test test_pypdf_loader_with_layout and an example text file to ensure the functionality of layout extraction from PDFs aligns with the new capabilities. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-17 20:47:09 +00:00
Luis Moros	bcb5f354ad	community: Fix SQLDatabse.from_databricks issue when ran from Job (#24346 ) - Description: When SQLDatabase.from_databricks is ran from a Databricks Workflow job, line 205 (default_host = context.browserHostName) throws an ``AttributeError`` as the ``context`` object has no ``browserHostName`` attribute. The fix handles the exception and sets the ``default_host`` variable to null --------- Co-authored-by: lmorosdb <lmorosdb> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-17 12:40:12 -04:00
Rafael Pereira	cf28708e7b	Neo4j: Update with non-deprecated cypher methods, and new method to associate relationship embeddings (#23725 ) Description: At the moment neo4j wrapper is using setVectorProperty, which is deprecated ([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)). I replaced with the non-deprecated version. Neo4j recently introduced a new cypher method to associate embeddings into relations using "setRelationshipVectorProperty" method. In this PR I also implemented a new method to perform this association maintaining the same format used in the "add_embeddings" method which is used to associate embeddings into Nodes. I also included a test case for this new method.	2024-07-17 12:37:47 -04:00
maang-h	2a3288b15d	docs: Add ChatBaichuan docstrings (#24348 ) - Description: Add ChatBaichuan rich docstrings. - Issue: the issue #22296	2024-07-17 12:00:16 -04:00
bovlb	5caa381177	community[minor]: Add ApertureDB as a vectorstore (#24088 ) Thank you for contributing to LangChain! - [X] ApertureDB as vectorstore: "community: Add ApertureDB as a vectorestore" - Description:* this change provides a new community integration that uses ApertureData's ApertureDB as a vector store. - Issue: none - Dependencies: depends on ApertureDB Python SDK - Twitter handle: ApertureData - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. Integration tests rely on a local run of a public docker image. Example notebook additionally relies on a local Ollama server. - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ All lint tests pass. Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Gautam <gautam@aperturedata.io>	2024-07-16 09:32:59 -07:00
frob	c59e663365	community[patch]: Fix docstring for ollama parameter "keep_alive" (#23973 ) Fix doc-string for ollama integration	2024-07-16 14:48:38 +00:00
Rahul Raghavendra Choudhury	f5a38772a8	community[patch]: Update TavilySearch to use TavilyClient instead of the deprecated Client (#24270 ) On using TavilySearchAPIRetriever with any conversation chain getting error : `TypeError: Client.__init__() got an unexpected keyword argument 'api_key'` It is because the retreiver class is using the depreciated `Client` class, `TavilyClient` need to be used instead. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-07-16 13:35:28 +00:00
Chen Xiabin	69b1603173	baidu qianfan AiMessage with usage_metadata (#24288 ) add usage_metadata to qianfan AIMessage. Thanks	2024-07-16 09:30:50 -04:00
Dobiichi-Origami	7aeaa1974d	community[patch]: change the class of `qianfan_ak` and `qianfan_sk` parameters (#24293 ) - Description: we changed the class of two parameters to fix a bug, which causes validation failure when using QianfanEmbeddingEndpoint	2024-07-16 09:17:48 -04:00
Lage Ragnarsson	a3c10fc6ce	community: Add support for specifying hybrid search for Databricks vector search (#23528 ) Description: Databricks Vector Search recently added support for hybrid keyword-similarity search. See [usage examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint) from their documentation. This PR updates the Langchain vectorstore interface for Databricks to enable the user to pass the query_type parameter to similarity_search to make use of this functionality. By default, there will not be any changes for existing users of this interface. To use the new hybrid search feature, it is now possible to do ```python # ... dvs = DatabricksVectorSearch(index) dvs.similarity_search("my search query", query_type="HYBRID") ``` Or using the retriever: ```python retriever = dvs.as_retriever( search_kwargs={ "query_type": "HYBRID", } ) retriever.invoke("my search query") ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-15 22:14:08 +00:00
Christopher Tee	5171ffc026	community(you): Integrate You.com conversational APIs (#23046 ) You.com is releasing two new conversational APIs — Smart and Research. This PR: - integrates those APIs with Langchain, as an LLM - streaming is supported If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-07-15 17:46:58 -04:00
maang-h	6c7d9f93b9	feat: Add ChatTongyi structured output (#24187 ) - Description: Add `with_structured_output` method to ChatTongyi to support structured output.	2024-07-15 15:57:21 -04:00
Chen Xiabin	8f4620f4b8	baidu qianfan streaming token_usage (#24117 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-15 19:52:31 +00:00
maang-h	9d97de34ae	community[patch]: Improve ChatBaichuan init args and role (#23878 ) - Description: Improve ChatBaichuan init args and role - ChatBaichuan adds `system` role - alias: `baichuan_api_base` -> `base_url` - `with_search_enhance` is deprecated - Add `max_tokens` argument	2024-07-15 15:17:00 -04:00
mrugank-wadekar	66bebeb76a	partners: add similarity search by image functionality to langchain_chroma partner package (#22982 ) - Description: This pull request introduces two new methods to the Langchain Chroma partner package that enable similarity search based on image embeddings. These methods enhance the package's functionality by allowing users to search for images similar to a given image URI. Also introduces a notebook to demonstrate it's use. - Issue: N/A - Dependencies: None - Twitter handle: @mrugank9009 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-15 18:48:22 +00:00
Carlos André Antunes	20151384d7	fix azure_openai.py: some keys do not exists (#24158 ) In some lines its trying to read a key that do not exists yet. In this cases I changed the direct access to dict.get() method Thank you for contributing to LangChain! - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-07-15 17:17:05 +00:00
Harold Martin	ccdaf14eff	docs: Spell check fixes (#24217 ) Description: Spell check fixes for docs, comments, and a couple of strings. No code change e.g. variable names. Issue: none Dependencies: none Twitter handle: hmartin	2024-07-15 15:51:43 +00:00
thehunmonkgroup	e8a21146d3	community[patch]: upgrade default model for ChatAnyscale (#24232 ) Old default `meta-llama/Llama-2-7b-chat-hf` no longer supported.	2024-07-15 11:34:59 -04:00
Bagatur	65321bf975	core[patch]: fix ToolCall "type" when streaming (#24218 )	2024-07-13 08:59:03 -07:00
Miroslav	aee55eda39	community: Skip Login to HuggubgFaceHub when token is not set (#21561 ) Thank you for contributing to LangChain! - [ ] HuggingFaceEndpoint: "Skip Login to HuggingFaceHub" - Where: langchain, community, llm, huggingface_endpoint - [ ] PR message: *Delete this entire checklist* and replace with - Description: Skip login to huggingface hub when when `huggingfacehub_api_token` is not set. This is needed when using custom `endpoint_url` outside of HuggingFaceHub. - Issue: the issue # it fixes https://github.com/langchain-ai/langchain/issues/20342 and https://github.com/langchain-ai/langchain/issues/19685 - Dependencies: None - [ ] Add tests and docs: 1. Tested with locally available TGI endpoint 2. Example Usage ```python from langchain_community.llms import HuggingFaceEndpoint llm = HuggingFaceEndpoint( endpoint_url='http://localhost:8080', server_kwargs={ "headers": {"Content-Type": "application/json"} } ) resp = llm.invoke("Tell me a joke") print(resp) ``` Also tested against HF Endpoints ```python from langchain_community.llms import HuggingFaceEndpoint huggingfacehub_api_token = "hf_xyz" repo_id = "mistralai/Mistral-7B-Instruct-v0.2" llm = HuggingFaceEndpoint( huggingfacehub_api_token=huggingfacehub_api_token, repo_id=repo_id, ) resp = llm.invoke("Tell me a joke") print(resp) ``` Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-12 22:10:32 +00:00
Jean Nshuti	d77d9bfc00	community[patch]: update typo document content returned from semanticscholar (#24175 ) Update "astract" -> abstract	2024-07-12 15:40:47 +00:00
Tomaz Bratanic	d3a2b9fae0	Fix neo4j type error on missing constraint information (#24177 ) If you use `refresh_schema=False`, then the metadata constraint doesn't exist. ATM, we used default `None` in the constraint check, but then `any` fails because it can't iterate over None value	2024-07-12 06:39:29 -04:00
thedavgar	ffe6ca986e	community: Fix Bug in Azure Search Vectorstore search asyncronously (#24081 ) Thank you for contributing to LangChain! Description: This PR fixes a bug described in the issue in #24064, when using the AzureSearch Vectorstore with the asyncronous methods to do search which is also the method used for the retriever. The proposed change includes just change the access of the embedding as optional because is it not used anywhere to retrieve documents. Actually, the syncronous methods of retrieval do not use the embedding neither. With this PR the code given by the user in the issue works. ```python vectorstore = AzureSearch( azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"), azure_search_key=os.getenv("AI_SEARCH_API_KEY"), index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"), fields=fields, embedding_function=encoder, ) retriever = vectorstore.as_retriever(search_type="hybrid", k=2) await vectorstore.avector_search("what is the capital of France") await retriever.ainvoke("what is the capital of France") ``` Issue: The Azure Search Vectorstore is not working when searching for documents with asyncronous methods, as described in issue #24064 Dependencies: There are no extra dependencies required for this change. --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-07-11 18:32:19 -07:00
Jacob Lee	f1f1f75782	community[patch]: Make AzureML endpoint return AI messages for type assistant (#24085 )	2024-07-11 21:45:30 +02:00
Atul R	457677c1b7	community: Fixes use of ImagePromptTemplate with Ollama (#24140 ) Description: ImagePromptTemplate for Multimodal llms like llava when using Ollama Twitter handle: https://x.com/a7ulr Details: When using llava models / any ollama multimodal llms and passing images in the prompt as urls, langchain breaks with this error. ```python image_url_components = image_url.split(",") ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'dict' object has no attribute 'split' ``` From the looks of it, there was bug where the condition did check for a `url` field in the variable but missed to actually assign it. This PR fixes ImagePromptTemplate for Multimodal llms like llava when using Ollama specifically. @hwchase17	2024-07-11 11:31:48 -07:00
Matt	8327925ab7	community:support additional Azure Search Options (#24134 ) - Description: Support additional kwargs options for the Azure Search client (Described here https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md#configurations) - Issue: N/A - Dependencies: No additional Dependencies ---------	2024-07-11 18:22:36 +00:00
Eugene Yurtsev	08638ccc88	community[patch]: QianfanLLMEndpoint fix type information for the keys (#24128 ) Fix for issue: https://github.com/langchain-ai/langchain/issues/24126	2024-07-11 16:24:26 +00:00
maang-h	16e178a8c2	docs: Add MiniMaxChat docstrings (#24026 ) - Description: Add MiniMaxChat rich docstrings. - Issue: the issue #22296	2024-07-11 10:55:02 -04:00
Christophe Bornet	5fc5ef2b52	community[minor]: Add graph store extractors (#24065 ) This adds an extractor interface and an implementation for HTML pages. Extractors are used to create GraphVectorStore Links on loaded content. Twitter handle: cbornet_	2024-07-11 10:35:31 -04:00
maang-h	9bcf8f867d	docs: Add SQLChatMessageHistory docstring (#23978 ) - Description: Add SQLChatMessageHistory docstring. - Issue: the issue #21983 Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-07-11 14:24:28 +00:00
Rafael Pereira	092e9ee0e6	community[minor]: Neo4j Fixed similarity docs (#23913 ) Description: There was missing some documentation regarding the `filter` and `params` attributes in similarity search methods. --------- Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>	2024-07-11 10:16:48 -04:00
ccurme	975b6129f6	core[patch]: support conversion of runnables to tools (#23992 ) Open to other thoughts on UX. string input: ```python as_tool = retriever.as_tool() as_tool.invoke("cat") # [Document(...), ...] ``` typed dict input: ```python class Args(TypedDict): key: int def f(x: Args) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(f).as_tool( name="my tool", description="description", # name, description are inferred if not supplied ) as_tool.invoke({"key": 3}) # "6" ``` for untyped dict input, allow specification of parameters + types ```python def g(x: Dict[str, Any]) -> str: return str(x["key"] * 2) as_tool = RunnableLambda(g).as_tool(arg_types={"key": int}) result = as_tool.invoke({"key": 3}) # "6" ``` Passing the `arg_types` is slightly awkward but necessary to ensure tool calls populate parameters correctly: ```python from typing import Any, Dict from langchain_core.runnables import RunnableLambda from langchain_openai import ChatOpenAI def f(x: Dict[str, Any]) -> str: return str(x["key"] * 2) runnable = RunnableLambda(f) as_tool = runnable.as_tool(arg_types={"key": int}) llm = ChatOpenAI().bind_tools([as_tool]) result = llm.invoke("Use the tool on 3.") tool_call = result.tool_calls[0] args = tool_call["args"] assert args == {"key": 3} as_tool.run(args) ``` Contrived (?) example with langgraph agent as a tool: ```python from typing import List, Literal from typing_extensions import TypedDict from langchain_openai import ChatOpenAI from langgraph.prebuilt import create_react_agent llm = ChatOpenAI(temperature=0) def magic_function(input: int) -> int: """Applies a magic function to an input.""" return input + 2 agent_1 = create_react_agent(llm, [magic_function]) class Message(TypedDict): role: Literal["human"] content: str agent_tool = agent_1.as_tool( arg_types={"messages": List[Message]}, name="Jeeves", description="Ask Jeeves.", ) agent_2 = create_react_agent(llm, [agent_tool]) ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-10 19:29:59 -04:00
Eugene Yurtsev	c4e149d4f1	community[patch]: Add linter to catch @root_validator (#24070 ) - Add linter to prevent further usage of vanilla root validator - Udpate remaining root validators	2024-07-10 14:51:03 +00:00
ccurme	9c6efadec3	community[patch]: propagate cost information to OpenAI callback (#23996 ) This is enabled following https://github.com/langchain-ai/langchain/pull/22716.	2024-07-10 14:50:35 +00:00
Ethan Yang	13855ef0c3	[HuggingFace Pipeline] add streaming support (#23852 )	2024-07-09 17:02:00 -04:00
Eugene Yurtsev	f765e8fa9d	core[minor],community[patch],standard-tests[patch]: Move InMemoryImplementation to langchain-core (#23986 ) This PR moves the in memory implementation to langchain-core. * The implementation remains importable from langchain-community. * Supporting utilities are marked as private for now.	2024-07-08 14:11:51 -07:00
Eugene Yurtsev	2c180d645e	core[minor],community[minor]: Upgrade all @root_validator() to @pre_init (#23841 ) This PR introduces a @pre_init decorator that's a @root_validator(pre=True) but with all the defaults populated!	2024-07-08 16:09:29 -04:00
Rajendra Kadam	8b84457b17	community[minor]: Support PGVector in PebbloRetrievalQA (#23874 ) - Description: Support PGVector in PebbloRetrievalQA - Identity and Semantic Enforcement support for PGVector - Refactor Vectorstore validation and name check - Clear the overridden identity and semantic enforcement filters - Issue: NA - Dependencies: NA - Tests: NA(already added) - Docs: Updated - Twitter handle: [@Raj__725](https://twitter.com/Raj__725)	2024-07-05 16:02:25 -04:00
Rajendra Kadam	ee8aa54f53	community[patch]: Fix source path mismatch in PebbloSafeLoader (#23857 ) Description: Fix for source path mismatch in PebbloSafeLoader. The fix involves storing the full path in the doc metadata in VectorDB Issue: NA, caught in internal testing Dependencies: NA Add tests: Updated tests	2024-07-05 15:24:17 -04:00
Christophe Bornet	42d049f618	core[minor]: Add Graph Store component (#23092 ) This PR introduces a GraphStore component. GraphStore extends VectorStore with the concept of links between documents based on document metadata. This allows linking documents based on a variety of techniques, including common keywords, explicit links in the content, and other patterns. This works with existing Documents, so it’s easy to extend existing VectorStores to be used as GraphStores. The interface can be implemented for any Vector Store technology that supports metadata, not only graph DBs. When retrieving documents for a given query, the first level of search is done using classical similarity search. Next, links may be followed using various traversal strategies to get additional documents. This allows documents to be retrieved that aren’t directly similar to the query but contain relevant information. 2 retrieving methods are added to the VectorStore ones : * traversal_search which gets all linked documents up to a certain depth * mmr_traversal_search which selects linked documents using an MMR algorithm to have more diverse results. If a depth of retrieval of 0 is used, GraphStore is effectively a VectorStore. It enables an easy transition from a simple VectorStore to GraphStore by adding links between documents as a second step. An implementation for Apache Cassandra is also proposed. See https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb for a notebook explaining how to use GraphStore and that shows that it can answer correctly to questions that a simple VectorStore cannot. Twitter handle: _cbornet	2024-07-05 12:24:10 -04:00
Eugene Yurtsev	6f08e11d7c	core[minor]: add upsert, streaming_upsert, aupsert, astreaming_upsert methods to the VectorStore abstraction (#23774 ) This PR rolls out part of the new proposed interface for vectorstores (https://github.com/langchain-ai/langchain/pull/23544) to existing store implementations. The PR makes the following changes: 1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert methods to the vectorstore. 2. Updates `add_texts` and `aadd_texts` to be non required with a default implementation that delegates to `upsert` and `aupsert` if those have been implemented. The original `add_texts` and `aadd_texts` methods are problematic as they spread object specific information across document and *kwargs. (e.g., ids are not a part of the document) 3. Adds a default implementation to `add_documents` and `aadd_documents` that delegates to `upsert` and `aupsert` respectively. 4. Adds standard unit tests to verify that a given vectorstore implements a correct read/write API. A downside of this implementation is that it creates `upsert` with a very similar signature to `add_documents`. The reason for introducing `upsert` is to: Remove any ambiguities about what information is allowed in `kwargs`. Specifically kwargs should only be used for information common to all indexed data. (e.g., indexing timeout). *Allow inheriting from an anticipated generalized interface for indexing that will allow indexing `BaseMedia` (i.e., allow making a vectorstore for images/audio etc.) `add_documents` can be deprecated in the future in favor of `upsert` to make sure that users have a single correct way of indexing content. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-05 12:21:40 -04:00
Philippe PRADOS	289960bc60	community[patch]: Redis.delete should be a regular method not a static method (#23873 ) The `langchain_common.vectostore.Redis.delete()` must not be a `@staticmethod`. With the current implementation, it's not possible to have multiple instances of Redis vectorstore because all versions must share the `REDIS_URL`. It's not conform with the base class.	2024-07-05 12:04:58 -04:00
Klaudia Lemiec	a2082bc1f8	docs: Arxiv docs update (#23871 ) - [X] PR title - [X] PR message: *Delete this entire checklist* and replace with - Description: Update of docstrings and docpages - Issue: [22866](https://github.com/langchain-ai/langchain/issues/22866) - [X] Add tests and docs - [X] Lint and test	2024-07-05 11:43:51 -04:00
André Quintino	99b1467b63	community: add support for 'cloud' parameter in JiraAPIWrapper (#23057 ) - Description: Enhance JiraAPIWrapper to accept the 'cloud' parameter through an environment variable. This update allows more flexibility in configuring the environment for the Jira API. - Twitter handle: Andre_Q_Pereira --------- Co-authored-by: André Quintino <andre.quintino@tui.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-05 15:11:10 +00:00
wenngong	b1e90b3075	community: add model_name param valid for GPT4AllEmbeddings (#23867 ) Description: add model_name param valid for GPT4AllEmbeddings Issue: #23863 #22819 --------- Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-07-05 10:46:34 -04:00
volodymyr-memsql	a4eb6d0fb1	community: add SingleStoreDB semantic cache (#23218 ) This PR adds a `SingleStoreDBSemanticCache` class that implements a cache based on SingleStoreDB vector store, integration tests, and a notebook example. Additionally, this PR contains minor changes to SingleStoreDB vector store: - change add texts/documents methods to return a list of inserted ids - implement delete(ids) method to delete documents by list of ids - added drop() method to drop a correspondent database table - updated integration tests to use and check functionality implemented above CC: @baskaryan, @hwchase17 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2024-07-05 09:26:06 -04:00
Igor Drozdov	bb597b1286	feat(community): add bind_tools function for ChatLiteLLM (#23823 ) It's a follow-up to https://github.com/langchain-ai/langchain/pull/23765 Now the tools can be bound by calling `bind_tools` ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool from langchain_community.chat_models import ChatLiteLLM class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" # tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] tools = [GetWeather, GetPopulation] llm = ChatLiteLLM(model="claude-3-sonnet-20240229").bind_tools(tools) ai_msg = llm.invoke(prompt) print(ai_msg.tool_calls) ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>	2024-07-05 09:19:41 -04:00
Jiejun Tan	2be66a38d8	huggingface: Fix huggingface tei support (#22653 ) Update former pull request: https://github.com/langchain-ai/langchain/pull/22595. Modified `libs/partners/huggingface/langchain_huggingface/embeddings/huggingface_endpoint.py`, where the API call function does not match current [Text Embeddings Inference API](https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/embed). One example is: ```json { "inputs": "string", "normalize": true, "truncate": false } ``` Parameters in `_model_kwargs` are not passed properly in the latest version. By the way, the issue [why cause 413? #50](https://github.com/huggingface/text-embeddings-inference/issues/50) might be solved.	2024-07-03 13:30:29 -07:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
Oguz Vuruskaner	2a2c0d1a94	community[deepinfra]: fix tool call parsing. (#23162 ) This PR includes fix for DeepInfra tool call parsing.	2024-07-03 12:11:37 -04:00
maang-h	525109e506	feat: Implement ChatBaichuan asynchronous interface (#23589 ) - Description: Add interface to `ChatBaichuan` to support asynchronous requests - `_agenerate` method - `_astream` method --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-03 12:10:04 -04:00
Qingchuan Hao	5cd4083457	community: make bing web search as the only option (#23523 ) This PR make bing web search as the option for BingSearchAPIWrapper to facilitate and simply the user interface on Langchain. This is a follow-up work of https://github.com/langchain-ai/langchain/pull/23306.	2024-07-02 17:13:54 -04:00
maang-h	e4e28a6ff5	community[patch]: Fix MiniMaxChat validate_environment error (#23770 ) - Description: Fix some issues in MiniMaxChat - Fix `minimax_api_host` not in `values` error - Remove `minimax_group_id` from reading environment variables, the `minimax_group_id` no longer use in MiniMaxChat - Invoke callback prior to yielding token, the issus #16913	2024-07-02 13:23:32 -04:00
Eugene Yurtsev	46ff0f7a3c	community[patch]: Update @root_validators to use explicit pre=True or pre=False (#23737 )	2024-07-02 10:47:21 -04:00
Igor Drozdov	b664dbcc36	feat(community): add support for tool_calls response (#23765 ) When `model_kwargs={"tools": tools}` are passed to `ChatLiteLLM`, they are executed, but the response is not recognized correctly Let's add `tool_calls` to the `additional_kwargs` Thank you for contributing to LangChain! ## ChatAnthropic I used the following example to verify the output of llm with tools: ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_anthropic import ChatAnthropic class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") llm = ChatAnthropic(model="claude-3-sonnet-20240229") llm_with_tools = llm.bind_tools([GetWeather, GetPopulation]) ai_msg = llm_with_tools.invoke("Which city is hotter today and which is bigger: LA or NY?") print(ai_msg.tool_calls) ``` I get the following response: ```json [{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01UfDA89knrhw3vFV9X47neT'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01NrYVRYae7m7z7tBgyPb3Gd'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01EPFEpDgzL6vV2dTpD9SVP5'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01B5J6tPJXgwwfhQX9BHP2dt'}] ``` ## LiteLLM Based on https://litellm.vercel.app/docs/completion/function_call ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool import litellm class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] response = litellm.completion(model="claude-3-sonnet-20240229", messages=[{'role': 'user', 'content': prompt}], tools=tools) print(response.choices[0].message.tool_calls) ``` ```python [ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetWeather'), id='toolu_01HeDWV5vP7BDFfytH5FJsja', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetWeather'), id='toolu_01EiLesUSEr3YK1DaE2jxsQv', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetPopulation'), id='toolu_01Xz26zvkBDRxEUEWm9pX6xa', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetPopulation'), id='toolu_01SDqKnsLjvUXuBsgAZdEEpp', type='function')] ``` ## ChatLiteLLM When I try the following ```python from langchain_core.pydantic_v1 import BaseModel, Field from langchain_core.utils.function_calling import convert_to_openai_tool from langchain_community.chat_models import ChatLiteLLM class GetWeather(BaseModel): '''Get the current weather in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") class GetPopulation(BaseModel): '''Get the current population in a given location''' location: str = Field(..., description="The city and state, e.g. San Francisco, CA") prompt = "Which city is hotter today and which is bigger: LA or NY?" tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)] llm = ChatLiteLLM(model="claude-3-sonnet-20240229", model_kwargs={"tools": tools}) ai_msg = llm.invoke(prompt) print(ai_msg) print(ai_msg.tool_calls) ``` ```python content="Okay, let's find out the current weather and populations for Los Angeles and New York City:" response_metadata={'token_usage': Usage(prompt_tokens=329, completion_tokens=193, total_tokens=522), 'model': 'claude-3-sonnet-20240229', 'finish_reason': 'tool_calls'} id='run-748b7a84-84f4-497e-bba1-320bd4823937-0' [] ``` --- When I apply the changes of this PR, the output is ```json [{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_017D2tGjiaiakB1HadsEFZ4e'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01WrDpJfVqLkPejWzonPCbLW'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_016UKyYrVAV9Pz99iZGgGU7V'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01Sgv1imExFX1oiR1Cw88zKy'}] ``` If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>	2024-07-02 10:42:08 -04:00
Eugene Yurtsev	338cef35b4	community[patch]: update @root_validator in utilities namespace (#23768 ) Update all utilities to use `pre=True` or `pre=False` https://github.com/langchain-ai/langchain/issues/22819	2024-07-02 14:33:01 +00:00
antonpibm	ffde8a6a09	Milvus vectorstore: fix pass ids as argument after upsert (#23761 ) Description: Milvus vectorstore supports both `add_documents` via the base class and `upsert` method which deletes and re-adds documents based on their ids Issue: Due to mismatch in the interfaces the ids used by `upsert` are neglected in `add_documents`, as `ids` are passed as argument in `upsert` but via `kwargs` is `add_documents` This caused exceptions and inconsistency in the DB, tested with `auto_id=False` Fix: pass `ids` via `kwargs` to `add_documents`	2024-07-02 13:45:30 +00:00
Eugene Yurtsev	d084172b63	community[patch]: root validator set explicit pre=False or pre=True (#23764 ) See issue: https://github.com/langchain-ai/langchain/issues/22819	2024-07-02 09:42:05 -04:00
mattthomps1	cc55823486	docs: updated PPLX model (#23723 ) Description: updated pplx docs to reference a currently [supported model](https://docs.perplexity.ai/docs/model-cards). pplx-70b-online ->llama-3-sonar-small-32k-online --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-07-02 08:48:49 -04:00
Jacob Lee	7791d92711	community[patch]: Fix requests alias for load_tools (#23734 ) CC @baskaryan	2024-07-01 15:02:14 -07:00
Eugene Yurtsev	f24e38876a	community[patch]: Update root_validators to use explicit pre=True or pre=False (#23736 )	2024-07-01 17:13:23 -04:00
Eugene Yurtsev	5d2262af34	community[patch]: Update root_validators to use pre=True or pre=False (#23731 ) Update root_validators in preparation for pydantic 2 migration.	2024-07-01 20:10:15 +00:00
maang-h	96af8f31ae	community[patch]: Invoke callback prior to yielding token (#23638 ) - Description: Invoke callback prior to yielding token in stream and astream methods for ChatZhipuAI. - Issue: the issue #16913	2024-07-01 18:12:24 +00:00
Valentin	bf402f902e	community: Fix LanceDB similarity search bug (#23591 ) Description: LanceDB didn't allow querying the database using similarity score thresholds because the metrics value was missing. This PR simply fixes that bug. Issue: not applicable Dependencies: none Twitter handle: not available --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-07-01 16:33:45 +00:00
Rafael Pereira	4b9517db85	Jira: Allow Jira access using only the token (#23708 ) - Description: At the moment the Jira wrapper only accepts the the usage of the Username and Password/Token at the same time. However Jira allows the connection using only is useful for enterprise context. Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>	2024-07-01 13:13:51 +00:00
Tim Van Wassenhove	24916c6703	community: Register pandas df in duckdb when creating vector_store (#23690 ) - Description: Register pandas df in duckdb when creating vector_store - Issue: Resolves #23308 - Dependencies: None - Twitter handle: @timvw Co-authored-by: Tim Van Wassenhove <tim.van.wassenhove@telenetgroup.be>	2024-07-01 09:12:06 -04:00
Bagatur	fc8fd49328	openai, anthropic, ...: with_structured_output to pass in explicit tool choice (#23645 ) ...community, mistralai, groq, fireworks part of #23644	2024-06-28 16:39:53 -07:00
Bagatur	381aedcc61	docs: standardize azure openai page (#23642 ) part of #22296	2024-06-28 15:15:41 -07:00
j pradhan	5f21eab491	community:perplexity[patch]: standardize init args (#21794 ) updated request_timeout default alias value per related docstring. Related to [20085](https://github.com/langchain-ai/langchain/issues/20085) Thank you for contributing to LangChain! --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-28 13:26:12 +00:00
mackong	11483b0fb8	community[patch]: set tool name for tongyi&qianfan llm (#22889 ) - Description: The name of ToolMessage is default to None, which makes tool message send to LLM likes ```json {"role": "tool", "tool_call_id": "", "content": "{\"time\": \"12:12\"}", "name": null} ``` But the name seems essential for some LLMs like TongYi Qwen. so we need to set the name use agent_action's tool value. - Issue: N/A - Dependencies: N/A	2024-06-28 09:17:05 -04:00
Leonid Ganeline	e4caa41aa9	community: docstrings `toolkits` (#23616 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-06-28 08:40:52 -04:00
ccurme	adf2dc13de	community: fix lint (#23611 )	2024-06-27 22:12:16 +00:00
NG Sai Prasanth	5e6d23f27d	community: Standardise tool import for arxiv & semantic scholar (#23578 ) - Description: Fixing the way users have to import Arxiv and Semantic Scholar - Issue: Changed to use `from langchain_community.tools.arxiv import ArxivQueryRun` instead of `from langchain_community.tools.arxiv.tool import ArxivQueryRun` - Dependencies: None - Twitter handle: Nope	2024-06-27 16:35:50 -04:00
Ayo Ayibiowu	c6f700b7cb	fix(community): allow support for disabling max_tokens args (#21534 ) This PR fixes an issue with not able to use unlimited/infinity tokens from the respective provider for the LiteLLM provider. This is an issue when working in an agent environment that the token usage can drastically increase beyond the initial value set causing unexpected behavior.	2024-06-27 16:28:59 -04:00
mackong	70834cd741	community[patch]: support convert FunctionMessage for Tongyi (#23569 ) Description: For function call agent with Tongyi, cause the AgentAction will be converted to FunctionMessage by `47f69fe0d8/libs/core/langchain_core/agents.py (L188)` But now Tongyi's convert_message_to_dict doesn't support FunctionMessage `47f69fe0d8/libs/community/langchain_community/chat_models/tongyi.py (L184-L207)` Then next round conversation will be failed by the TypeError exception. This patch adds the support to convert FunctionMessage for Tongyi. Issue: N/A Dependencies: N/A	2024-06-27 15:49:26 -04:00
maang-h	5070004e8a	docs: Update Tongyi ChatModel docstring (#23540 ) - Description: Update Tongyi ChatModel rich docstring - Issue: the issue #22296	2024-06-26 13:07:13 -04:00
yonarw	6d0ebbca1e	community: SAP HANA Vector Engine fix for latest HANA release (#23516 ) - Description: This PR fixes an issue with SAP HANA Cloud QRC03 version. In that version the number to indicate no length being set for a vector column changed from -1 to 0. The change in this PR support both behaviours (old/new). - Dependencies: No dependencies have been introduced. - Tests: The change is covered by previous unit tests.	2024-06-26 13:15:51 +00:00
Alireza Kashani	c39521b70d	Update grobid.py (#23399 ) fixed potential `IndexError: list index out of range` in case there is no title Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-26 09:11:02 -04:00
Qingchuan Hao	ee282a1d2e	community: add missing link (#23526 )	2024-06-26 09:06:28 -04:00
Nuradil	c93d9e66e4	Community: Update and fix ZenGuardTool docs and add ZenguardTool to init files (#23415 ) Thank you for contributing to LangChain! - [x] PR title: "community: update docs and add tool to init.py" - [x] PR message: - Description: Fixed some errors and comments in the docs and added our ZenGuardTool and additional classes to init.py for easy access when importing - Question: when will you update the langchain-community package in pypi to make our tool available? - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Thank you for review! --------- Co-authored-by: Baur <baur.krykpayev@gmail.com>	2024-06-25 19:26:32 +00:00
Qingchuan Hao	ad50702934	community: add default value to bing_search_url (#23306 ) bing_search_url is an endpoint to requests bing search resource and is normally invariant to users, we can give it the default value to simply the uesages of this utility/tool	2024-06-25 08:08:41 -04:00
wenngong	b33d2346db	community: FlashrankRerank support loading customer client (#23350 ) Description: FlashrankRerank Document compressor support loading customer client Issue: #23338 Co-authored-by: gongwn1 <gongwn1@lenovo.com>	2024-06-24 17:50:08 -04:00
maang-h	f58c40b4e3	docs: Update QianfanChatEndpoint ChatModel docstring (#23337 ) - Description: Update QianfanChatEndpoint ChatModel rich docstring - Issue: the issue #22296	2024-06-24 17:42:46 -04:00
Rahul Triptahi	9ef93ecd7c	community[minor]: Added classification_location parameter in PebbloSafeLoader. (#22565 ) Description: Add classifier_location feature flag. This flag enables Pebblo to decide the classifier location, local or pebblo-cloud. Unit Tests: N/A Documentation: N/A --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-24 17:30:38 -04:00
yuncliu	398b2b9c51	community[minor]: Add Ascend NPU optimized Embeddings (#20260 ) - Description: Add NPU support for embeddings --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-24 20:15:11 +00:00
RUO	2b87e330b0	community: fix issue with nested field extraction in MongodbLoader (#22801 ) Description: This PR addresses an issue in the `MongodbLoader` where nested fields were not being correctly extracted. The loader now correctly handles nested fields specified in the `field_names` parameter. Issue: Fixes an issue where attempting to extract nested fields from MongoDB documents resulted in `KeyError`. Dependencies: No new dependencies are required for this change. Twitter handle: (Optional, your Twitter handle if you'd like a mention when the PR is announced) ### Changes 1. Field Name Parsing: - Added logic to parse nested field names and safely extract their values from the MongoDB documents. 2. Projection Construction: - Updated the projection dictionary to include nested fields correctly. 3. Field Extraction: - Updated the `aload` method to handle nested field extraction using a recursive approach to traverse the nested dictionaries. ### Example Usage Updated usage example to demonstrate how to specify nested fields in the `field_names` parameter: ```python loader = MongodbLoader( connection_string=MONGO_URI, db_name=MONGO_DB, collection_name=MONGO_COLLECTION, filter_criteria={"data.job.company.industry_name": "IT", "data.job.detail": { "$exists": True }}, field_names=[ "data.job.detail.id", "data.job.detail.position", "data.job.detail.intro", "data.job.detail.main_tasks", "data.job.detail.requirements", "data.job.detail.preferred_points", "data.job.detail.benefits", ], ) docs = loader.load() print(len(docs)) for doc in docs: print(doc.page_content) ``` ### Testing Tested with a MongoDB collection containing nested documents to ensure that the nested fields are correctly extracted and concatenated into a single page_content string. ### Note This change ensures backward compatibility for non-nested fields and improves functionality for nested field extraction. ### Output Sample ```python print(docs[:3]) ``` ```shell # output sample: [ Document( # Here in this example, page_content is the combined text from the fields below # "position", "intro", "main_tasks", "requirements", "preferred_points", "benefits" page_content='all combined contents from the requested fields in the document', metadata={'database': 'Your Database name', 'collection': 'Your Collection name'} ), ... ] ``` --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-24 19:29:11 +00:00
Tomaz Bratanic	aeeda370aa	Sanitize backticks from neo4j labels and types for import (#23367 )	2024-06-24 19:05:31 +00:00
Rave Harpaz	f5ff7f178b	Add OCI Generative AI new model support (#22880 ) - [x] PR title: community: Add OCI Generative AI new model support - [x] PR message: - Description: adding support for new models offered by OCI Generative AI services. This is a moderate update of our initial integration PR 16548 and includes a new integration for our chat models under /langchain_community/chat_models/oci_generative_ai.py - Issue: NA - Dependencies: No new Dependencies, just latest version of our OCI sdk - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests 2. we have updated our documentation including a new ipynb for our new chat integration - [x] Lint and test: `make format`, `make lint`, and `make test` run successfully --------- Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com> Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>	2024-06-24 14:48:23 -04:00
Baur	aa358f2be4	community: Add ZenGuard tool (#22959 ) Description This is the community integration of ZenGuard AI - the fastest guardrails for GenAI applications. ZenGuard AI protects against: - Prompts Attacks - Veering of the pre-defined topics - PII, sensitive info, and keywords leakage. - Toxicity - Etc. Twitter Handle : @zenguardai - [x] Add tests and docs: If you're adding a new integration, please include 1. Added an integration test 2. Added colab - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. --------- Co-authored-by: Nuradil <nuradil.maksut@icloud.com> Co-authored-by: Nuradil <133880216+yaksh0nti@users.noreply.github.com>	2024-06-24 17:40:56 +00:00
Mathis Joffre	60103fc4a5	community: Fix OVHcloud 401 Unauthorized on embedding. (#23260 ) They are now rejecting with code 401 calls from users with expired or invalid tokens (while before they were being considered anonymous). Thus, the authorization header has to be removed when there is no token. Related to: #23178 --------- Signed-off-by: Joffref <mariusjoffre@gmail.com>	2024-06-24 12:58:32 -04:00
Leonid Ganeline	987099cfcd	community: `toolkits` docstrings (#23286 ) Added missed docstrings. Formatted docstrings to the consistent form. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-22 14:37:52 +00:00
Rahul Triptahi	0cd3f93361	Enhance metadata of sharepointLoader. (#22248 ) Description: 2 feature flags added to SharePointLoader in this PR: 1. load_auth: if set to True, adds authorised identities to metadata 2. load_extended_metadata, adds source, owner and full_path to metadata Unit tests:N/A Documentation: To be done. --------- Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com> Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>	2024-06-21 17:03:38 -07:00
Rajendra Kadam	7ee2822ec2	community: Fix TypeError in PebbloRetrievalQA (#23170 ) Description: Fix "`TypeError: 'NoneType' object is not iterable`" when the auth_context is absent in PebbloRetrievalQA. The auth_context is optional; hence, PebbloRetrievalQA should work without it, but it throws an error at the moment. This PR fixes that issue. Issue: NA Dependencies: None Unit tests: NA --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-21 17:04:00 -04:00
Iurii Umnov	3b7b933aa2	community[minor]: OpenAPI agent. Add support for PUT, DELETE and PATCH (#22962 ) Description: Add PUT, DELETE and PATCH tools to tool list for OpenAPI agent if dangerous requests are allowed. Issue: https://github.com/langchain-ai/langchain/issues/20469	2024-06-21 20:44:23 +00:00
Guangdong Liu	3c42bf8d97	community(patch):Fix PineconeHynridSearchRetriever not having search_kwargs (#21577 ) - close #21521	2024-06-21 16:27:52 -04:00
Rahul Triptahi	4bb3d5c488	[community][quick-fix]: changed from blob.path to blob.path.name in 0365BaseLoader. (#22287 ) Description: file_metadata_ was not getting propagated to returned documents. Changed the lookup key to the name of the blob's path. Changed blob.path key to blob.path.name for metadata_dict key lookup. Documentation: N/A Unit tests: N/A Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-21 15:51:03 -04:00
mackong	360a70c8a8	core[patch]: fix no current event loop for sql history in async mode (#22933 ) - Description: When use RunnableWithMessageHistory/SQLChatMessageHistory in async mode, we'll get the following error: ``` Error in RootListenersTracer.on_chain_end callback: RuntimeError("There is no current event loop in thread 'asyncio_3'.") ``` which throwed by `ddfbca38df/libs/community/langchain_community/chat_message_histories/sql.py (L259)`. and no message history will be add to database. In this patch, a new _aexit_history function which will'be called in async mode is added, and in turn aadd_messages will be called. In this patch, we use `afunc` attribute of a Runnable to check if the end listener should be run in async mode or not. - Issue: #22021, #22022 - Dependencies: N/A	2024-06-21 10:39:47 -04:00
Zheng Robert Jia	a349fce880	docs[minor],community[patch]: Minor tutorial docs improvement, minor import error quick fix. (#22725 ) minor changes to module import error handling and minor issues in tutorial documents. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-20 15:36:49 -04:00
Luis Moros	d5be160af0	community[patch]: Fix sql_databse.from_databricks issue when ran from Job (#23224 ) Desscription: When the ``sql_database.from_databricks`` is executed from a Workflow Job, the ``context`` object does not have a "browserHostName" property, resulting in an error. This change manages the error so the "DATABRICKS_HOST" env variable value is used instead of stoping the flow Co-authored-by: lmorosdb <lmorosdb>	2024-06-20 19:34:15 +00:00
maang-h	bc4cd9c5cc	community[patch]: Update root_validators ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI (#22853 ) This PR updates root validators for: - ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI Issues #22819 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-20 16:36:41 +00:00
Leonid Ganeline	51e75cf59d	community: docstrings (#23202 ) Added missed docstrings. Format docstrings to the consistent format (used in the API Reference)	2024-06-20 11:08:13 -04:00
xyd	9b3a025f9c	fix https://github.com/langchain-ai/langchain/issues/23215 (#23216 ) fix bug The ZhipuAIEmbeddings class is not working. Co-authored-by: xu yandong <shaonian@acsx1.onexmail.com>	2024-06-20 13:04:50 +00:00
Michał Krassowski	710197e18c	community[patch]: restore compatibility with SQLAlchemy 1.x (#22546 ) - Description: Restores compatibility with SQLAlchemy 1.4.x that was broken since #18992 and adds a test run for this version on CI (only for Python 3.11) - Issue: fixes #19681 - Dependencies: None - Twitter handle: `@krassowski_m` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-19 17:58:57 +00:00
Jorge Piedrahita Ortiz	b3e53ffca0	community[patch]: sambanova llm integration improvement (#23137 ) - Description: sambanova sambaverse integration improvement: removed input parsing that was changing raw user input, and was making to use process prompt parameter as true mandatory	2024-06-19 10:30:14 -07:00
Jorge Piedrahita Ortiz	e162893d7f	community[patch]: update sambastudio embeddings (#23133 ) Description: update sambastudio embeddings integration, now compatible with generic endpoints and CoE endpoints	2024-06-19 10:26:56 -07:00
chenxi	505a2e8743	fix: MoonshotChat fails when setting the moonshot_api_key through the OS environment. (#23176 ) Close #23174 Co-authored-by: tianming <tianming@bytenew.com>	2024-06-19 16:28:24 +00:00
鹿鹿鹿鲨	6b46b5e9ce	community: add request_kwargs and expect TimeError AsyncHtmlLoader (#23068 ) - Description: add `request_kwargs` and expect `TimeError` in `_fetch` function for AsyncHtmlLoader. This allows you to fill in the kwargs parameter when using the `load()` method of the `AsyncHtmlLoader` class. Co-authored-by: Yucolu <yucolu@tencent.com>	2024-06-18 20:02:46 -07:00
Artem Mukhin	e271f75bee	docs: Fix URL formatting in deprecation warnings (#23075 ) Description Updated the URLs in deprecation warning messages. The URLs were previously written as raw strings and are now formatted to be clickable HTML links. Example of a broken link in the current API Reference: https://api.python.langchain.com/en/latest/chains/langchain.chains.openai_functions.extraction.create_extraction_chain_pydantic.html <img width="942" alt="Screenshot 2024-06-18 at 13 21 07" src="https://github.com/langchain-ai/langchain/assets/4854600/a1b1863c-cd03-4af2-a9bc-70375407fb00">	2024-06-18 14:49:58 -04:00
Gabriel Petracca	c6660df58e	community[minor]: Implement Doctran async execution (#22372 ) Description The DoctranTextTranslator has an async transform function that was not implemented because [the doctran library](https://github.com/psychic-api/doctran) uses a sync version of the `execute` method. - I implemented the `DoctranTextTranslator.atransform_documents()` method using `asyncio.to_thread` to run the function in a separate thread. - I updated the example in the Notebook with the new async version. - The performance improvements can be appreciated when a big document is divided into multiple chunks. Relates to: - Issue #14645: https://github.com/langchain-ai/langchain/issues/14645 - Issue #14437: https://github.com/langchain-ai/langchain/issues/14437 - https://github.com/langchain-ai/langchain/pull/15264 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-18 18:17:37 +00:00
nold	226802f0c4	community: add args_schema to SearxSearch (#22954 ) This change adds args_schema (pydantic BaseModel) to SearxSearchRun for correct schema formatting on LLM function calls Issue: currently using SearxSearchRun with OpenAI function calling returns the following error "TypeError: SearxSearchRun._run() got an unexpected keyword argument '__arg1' ". This happens because the schema sent to the LLM is "input: '{"__arg1":"foobar"}'" while the method should be called with the "query" parameter. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 17:27:39 +00:00
Finlay Macklon	616d06d7fe	community: glob multiple patterns when using DirectoryLoader (#22852 ) - Description: Updated community.langchain_community.document_loaders.directory.py to enable the use of multiple glob patterns in the `DirectoryLoader` class. Now, the glob parameter is of type `list[str] \| str` and still defaults to the same value as before. I updated the docstring of the class to reflect this, and added a unit test to community.tests.unit_tests.document_loaders.test_directory.py named `test_directory_loader_glob_multiple`. This test also shows an example of how to use the new functionality. - ~~Issue:~~Discussion Thread: https://github.com/langchain-ai/langchain/discussions/18559 - Dependencies: None - Twitter handle: N/a - [x] Add tests and docs - Added test (described above) - Updated class docstring - [x] Lint and test --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-18 09:24:50 -07:00
Raghav Dixit	55705c0f5e	LanceDB integration update (#22869 ) Added : - [x] relevance search (w/wo scores) - [x] maximal marginal search - [x] image ingestion - [x] filtering support - [x] hybrid search w reranking make test, lint_diff and format checked.	2024-06-17 20:54:26 -07:00
Chang Liu	62c8a67f56	community: add KafkaChatMessageHistory (#22216 ) Add chat history store based on Kafka. Files added: `libs/community/langchain_community/chat_message_histories/kafka.py` `docs/docs/integrations/memory/kafka_chat_message_history.ipynb` New issue to be created for future improvement: 1. Async method implementation. 2. Message retrieval based on timestamp. 3. Support for other configs when connecting to cloud hosted Kafka (e.g. add `api_key` field) 4. Improve unit testing & integration testing.	2024-06-17 20:34:01 -07:00
Lucas Tucker	e25a5966b5	docs: Standardize DocumentLoader docstrings (#22932 ) Standardizing DocumentLoader docstrings (of which there are many) This PR addresses issue #22866 and adds docstrings according to the issue's specified format (in the appendix) for files csv_loader.py and json_loader.py in langchain_community.document_loaders. In particular, the following sections have been added to both CSVLoader and JSONLoader: Setup, Instantiate, Load, Async load, and Lazy load. It may be worth adding a 'Metadata' section to the JSONLoader docstring to clarify how we want to extract the JSON metadata (using the `metadata_func` argument). The files I used to walkthrough the various sections were `example_2.json` from [HERE](https://support.oneskyapp.com/hc/en-us/articles/208047697-JSON-sample-files) and `hw_200.csv` from [HERE](https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html). --------- Co-authored-by: lucast2021 <lucast2021@headroyce.org> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-18 03:26:36 +00:00
Mohammad Mohtashim	60ba02f5db	[Community]: Fixed DDG DuckDuckGoSearchResults Docstring (#22968 ) - Description: A very small fix in the Docstring of `DuckDuckGoSearchResults` identified in the following issue. - Issue: #22961 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 03:16:24 +00:00
Eun Hye Kim	70761af8cf	community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community) (#22977 ) - PR title: "community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community)" - PR message: - Description: - Added an optional verify parameter to the Requests class with a default value of True. - Modified the get, post, patch, put, and delete methods to include the verify parameter. - Updated the _arequest async context manager to include the verify parameter. - Added the verify parameter to the GenericRequestsWrapper class and passed it to the Requests class. - Issue: This PR fixes issue #22975. - Dependencies: No additional dependencies are required for this change. - Twitter handle: @lunara_x You can check this change with below code. ```python from langchain_openai.chat_models import ChatOpenAI from langchain.requests import RequestsWrapper from langchain_community.agent_toolkits.openapi import planner from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec with open("swagger.yaml") as f: data = yaml.load(f, Loader=yaml.FullLoader) swagger_api_spec = reduce_openapi_spec(data) llm = ChatOpenAI(model='gpt-4o') swagger_requests_wrapper = RequestsWrapper(verify=False) # modified point superset_agent = planner.create_openapi_agent(swagger_api_spec, swagger_requests_wrapper, llm, allow_dangerous_requests=True, handle_parsing_errors=True) superset_agent.run( "Tell me the number and types of charts and dashboards available." ) ``` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-06-18 03:12:40 +00:00
Mohammad Mohtashim	bf839676c7	[Community]: FIxed the DocumentDBVectorSearch `_similarity_search_without_score` (#22970 ) - Description: The PR #22777 introduced a bug in `_similarity_search_without_score` which was raising the `OperationFailure` error. The mistake was syntax error for MongoDB pipeline which has been corrected now. - Issue: #22770	2024-06-17 20:08:42 -07:00
Anders Swanson	aacc6198b9	community: OCI GenAI embedding batch size (#22986 ) Thank you for contributing to LangChain! - [x] PR title: "community: OCI GenAI embedding batch size" - [x] PR message: - Issue: #22985 - [ ] Add tests and docs: N/A - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: Anders Swanson <anders.swanson@oracle.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-17 22:06:45 +00:00
Oguz Vuruskaner	dd25d08c06	community[minor]: add tool calling for DeepInfraChat (#22745 ) DeepInfra now supports tool calling for supported models. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-17 15:21:49 -04:00
maang-h	c6b7db6587	community: Add Baichuan Embeddings batch size (#22942 ) - Support batch size Baichuan updates the document, indicating that up to 16 documents can be imported at a time - Standardized model init arg names - baichuan_api_key -> api_key - model_name -> model	2024-06-17 14:11:04 -04:00
Shubham Pandey	56ac94e014	community[minor]: add `ChatSnowflakeCortex` chat model (#21490 ) Description: This PR adds a chat model integration for [Snowflake Cortex](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions), which gives an instant access to industry-leading large language models (LLMs) trained by researchers at companies like Mistral, Reka, Meta, and Google, including [Snowflake Arctic](https://www.snowflake.com/en/data-cloud/arctic/), an open enterprise-grade model developed by Snowflake. Dependencies: Snowflake's [snowpark](https://pypi.org/project/snowflake-snowpark-python/) library is required for using this integration. Twitter handle: [@gethouseware](https://twitter.com/gethouseware) - [x] Add tests and docs: 1. integration tests: `libs/community/tests/integration_tests/chat_models/test_snowflake.py` 2. unit tests: `libs/community/tests/unit_tests/chat_models/test_snowflake.py` 3. example notebook: `docs/docs/integrations/chat/snowflake.ipynb` - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-06-17 09:47:05 -07:00
Christopher Tee	ada03dd273	community(you): Better support for You.com News API (#22622 ) ## Description While `YouRetriever` supports both You.com's Search and News APIs, news is supported as an afterthought. More specifically, not all of the News API parameters are exposed for the user, only those that happen to overlap with the Search API. This PR: - improves support for both APIs, exposing the remaining News API parameters while retaining backward compatibility - refactor some REST parameter generation logic - updates the docstring of `YouSearchAPIWrapper` - add input validation and warnings to ensure parameters are properly set by user - 🚨 Breaking: Limit the news results to `k` items If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-15 20:05:19 +00:00
maang-h	7a0af56177	docs: update ZhipuAI ChatModel docstring (#22934 ) - Description: Update ZhipuAI ChatModel rich docstring - Issue: the issue #22296	2024-06-15 09:12:21 -04:00
Bitmonkey	570d45b2a1	Update ollama.py with optional raw setting. (#21486 ) Ollama has a raw option now. https://github.com/ollama/ollama/blob/main/docs/api.md Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com> Co-authored-by: isaac hershenson <ihershenson@hmc.edu>	2024-06-14 17:19:26 -07:00
caiyueliang	9944ad7f5f	community: 'Solve the issue where the _search function in ElasticsearchStore supports passing a query_vector parameter, but the parameter does not take effect. (#21532 ) Issue: When using the similarity_search_with_score function in ElasticsearchStore, I expected to pass in the query_vector that I have already obtained. I noticed that the _search function does support the query_vector parameter, but it seems to be ineffective. I am attempting to resolve this issue. Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>	2024-06-14 17:13:11 -07:00
Baskar Gopinath	c4f2bc9540	docs: Fix wrongly referenced class name in confluence.py (#22879 ) Fixes #22542 Changed ConfluenceReader to ConfluenceLoader	2024-06-14 14:00:48 -07:00
Philippe PRADOS	b61de9728e	community[minor]: Fix long_context_reorder.py async (#22839 ) Implement `async def atransform_documents( self, documents: Sequence[Document], **kwargs: Any ) -> Sequence[Document]` for `LongContextReorder`	2024-06-14 13:55:18 -04:00
Eugene Yurtsev	c72bcda4f2	community[major], experimental[patch]: Remove Python REPL from community (#22904 ) Remove the REPL from community, and suggest an alternative import from langchain_experimental. Fix for this issue: https://github.com/langchain-ai/langchain/issues/14345 This is not a bug in the code or an actual security risk. The python REPL itself is behaving as expected. The PR is done to appease blanket security policies that are just looking for the presence of exec in the code. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-14 17:53:29 +00:00
Eugene Yurtsev	9a877c7adb	community[patch]: SitemapLoader restrict depth of parsing sitemap (CVE-2024-2965) (#22903 ) This PR restricts the depth to which the sitemap can be parsed. Fix for: CVE-2024-2965	2024-06-14 13:04:40 -04:00
Eugene Yurtsev	4a77a3ab19	core[patch]: fix validation of @deprecated decorator (#22513 ) This PR moves the validation of the decorator to a better place to avoid creating bugs while deprecating code. Prevent issues like this from arising: https://github.com/langchain-ai/langchain/issues/22510 we should replace with a linter at some point that just does static analysis	2024-06-14 16:52:30 +00:00
Mohammad Mohtashim	d1b7a934aa	[Community]: HuggingFaceCrossEncoder `score` accounting for <not-relevant score,relevant score> pairs. (#22578 ) - Description: Some of the Cross-Encoder models provide scores in pairs, i.e., <not-relevant score (higher means the document is less relevant to the query), relevant score (higher means the document is more relevant to the query)>. However, the `HuggingFaceCrossEncoder` `score` method does not currently take into account the pair situation. This PR addresses this issue by modifying the method to consider only the relevant score if score is being provided in pair. The reason for focusing on the relevant score is that the compressors select the top-n documents based on relevance. - Issue: #22556 - Please also refer to this [comment](https://github.com/UKPLab/sentence-transformers/issues/568#issuecomment-729153075)	2024-06-14 08:28:24 -07:00
Thanh Nguyen	b5e2ba3a47	community[minor]: add chat model llamacpp (#22589 ) - PR title: [community] add chat model llamacpp - PR message: - Description: This PR introduces a new chat model integration with llamacpp_python, designed to work similarly to the existing ChatOpenAI model. + Work well with instructed chat, chain and function/tool calling. + Work with LangGraph (persistent memory, tool calling), will update soon - Dependencies: This change requires the llamacpp_python library to be installed. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-14 14:51:43 +00:00
Anindyadeep	c417803908	community[minor]: Prem Templates (#22783 ) This PR adds the feature add Prem Template feature in ChatPremAI. Additionally it fixes a minor bug for API auth error when API passed through arguments.	2024-06-13 19:59:28 -07:00
maang-h	1055b9a309	community[minor]: Implement ZhipuAIEmbeddings interface (#22821 ) - Description: Implement ZhipuAIEmbeddings interface, include: - The `embed_query` method - The `embed_documents` method refer to [ZhipuAI Embedding-2](https://open.bigmodel.cn/dev/api#text_embedding) --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-06-13 19:45:11 -07:00
Isaac Francisco	75e966a2fa	docs, cli[patch]: document loaders doc template (#22862 ) From: https://github.com/langchain-ai/langchain/pull/22290 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-13 19:28:57 -07:00
Eugene Yurtsev	8f7cc73817	ci: Add script to check for pickle usage in community (#22863 ) Add script to check for pickle usage in community.	2024-06-13 16:13:15 -04:00
Eugene Yurtsev	77209f315e	community[patch]: FAISS VectorStore deserializer should be opt-in (#22861 ) FAISS deserializer uses pickle module. Users have to opt-in to de-serialize.	2024-06-13 15:48:13 -04:00
Isaac Francisco	869523ad72	[docs]: added info for TavilySearchResults (#22765 )	2024-06-13 12:14:11 -07:00
Isaac Francisco	345fd3a556	minor functionality change: adding API functionality to tavilysearch (#22761 )	2024-06-13 11:10:28 -07:00
Isaac Francisco	034257e9bf	docs: improved recursive url loader docs (#22648 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-13 11:09:35 -07:00
HyoJin Kang	b6bf2bb234	community[patch]: fix database uri type in SQLDatabase (#22661 ) Description sqlalchemy uses "sqlalchemy.engine.URL" type for db uri argument. Added 'URL' type for compatibility. Issue: None Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-12 15:11:00 -04:00
Eugene Yurtsev	265e650e64	community[patch]: Update root_validators embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub, Toolkits: Connery, ChatModels: PAI_EAS, (#22828 ) This PR updates root validators for: * Embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub * Toolkits: Connery * ChatModels: PAI_EAS Following this issue: https://github.com/langchain-ai/langchain/issues/22819	2024-06-12 13:59:05 -04:00
JonZeolla	32ba8cfab0	community[minor]: implement huggingface show_progress consistently (#22682 ) - Description: This implements `show_progress` more consistently (i.e. it is also added to the `HuggingFaceBgeEmbeddings` object). - Issue: This implements `show_progress` more consistently in the embeddings huggingface classes. Previously this could have been set via `encode_kwargs`. - Dependencies: None - Twitter handle: @jonzeolla	2024-06-12 17:30:56 +00:00
mrhbj	a1268d9e9a	community[patch]: fix hunyuan message include chinese signature error (#22795 ) (#22796 ) … (#22795) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-06-12 12:30:22 -04:00
Mr. Lance E Sloan «UMich»	08c466c603	community[patch]: bugfix for `YoutubeLoader`'s `LINES` format (#22815 ) - Description: A change I submitted recently introduced a bug in `YoutubeLoader`'s `LINES` output format. In those conditions, curly braces ("`{}`") creates a set, not a dictionary. This bugfix explicitly specifies that a dictionary is created. - Issue: N/A - Dependencies: N/A - Twitter: lsloan_umich - Mastodon: [lsloan@mastodon.social](https://mastodon.social/@lsloan)	2024-06-12 12:29:34 -04:00
Philippe PRADOS	23c22fcbc9	langchain[minor]: Make EmbeddingsFilters async (#22737 ) Add native async implementation for EmbeddingsFilter	2024-06-12 12:27:26 -04:00
mrhbj	9212c9fcb8	community[patch]: fix hunyuan client json analysis (#22452 ) (#22767 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-11 19:05:18 +00:00
Rohan Aggarwal	86e8224cf1	community[patch]: Support for old clients (Thin and Thick) Oracle Vector Store (#22766 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" Support for old clients (Thin and Thick) Oracle Vector Store - [ ] PR message: *Delete this entire checklist* and replace with Support for old clients (Thin and Thick) Oracle Vector Store - [ ] Add tests and docs: If you're adding a new integration, please include Have our own local tests --------- Co-authored-by: rohan.aggarwal@oracle.com <rohaagga@phoenix95642.dev3sub2phx.databasede3phx.oraclevcn.com>	2024-06-11 11:36:06 -07:00
Mr. Lance E Sloan «UMich»	84dc2dd059	community[patch]: Load YouTube transcripts (captions) as fixed-duration chunks with start times (#21710 ) - Description: Add a new format, `CHUNKS`, to `langchain_community.document_loaders.youtube.YoutubeLoader` which creates multiple `Document` objects from YouTube video transcripts (captions), each of a fixed duration. The metadata of each chunk `Document` includes the start time of each one and a URL to that time in the video on the YouTube website. I had implemented this for UMich (@umich-its-ai) in a local module, but it makes sense to contribute this to LangChain community for all to benefit and to simplify maintenance. - Issue: N/A - Dependencies: N/A - Twitter: lsloan_umich - Mastodon: [lsloan@mastodon.social](https://mastodon.social/@lsloan) With regards to tests and documentation, most existing features of the `YoutubeLoader` class are not tested. Only the `YoutubeLoader.extract_video_id()` static method had a test. However, while I was waiting for this PR to be reviewed and merged, I had time to add a test for the chunking feature I've proposed in this PR. I have added an example of using chunking to the `docs/docs/integrations/document_loaders/youtube_transcript.ipynb` notebook. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-11 17:44:36 +00:00
Aayush Kataria	71811e0547	community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676 ) This PR add supports for Azure Cosmos DB for NoSQL vector store. Summary: Description: added vector store integration for Azure Cosmos DB for NoSQL Vector Store, Dependencies: azure-cosmos dependency, Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-11 10:34:01 -07:00
Mohammad Mohtashim	36cad5d25c	[Community]: Added Metadata filter support for DocumentDB Vector Store (#22777 ) - Description: As pointed out in this issue #22770, DocumentDB `similarity_search` does not support filtering through metadata which this PR adds by passing in the parameter `filter`. Also this PR fixes a minor Documentation error. - Issue: #22770 --------- Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-11 16:37:53 +00:00
Dmitry Stepanov	912751e268	Ollama vision support (#22734 ) Description: Ollama vision with messages in OpenAI-style support `{ "image_url": { "url": ... } }` Issue: #22460 Added flexible solution for ChatOllama to support chat messages with images. Works when you provide either `image_url` as a string or as a dict with "url" inside (like OpenAI does). So it makes available to use tuples with `ChatPromptTemplate.from_messages()` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-11 16:10:19 +00:00
Jaeyeon Kim(김재연)	ce4e29ae42	community[minor]: fix redis store docstring and streamline initialization code (#22730 ) Thank you for contributing to LangChain! ### Description Fix the example in the docstring of redis store. Change the initilization logic and remove redundant check, enhance error message. ### Issue The example in docstring of how to use redis store was wrong. ![image](https://github.com/langchain-ai/langchain/assets/37469330/78c5d9ce-ee66-45b3-8dfe-ea29f125e6e9) ### Dependencies Nothing - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-11 14:08:05 +00:00
am-kinetica	ad101adec8	community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724 ) - [ ] Miscellaneous updates and fixes: - Description: Handled error in querying; quotes in table names; updated gpudb API - Issue: Threw an error with an error message difficult to understand if a query failed or returned no records - Dependencies: Updated GPUDB API version to `7.2.0.9` @baskaryan @hwchase17	2024-06-11 10:01:26 -04:00
Mathis Joffre	ea43f40daf	community[minor]: Add support for OVHcloud AI Endpoints Embedding (#22667 ) Description: Add support for [OVHcloud AI Endpoints](https://endpoints.ai.cloud.ovh.net/) Embedding models. Inspired by: https://gist.github.com/gmasse/e1f99339e161f4830df6be5d0095349a Signed-off-by: Joffref <mariusjoffre@gmail.com>	2024-06-10 21:07:25 +00:00
Eugene Yurtsev	05d31a2f00	community[patch]: Add missing type annotations (#22758 ) Add missing type annotations to objects in community. These missing type annotations will raise type errors in pydantic 2.	2024-06-10 16:59:28 -04:00
Bharat Ramanathan	2b5631a6be	community[patch]: fix `WandbTracer` to work with new "RunV2" API (#22673 ) - Description: This PR updates the `WandbTracer` to work with the new RunV2 API so that wandb Traces logging works correctly for new LangChain versions. Here's an example [run](https://wandb.ai/parambharat/langchain-tracing/runs/wpm99ftq) from the existing tests - Issue: https://github.com/wandb/wandb/issues/7762 - Twitter handle: @ParamBharat _If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17._	2024-06-10 13:56:35 -07:00
Oguz Vuruskaner	f0f4532579	community[patch]: fix deepinfra inference (#22680 ) This PR includes: 1. Update of default model to LLama3. 2. Handle some 400x errors with more user friendly error messages. 3. Handle user errors.	2024-06-10 13:55:55 -07:00
Tomaz Bratanic	76a193decc	community[patch]: Add function response to graph cypher qa chain (#22690 ) LLMs struggle with Graph RAG, because it's different from vector RAG in a way that you don't provide the whole context, only the answer and the LLM has to believe. However, that doesn't really work a lot of the time. However, if you wrap the context as function response the accuracy is much better. btw... `union[LLMChain, Runnable]` is linting fun, that's why so many ignores	2024-06-10 13:52:17 -07:00
X-HAN	34edfe4a16	community[minor]: add Volcengine Rerank (#22700 ) Description: this PR adds Volcengine Rerank capability to Langchain, you can find Volcengine Rerank API from [here](https://www.volcengine.com/docs/84313/1254474) & [here](https://www.volcengine.com/docs/84313/1254605). [Volcengine](https://www.volcengine.com/) is a cloud service platform developed by ByteDance, the parent company of TikTok. You can obtain Volcengine API AK/SK from [here](https://www.volcengine.com/docs/84313/1254553). Dependencies: VolcengineRerank depends on `volcengine` python package. Twitter handle: my twitter/x account is https://x.com/LastMonopoly and I'd like a mention, thank you! Tests and docs 1. integration test: `test_volcengine_rerank.py` 2. example notebook: `volcengine_rerank.ipynb` Lint and test: I have run `make format`, `make lint` and `make test` from the root of the package I've modified.	2024-06-10 13:41:05 -07:00
Mohammad Mohtashim	c3cce98d86	community[patch]: Small Fix in OutlookMessageLoader (Close the Message once Open) (#22744 ) - Description: A very small fix where we close the message when it opened - Issue: #22729	2024-06-10 13:08:39 -07:00
Max Mulatz	058a64c563	Community[minor]: Add language parser for Elixir (#22742 ) Hi 👋 First off, thanks a ton for your work on this 💚 Really appreciate what you're providing here for the community. ## Description This PR adds a basic language parser for the [Elixir](https://elixir-lang.org/) programming language. The parser code is based upon the approach outlined in https://github.com/langchain-ai/langchain/pull/13318: it's using `tree-sitter` under the hood and aligns with all the other `tree-sitter` based parses added that PR. The `CHUNK_QUERY` I'm using here is probably not the most sophisticated one, but it worked for my application. It's a starting point to provide "core" parsing support for Elixir in LangChain. It enables people to use the language parser out in real world applications which may then lead to further tweaking of the queries. I consider this PR just the ground work. - Dependencies: requires `tree-sitter` and `tree-sitter-languages` from the extended dependencies - Twitter handle:`@bitcrowd` ## Checklist - [x] PR title: "package: description" - [x] Add tests and docs - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. <!-- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->	2024-06-10 15:56:57 +00:00
Enzo Poggio	8f019e91d7	community[patch]: Use Custom Logger Instead of Root Logger in get_user_agent Function (#22691 ) ## Description This PR addresses a logging inconsistency in the `get_user_agent` function. Previously, the function was using the root logger to log a warning message when the "USER_AGENT" environment variable was not set. This bypassed the custom logger `log` that was created at the start of the module, leading to potential inconsistencies in logging behavior. Changes: - Replaced `logging.warning` with `log.warning` in the `get_user_agent` function to ensure that the custom logger is used. This change ensures that all logging in the `get_user_agent` function respects the configurations of the custom logger, leading to more consistent and predictable logging behavior. ## Dependencies None ## Issue None ## Tests and docs ☝🏻 see description ## `make format`, `make lint` & `cd libs/community; make test` ```shell > make format poetry run ruff format docs templates cookbook 1417 files left unchanged poetry run ruff check --select I --fix docs templates cookbook All checks passed! ``` ```shell > make lint poetry run ruff check docs templates cookbook All checks passed! poetry run ruff format docs templates cookbook --diff 1417 files already formatted poetry run ruff check --select I docs templates cookbook All checks passed! git grep 'from langchain import' docs/docs templates cookbook \| grep -vE 'from langchain import (hub)' && exit 1 \|\| exit 0 ``` ~cd libs/community; make test~ too much dependencies for integration ... ```shell > poetry run pytest tests/unit_tests .... ==== 884 passed, 466 skipped, 4447 warnings in 15.93s ==== ``` I choose you randomly : @ccurme	2024-06-08 02:33:07 +00:00
Philippe PRADOS	9aabb446c5	community[minor]: Add SQL storage implementation (#22207 ) Hello @eyurtsev - package: langchain-comminity - Description: Add SQL implementation for docstore. A new implementation, in line with my other PR ([async PGVector](https://github.com/langchain-ai/langchain-postgres/pull/32), [SQLChatMessageMemory](https://github.com/langchain-ai/langchain/pull/22065)) - Twitter handler: pprados --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Piotr Mardziel <piotrm@gmail.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-07 21:17:02 +00:00
Nithish Raghunandanan	f2f0e0e13d	couchbase: Add the initial version of Couchbase partner package (#22087 ) Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-06-07 14:04:08 -07:00
Cahid Arda Öz	6c07eb0c12	community[minor]: Add UpstashRatelimitHandler (#21885 ) Adding `UpstashRatelimitHandler` callback for rate limiting based on number of chain invocations or LLM token usage. For more details, see [upstash/ratelimit-py repository](https://github.com/upstash/ratelimit-py) or the notebook guide included in this PR. Twitter handle: @cahidarda --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-06-07 21:02:06 +00:00
Erick Friis	a24a9c6427	multiple: get rid of pyproject extras (#22581 ) They cause `poetry lock` to take a ton of time, and `uv pip install` can resolve the constraints from these toml files in trivial time (addressing problem with #19153) This allows us to properly upgrade lockfile dependencies moving forward, which revealed some issues that were either fixed or type-ignored (see file comments)	2024-06-06 15:45:22 -07:00
Xiangrui Meng	f26ab93df8	community: support Databricks Unity Catalog functions as LangChain tools (#22555 ) This PR adds support for using Databricks Unity Catalog functions as LangChain tools, which runs inside a Databricks SQL warehouse. * An example notebook is provided.	2024-06-06 09:38:50 -07:00
lucasiscovici	05bf98b2f9	community[patch]: pgvector replace nin_ by not_in (#22619 ) - [ ] community: "pgvector: replace nin_ by not_in" - [ ] PR message: nin_ do not exist in sqlalchemy orm, it's not_in	2024-06-06 12:17:22 -04:00
Isaac Francisco	ba3e219d83	community[patch]: recursive url loader fix and unit tests (#22521 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-05 17:56:20 -07:00
Jeffrey Mak	5fc5ed463c	community[patch]:Support filter for AzureAISearchRetriever (#22303 ) Description: The AzureAISearchRetriever does not support the "$filter" argument offered in the AISearch API: https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-get?view=rest-searchservice-2023-11-01&tabs=HTTP The $filter allows filtering of indexes based on values in metadata. Issue: https://github.com/langchain-ai/langchain/issues/19885 Dependencies: No Twitter handle: @Jeffreym9M - [ ] Add tests and docs: Not relevant - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2024-06-05 16:53:19 -07:00
Isaac Francisco	148088a588	docs: duckduckgosearch options listed (#22568 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-06-05 23:29:47 +00:00
X-HAN	62f13f95e4	community[minor]: add DashScope Rerank (#22403 ) Description: this PR adds DashScope Rerank capability to Langchain, you can find DashScope Rerank API from [here](https://help.aliyun.com/document_detail/2780058.html?spm=a2c4g.2780059.0.0.6d995024FlrJ12) & [here](https://help.aliyun.com/document_detail/2780059.html?spm=a2c4g.2780058.0.0.63f75024cr11N9). [DashScope](https://dashscope.aliyun.com/) is the generative AI service from Alibaba Cloud (Aliyun). You can create DashScope API key from [here](https://bailian.console.aliyun.com/?apiKey=1#/api-key). Dependencies: DashScopeRerank depends on `dashscope` python package. Twitter handle: my twitter/x account is https://x.com/LastMonopoly and I'd like a mention, thanks you! Tests and docs 1. integration test: `test_dashscope_rerank.py` 2. example notebook: `dashscope_rerank.ipynb` Lint and test: I have run `make format`, `make lint` and `make test` from the root of the package I've modified. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-05 15:40:21 -07:00
Ethan Yang	29064848f9	[Community]add option to delete the prompt from HF output (#22225 ) This will help to solve pattern mismatching issue when parsing the output in Agent. https://github.com/langchain-ai/langchain/issues/21912	2024-06-05 18:38:54 -04:00
Bagatur	584a1e30ac	community[patch]: AzureSearch async functions (#22075 )	2024-06-05 14:39:54 -07:00
ccurme	af129974a3	community: update how OpenAIAssistantV2Runnable creates threads with tool_resources (#22549 ) https://github.com/langchain-ai/langchain/issues/22503	2024-06-05 14:19:41 -04:00

... 3 4 5 6 7 ...

1463 Commits