langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-07-07 05:30:39 +00:00

Author	SHA1	Message	Date
Michael Chin	e120378695	community: Additional AWS deprecations (#29447 ) Added deprecation warnings for a few more classes that weremoved to `langchain-aws` package: - [SageMaker Endpoint LLM](https://python.langchain.com/api_reference/aws/retrievers/langchain_aws.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html) - [Amazon Kendra retriever](https://python.langchain.com/api_reference/aws/retrievers/langchain_aws.retrievers.kendra.AmazonKendraRetriever.html) - [Amazon Bedrock Knowledge Bases retriever](https://python.langchain.com/api_reference/aws/retrievers/langchain_aws.retrievers.bedrock.AmazonKnowledgeBasesRetriever.html)	2025-01-28 09:50:14 -05:00
Erick Friis	2d776351af	community: release 0.3.16 (#29452 )	2025-01-28 07:44:54 +00:00
Adrián Panella	1551d9750c	community(doc_loaders): allow any credential type in AzureAIDocumentI… (#29289 ) allow any credential type in AzureAIDocumentInteligence, not only `api_key`. This allows to use any of the credentials types integrated with AD. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-27 20:56:30 +00:00
Jorge Piedrahita Ortiz	3b886cdbb2	libs: add sambanova-lagchain integration package (#29417 ) - Description:: Add sambanova-langchain integration package as suggested in previous PRs --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-27 20:34:55 +00:00
Mohammad Anash	aba1fd0bd4	fixed similarity search with score error #29407 (#29413 ) Description: Fix TypeError in AzureSearch similarity_search_with_score by removing search_type from kwargs before passing to underlying requests. This resolves issue #29407 where search_type was being incorrectly passed through to Session.request(). Issue: #29407 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-27 20:34:42 +00:00
Teruaki Ishizaki	3fce78994e	community: Fixed the procedure of initializing pad_token_id (#29434 ) - Description: Add to check pad_token_id and eos_token_id of model config. It seems that this is the same bug as the HuggingFace TGI bug. In addition, the source code of libs/partners/huggingface/langchain_huggingface/llms/huggingface_pipeline.py also requires similar changes. - Issue: #29431 - Dependencies: none - Twitter handle: tell14	2025-01-27 14:54:54 -05:00
Loris Alexandre	e4921239a6	community: missing mandatory parameter partition_key for AzureCosmosDBNoSqlVectorSearch (#29382 ) - Description: the `delete` function of AzureCosmosDBNoSqlVectorSearch is using `self._container.delete_item(document_id)` which miss a mandatory parameter `partition_key` We use the class function `delete_document_by_id` to provide a default `partition_key` - Issue: #29372 - Dependencies: None - Twitter handle: None Co-authored-by: Loris Alexandre <loris.alexandre@boursorama.fr>	2025-01-23 10:05:10 -05:00
Terry Tan	ec0ebb76f2	community: fix Google Scholar tool errors (#29371 ) Resolve https://github.com/langchain-ai/langchain/issues/27557	2025-01-23 10:03:01 -05:00
江同学呀	a1e62070d0	community: Fix the problem of error reporting when OCR extracts text from PDF. (#29378 ) - Description: The issue has been fixed where images could not be recognized from ```xObject[obj]["/Filter"]``` (whose value can be either a string or a list of strings) in the ```_extract_images_from_page()``` method. It also resolves the bug where vectorization by Faiss fails due to the failure of image extraction from a PDF containing only images```IndexError: list index out of range```. ![69a60f3f6bd474641b9126d74bb18f7e](https://github.com/user-attachments/assets/dc9e098d-2862-49f7-93b0-00f1056727dc) - Issue: Fix the following issues: [#15227 ](https://github.com/langchain-ai/langchain/issues/15227) [#22892 ](https://github.com/langchain-ai/langchain/issues/22892) [#26652 ](https://github.com/langchain-ai/langchain/issues/26652) [#27153 ](https://github.com/langchain-ai/langchain/issues/27153) Related issues: [#7067 ](https://github.com/langchain-ai/langchain/issues/7067) - Dependencies: None - Twitter handle: None --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-23 15:01:52 +00:00
Tim Mallezie	a13faab6b7	community; allow to set gitlab url in gitlab tool in constrictor (#29380 ) This pr, expands the gitlab url so it can also be set in a constructor, instead of only through env variables. This allows to do something like this. ``` # Create the GitLab API wrapper gitlab_api = GitLabAPIWrapper( gitlab_url=self.gitlab_url, gitlab_personal_access_token=self.gitlab_personal_access_token, gitlab_repository=self.gitlab_repository, gitlab_branch=self.gitlab_branch, gitlab_base_branch=self.gitlab_base_branch, ) ``` Where before you could not set the url in the constructor. Co-authored-by: Tim Mallezie <tim.mallezie@dropsolid.com>	2025-01-23 09:36:27 -05:00
Macs Dickinson	7378c955db	community: adds support for getting github releases for the configured repository (#29318 ) Description: adds support for github tool to query github releases on the configure respository Issue: N/A Dependencies: N/A Twitter handle: @macsdickinson --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-22 15:45:52 +00:00
Bhav Sardana	d6a7aaa97d	community: Fix for Pydantic model validator of GoogleApiClient (#29346 ) - [ ] PR message: Delete this entire checklist* and replace with - Description: Fix for pedantic model validator for GoogleApiHandler - Issue: the issue #29165 - [ ] Lint and test*: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. --------- Signed-off-by: Bhav Sardana <sardana.bhav@gmail.com>	2025-01-21 15:17:43 -05:00
Bagatur	536b44a47f	community[patch]: Release 0.3.15 (#29325 )	2025-01-21 03:10:07 +00:00
Bagatur	923e6fb321	core[patch]: 0.3.31 (#29320 )	2025-01-21 01:17:31 +00:00
Philippe PRADOS	4efc5093c1	community[minor]: Refactoring PyMuPDF parser, loader and add image blob parsers (#29063 ) * Adds BlobParsers for images. These implementations can take an image and produce one or more documents per image. This interface can be used for exposing OCR capabilities. * Update PyMuPDFParser and Loader to standardize metadata, handle images, improve table extraction etc. - Twitter handle: pprados This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses to prepare the update of all parsers. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970). --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-01-20 15:15:43 -05:00
CLOVA Studio 개발	7a95ffc775	community: fix some features on Naver ChatModel & embedding model 2 (#29243 ) ## Description - Responding to `NCP API Key` changes. - To fix `ChatClovaX` `astream` function to raise `SSEError` when an error event occurs. - To add `token length` and `ai_filter` to ChatClovaX's `response_metadata`. - To update document for apply NCP API Key changes. cc. @efriis @vbarda	2025-01-20 11:01:03 -05:00
Sangyun_LEE	5d64597490	docs: fix broken Appearance of langchain_community/document_loaders/recursive_url_loader API Reference (#29305 ) # PR mesesage ## Description Fixed a broken Appearance of RecurisveUrlLoader API Reference. ### Before <p align="center"> <img width="750" alt="image" src="https://github.com/user-attachments/assets/f39df65d-b788-411d-88af-8bfa2607c00b" /> <img width="750" alt="image" src="https://github.com/user-attachments/assets/b8a92b70-4548-4b4a-965f-026faeebd0ec" /> </p> ### After <p align="center"> <img width="750" alt="image" src="https://github.com/user-attachments/assets/8ea28146-de45-42e2-b346-3004ec4dfc55" /> <img width="750" alt="image" src="https://github.com/user-attachments/assets/914c6966-4055-45d3-baeb-2d97eab06fe7" /> </p> ## Issue: N/A ## Dependencies None ## Twitter handle N/A # Add tests and docs Not applicable; this change only affects documentation. # Lint and test Ran make format, make lint, and make test to ensure no issues.	2025-01-20 10:56:59 -05:00
Mohammad Mohtashim	b5fbebb3c8	(Community): Changing the BaseURL and Model for MiniMax (#29299 ) - Description: Changed the Base Default Model and Base URL to correct versions. Plus added a more explicit exception if user provides an invalid API Key - Issue: #29278	2025-01-19 14:15:02 -05:00
ThomasSaulou	e9abe583b2	chatperplexity stream-citations in additional kwargs (#29273 ) chatperplexity stream-citations in additional kwargs --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-18 22:31:10 +00:00
TheSongg	1cd4d8d101	[langchain_community.llms.xinference]: Rewrite _stream() method and support stream() method in xinference.py (#29259 ) - [ ] PR title:[langchain_community.llms.xinference]: Rewrite _stream() method and support stream() method in xinference.py - [ ] PR message: Rewrite the _stream method so that the chain.stream() can be used to return data streams. chain = prompt \| llm chain.stream(input=user_input) - [ ] tests: from langchain_community.llms import Xinference from langchain.prompts import PromptTemplate llm = Xinference( server_url="http://0.0.0.0:9997", # replace your xinference server url model_uid={model_uid} # replace model_uid with the model UID return from launching the model stream = True ) prompt = PromptTemplate(input=['country'], template="Q: where can we visit in the capital of {country}? A:") chain = prompt \| llm chain.stream(input={'country': 'France'})	2025-01-17 20:31:59 -05:00
Luis Lopez	75663f2cae	community: Add cost per 1K tokens for fine-tuned model cached input (#29248 ) ### Description - Since there is no cost per 1k input tokens for a fine-tuned cached version of `gpt-4o-mini-2024-07-18` is not available when using the `OpenAICallbackHandler`, it raises an error when trying to make calls with such model. - To add the price in the `MODEL_COST_PER_1K_TOKENS` dictionary cc. @efriis	2025-01-16 15:19:26 -05:00
Junon	667d2a57fd	add mode arg to OBSFileLoader.load() method (#29246 ) - Description: add mode arg to OBSFileLoader.load() method - Issue: #29245 - Dependencies: no dependencies required for this change --------- Co-authored-by: Junon_Gz <junon_gz@qq.com>	2025-01-16 11:09:04 -05:00
Nadeem Sajjad	eaf2fb287f	community(pypdfloader): added page_label in metadata for pypdf loader (#29225 ) # Description ## Summary This PR adds support for handling multi-labeled page numbers in the PyPDFLoader. Some PDFs use complex page numbering systems where the actual content may begin after multiple introductory pages. The page_label field helps accurately reflect the document’s page structure, making it easier to handle such cases during document parsing. ## Motivation This feature improves document parsing accuracy by allowing users to access the actual page labels instead of relying only on the physical page numbers. This is particularly useful for documents where the first few pages have roman numerals or other non-standard page labels. ## Use Case This feature is especially useful for Retrieval-Augmented Generation (RAG) systems where users may reference page numbers when asking questions. Some PDFs have both labeled page numbers (like roman numerals for introductory sections) and index-based page numbers. For example, a user might ask: "What is mentioned on page 5?" The system can now check both: • Index-based page number (page) • Labeled page number (page_label) This dual-check helps improve retrieval accuracy. Additionally, the results can be validated with an agent or tool to ensure the retrieved pages match the user’s query contextually. ## Code Changes - Added a page_label field to the metadata of the Document class in PyPDFLoader. - Implemented support for retrieving page_label from the pdf_reader.page_labels. - Created a test case (test_pypdf_loader_with_multi_label_page_numbers) with a sample PDF containing multi-labeled pages (geotopo-komprimiert.pdf) [[Source of pdf](https://github.com/py-pdf/sample-files/blob/main/009-pdflatex-geotopo/GeoTopo-komprimiert.pdf)]. - Updated existing tests to ensure compatibility and verify page_label extraction. ## Tests Added - Added a new test case for a PDF with multi-labeled pages. - Verified both page and page_label metadata fields are correctly extracted. ## Screenshots <img width="549" alt="image" src="https://github.com/user-attachments/assets/65db9f5c-032e-4592-926f-824777c28f33" />	2025-01-15 14:18:07 -05:00
TheSongg	4867fe7ac8	[langchain_community.llms.xinference]: fix error in xinference.py (#29216 ) - [ ] PR title: [langchain_community.llms.xinference]: fix error in xinference.py - [ ] PR message: - The old code raised an ValidationError: pydantic_core._pydantic_core.ValidationError: 1 validation error for Xinference when import Xinference from xinference.py. This issue has been resolved by adjusting it's type and default value. File "/media/vdc/python/lib/python3.10/site-packages/pydantic/main.py", line 212, in __init__ validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) pydantic_core._pydantic_core.ValidationError: 1 validation error for Xinference client Field required [type=missing, input_value={'server_url': 'http://10...t4', 'model_kwargs': {}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing - [ ] tests: from langchain_community.llms import Xinference llm = Xinference( server_url="http://0.0.0.0:9997", # replace your xinference server url model_uid={model_uid} # replace model_uid with the model UID return from launching the model )	2025-01-15 10:11:26 -05:00
Syed Baqar Abbas	4278046329	[fix] Convert table names to list for compatibility in SQLDatabase (#29229 ) - [langchain_community.utilities.SQLDatabase] [fix] Convert table names to list for compatibility in SQLDatabase: - The issue #29227 is being fixed here - The "package" modified is community - The issue lied in this block of code: `44b41b699c/libs/community/langchain_community/utilities/sql_database.py (L72-L77)` - [langchain_community.utilities.SQLDatabase] [fix] Convert table names to list for compatibility in SQLDatabase: - Description: When the SQLDatabase is initialized, it runs a code `self._inspector.get_table_names(schema=schema)` which expects an output of list. However, with some connectors (such as snowflake) the data type returned could be another iterable. This results in a type error when concatenating the table_names to view_names. I have added explicit type casting to prevent this. - Issue: The issue #29227 is being fixed here - Dependencies: None - Twitter handle: @BaqarAbbas2001 ## Additional Information When the following method is called for a Snowflake database: `44b41b699c/libs/community/langchain_community/utilities/sql_database.py (L75)` Snowflake under the hood calls: ```python from snowflake.sqlalchemy.snowdialect import SnowflakeDialect SnowflakeDialect.get_table_names ``` This method returns a `dict_keys()` object which is incompatible to concatenate with a list and results in a `TypeError` ### Relevant Library Versions - snowflake-sqlalchemy: 1.7.2 - snowflake-connector-python: 3.12.4 - sqlalchemy: 2.0.20 - langchain_community: 0.3.14	2025-01-15 10:00:03 -05:00
Jin Hyung Ahn	05554265b4	community: Fix ConfluenceLoader load() failure caused by deleted pages (#29232 ) ## Description This PR modifies the is_public_page function in ConfluenceLoader to prevent exceptions caused by deleted pages during the execution of ConfluenceLoader.process_pages(). Example scenario: Consider the following usage of ConfluenceLoader: ```python import os from langchain_community.document_loaders import ConfluenceLoader loader = ConfluenceLoader( url=os.getenv("BASE_URL"), token=os.getenv("TOKEN"), max_pages=1000, cql=f'type=page and lastmodified >= "2020-01-01 00:00"', include_restricted_content=False, ) # Raised Exception : HTTPError: Outdated version/old_draft/trashed? Cannot find content Please provide valid ContentId. documents = loader.load() ``` If a deleted page exists within the query result, the is_public_page function would previously raise an exception when calling get_all_restrictions_for_content, causing the loader.load() process to fail for all pages. By adding a pre-check for the page's "current" status, unnecessary API calls to get_all_restrictions_for_content for non-current pages are avoided. This fix ensures that such pages are skipped without affecting the rest of the loading process. ## Issue N/A (No specific issue number) ## Dependencies No new dependencies are introduced with this change. ## Twitter handle [@zenoengine](https://x.com/zenoengine)	2025-01-15 09:56:23 -05:00
Mohammad Mohtashim	21eb39dff0	[Community]: AzureOpenAIWhisperParser Authenication Fix (#29135 ) - Description: `AzureOpenAIWhisperParser` authentication fix as stated in the issue. - Issue: #29133	2025-01-15 09:44:53 -05:00
Michael Chin	d9b856abad	community: Deprecate Amazon Neptune resources in langchain-community (#29191 ) Related: https://github.com/langchain-ai/langchain-aws/pull/322 The legacy `NeptuneOpenCypherQAChain` and `NeptuneSparqlQAChain` classes are being replaced by the new LCEL format chains `create_neptune_opencypher_qa_chain` and `create_neptune_sparql_qa_chain`, respectively, in the `langchain_aws` package. This PR adds deprecation warnings to all Neptune classes and functions that have been migrated to `langchain_aws`. All relevant documentation has also been updated to replace `langchain_community` usage with the new `langchain_aws` implementations. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-14 10:23:34 -05:00
Tymon Żarski	689592f9bb	community: Fix rank-llm import paths for new 0.20.3 version (#29154 ) # PR title: "community: Fix rank-llm import paths for new 0.20.3 version" - The "community" package is being modified to handle updated import paths for the new `rank-llm` version. --- ## Description This PR updates the import paths for the `rank-llm` package to account for changes introduced in version `0.20.3`. The changes ensure compatibility with both pre- and post-revamp versions of `rank-llm`, specifically version `0.12.8`. Conditional imports are introduced based on the detected version of `rank-llm` to handle different path structures for `VicunaReranker`, `ZephyrReranker`, and `SafeOpenai`. ## Issue RankLLMRerank usage throws an error when used GPT (not only) when rank-llm version is > 0.12.8 - #29156 ## Dependencies This change relies on the `packaging` and `pkg_resources` libraries to handle version checks. ## Twitter handle @tymzar	2025-01-13 10:22:14 -05:00
Priyansh Agrawal	c115c09b6d	community: add missing format specifier in error log in CubeSemanticLoader (#29172 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message - Description: Add a missing format specifier in an an error log in `langchain_community.document_loaders.CubeSemanticLoader` - Issue: raises `TypeError: not all arguments converted during string formatting` - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-13 09:32:57 -05:00
ThomasSaulou	349b5c91c2	fix chatperplexity: remove 'stream' from params in _stream method (#29173 ) quick fix chatperplexity: remove 'stream' from params in _stream method	2025-01-13 09:31:37 -05:00
LIU Yuwei	f980144e9c	community: add init for unstructured file loader (#29101 ) ## Description Add `__init__` for unstructured loader of epub/image/markdown/pdf/ppt/word to restrict the input type to `str` or `Path`. In the [signature](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.markdown.UnstructuredMarkdownLoader.html) these unstructured loaders receive `file_path: str \| List[str] \| Path \| List[Path]`, but actually they only receive `str` or `Path`. ## Issue None ## Dependencies No changes.	2025-01-13 09:26:00 -05:00
Joshua Campbell	00dcc44739	Langchain_community: Fix issue with missing backticks in arango client (#29110 ) - Description: Adds backticks to generate_schema function in the arango graph client - Issue: We experienced an issue with the generate schema function when talking to our arango database where these backticks were missing - Dependencies: none - Twitter handle: @anangelofgrace	2025-01-09 10:00:10 -05:00
LIU Yuwei	2b09f798e1	community: add init for `UnstructuredHTMLLoader` to solve pathlib paths (#29091 ) ## Description Add `__init__` for `UnstructuredHTMLLoader` to restrict the input type to `str` or `Path`, and transfer the `self.file_path` to `str` just like `UnstructuredXMLLoader` does. ## Issue Fix #29090 ## Dependencies No changes.	2025-01-08 10:19:27 -05:00
Jin Hyung Ahn	c8ca1cd42f	community: fix "confluence-loader" enable include_labels for documents loaded via CQL (#29089 ) ## Description This PR enables label inclusion for documents loaded via CQL in the confluence-loader. - Updated _lazy_load to pass the include_labels parameter instead of False in process_pages calls for documents loaded via CQL. - Ensured that labels can now be fetched and added to the metadata for documents queried with cql. ## Related Modification History This PR builds on the previous functionality introduced in [#28259](https://github.com/langchain-ai/langchain/pull/28259), which added support for including labels with the include_labels option. However, this functionality did not work as expected for CQL queries, and this PR fixes that issue. If the False handling was intentional due to another issue, please let me know. I have verified with our Confluence instance that this change allows labels to be correctly fetched for documents loaded via CQL. ## Issue Fixes #29088 ## Dependencies No changes. ## Twitter Handle [@zenoengine](https://x.com/zenoengine)	2025-01-08 10:16:39 -05:00
Inah Jeon	9d290abccd	partner: Update Upstage Model Names and Remove Deprecated Model (#29093 ) This PR updates model names in the upstage library to reflect the latest naming conventions and removes deprecated models. Changes: Renamed Models: - `solar-1-mini-chat` -> `solar-mini` - `solar-1-mini-embedding-query` -> `embedding-query` Removed Deprecated Models: - `layout-analysis` (replaced to `document-parse`) Reference: - https://console.upstage.ai/docs/getting-started/overview - https://github.com/langchain-ai/langchain-upstage/releases/tag/libs%2Fupstage%2Fv0.5.0 Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-08 10:13:22 -05:00
Philippe PRADOS	2921597c71	community[patch]: Refactoring PDF loaders: 01 prepare (#29062 ) - Refactoring PDF loaders step 1: "community: Refactoring PDF loaders to standardize approaches" - Description: Declare CloudBlobLoader in __init__.py. file_path is Union[str, PurePath] anywhere - Twitter handle: pprados This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses to prepare the update of all parsers. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970). @eyurtsev it's the start of a PR series.	2025-01-07 11:00:04 -05:00
Ashvin	20a715a103	community: Fix redundancy in code. (#29022 ) In my previous PR (#28953), I added an unwanted condition for validating the Azure ML Endpoint. In this PR, I have rectified the issue.	2025-01-06 12:58:16 -05:00
ccurme	3e618b16cd	community[patch]: release 0.3.14 (#29019 )	2025-01-03 15:34:24 -05:00
ccurme	0185010b88	community[patch]: additional check for prompt caching support (#29008 ) Prompt caching explicitly excludes `gpt-4o-2024-05-13`: https://platform.openai.com/docs/guides/prompt-caching Resolves https://github.com/langchain-ai/langchain/issues/28997	2025-01-03 10:14:07 -05:00
Ashvin	d26c102a5a	community: Update azureml endpoint (#28953 ) - In this PR, I have updated the AzureML Endpoint with the latest endpoint. - Description: I have changed the existing `/chat/completions` to `/models/chat/completions` in libs/community/langchain_community/llms/azureml_endpoint.py - Issue: #25702 --------- Co-authored-by: = <=>	2025-01-02 14:47:02 -05:00
Mohammad Mohtashim	0e74757b0a	(Community): `DuckDuckGoSearchAPIWrapper` backend changed from `api` to `auto` (#28961 ) - Description: `DuckDuckGoSearchAPIWrapper` default value for backend has been changed to avoid User Warning - Issue: #28957	2025-01-02 14:08:22 -05:00
minpeter	a873e0fbfb	community: update documentation and model IDs for FriendliAI provider (#28984 ) ### Description - In the example, remove `llama-2-13b-chat`, `mixtral-8x7b-instruct-v0-1`. - Fix llm friendli streaming implementation. - Update examples in documentation and remove duplicates. ### Issue N/A ### Dependencies None ### Twitter handle `@friendliai`	2025-01-02 12:15:59 -05:00
Mohammad Mohtashim	49a26c1fca	(Community): Fix Keyword argument for `AzureAIDocumentIntelligenceParser` (#28959 ) - Description: Fix the `body` keyword argument for AzureAIDocumentIntelligenceParser` - Issue: #28948	2025-01-02 11:27:12 -05:00
ccurme	efc687a13b	community[patch]: fix instantiation for Slack tools (#28990 ) Believe the current implementation raises PydanticUserError following [this](https://github.com/pydantic/pydantic/releases/tag/v2.10.1) Pydantic release. Resolves https://github.com/langchain-ai/langchain/issues/28989	2025-01-02 16:14:17 +00:00
Bagatur	1c797ac68f	infra: speed up unit tests (#28974 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-02 04:13:08 +00:00
Andy Wermke	5940ed3952	community: Fix error handling bug in ChatDeepInfra (#28918 ) In the async ClientResponse, `response.text` is not a string property, but an asynchronous function returning a string.	2024-12-26 14:45:12 -05:00
zep.hyr	7b4d2d5d44	Community : Add cost information for missing OpenAI model (#28882 ) In the previous commit, the cached model key for this model was omitted. When using the "gpt-4o-2024-11-20" model, the token count in the callback appeared as 0, and the cost was recorded as 0. We add model and cost information so that the token count and cost can be displayed for the respective model. - The message before modification is as follows. ``` Tokens Used: 0 Prompt Tokens: 0 Prompt Tokens Cached: 0 Completion Tokens: 0 Reasoning Tokens: 0 Successful Requests: 0 Total Cost (USD): $0.0 ``` - The message after modification is as follows. ``` Tokens Used: 3783 Prompt Tokens: 3625 Prompt Tokens Cached: 2560 Completion Tokens: 158 Reasoning Tokens: 0 Successful Requests: 1 Total Cost (USD): $0.010642500000000001 ```	2024-12-26 14:28:31 -05:00
Adrián Panella	8d9907088b	community(azuresearch): allow to use any valid credential (#28873 ) Add option to use any valid credential type. Differentiates async cases needed by Azure Search. This could replace the use of a static token	2024-12-23 10:05:48 -05:00
Mohammad Mohtashim	41b6a86bbe	Community: LlamaCppEmbeddings `embed_documents` and `embed_query` (#28827 ) - Description: `embed_documents` and `embed_query` was throwing off the error as stated in the issue. The issue was that `Llama` client is returning the embeddings in a nested list which is not being accounted for in the current implementation and therefore the stated error is being raised. - Issue: #28813 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-23 09:50:22 -05:00
Darien Schettler	32917a0b98	Update dataframe.py (#28871 ) community: optimize DataFrame document loader Description: Simplify the `lazy_load` method in the DataFrame document loader by combining text extraction and metadata cleanup into a single operation. This makes the code more concise while maintaining the same functionality. Issue: N/A Dependencies: None Twitter handle: N/A	2024-12-22 19:16:16 -05:00
yeounhak	f38fc89f35	community: Corrected aload func to be asynchronous from webBaseLoader (#28337 ) - Description: The aload function, contrary to its name, is not an asynchronous function, so it cannot work concurrently with other asynchronous functions. - Issue: #28336 - Test: : Done - Docs: [here](`e0a95e5646/docs/docs/integrations/document_loaders/web_base.ipynb (L201)`) - Lint: All checks passed If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-20 14:42:52 -05:00
Jacob Mansdorfer	6d81137325	community: adding langchain-predictionguard partner package documentation (#28832 ) - [x] PR title: "community: adding langchain-predictionguard partner package documentation" - [x] PR message: - Description: This PR adds documentation for the langchain-predictionguard package to main langchain repo, along with deprecating current Prediction Guard LLMs package. The LLMs package was previously broken, so I also updated it one final time to allow it to continue working from this point onward. . This enables users to chat with LLMs through the Prediction Guard ecosystem. - Package Links: - [PyPI](https://pypi.org/project/langchain-predictionguard/) - [Github Repo](https://www.github.com/predictionguard/langchain-predictionguard) - Issue: None - Dependencies: None - Twitter handle: [@predictionguard](https://x.com/predictionguard) - [x] Add tests and docs: All docs have been added for the partner package, and the current LLMs package test was updated to reflect changes. - [x] Lint and test: Linting tests are all passing. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-20 10:51:44 -05:00
Emmanuel Leroy	c8db5a19ce	langchain_community.chat_models.oci_generative_ai: Fix a bug when using optional parameters in tools (#28829 ) When using tools with optional parameters, the parameter `type` is not longer available since langchain update to 0.3 (because of the pydantic upgrade?) and there is now an `anyOf` field instead. This results in the `type` being `None` in the chat request for the tool parameter, and the LLM call fails with the error: ``` oci.exceptions.ServiceError: {'target_service': 'generative_ai_inference', 'status': 400, 'code': '400', 'opc-request-id': '...', 'message': 'Parameter definition must have a type.', 'operation_name': 'chat' ... } ``` Example code that fails: ``` from langchain_community.chat_models.oci_generative_ai import ChatOCIGenAI from langchain_core.tools import tool from typing import Optional llm = ChatOCIGenAI( model_id="cohere.command-r-plus", service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com", compartment_id="ocid1.compartment.oc1...", auth_profile="your_profile", auth_type="API_KEY", model_kwargs={"temperature": 0, "max_tokens": 3000}, ) @tool def test(example: Optional[str] = None): """This is the tool to use to test things Args: example: example variable, defaults to None """ return "this is a test" llm_with_tools = llm.bind_tools([test]) result = llm_with_tools.invoke("can you make a test for g") ``` This PR sets the param type to `any` in that case, and fixes the problem. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-19 22:17:34 +00:00
Anusha Karkhanis	26bdf40072	Langchain_Community: SQL LanguageParser (#28430 ) ## Description (This PR has contributions from @khushiDesai, @ashvini8, and @ssumaiyaahmed). This PR addresses Issue #11229 which addresses the need for SQL support in document parsing. This is integrated into the generic TreeSitter parsing library, allowing LangChain users to easily load codebases in SQL into smaller, manageable "documents." This pull request adds a new ```SQLSegmenter``` class, which provides the SQL integration. ## Issue Issue #11229: Add support for a variety of languages to LanguageParser ## Testing We created a file ```test_sql.py``` with several tests to ensure the ```SQLSegmenter``` is functional. Below are the tests we added: - ```def test_is_valid```: Checks SQL validity. - ```def test_extract_functions_classes```: Extracts individual SQL statements. - ```def test_simplify_code```: Simplifies SQL code with comments. --------- Co-authored-by: Syeda Sumaiya Ahmed <114104419+ssumaiyaahmed@users.noreply.github.com> Co-authored-by: ashvini hunagund <97271381+ashvini8@users.noreply.github.com> Co-authored-by: Khushi Desai <khushi.desai@advantawitty.com> Co-authored-by: Khushi Desai <59741309+khushiDesai@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-19 20:30:57 +00:00
Erick Friis	6526db4871	community: bump core (#28819 )	2024-12-19 06:41:53 -08:00
Vignesh A	4c9acdfbf1	Community : Add OpenAI prompt caching and reasoning tokens tracking (#27135 ) Added Token tracking for OpenAI's prompt caching and reasoning tokens Costs updated from https://openai.com/api/pricing/ usage example ```python from langchain_community.callbacks import get_openai_callback from langchain_openai import ChatOpenAI llm = ChatOpenAI(model_name="o1-mini",temperature=1) with get_openai_callback() as cb: response = llm.invoke("hi "*1500) print(cb) ``` Output ``` Tokens Used: 1720 Prompt Tokens: 1508 Prompt Tokens Cached: 1408 Completion Tokens: 212 Reasoning Tokens: 192 Successful Requests: 1 Total Cost (USD): $0.0049559999999999995 ``` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-19 09:31:13 -05:00
ScriptShi	97f1e1d39f	community: tablestore vector store check the dimension of the embedding when writing it to store. (#28812 ) Added some restrictions to a vectorstore I released in the community before.	2024-12-19 09:30:43 -05:00
Lu Peng	50afa7c4e7	community: add new parameter default_headers (#28700 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - "community: 1. add new parameter `default_headers` for oci model deployments and oci chat model deployments. 2. updated k parameter in OCIModelDeploymentLLM class." - [x] PR message: - Description: 1. add new parameters `default_headers` for oci model deployments and oci chat model deployments. 2. updated k parameter in OCIModelDeploymentLLM class. - [x] Add tests and docs: 1. unit tests 2. notebook --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-18 22:33:23 +00:00
Erick Friis	4dc28b43ac	community: release 0.3.13 (#28798 )	2024-12-18 21:58:46 +00:00
Mohammad Mohtashim	7c8f977695	Community: Fix `with_structured_output` for `ChatSambaNovaCloud` (#28796 ) - Description: The `kwargs` was being checked as None object which was causing the rest of code in `with_structured_output` not getting executed. The checking part has been fixed in this PR. - Issue: #28776	2024-12-18 14:35:06 -05:00
V.Prasanna kumar	684b146b18	Fixed adding float values into DynamoDB (#26562 ) Thank you for contributing to LangChain! - [x] PR title: Add float Message into Dynamo DB - community - Example: "community: Chat Message History - [x] PR message: - Description: pushing float values into dynamo db creates error , solved that by converting to str type - Issue: Float values are not getting pushed - Twitter handle: VpkPrasanna Have added an utility function for str conversion , let me know where to place it happy to do an commit. This PR is from an discussion of #26543 @hwchase17 @baskaryan @efriis --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 13:45:00 -05:00
Martin Triska	e6b41d081d	community: DocumentLoaderAsParser wrapper (#27749 ) ## Description This pull request introduces the `DocumentLoaderAsParser` class, which acts as an adapter to transform document loaders into parsers within the LangChain framework. The class enables document loaders that accept a `file_path` parameter to be utilized as blob parsers. This is particularly useful for integrating various document loading capabilities seamlessly into the LangChain ecosystem. When merged in together with PR https://github.com/langchain-ai/langchain/pull/27716 It opens options for `SharePointLoader` / `OneDriveLoader` to process any filetype that has a document loader. ### Features - Flexible Parsing: The `DocumentLoaderAsParser` class can adapt any document loader that meets the criteria of accepting a `file_path` argument, allowing for lazy parsing of documents. - Compatibility: The class has been designed to work with various document loaders, making it versatile for different use cases. ### Usage Example To use the `DocumentLoaderAsParser`, you would initialize it with a suitable document loader class and any required parameters. Here’s an example of how to do this with the `UnstructuredExcelLoader`: ```python from langchain_community.document_loaders.blob_loaders import Blob from langchain_community.document_loaders.parsers.documentloader_adapter import DocumentLoaderAsParser from langchain_community.document_loaders.excel import UnstructuredExcelLoader # Initialize the parser adapter with UnstructuredExcelLoader xlsx_parser = DocumentLoaderAsParser(UnstructuredExcelLoader, mode="paged") # Use parser, for ex. pass it to MimeTypeBasedParser MimeTypeBasedParser( handlers={ "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet": xlsx_parser } ) ``` - Dependencies: None - Twitter handle: @martintriska1 If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 12:47:08 -05:00
Mohammad Mohtashim	d49df4871d	[Community]: Image Extraction Fixed for `PDFPlumberParser` (#28491 ) - Description: One-Bit Images was raising error which has been fixed in this PR for `PDFPlumberParser` - Issue: #28480 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 11:45:48 -05:00
binhnd102	f723a8456e	Fixes: community: fix LanceDB return no metadata (#27024 ) - [ x ] Fix when lancedb return table without metadata column - Description: Check the table schema, if not has metadata column, init the Document with metadata argument equal to empty dict - Issue: https://github.com/langchain-ai/langchain/issues/27005 - [ x ] Add tests and docs --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-18 15:21:28 +00:00
Rave Harpaz	986b752fc8	Add OCI Generative AI new model and structured output support (#28754 ) - [X] PR title: community: Add new model and structured output support - [X] PR message: - Description: add support for meta llama 3.2 image handling, and JSON mode for structured output - Issue: NA - Dependencies: NA - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests, 2. no changes required for documentation. - [x] Lint and test: make format, make lint and make test we run successfully --------- Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-18 09:50:25 -05:00
David Pryce-Compson	ef24220d3f	community: adding haiku 3.5 and opus callbacks (#28783 ) Description: Adding new AWS Bedrock model and their respective costs to match https://aws.amazon.com/bedrock/pricing/ for the Bedrock callback Issue: Missing models for those that wish to try them out Dependencies: Nothing added Twitter handle: @David_Pryce and / or @JamfSoftware If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-12-18 09:45:10 -05:00
Yudai Kotani	05a44797ee	langchain_community: Add default None values to DocumentAttributeValue class properties (#28785 ) Description: This PR addresses an issue where the DocumentAttributeValue class properties did not have default values of None. By explicitly setting the Optional attributes (DateValue, LongValue, StringListValue, and StringValue) to default to None, this change ensures the class functions as expected when no value is provided for these attributes. Changes Made: Added default None values to the following properties of the DocumentAttributeValue class: DateValue LongValue StringListValue StringValue Removed the invalid argument extra="allow" from the BaseModel inheritance. Dependencies: None. Twitter handle (optional): @__korikori1021 Checklist - [x] Verified that KendraRetriever works as expected after the changes. Co-authored-by: y1u0d2a1i <y.kotani@raksul.com>	2024-12-18 09:43:04 -05:00
Dong Shin	0b1359801e	community: add trust_env at web_base_loader (#28514 ) - Description: I am working to address a similar issue to the one mentioned in https://github.com/langchain-ai/langchain/pull/19499. Specifically, there is a problem with the Webbase loader used in open-webui, where it fails to load the proxy configuration. This PR aims to resolve that issue. <!--If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.--> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-17 21:18:16 -05:00
Frank Dai	e81433497b	community: support Confluence cookies (#28760 ) Description: Some confluence instances don't support personal access token, then cookie is a convenient way to authenticate. This PR adds support for Confluence cookies. Twitter handle: soulmachine	2024-12-17 12:16:36 -05:00
German Martin	3a1d05394d	community: Apache AGE wrapper. Ensure Node Uniqueness by ID. (#28759 ) Description: The Apache AGE graph integration incorrectly handled node merging, allowing duplicate nodes with different IDs but the same type and other properties. Unlike [Neo4j](`cdf6202156/libs/community/langchain_community/graphs/neo4j_graph.py (L47)`), [Memgraph](`cdf6202156/libs/community/langchain_community/graphs/memgraph_graph.py (L50)`), [Kuzu](`cdf6202156/libs/community/langchain_community/graphs/kuzu_graph.py (L253)`), and [Gremlin](`cdf6202156/libs/community/langchain_community/graphs/gremlin_graph.py (L165)`), it did not use the node ID as the primary identifier for merging. This inconsistency caused data integrity issues and unexpected behavior when users expected updates to specific nodes by ID. Solution: This PR modifies the `node_insert_query` to `MERGE` nodes based on label and ID only and updates properties with `SET`, aligning the behavior with other graph database integrations. The `_format_properties` method was also modified to handle id overrides. Impact: This fix ensures data integrity by preventing duplicate nodes, and provides a consistent behavior across graph database integrations.	2024-12-17 09:21:59 -05:00
gsa9989	cdf6202156	cosmosdbnosql: Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook (#24424 ) * Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook --------- Co-authored-by: Aayush Kataria <aayushkataria3011@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 21:57:05 -05:00
Brian Burgin	27a9056725	community: Fix ChatLiteLLMRouter runtime issues (#28163 ) Description: Fix ChatLiteLLMRouter ctor validation and model_name parameter Issue: #19356, #27455, #28077 Twitter handle: @bburgin_0	2024-12-16 18:17:39 -05:00
Erick Friis	d4b5e7ef22	community: recommend RedisVectorStore over Redis (#28749 )	2024-12-16 21:08:30 +00:00
Hiros	8f5e72de05	community: Correctly handle multi-element rich text (#25762 ) Description: - Add _concatenate_rich_text method to combine all elements in rich text arrays - Update load_page method to use _concatenate_rich_text for rich text properties - Ensure all text content is captured, including inline code and formatted text - Add unit tests to verify correct handling of multi-element rich text This fix prevents truncation of content after backticks or other formatting elements. Issue: Using Notion DB Loader, the text for `richtext` and `title` is truncated after 1st element was loaded as Notion Loader only read the first element. Dependencies: any dependencies required for this change None. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 20:20:27 +00:00
Tari Yekorogha	d262d41cc0	community: added FalkorDB vector store support i.e implementation, test, docs an… (#26245 ) Description: Added support for FalkorDB Vector Store, including its implementation, unit tests, documentation, and an example notebook. The FalkorDB integration allows users to efficiently manage and query embeddings in a vector database, with relevance scoring and maximal marginal relevance search. The following components were implemented: - Core implementation for FalkorDBVector store. - Unit tests ensuring proper functionality and edge case coverage. - Example notebook demonstrating an end-to-end setup, search, and retrieval using FalkorDB. Twitter handle: @tariyekorogha --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 19:37:55 +00:00
Aaron Pham	12fced13f4	chore(community): update to OpenLLM 0.6 (#24609 ) Update to OpenLLM 0.6, which we decides to make use of OpenLLM's OpenAI-compatible endpoint. Thus, OpenLLM will now just become a thin wrapper around OpenAI wrapper. Signed-off-by: Aaron Pham <contact@aarnphm.xyz> --------- Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-16 14:30:07 -05:00
Lvlvko	5c17a4ace9	community: support Hunyuan Embedding (#23160 ) ## description - I refactor `Chathunyuan` using tencentcloud sdk because I found the original one can't work in my application - I add `HunyuanEmbeddings` using tencentcloud sdk - Both of them are extend the basic class of langchain. I have fully tested them in my application ## Dependencies - tencentcloud-sdk-python --------- Co-authored-by: centonhuang <centonhuang@tencent.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 19:27:19 +00:00
Harrison Chase	de7996c2ca	core: add kwargs support to VectorStore (#25934 ) has been missing the passthrough until now --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 18:57:57 +00:00
Lorenzo	b79a1156ed	community: correct return type of get_files_from_directory in github tool (#27885 ) ### About: - Description: the _get_files_from_directory_ method return a string, but it's used in other methods that expect a List[str] - Issue: None - Dependencies: None This pull request import a new method _list_files_ with the old logic of _get_files_from_directory_, but it return a List[str] at the end. The behavior of _ get_files_from_directory_ is not changed.	2024-12-16 10:30:33 -08:00
Sheepsta300	580a8d53f9	community: Add configurable `VisualFeatures` to the `AzureAiServicesImageAnalysisTool` (#27444 ) Thank you for contributing to LangChain! - [ ] PR title: community: Add configurable `VisualFeatures` to the `AzureAiServicesImageAnalysisTool` - [ ] PR message: - Description: The `AzureAiServicesImageAnalysisTool` is a good service and utilises the Azure AI Vision package under the hood. However, since the creation of this tool, new `VisualFeatures` have been added to allow the user to request other image specific information to be returned. Currently, the tool offers neither configuration of which features should be return nor does it offer any newer feature types. The aim of this PR is to address this and expose more of the Azure Service in this integration. - Dependencies: no new dependencies in the main class file, azure.ai.vision.imageanalysis added to extra test dependencies file. - [ ] Add tests and docs: If you're adding a new integration, please include 1. Although no tests exist for already implemented Azure Service tools, I've created 3 unit tests for this class that test initialisation and credentials, local file analysis and a test for the new changes/ features option. - [ ] Lint and test: All linting has passed. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 18:30:04 +00:00
Ana	ebab2ea81b	Fix Azure National Cloud authentication using token (RBAC) (Generated by Ana - AI SDE) (#25843 ) This pull request addresses the issue with authenticating Azure National Cloud using token (RBAC) in the AzureSearch vectorstore implementation. ## Changes - Modified the `_get_search_client` method in `azuresearch.py` to pass `additional_search_client_options` to the `SearchIndexClient` instance. ## Implementation Details The patch updates the `SearchIndexClient` initialization to include the `additional_search_client_options` parameter: ```python index_client: SearchIndexClient = SearchIndexClient( endpoint=endpoint, credential=credential, user_agent=user_agent, **additional_search_client_options ) ``` This change allows the `audience` parameter to be correctly passed when using Azure National Cloud, fixing the authentication issues with GovCloud & RBAC. This patch was generated by [Ana - AI SDE](https://openana.ai/), an AI-powered software development assistant. This is a fix for [Issue 25823](https://github.com/langchain-ai/langchain/issues/25823) --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-16 18:22:24 +00:00
chenzimin	169d419581	community: Remove all other keys in ChatLiteLLM and add api_key (#28097 ) Thank you for contributing to LangChain! - PR title: "community: Remove all other keys in ChatLiteLLM and add api_key" - PR message: Currently, no api_key are passed to LiteLLM, and LiteLLM only takes on api_key parameter. Therefore I removed all current `*_api_key` attributes (They are not used), and added `api_key` that is passed to ChatLiteLLM. - Should fix issue #27826 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 17:54:29 +00:00
German Martin	d5d18c62b3	community: Apache AGE wrapper additional edge cases. (#28151 ) Description: Current AGEGraph() implementation does some custom wrapping for graph queries. The method here is _wrap_query() as it parse the field from the original query to add some SQL context to it. This improves the current parsing logic to cover additional edge cases that are added to the test coverage, basically if any Node property name or value has the "return" literal in it will break the graph / SQL query. We discovered this while dealing with real world datasets, is not an uncommon scenario and I think it needs to be covered.	2024-12-16 11:28:01 -05:00
Rock2z	768e4a7fd4	[community][fix] Compatibility support to bump up wikibase-rest-api-client version (#27316 ) Description: This PR addresses the `TypeError: sequence item 0: expected str instance, FluentValue found` error when invoking `WikidataQueryRun`. The root cause was an incompatible version of the `wikibase-rest-api-client`, which caused the tool to fail when handling `FluentValue` objects instead of strings. The current implementation only supports `wikibase-rest-api-client<0.2`, but the latest version is `0.2.1`, where the current implementation breaks. Additionally, the error message advises users to install the latest version: [code reference](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/utilities/wikidata.py#L125C25-L125C32). Therefore, this PR updates the tool to support the latest version of `wikibase-rest-api-client`. Key changes: - Updated the handling of `FluentValue` objects to ensure compatibility with the latest `wikibase-rest-api-client`. - Removed the restriction to `wikibase-rest-api-client<0.2` and updated to support the latest version (`0.2.1`). Issue: Fixes [#24093](https://github.com/langchain-ai/langchain/issues/24093) – `TypeError: sequence item 0: expected str instance, FluentValue found`. Dependencies: - Upgraded `wikibase-rest-api-client` to the latest version to resolve the issue. --------- Co-authored-by: peiwen_zhang <peiwen_zhang@email.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 16:22:18 +00:00
André Quintino	a26c786bc5	community: refactor opensearch query constructor to use wildcard instead of match in the contain comparator (#26653 ) - Description: Changed the comparator to use a wildcard query instead of match. This modification allows for partial text matching on analyzed fields, which improves the flexibility of the search by performing full-text searches that aren't limited to exact matches. - Issue: The previous implementation used a match query, which performs exact matches on analyzed fields. This approach limited the search capabilities by requiring the query terms to align with the indexed text. The modification to use a wildcard query instead addresses this limitation. The wildcard query allows for partial text matching, which means the search can return results even if only a portion of the term matches the text. This makes the search more flexible and suitable for use cases where exact matches aren't necessary or expected, enabling broader full-text searches across analyzed fields. In short, the problem was that match queries were too restrictive, and the change to wildcard queries enhances the ability to perform partial matches. - Dependencies: none - Twitter handle: @Andre_Q_Pereira --------- Co-authored-by: André Quintino <andre.quintino@tui.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 11:16:34 -05:00
Davi Schumacher	0f9b4bf244	community[patch]: update dynamodb chat history to update instead of overwrite (#22397 ) Description: The current implementation of `DynamoDBChatMessageHistory` updates the `History` attribute for a given chat history record by first extracting the existing contents into memory, appending the new message, and then using the `put_item` method to put the record back. This has the effect of overwriting any additional attributes someone may want to include in the record, like chat session metadata. This PR suggests changing from using `put_item` to using `update_item` instead which will keep any other attributes in the record untouched. The change is backward compatible since 1. `update_item` is an "upsert" operation, creating the record if it doesn't already exist, otherwise updating it 2. It only touches the db insert call and passes the exact same information. The rest of the class is left untouched Dependencies: None Tests and docs: No unit tests currently exist for the `DynamoDBChatMessageHistory` class. This PR adds the file `libs/community/tests/unit_tests/chat_message_histories/test_dynamodb_chat_message_history.py` to test the `add_message` and `clear` methods. I wanted to use the moto library to mock DynamoDB calls but I could not get poetry to resolve it so I mocked those calls myself in the test. Therefore, no test dependencies were added. The change was tested on a test DynamoDB table as well. The first three images below show the current behavior. First a message is added to chat history, then a value is inserted in the record in some other attribute, and finally another message is added to the record, destroying the other attribute. ![using_put_1_first_message](https://github.com/langchain-ai/langchain/assets/29493541/426acd62-fe29-42f4-b75f-863fb8b3fb21) ![using_put_2_add_attribute](https://github.com/langchain-ai/langchain/assets/29493541/f8a1c864-7114-4fe3-b487-d6f9252f8f92) ![using_put_3_second_message](https://github.com/langchain-ai/langchain/assets/29493541/8b691e08-755e-4877-8969-0e9769e5d28a) The next three images show the new behavior. Once again a value is added to an attribute other than the History attribute, but now when the followup message is added it does not destroy that other attribute. The History attribute itself is unaffected by this change. ![using_update_1_first_message](https://github.com/langchain-ai/langchain/assets/29493541/3e0d76ed-637e-41cd-82c7-01a86c468634) ![using_update_2_add_attribute](https://github.com/langchain-ai/langchain/assets/29493541/52585f9b-71a2-43f0-9dfc-9935aa59c729) ![using_update_3_second_message](https://github.com/langchain-ai/langchain/assets/29493541/f94c8147-2d6f-407a-9a0f-86b94341abff) The doc located at `docs/docs/integrations/memory/aws_dynamodb.ipynb` required no changes and was tested as well.	2024-12-16 10:38:00 -05:00
Christophe Bornet	6ddd5dbb1e	community: Add FewShotSQLTool (#28232 ) The `FewShotSQLTool` gets some SQL query examples from a `BaseExampleSelector` for a given question. This is useful to provide [few-shot examples](https://python.langchain.com/docs/how_to/sql_prompting/#few-shot-examples) capability to an SQL agent. Example usage: ```python from langchain.agents.agent_toolkits.sql.prompt import SQL_PREFIX embeddings = OpenAIEmbeddings() example_selector = SemanticSimilarityExampleSelector.from_examples( examples, embeddings, AstraDB, k=5, input_keys=["input"], collection_name="lc_few_shots", token=ASTRA_DB_APPLICATION_TOKEN, api_endpoint=ASTRA_DB_API_ENDPOINT, ) few_shot_sql_tool = FewShotSQLTool( example_selector=example_selector, description="Input to this tool is the input question, output is a few SQL query examples related to the input question. Always use this tool before checking the query with sql_db_query_checker!" ) agent = create_sql_agent( llm=llm, db=db, prefix=SQL_PREFIX + "\nYou MUST get some example queries before creating the query.", extra_tools=[few_shot_sql_tool] ) result = agent.invoke({"input": "How many artists are there?"}) ``` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 15:37:21 +00:00
Mohammad Mohtashim	8d746086ab	Added `bind_tools` support for `ChatMLX` along with small fix in `_stream` (#28743 ) - Description: Added Support for `bind_tool` as requested in the issue. Plus two issue in `_stream` were fixed: - Corrected the Positional Argument Passing for `generate_step` - Accountability if `token` returned by `generate_step` is integer. - Issue: #28692	2024-12-16 09:52:49 -05:00
Jorge Piedrahita Ortiz	558b65ea32	community: SamabaStudio Tool Calling and Structured Output (#28025 ) Description: Add tool calling and structured output support for SambaStudio chat models, docs included --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 06:15:19 +00:00
clairebehue	fb44e74ca4	community: fix AzureSearch Oauth with azure_ad_access_token (#26995 ) Description: AzureSearch vector store: create a wrapper class on `azure.core.credentials.TokenCredential` (which is not-instantiable) to fix Oauth usage with `azure_ad_access_token` argument Issue: [the issue it fixes](https://github.com/langchain-ai/langchain/issues/26216) Dependencies: None - [x] Lint and test --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 05:56:45 +00:00
SirSmokeAlot	29305cd948	community: O365Toolkit - send_event - fixed timezone error (#25876 ) Description: Fixed formatting start and end time Issue: The old formatting resulted everytime in an timezone error Dependencies: / Twitter handle: / --------- Co-authored-by: Yannick Opitz <yannick.opitz@gob.de> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 05:32:28 +00:00
Aayush Kataria	d417e4b372	Community: Azure CosmosDB No Sql Vector Store: Full Text and Hybrid Search Support (#28716 ) Thank you for contributing to LangChain! - Added [full text](https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/full-text-search) and [hybrid search](https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/hybrid-search) support for Azure CosmosDB NoSql Vector Store - Added a new enum called CosmosDBQueryType which supports the following values: - VECTOR = "vector" - FULL_TEXT_SEARCH = "full_text_search" - FULL_TEXT_RANK = "full_text_rank" - HYBRID = "hybrid" - User now needs to provide this query_type to the similarity_search method for the vectorStore to make the correct query api call. - Added a couple of work arounds as for the FULL_TEXT_RANK and HYBRID query functions we don't support parameterized queries right now. I have added TODO's in place, and will remove these work arounds by end of January. - Added necessary test cases and updated the - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-12-15 13:26:32 -08:00
Mohammad Mohtashim	4c1871d9a8	community: Passing the `model_kwargs` correctly while maintaing backward compatability (#28439 ) - Description: `Model_Kwargs` was not being passed correctly to `sentence_transformers.SentenceTransformer` which has been corrected while maintaing backward compatability - Issue: #28436 --------- Co-authored-by: MoosaTae <sadhis.tae@gmail.com> Co-authored-by: Sadit Wongprayon <101176694+MoosaTae@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-15 20:34:29 +00:00
nhols	a3851cb3bc	community: FAISS vectorstore - consistent Document id field (#28728 ) make sure id field of Documents in `FAISS` docstore have the same id as values in `index_to_docstore_id`, implement `get_by_ids` method	2024-12-15 12:23:49 -08:00
Bagatur	a0534ae62a	community[patch]: Release 0.3.12 (#28725 )	2024-12-14 22:13:20 +00:00
Nawaf Alharbi	decd77c515	community: fix an issue with deepinfra integration (#28715 ) Thank you for contributing to LangChain! - [x] PR title: langchain: add URL parameter to ChatDeepInfra class - [x] PR message: add URL parameter to ChatDeepInfra class - Description: This PR introduces a url parameter to the ChatDeepInfra class in LangChain, allowing users to specify a custom URL. Previously, the URL for the DeepInfra API was hardcoded to "https://stage.api.deepinfra.com/v1/openai/chat/completions", which caused issues when the staging endpoint was not functional. The _url method was updated to return the value from the url parameter, enabling greater flexibility and addressing the problem. out! --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-14 02:15:29 +00:00
Ben Chambers	008efada2c	[community]: Render documents to graphviz (#24830 ) - Description: Adds a helper that renders documents with the GraphVectorStore metadata fields to Graphviz for visualization. This is helpful for understanding and debugging. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-14 02:02:09 +00:00
Erick Friis	288f204758	docs, community: aerospike docs update (#28717 ) Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com> Co-authored-by: Jesse S <jschmidt@aerospike.com> Co-authored-by: dylan <dwelch@aerospike.com>	2024-12-14 00:27:37 +00:00
Vimpas	337fed80a5	community: 🐛 PDF Filter Type Error (#27154 ) Thank you for contributing to LangChain! PR title: "community: fix PDF Filter Type Error" - Description: fix PDF Filter Type Error" - Issue: the issue #27153 it fixes, - Dependencies: no - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-13 23:30:29 +00:00

1 2 3 4 5 ...

1918 Commits