langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-07-12 11:21:37 +00:00

Author	SHA1	Message	Date
Bae-ChangHyun	d8510270ee	community: add 'extract' mode to FireCrawlLoader for structured data extraction (#30242 ) Description: Added an 'extract' mode to FireCrawlLoader that enables structured data extraction from web pages. This feature allows users to Extract structured data from a single URLs, or entire websites using Large Language Models (LLMs). You can show more params and usage on [firecrawl docs](https://docs.firecrawl.dev/features/extract-beta). You can extract from only one url now.(it depends on firecrawl's extract method) Dependencies: No new dependencies required. Uses existing FireCrawl API capabilities. --------- Co-authored-by: chbae <chbae@gcsc.co.kr> Co-authored-by: ccurme <chester.curme@gmail.com>	2025-03-17 15:15:57 +00:00
Priyansh Agrawal	f54f14b747	community: cube document loader - do not load non-public dimensions and measures (#30286 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - Description: Do not load non-public dimensions and measures (public: false) with Cube semantic loader - Issue: Currently, non-public dimensions and measures are loaded by the Cube document loader which leads to downstream applications using these which is not allowed by Cube. - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, eyurtsev, ccurme, vbarda, hwchase17.	2025-03-14 15:07:56 -04:00
homeffjy	2c99f12062	community[patch]: fix bilibili loader handling of multi-page content (#30283 ) Previously the loader would only extract subtitles from the first page of multi-page videos.	2025-03-14 14:53:03 -04:00
ccurme	a7ab5e8372	community[patch]: ChatPerplexity: track usage metadata (#30175 )	2025-03-07 23:25:05 +00:00
Hugh Gao	9b7b8e4a1a	community: make DashScope models support Partial Mode for text continuation. (#30108 ) ## Description make DashScope models support Partial Mode for text continuation. For text continuation in ChatTongYi, it supports text continuation with a prefix by adding a "partial" argument in AIMessage. The document is [Partial Mode ](https://help.aliyun.com/zh/model-studio/user-guide/partial-mode?spm=a2c4g.11186623.help-menu-2400256.d_1_0_0_8.211e5b77KMH5Pn&scm=20140722.H_2862210._.OR_help-T_cn~zh-V_1). The API example is: ```py import os import dashscope messages = [{ "role": "user", "content": "请对“春天来了，大地”这句话进行续写，来表达春天的美好和作者的喜悦之情" }, { "role": "assistant", "content": "春天来了，大地", "partial": True }] response = dashscope.Generation.call( api_key=os.getenv("DASHSCOPE_API_KEY"), model='qwen-plus', messages=messages, result_format='message', ) print(response.output.choices[0].message.content) ``` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-03-05 16:22:14 +00:00
Manthan Surkar	1ee8aceaee	community: fix Jira API wrapper failing initialization with cloud param (#30117 ) ### Description Converts the boolean `jira_cloud` parameter in the Jira API Wrapper to a string before initializing the Jira Client. Also adds tests for the same. ### Issue [Jira API Wrapper Bug](`8abb65e138/libs/community/langchain_community/utilities/jira.py (L47)`) ```python jira_cloud_str = get_from_dict_or_env(values, "jira_cloud", "JIRA_CLOUD") jira_cloud = jira_cloud_str.lower() == "true" ``` The above code has a bug where the value of `"jira_cloud"` is a boolean. If it is passed, calling `.lower()` on a boolean raises an error. Additionally, `False` cannot be passed explicitly since `get_from_dict_or_env` falls back to environment variables. Relevant code in `langchain_core`: [Source](https://github.com/thesmallstar/langchain/blob/master/.venv/lib/python3.13/site-packages/langchain_core/utils/env.py#L46) ```python if isinstance(key, str) and key in data and data[key]: # Here, data[key] is False ``` This PR fixes both issues. ### Twitter Handle [Manthan Surkar](https://x.com/manthan_surkar)	2025-03-05 10:49:25 -05:00
talos	9cd20080fc	community: Update SQLiteVec table trigger (#29914 ) Issue: This trigger can only be used by the first table created. Cannot create additional triggers for other tables. fixed: Update the trigger name so that it can be used for new tables. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-02-26 15:10:13 +00:00
Julien Elkaim	e586bffe51	community: Repair embeddings/llamacpp's embed_query method (#29935 ) Description: As commented on the commit [`41b6a86`](`41b6a86bbe`) it introduced a bug for when we do an embedding request and the model returns a non-nested list. Typically it's the case for model _nomic-embed-text_. - I added the unit test, and ran `make format`, `make lint` and `make test` from the `community` package. - No new dependency. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-02-23 19:32:17 +00:00
Jean-Philippe Dournel	ebe38baaf9	community/mlx_pipeline: fix crash at mlx call (#29915 ) - Description: Since mlx_lm 0.20, all calls to mlx crash due to deprecation of the way parameters are passed to methods generate and generate_step. Parameters top_p, temp, repetition_penalty and repetition_context_size are not passed directly to those method anymore but wrapped into "sampler" and "logit_processor". - Dependencies: mlx_lm (optional) - Tests: I've had a new test to existing test file: tests/integration_tests/llms/test_mlx_pipeline.py --------- Co-authored-by: Jean-Philippe Dournel <jp@insightkeeper.io>	2025-02-21 09:14:53 -05:00
Jorge Piedrahita Ortiz	3acf842e35	core: add sambanova chat models to load module mapping (#29855 ) - Description: add sambanova integration package chat models to load module mapping, to allow serialization and deserialization	2025-02-20 12:30:50 -05:00
Jan Heimes	60f58df5b3	community: add top_k as param to Needle Retriever (#29821 ) Thank you for contributing to LangChain! - [X] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: This PR adds top_k as a param to the Needle Retriever. By default we use top 10. - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-02-16 08:30:52 -05:00
Mohammad Mohtashim	96ad09fa2d	(Community): Added API Key for Jina Search API Wrapper (#29622 ) - Description: Simple change for adding the API Key for Jina Search API Wrapper - Issue: #29596	2025-02-12 20:12:07 -08:00
Philippe PRADOS	beb75b2150	community[minor]: 05 - Refactoring PyPDFium2 parser (#29625 ) This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses on updating the PyPDFium2 parser. For more details, see https://github.com/langchain-ai/langchain/pull/28970.	2025-02-07 21:31:12 -05:00
Christophe Bornet	723031d548	community: Bump ruff version to 0.9 (#29206 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-02-08 01:21:10 +00:00
Christophe Bornet	30f6c9f5c8	community: Use Blockbuster to detect blocking calls in asyncio during tests (#29609 ) Same as https://github.com/langchain-ai/langchain/pull/29043 for langchain-community. Dependencies: - blockbuster (test) Twitter handle: cbornet_ Co-authored-by: Erick Friis <erick@langchain.dev>	2025-02-08 01:10:39 +00:00
ccurme	d172984c91	infra: migrate to uv (#29566 )	2025-02-06 13:36:26 -05:00
Philippe PRADOS	6ff0d5c807	community[minor]: 04 - Refactoring PDFMiner parser (#29526 ) This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses on updating the XXX parser. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970). --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-02-05 21:08:27 -05:00
Hemant Rawat	db1693aa70	community: fix issue #29429 in age_graph.py (#29506 ) ## Description: This PR addresses issue #29429 by fixing the _wrap_query method in langchain_community/graphs/age_graph.py. The method now correctly handles Cypher queries with UNION and EXCEPT operators, ensuring that the fields in the SQL query are ordered as they appear in the Cypher query. Additionally, the method now properly handles cases where RETURN * is not supported. ### Issue: #29429 ### Dependencies: None ### Add tests and docs: Added unit tests in tests/unit_tests/graphs/test_age_graph.py to validate the changes. No new integrations were added, so no example notebook is necessary. Lint and test: Ran make format, make lint, and make test to ensure code quality and functionality.	2025-02-01 21:24:45 -05:00
ccurme	16a422f3fa	community: add standard tests for Perplexity (#29534 )	2025-02-01 17:02:57 -05:00
Philippe PRADOS	ceda8bc050	community[minor]: 03 - Refactoring PyPDF parser (#29330 ) This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses on updating the PyPDF parser. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970).	2025-01-31 10:05:07 -05:00
Julian Castro Pulgarin	b7e3e337b1	community: Fix YahooFinanceNewsTool to handle updated yfinance data structure (#29498 ) Description:* Updates the YahooFinanceNewsTool to handle the current yfinance news data structure. The tool was failing with a KeyError due to changes in the yfinance API's response format. This PR updates the code to correctly extract news URLs from the new structure. Issue: #29495 Dependencies: No new dependencies required. Works with existing yfinance package. The changes maintain backwards compatibility while fixing the KeyError that users were experiencing. The modified code properly handles the new data structure where: - News type is now at `content.contentType` - News URL is now at `content.canonicalUrl.url` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-31 02:31:44 +00:00
Macs Dickinson	7378c955db	community: adds support for getting github releases for the configured repository (#29318 ) Description: adds support for github tool to query github releases on the configure respository Issue: N/A Dependencies: N/A Twitter handle: @macsdickinson --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-22 15:45:52 +00:00
Philippe PRADOS	4efc5093c1	community[minor]: Refactoring PyMuPDF parser, loader and add image blob parsers (#29063 ) * Adds BlobParsers for images. These implementations can take an image and produce one or more documents per image. This interface can be used for exposing OCR capabilities. * Update PyMuPDFParser and Loader to standardize metadata, handle images, improve table extraction etc. - Twitter handle: pprados This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses to prepare the update of all parsers. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970). --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-01-20 15:15:43 -05:00
CLOVA Studio 개발	7a95ffc775	community: fix some features on Naver ChatModel & embedding model 2 (#29243 ) ## Description - Responding to `NCP API Key` changes. - To fix `ChatClovaX` `astream` function to raise `SSEError` when an error event occurs. - To add `token length` and `ai_filter` to ChatClovaX's `response_metadata`. - To update document for apply NCP API Key changes. cc. @efriis @vbarda	2025-01-20 11:01:03 -05:00
ThomasSaulou	e9abe583b2	chatperplexity stream-citations in additional kwargs (#29273 ) chatperplexity stream-citations in additional kwargs --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-18 22:31:10 +00:00
Nadeem Sajjad	eaf2fb287f	community(pypdfloader): added page_label in metadata for pypdf loader (#29225 ) # Description ## Summary This PR adds support for handling multi-labeled page numbers in the PyPDFLoader. Some PDFs use complex page numbering systems where the actual content may begin after multiple introductory pages. The page_label field helps accurately reflect the document’s page structure, making it easier to handle such cases during document parsing. ## Motivation This feature improves document parsing accuracy by allowing users to access the actual page labels instead of relying only on the physical page numbers. This is particularly useful for documents where the first few pages have roman numerals or other non-standard page labels. ## Use Case This feature is especially useful for Retrieval-Augmented Generation (RAG) systems where users may reference page numbers when asking questions. Some PDFs have both labeled page numbers (like roman numerals for introductory sections) and index-based page numbers. For example, a user might ask: "What is mentioned on page 5?" The system can now check both: • Index-based page number (page) • Labeled page number (page_label) This dual-check helps improve retrieval accuracy. Additionally, the results can be validated with an agent or tool to ensure the retrieved pages match the user’s query contextually. ## Code Changes - Added a page_label field to the metadata of the Document class in PyPDFLoader. - Implemented support for retrieving page_label from the pdf_reader.page_labels. - Created a test case (test_pypdf_loader_with_multi_label_page_numbers) with a sample PDF containing multi-labeled pages (geotopo-komprimiert.pdf) [[Source of pdf](https://github.com/py-pdf/sample-files/blob/main/009-pdflatex-geotopo/GeoTopo-komprimiert.pdf)]. - Updated existing tests to ensure compatibility and verify page_label extraction. ## Tests Added - Added a new test case for a PDF with multi-labeled pages. - Verified both page and page_label metadata fields are correctly extracted. ## Screenshots <img width="549" alt="image" src="https://github.com/user-attachments/assets/65db9f5c-032e-4592-926f-824777c28f33" />	2025-01-15 14:18:07 -05:00
Mohammad Mohtashim	21eb39dff0	[Community]: AzureOpenAIWhisperParser Authenication Fix (#29135 ) - Description: `AzureOpenAIWhisperParser` authentication fix as stated in the issue. - Issue: #29133	2025-01-15 09:44:53 -05:00
Philippe PRADOS	2921597c71	community[patch]: Refactoring PDF loaders: 01 prepare (#29062 ) - Refactoring PDF loaders step 1: "community: Refactoring PDF loaders to standardize approaches" - Description: Declare CloudBlobLoader in __init__.py. file_path is Union[str, PurePath] anywhere - Twitter handle: pprados This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses to prepare the update of all parsers. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970). @eyurtsev it's the start of a PR series.	2025-01-07 11:00:04 -05:00
minpeter	a873e0fbfb	community: update documentation and model IDs for FriendliAI provider (#28984 ) ### Description - In the example, remove `llama-2-13b-chat`, `mixtral-8x7b-instruct-v0-1`. - Fix llm friendli streaming implementation. - Update examples in documentation and remove duplicates. ### Issue N/A ### Dependencies None ### Twitter handle `@friendliai`	2025-01-02 12:15:59 -05:00
Bagatur	1c797ac68f	infra: speed up unit tests (#28974 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-02 04:13:08 +00:00
Mohammad Mohtashim	41b6a86bbe	Community: LlamaCppEmbeddings `embed_documents` and `embed_query` (#28827 ) - Description: `embed_documents` and `embed_query` was throwing off the error as stated in the issue. The issue was that `Llama` client is returning the embeddings in a nested list which is not being accounted for in the current implementation and therefore the stated error is being raised. - Issue: #28813 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-23 09:50:22 -05:00
yeounhak	f38fc89f35	community: Corrected aload func to be asynchronous from webBaseLoader (#28337 ) - Description: The aload function, contrary to its name, is not an asynchronous function, so it cannot work concurrently with other asynchronous functions. - Issue: #28336 - Test: : Done - Docs: [here](`e0a95e5646/docs/docs/integrations/document_loaders/web_base.ipynb (L201)`) - Lint: All checks passed If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-20 14:42:52 -05:00
Jacob Mansdorfer	6d81137325	community: adding langchain-predictionguard partner package documentation (#28832 ) - [x] PR title: "community: adding langchain-predictionguard partner package documentation" - [x] PR message: - Description: This PR adds documentation for the langchain-predictionguard package to main langchain repo, along with deprecating current Prediction Guard LLMs package. The LLMs package was previously broken, so I also updated it one final time to allow it to continue working from this point onward. . This enables users to chat with LLMs through the Prediction Guard ecosystem. - Package Links: - [PyPI](https://pypi.org/project/langchain-predictionguard/) - [Github Repo](https://www.github.com/predictionguard/langchain-predictionguard) - Issue: None - Dependencies: None - Twitter handle: [@predictionguard](https://x.com/predictionguard) - [x] Add tests and docs: All docs have been added for the partner package, and the current LLMs package test was updated to reflect changes. - [x] Lint and test: Linting tests are all passing. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-20 10:51:44 -05:00
Anusha Karkhanis	26bdf40072	Langchain_Community: SQL LanguageParser (#28430 ) ## Description (This PR has contributions from @khushiDesai, @ashvini8, and @ssumaiyaahmed). This PR addresses Issue #11229 which addresses the need for SQL support in document parsing. This is integrated into the generic TreeSitter parsing library, allowing LangChain users to easily load codebases in SQL into smaller, manageable "documents." This pull request adds a new ```SQLSegmenter``` class, which provides the SQL integration. ## Issue Issue #11229: Add support for a variety of languages to LanguageParser ## Testing We created a file ```test_sql.py``` with several tests to ensure the ```SQLSegmenter``` is functional. Below are the tests we added: - ```def test_is_valid```: Checks SQL validity. - ```def test_extract_functions_classes```: Extracts individual SQL statements. - ```def test_simplify_code```: Simplifies SQL code with comments. --------- Co-authored-by: Syeda Sumaiya Ahmed <114104419+ssumaiyaahmed@users.noreply.github.com> Co-authored-by: ashvini hunagund <97271381+ashvini8@users.noreply.github.com> Co-authored-by: Khushi Desai <khushi.desai@advantawitty.com> Co-authored-by: Khushi Desai <59741309+khushiDesai@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-19 20:30:57 +00:00
Vignesh A	4c9acdfbf1	Community : Add OpenAI prompt caching and reasoning tokens tracking (#27135 ) Added Token tracking for OpenAI's prompt caching and reasoning tokens Costs updated from https://openai.com/api/pricing/ usage example ```python from langchain_community.callbacks import get_openai_callback from langchain_openai import ChatOpenAI llm = ChatOpenAI(model_name="o1-mini",temperature=1) with get_openai_callback() as cb: response = llm.invoke("hi "*1500) print(cb) ``` Output ``` Tokens Used: 1720 Prompt Tokens: 1508 Prompt Tokens Cached: 1408 Completion Tokens: 212 Reasoning Tokens: 192 Successful Requests: 1 Total Cost (USD): $0.0049559999999999995 ``` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-19 09:31:13 -05:00
ScriptShi	97f1e1d39f	community: tablestore vector store check the dimension of the embedding when writing it to store. (#28812 ) Added some restrictions to a vectorstore I released in the community before.	2024-12-19 09:30:43 -05:00
Lu Peng	50afa7c4e7	community: add new parameter default_headers (#28700 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - "community: 1. add new parameter `default_headers` for oci model deployments and oci chat model deployments. 2. updated k parameter in OCIModelDeploymentLLM class." - [x] PR message: - Description: 1. add new parameters `default_headers` for oci model deployments and oci chat model deployments. 2. updated k parameter in OCIModelDeploymentLLM class. - [x] Add tests and docs: 1. unit tests 2. notebook --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-18 22:33:23 +00:00
binhnd102	f723a8456e	Fixes: community: fix LanceDB return no metadata (#27024 ) - [ x ] Fix when lancedb return table without metadata column - Description: Check the table schema, if not has metadata column, init the Document with metadata argument equal to empty dict - Issue: https://github.com/langchain-ai/langchain/issues/27005 - [ x ] Add tests and docs --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-18 15:21:28 +00:00
Frank Dai	e81433497b	community: support Confluence cookies (#28760 ) Description: Some confluence instances don't support personal access token, then cookie is a convenient way to authenticate. This PR adds support for Confluence cookies. Twitter handle: soulmachine	2024-12-17 12:16:36 -05:00
German Martin	3a1d05394d	community: Apache AGE wrapper. Ensure Node Uniqueness by ID. (#28759 ) Description: The Apache AGE graph integration incorrectly handled node merging, allowing duplicate nodes with different IDs but the same type and other properties. Unlike [Neo4j](`cdf6202156/libs/community/langchain_community/graphs/neo4j_graph.py (L47)`), [Memgraph](`cdf6202156/libs/community/langchain_community/graphs/memgraph_graph.py (L50)`), [Kuzu](`cdf6202156/libs/community/langchain_community/graphs/kuzu_graph.py (L253)`), and [Gremlin](`cdf6202156/libs/community/langchain_community/graphs/gremlin_graph.py (L165)`), it did not use the node ID as the primary identifier for merging. This inconsistency caused data integrity issues and unexpected behavior when users expected updates to specific nodes by ID. Solution: This PR modifies the `node_insert_query` to `MERGE` nodes based on label and ID only and updates properties with `SET`, aligning the behavior with other graph database integrations. The `_format_properties` method was also modified to handle id overrides. Impact: This fix ensures data integrity by preventing duplicate nodes, and provides a consistent behavior across graph database integrations.	2024-12-17 09:21:59 -05:00
gsa9989	cdf6202156	cosmosdbnosql: Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook (#24424 ) * Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook --------- Co-authored-by: Aayush Kataria <aayushkataria3011@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 21:57:05 -05:00
Brian Burgin	27a9056725	community: Fix ChatLiteLLMRouter runtime issues (#28163 ) Description: Fix ChatLiteLLMRouter ctor validation and model_name parameter Issue: #19356, #27455, #28077 Twitter handle: @bburgin_0	2024-12-16 18:17:39 -05:00
Hiros	8f5e72de05	community: Correctly handle multi-element rich text (#25762 ) Description: - Add _concatenate_rich_text method to combine all elements in rich text arrays - Update load_page method to use _concatenate_rich_text for rich text properties - Ensure all text content is captured, including inline code and formatted text - Add unit tests to verify correct handling of multi-element rich text This fix prevents truncation of content after backticks or other formatting elements. Issue: Using Notion DB Loader, the text for `richtext` and `title` is truncated after 1st element was loaded as Notion Loader only read the first element. Dependencies: any dependencies required for this change None. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 20:20:27 +00:00
Tari Yekorogha	d262d41cc0	community: added FalkorDB vector store support i.e implementation, test, docs an… (#26245 ) Description: Added support for FalkorDB Vector Store, including its implementation, unit tests, documentation, and an example notebook. The FalkorDB integration allows users to efficiently manage and query embeddings in a vector database, with relevance scoring and maximal marginal relevance search. The following components were implemented: - Core implementation for FalkorDBVector store. - Unit tests ensuring proper functionality and edge case coverage. - Example notebook demonstrating an end-to-end setup, search, and retrieval using FalkorDB. Twitter handle: @tariyekorogha --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 19:37:55 +00:00
Aaron Pham	12fced13f4	chore(community): update to OpenLLM 0.6 (#24609 ) Update to OpenLLM 0.6, which we decides to make use of OpenLLM's OpenAI-compatible endpoint. Thus, OpenLLM will now just become a thin wrapper around OpenAI wrapper. Signed-off-by: Aaron Pham <contact@aarnphm.xyz> --------- Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-16 14:30:07 -05:00
Lvlvko	5c17a4ace9	community: support Hunyuan Embedding (#23160 ) ## description - I refactor `Chathunyuan` using tencentcloud sdk because I found the original one can't work in my application - I add `HunyuanEmbeddings` using tencentcloud sdk - Both of them are extend the basic class of langchain. I have fully tested them in my application ## Dependencies - tencentcloud-sdk-python --------- Co-authored-by: centonhuang <centonhuang@tencent.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 19:27:19 +00:00
Harrison Chase	de7996c2ca	core: add kwargs support to VectorStore (#25934 ) has been missing the passthrough until now --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 18:57:57 +00:00
Sheepsta300	580a8d53f9	community: Add configurable `VisualFeatures` to the `AzureAiServicesImageAnalysisTool` (#27444 ) Thank you for contributing to LangChain! - [ ] PR title: community: Add configurable `VisualFeatures` to the `AzureAiServicesImageAnalysisTool` - [ ] PR message: - Description: The `AzureAiServicesImageAnalysisTool` is a good service and utilises the Azure AI Vision package under the hood. However, since the creation of this tool, new `VisualFeatures` have been added to allow the user to request other image specific information to be returned. Currently, the tool offers neither configuration of which features should be return nor does it offer any newer feature types. The aim of this PR is to address this and expose more of the Azure Service in this integration. - Dependencies: no new dependencies in the main class file, azure.ai.vision.imageanalysis added to extra test dependencies file. - [ ] Add tests and docs: If you're adding a new integration, please include 1. Although no tests exist for already implemented Azure Service tools, I've created 3 unit tests for this class that test initialisation and credentials, local file analysis and a test for the new changes/ features option. - [ ] Lint and test: All linting has passed. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 18:30:04 +00:00
German Martin	d5d18c62b3	community: Apache AGE wrapper additional edge cases. (#28151 ) Description: Current AGEGraph() implementation does some custom wrapping for graph queries. The method here is _wrap_query() as it parse the field from the original query to add some SQL context to it. This improves the current parsing logic to cover additional edge cases that are added to the test coverage, basically if any Node property name or value has the "return" literal in it will break the graph / SQL query. We discovered this while dealing with real world datasets, is not an uncommon scenario and I think it needs to be covered.	2024-12-16 11:28:01 -05:00
Davi Schumacher	0f9b4bf244	community[patch]: update dynamodb chat history to update instead of overwrite (#22397 ) Description: The current implementation of `DynamoDBChatMessageHistory` updates the `History` attribute for a given chat history record by first extracting the existing contents into memory, appending the new message, and then using the `put_item` method to put the record back. This has the effect of overwriting any additional attributes someone may want to include in the record, like chat session metadata. This PR suggests changing from using `put_item` to using `update_item` instead which will keep any other attributes in the record untouched. The change is backward compatible since 1. `update_item` is an "upsert" operation, creating the record if it doesn't already exist, otherwise updating it 2. It only touches the db insert call and passes the exact same information. The rest of the class is left untouched Dependencies: None Tests and docs: No unit tests currently exist for the `DynamoDBChatMessageHistory` class. This PR adds the file `libs/community/tests/unit_tests/chat_message_histories/test_dynamodb_chat_message_history.py` to test the `add_message` and `clear` methods. I wanted to use the moto library to mock DynamoDB calls but I could not get poetry to resolve it so I mocked those calls myself in the test. Therefore, no test dependencies were added. The change was tested on a test DynamoDB table as well. The first three images below show the current behavior. First a message is added to chat history, then a value is inserted in the record in some other attribute, and finally another message is added to the record, destroying the other attribute. ![using_put_1_first_message](https://github.com/langchain-ai/langchain/assets/29493541/426acd62-fe29-42f4-b75f-863fb8b3fb21) ![using_put_2_add_attribute](https://github.com/langchain-ai/langchain/assets/29493541/f8a1c864-7114-4fe3-b487-d6f9252f8f92) ![using_put_3_second_message](https://github.com/langchain-ai/langchain/assets/29493541/8b691e08-755e-4877-8969-0e9769e5d28a) The next three images show the new behavior. Once again a value is added to an attribute other than the History attribute, but now when the followup message is added it does not destroy that other attribute. The History attribute itself is unaffected by this change. ![using_update_1_first_message](https://github.com/langchain-ai/langchain/assets/29493541/3e0d76ed-637e-41cd-82c7-01a86c468634) ![using_update_2_add_attribute](https://github.com/langchain-ai/langchain/assets/29493541/52585f9b-71a2-43f0-9dfc-9935aa59c729) ![using_update_3_second_message](https://github.com/langchain-ai/langchain/assets/29493541/f94c8147-2d6f-407a-9a0f-86b94341abff) The doc located at `docs/docs/integrations/memory/aws_dynamodb.ipynb` required no changes and was tested as well.	2024-12-16 10:38:00 -05:00

1 2 3 4 5 ...

668 Commits