langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-02 03:26:17 +00:00

Author	SHA1	Message	Date
Ella Charlaix	6f95db81b7	huggingface: Add IPEX models support (#29179 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 00:16:44 +00:00
Bhav Sardana	d6a7aaa97d	community: Fix for Pydantic model validator of GoogleApiClient (#29346 ) - [ ] PR message: Delete this entire checklist* and replace with - Description: Fix for pedantic model validator for GoogleApiHandler - Issue: the issue #29165 - [ ] Lint and test*: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. --------- Signed-off-by: Bhav Sardana <sardana.bhav@gmail.com>	2025-01-21 15:17:43 -05:00
Christophe Bornet	1c4ce7b42b	core: Auto-fix some docstrings (#29337 )	2025-01-21 13:29:53 -05:00
ccurme	86a0720310	fireworks[patch]: update model used in integration tests (#29342 ) No access to firefunction-v1 and -v2.	2025-01-21 11:05:30 -05:00
Hugo Berg	32c9c58adf	Community: fix missing f-string modifier in oai structured output parsing error (#29326 ) - Description: The ValueError raised on certain structured-outputs parsing errors, in langchain openai community integration, was missing a f-string modifier and so didn't produce useful outputs. This is a 2-line, 2-character change. - Issue: None open that this fixes - Dependencies: Nothing changed - Twitter handle: None - [X] Add tests and docs: There's nothing to add for. - [-] Lint and test: Happy to run this if you deem it necessary. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-21 14:26:38 +00:00
Nuno Campos	566915d7cf	core: fix call to get closure vars for partial-wrapped funcs (#29316 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-21 09:26:15 -05:00
ZhangShenao	33e22ccb19	[Doc] Improve api doc (#29324 ) - Fix doc description - Add static method decorator	2025-01-21 09:16:08 -05:00
Tugay Talha İçen	7b44c3384e	Docs: update huggingfacehub.ipynb (#29329 ) langchain -> langchain langchain-huggingface Updated the installation command from: %pip install --upgrade --quiet langchain sentence_transformers to: %pip install --upgrade --quiet langchain-huggingface sentence_transformers This resolves an import error in the notebook when using from langchain_huggingface.embeddings import HuggingFaceEmbeddings.	2025-01-21 09:12:22 -05:00
Bagatur	536b44a47f	community[patch]: Release 0.3.15 (#29325 ) langchain-community==0.3.15	2025-01-21 03:10:07 +00:00
Bagatur	ec5fae76d4	langchain[patch]: Release 0.3.15 (#29322 ) langchain==0.3.15	2025-01-21 02:24:11 +00:00
Bagatur	923e6fb321	core[patch]: 0.3.31 (#29320 ) langchain-core==0.3.31	2025-01-21 01:17:31 +00:00
Ikko Eltociear Ashimine	06456c1dcf	docs: update google_cloud_sql_mssql.ipynb (#29315 ) arbitary -> arbitrary	2025-01-20 16:11:08 -05:00
Ahmed Tammaa	d3ed9b86be	text-splitters[minor]: Replace lxml and XSLT with BeautifulSoup in HTMLHeaderTextSplitter for Improved Large HTML File Processing (#27678 ) This pull request updates the `HTMLHeaderTextSplitter` by replacing the `split_text_from_file` method's implementation. The original method used `lxml` and XSLT for processing HTML files, which caused `lxml.etree.xsltapplyerror maxhead` when handling large HTML documents due to limitations in the XSLT processor. Fixes #13149 By switching to BeautifulSoup (`bs4`), we achieve: - Improved Performance and Reliability: BeautifulSoup efficiently processes large HTML files without the errors associated with `lxml` and XSLT. - Simplified Dependencies: Removes the dependency on `lxml` and external XSLT files, relying instead on the widely used `beautifulsoup4` library. - Maintained Functionality: The new method replicates the original behavior, ensuring compatibility with existing code and preserving the extraction of content and metadata. Issue: This change addresses issues related to processing large HTML files with the existing `HTMLHeaderTextSplitter` implementation. It resolves problems where users encounter lxml.etree.xsltapplyerror maxhead due to large HTML documents. Dependencies: - BeautifulSoup (`beautifulsoup4`): The `beautifulsoup4` library is now used for parsing HTML content. - Installation: `pip install beautifulsoup4` Code Changes: Updated the `split_text_from_file` method in `HTMLHeaderTextSplitter` as follows: ```python def split_text_from_file(self, file: Any) -> List[Document]: """Split HTML file using BeautifulSoup. Args: file: HTML file path or file-like object. Returns: List of Document objects with page_content and metadata. """ from bs4 import BeautifulSoup from langchain.docstore.document import Document import bs4 # Read the HTML content from the file or file-like object if isinstance(file, str): with open(file, 'r', encoding='utf-8') as f: html_content = f.read() else: # Assuming file is a file-like object html_content = file.read() # Parse the HTML content using BeautifulSoup soup = BeautifulSoup(html_content, 'html.parser') # Extract the header tags and their corresponding metadata keys headers_to_split_on = [tag[0] for tag in self.headers_to_split_on] header_mapping = dict(self.headers_to_split_on) documents = [] # Find the body of the document body = soup.body if soup.body else soup # Find all header tags in the order they appear all_headers = body.find_all(headers_to_split_on) # If there's content before the first header, collect it first_header = all_headers[0] if all_headers else None if first_header: pre_header_content = '' for elem in first_header.find_all_previous(): if isinstance(elem, bs4.Tag): text = elem.get_text(separator=' ', strip=True) if text: pre_header_content = text + ' ' + pre_header_content if pre_header_content.strip(): documents.append(Document( page_content=pre_header_content.strip(), metadata={} # No metadata since there's no header )) else: # If no headers are found, return the whole content full_text = body.get_text(separator=' ', strip=True) if full_text.strip(): documents.append(Document( page_content=full_text.strip(), metadata={} )) return documents # Process each header and its associated content for header in all_headers: current_metadata = {} header_name = header.name header_text = header.get_text(separator=' ', strip=True) current_metadata[header_mapping[header_name]] = header_text # Collect all sibling elements until the next header of the same or higher level content_elements = [] for sibling in header.find_next_siblings(): if sibling.name in headers_to_split_on: # Stop at the next header break if isinstance(sibling, bs4.Tag): content_elements.append(sibling) # Get the text content of the collected elements current_content = '' for elem in content_elements: text = elem.get_text(separator=' ', strip=True) if text: current_content += text + ' ' # Create a Document if there is content if current_content.strip(): documents.append(Document( page_content=current_content.strip(), metadata=current_metadata.copy() )) else: # If there's no content, but we have metadata, still create a Document documents.append(Document( page_content='', metadata=current_metadata.copy() )) return documents ``` --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-01-20 16:10:37 -05:00
Christophe Bornet	989eec4b7b	core: Add ruff rule S101 (no assert) (#29267 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-01-20 20:24:31 +00:00
Christophe Bornet	e5d62c6ce7	core: Add ruff rule W293 (whitespaces) (#29272 )	2025-01-20 15:16:12 -05:00
Philippe PRADOS	4efc5093c1	community[minor]: Refactoring PyMuPDF parser, loader and add image blob parsers (#29063 ) * Adds BlobParsers for images. These implementations can take an image and produce one or more documents per image. This interface can be used for exposing OCR capabilities. * Update PyMuPDFParser and Loader to standardize metadata, handle images, improve table extraction etc. - Twitter handle: pprados This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses to prepare the update of all parsers. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970). --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-01-20 15:15:43 -05:00
Syed Baqar Abbas	f175319303	[feat] Added backwards compatibility for OllamaEmbeddings initialization (migration from `langchain_community.embeddings` to `langchain_ollama.embeddings` (#29296 ) - [feat] Added backwards compatibility for OllamaEmbeddings initialization (migration from `langchain_community.embeddings` to `langchain_ollama.embeddings`: "langchain_ollama" - Description: Given that `OllamaEmbeddings` from `langchain_community.embeddings` is deprecated, code is being shifted to ``langchain_ollama.embeddings`. However, this does not offer backward compatibility of initializing the parameters and `OllamaEmbeddings` object. - Issue: #29294 - Dependencies: None - Twitter handle: @BaqarAbbas2001 ## Additional Information Previously, `OllamaEmbeddings` from `langchain_community.embeddings` used to support the following options: `e9abe583b2/libs/community/langchain_community/embeddings/ollama.py (L125-L139)` However, in the new package `from langchain_ollama import OllamaEmbeddings`, there is no method to set these options. I have added these parameters to resolve this issue. This issue was also discussed in https://github.com/langchain-ai/langchain/discussions/29113	2025-01-20 11:16:29 -05:00
CLOVA Studio 개발	7a95ffc775	community: fix some features on Naver ChatModel & embedding model 2 (#29243 ) ## Description - Responding to `NCP API Key` changes. - To fix `ChatClovaX` `astream` function to raise `SSEError` when an error event occurs. - To add `token length` and `ai_filter` to ChatClovaX's `response_metadata`. - To update document for apply NCP API Key changes. cc. @efriis @vbarda	2025-01-20 11:01:03 -05:00
Sangyun_LEE	5d64597490	docs: fix broken Appearance of langchain_community/document_loaders/recursive_url_loader API Reference (#29305 ) # PR mesesage ## Description Fixed a broken Appearance of RecurisveUrlLoader API Reference. ### Before <p align="center"> <img width="750" alt="image" src="https://github.com/user-attachments/assets/f39df65d-b788-411d-88af-8bfa2607c00b" /> <img width="750" alt="image" src="https://github.com/user-attachments/assets/b8a92b70-4548-4b4a-965f-026faeebd0ec" /> </p> ### After <p align="center"> <img width="750" alt="image" src="https://github.com/user-attachments/assets/8ea28146-de45-42e2-b346-3004ec4dfc55" /> <img width="750" alt="image" src="https://github.com/user-attachments/assets/914c6966-4055-45d3-baeb-2d97eab06fe7" /> </p> ## Issue: N/A ## Dependencies None ## Twitter handle N/A # Add tests and docs Not applicable; this change only affects documentation. # Lint and test Ran make format, make lint, and make test to ensure no issues.	2025-01-20 10:56:59 -05:00
Hemant Rawat	6c52378992	Add Google-style docstring linting and update pyproject.toml (#29303 ) ### Description: This PR introduces Google-style docstring linting for the ModelLaboratory class in libs/langchain/langchain/model_laboratory.py. It also updates the pyproject.toml file to comply with the latest Ruff configuration standards (deprecating top-level lint settings in favor of lint). ### Changes include: - [x] Added detailed Google-style docstrings to all methods in ModelLaboratory. - [x] Updated pyproject.toml to move select and pydocstyle settings under the [tool.ruff.lint] section. - [x] Ensured all files pass Ruff linting. Issue: Closes #25154 ### Dependencies: No additional dependencies are required for this change. ### Checklist - [x] Files passes ruff linting. - [x] Docstrings conform to the Google-style convention. - [x] pyproject.toml updated to avoid deprecation warnings. - [x] My PR is ready to review, please review.	2025-01-19 14:37:21 -05:00
Mohammad Mohtashim	b5fbebb3c8	(Community): Changing the BaseURL and Model for MiniMax (#29299 ) - Description: Changed the Base Default Model and Base URL to correct versions. Plus added a more explicit exception if user provides an invalid API Key - Issue: #29278	2025-01-19 14:15:02 -05:00
ccurme	c20f7418c7	openai[patch]: fix Azure LLM test (#29302 ) The tokens I get are: ``` ['', '\n\n', 'The', ' sun', ' was', ' setting', ' over', ' the', ' horizon', ',', ' casting', ''] ``` so possibly an extra empty token is included in the output. lmk @efriis if we should look into this further. langchain-openai==0.3.1	2025-01-19 17:25:42 +00:00
ccurme	6b249a0dc2	openai[patch]: release 0.3.1 (#29301 )	2025-01-19 17:04:00 +00:00
ThomasSaulou	e9abe583b2	chatperplexity stream-citations in additional kwargs (#29273 ) chatperplexity stream-citations in additional kwargs --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-18 22:31:10 +00:00
Farzad Sharif	8fad9214c7	docs: fix qa_per_user.ipynb (#29290 ) # Description The `config` option was not passed to `configurable_retriever.invoke()`. Screenshot below. Fixed. <img width="731" alt="Screenshot 2025-01-18 at 11 59 28 AM" src="https://github.com/user-attachments/assets/21f30739-2abd-4150-b3ad-626ea9e3f96c" />	2025-01-18 16:24:31 -05:00
Vadim Rusin	2fb6fd7950	docs: fix broken link in JSONOutputParser reference (#29292 ) ### PR message: - Description: Fixed a broken link in the documentation for the `JSONOutputParser`. Updated the link to point to the correct reference: From: `https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JSONOutputParser.html` To: `https://python.langchain.com/api_reference/core/output_parsers/langchain_core.output_parsers.json.JsonOutputParser.html`. This ensures accurate navigation for users referring to the `JsonOutputParser` documentation. - Issue: N/A - Dependencies: None - Twitter handle: N/A ### Add tests and docs: Not applicable; this change only affects documentation. ### Lint and test: Ran `make format`, `make lint`, and `make test` to ensure no issues.	2025-01-18 16:17:34 -05:00
TheSongg	1cd4d8d101	[langchain_community.llms.xinference]: Rewrite _stream() method and support stream() method in xinference.py (#29259 ) - [ ] PR title:[langchain_community.llms.xinference]: Rewrite _stream() method and support stream() method in xinference.py - [ ] PR message: Rewrite the _stream method so that the chain.stream() can be used to return data streams. chain = prompt \| llm chain.stream(input=user_input) - [ ] tests: from langchain_community.llms import Xinference from langchain.prompts import PromptTemplate llm = Xinference( server_url="http://0.0.0.0:9997", # replace your xinference server url model_uid={model_uid} # replace model_uid with the model UID return from launching the model stream = True ) prompt = PromptTemplate(input=['country'], template="Q: where can we visit in the capital of {country}? A:") chain = prompt \| llm chain.stream(input={'country': 'France'})	2025-01-17 20:31:59 -05:00
Amaan	d4b9404fd6	docs: add langchain dappier tool integration notebook (#29265 ) Add tools to interact with Dappier APIs with an example notebook. For `DappierRealTimeSearchTool`, the tool can be invoked with: ```python from langchain_dappier import DappierRealTimeSearchTool tool = DappierRealTimeSearchTool() tool.invoke({"query": "What happened at the last wimbledon"}) ``` ``` At the last Wimbledon in 2024, Carlos Alcaraz won the title by defeating Novak Djokovic. This victory marked Alcaraz's fourth Grand Slam title at just 21 years old! 🎉🏆🎾 ``` For DappierAIRecommendationTool the tool can be invoked with: ```python from langchain_dappier import DappierAIRecommendationTool tool = DappierAIRecommendationTool( data_model_id="dm_01j0pb465keqmatq9k83dthx34", similarity_top_k=3, ref="sportsnaut.com", num_articles_ref=2, search_algorithm="most_recent", ) ``` ``` [{"author": "Matt Weaver", "image_url": "https://images.dappier.com/dm_01j0pb465keqmatq9k83dthx34...", "pubdate": "Fri, 17 Jan 2025 08:04:03 +0000", "source_url": "https://sportsnaut.com/chili-bowl-thursday-bell-column/", "summary": "The article highlights the thrilling unpredictability... ", "title": "Thursday proves why every lap of Chili Bowl..."}, {"author": "Matt Higgins", "image_url": "https://images.dappier.com/dm_01j0pb465keqmatq9k83dthx34...", "pubdate": "Fri, 17 Jan 2025 02:48:42 +0000", "source_url": "https://sportsnaut.com/new-york-mets-news-pete-alonso...", "summary": "The New York Mets are likely parting ways with star...", "title": "MLB insiders reveal New York Mets’ last-ditch..."}, {"author": "Jim Cerny", "image_url": "https://images.dappier.com/dm_01j0pb465keqmatq9k83dthx34...", "pubdate": "Fri, 17 Jan 2025 05:10:39 +0000", "source_url": "https://www.foreverblueshirts.com/new-york-rangers-news...", "summary": "The New York Rangers achieved a thrilling 5-3 comeback... ", "title": "Rangers score 3 times in 3rd period for stirring 5-3..."}] ``` The integration package can be found over here - https://github.com/DappierAI/langchain-dappier	2025-01-17 19:02:28 -05:00
ccurme	184ea8aeb2	anthropic[patch]: update tool choice type (#29276 ) langchain-anthropic==0.3.3	2025-01-17 15:26:33 -05:00
ccurme	ac52021097	anthropic[patch]: release 0.3.2 (#29275 ) langchain-anthropic==0.3.2	2025-01-17 19:48:31 +00:00
ccurme	c616b445f2	anthropic[patch]: support `parallel_tool_calls` (#29257 ) Need to: - Update docs - Decide if this is an explicit kwarg of bind_tools - Decide if this should be in standard test with flag for supporting	2025-01-17 19:41:41 +00:00
Erick Friis	628145b172	infra: fix api build (#29274 )	2025-01-17 10:41:59 -08:00
Zapiron	97a5bc7fc7	docs: Fixed typos and improve metadata explanation (#29266 ) Fix mini typos and made the explanation of metadata filtering clearer	2025-01-17 11:17:40 -05:00
Jun He	f0226135e5	docs: Remove redundant "%" (#29205 ) Before this commit, the copied command can't be used directly.	2025-01-17 14:30:58 +00:00
Michael Chin	36ff83a0b5	docs: Message history for Neptune chains (#29260 ) Expanded the Amazon Neptune documentation with new sections detailing usage of chat message history with the `create_neptune_opencypher_qa_chain` and `create_neptune_sparql_qa_chain` functions.	2025-01-17 09:06:17 -05:00
ccurme	d5360b9bd6	core[patch]: release 0.3.30 (#29256 ) langchain-core==0.3.30	2025-01-16 17:52:37 -05:00
Nuno Campos	595297e2e5	core: Add support for calls in get_function_nonlocals (#29255 ) Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-16 14:43:42 -08:00
Luis Lopez	75663f2cae	community: Add cost per 1K tokens for fine-tuned model cached input (#29248 ) ### Description - Since there is no cost per 1k input tokens for a fine-tuned cached version of `gpt-4o-mini-2024-07-18` is not available when using the `OpenAICallbackHandler`, it raises an error when trying to make calls with such model. - To add the price in the `MODEL_COST_PER_1K_TOKENS` dictionary cc. @efriis	2025-01-16 15:19:26 -05:00
Junon	667d2a57fd	add mode arg to OBSFileLoader.load() method (#29246 ) - Description: add mode arg to OBSFileLoader.load() method - Issue: #29245 - Dependencies: no dependencies required for this change --------- Co-authored-by: Junon_Gz <junon_gz@qq.com>	2025-01-16 11:09:04 -05:00
齐	c6388d736b	docs: fix typo in tool_results_pass_to_model.ipynb (how-to) (#29252 ) Description: fix typo. change word from `cals` to `calls` Issue: closes #29251 Dependencies: None Twitter handle: None	2025-01-16 11:05:28 -05:00
Erick Friis	4bc6cb759f	docs: update recommended code interpreters (#29236 ) unstable :(	2025-01-15 16:03:26 -08:00
Erick Friis	5eb4dc5e06	standard-tests: double messages test (#29237 )	2025-01-15 15:14:29 -08:00
Nithish Raghunandanan	1051fa5729	couchbase: Migrate couchbase partner package to different repo (#29239 ) Description: Migrate the couchbase partner package to [Couchbase-Ecosystem](https://github.com/Couchbase-Ecosystem/langchain-couchbase) org	2025-01-15 12:37:27 -08:00
Nadeem Sajjad	eaf2fb287f	community(pypdfloader): added page_label in metadata for pypdf loader (#29225 ) # Description ## Summary This PR adds support for handling multi-labeled page numbers in the PyPDFLoader. Some PDFs use complex page numbering systems where the actual content may begin after multiple introductory pages. The page_label field helps accurately reflect the document’s page structure, making it easier to handle such cases during document parsing. ## Motivation This feature improves document parsing accuracy by allowing users to access the actual page labels instead of relying only on the physical page numbers. This is particularly useful for documents where the first few pages have roman numerals or other non-standard page labels. ## Use Case This feature is especially useful for Retrieval-Augmented Generation (RAG) systems where users may reference page numbers when asking questions. Some PDFs have both labeled page numbers (like roman numerals for introductory sections) and index-based page numbers. For example, a user might ask: "What is mentioned on page 5?" The system can now check both: • Index-based page number (page) • Labeled page number (page_label) This dual-check helps improve retrieval accuracy. Additionally, the results can be validated with an agent or tool to ensure the retrieved pages match the user’s query contextually. ## Code Changes - Added a page_label field to the metadata of the Document class in PyPDFLoader. - Implemented support for retrieving page_label from the pdf_reader.page_labels. - Created a test case (test_pypdf_loader_with_multi_label_page_numbers) with a sample PDF containing multi-labeled pages (geotopo-komprimiert.pdf) [[Source of pdf](https://github.com/py-pdf/sample-files/blob/main/009-pdflatex-geotopo/GeoTopo-komprimiert.pdf)]. - Updated existing tests to ensure compatibility and verify page_label extraction. ## Tests Added - Added a new test case for a PDF with multi-labeled pages. - Verified both page and page_label metadata fields are correctly extracted. ## Screenshots <img width="549" alt="image" src="https://github.com/user-attachments/assets/65db9f5c-032e-4592-926f-824777c28f33" />	2025-01-15 14:18:07 -05:00
Mehdi	1a38948ee3	Mehdi zare/fmp data doc (#29219 ) Title: community: add Financial Modeling Prep (FMP) API integration Description: Adding LangChain integration for Financial Modeling Prep (FMP) API to enable semantic search and structured tool creation for financial data endpoints. This integration provides semantic endpoint search using vector stores and automatic tool creation with proper typing and error handling. Users can discover relevant financial endpoints using natural language queries and get properly typed LangChain tools for discovered endpoints. Issue: N/A Dependencies: fmp-data>=0.3.1 langchain-core>=0.1.0 faiss-cpu tiktoken Twitter handle: @mehdizarem Unit tests and example notebook have been added: Tests are in tests/integration_tests/est_tools.py and tests/unit_tests/test_tools.py Example notebook is in docs/tools.ipynb All format, lint and test checks pass: pytest mypy . Dependencies are imported within functions and not added to pyproject.toml. The changes are backwards compatible and only affect the community package. --------- Co-authored-by: mehdizare <mehdizare@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-15 15:31:01 +00:00
Mohammad Mohtashim	288613d361	(text-splitters): Small Fix in `_process_html` for HTMLSemanticPreservingSplitter to properly extract the metadata. (#29215 ) - Description: Include `main` in the list of elements whose child elements needs to be processed for splitting the HTML. - Issue: #29184	2025-01-15 10:18:06 -05:00
TheSongg	4867fe7ac8	[langchain_community.llms.xinference]: fix error in xinference.py (#29216 ) - [ ] PR title: [langchain_community.llms.xinference]: fix error in xinference.py - [ ] PR message: - The old code raised an ValidationError: pydantic_core._pydantic_core.ValidationError: 1 validation error for Xinference when import Xinference from xinference.py. This issue has been resolved by adjusting it's type and default value. File "/media/vdc/python/lib/python3.10/site-packages/pydantic/main.py", line 212, in __init__ validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self) pydantic_core._pydantic_core.ValidationError: 1 validation error for Xinference client Field required [type=missing, input_value={'server_url': 'http://10...t4', 'model_kwargs': {}}, input_type=dict] For further information visit https://errors.pydantic.dev/2.9/v/missing - [ ] tests: from langchain_community.llms import Xinference llm = Xinference( server_url="http://0.0.0.0:9997", # replace your xinference server url model_uid={model_uid} # replace model_uid with the model UID return from launching the model )	2025-01-15 10:11:26 -05:00
Kostadin Devedzhiev	bea5798b04	docs: Fix typo in retrievers documentation: 'An vectorstore' -> 'A vectorstore' (#29221 ) - [x] PR title: "docs: Fix typo in documentation" - [x] PR message: - Description: Fixed a typo in the documentation, changing "An vectorstore" to "A vector store" for grammatical accuracy. - Issue: N/A (no issue filed for this typo fix) - Dependencies: None - Twitter handle: N/A - [x] Add tests and docs: This is a minor documentation fix that doesn't require additional tests or example notebooks. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2025-01-15 10:10:14 -05:00
Sohaib Athar	d1cf10373b	Update elasticsearch_retriever.ipynb (#29223 ) docs: fix typo (connection) - Twitter handle: @ReallyVirtual	2025-01-15 10:09:51 -05:00
Syed Baqar Abbas	4278046329	[fix] Convert table names to list for compatibility in SQLDatabase (#29229 ) - [langchain_community.utilities.SQLDatabase] [fix] Convert table names to list for compatibility in SQLDatabase: - The issue #29227 is being fixed here - The "package" modified is community - The issue lied in this block of code: `44b41b699c/libs/community/langchain_community/utilities/sql_database.py (L72-L77)` - [langchain_community.utilities.SQLDatabase] [fix] Convert table names to list for compatibility in SQLDatabase: - Description: When the SQLDatabase is initialized, it runs a code `self._inspector.get_table_names(schema=schema)` which expects an output of list. However, with some connectors (such as snowflake) the data type returned could be another iterable. This results in a type error when concatenating the table_names to view_names. I have added explicit type casting to prevent this. - Issue: The issue #29227 is being fixed here - Dependencies: None - Twitter handle: @BaqarAbbas2001 ## Additional Information When the following method is called for a Snowflake database: `44b41b699c/libs/community/langchain_community/utilities/sql_database.py (L75)` Snowflake under the hood calls: ```python from snowflake.sqlalchemy.snowdialect import SnowflakeDialect SnowflakeDialect.get_table_names ``` This method returns a `dict_keys()` object which is incompatible to concatenate with a list and results in a `TypeError` ### Relevant Library Versions - snowflake-sqlalchemy: 1.7.2 - snowflake-connector-python: 3.12.4 - sqlalchemy: 2.0.20 - langchain_community: 0.3.14	2025-01-15 10:00:03 -05:00

1 2 3 4 5 ...

12478 Commits