langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-20 01:49:51 +00:00

Author	SHA1	Message	Date
Erick Friis	30badd7a32	packages: update mongodb folder (#29217 )	2025-01-15 02:01:06 +00:00
pm390	76172511fd	community: Additional parameters for OpenAIAssistantV2Runnable (#29207 ) Description: Added Additional parameters that could be useful for usage of OpenAIAssistantV2Runnable. This change is thought to allow langchain users to set parameters that cannot be set using assistants UI (max_completion_tokens,max_prompt_tokens,parallel_tool_calls) and parameters that could be useful for experimenting like top_p and temperature. This PR originated from the need of using parallel_tool_calls in langchain, this parameter is very important in openAI assistants because without this parameter set to False strict mode is not respected by OpenAI Assistants (https://platform.openai.com/docs/guides/function-calling#parallel-function-calling). > Note: Currently, if the model calls multiple functions in one turn then strict mode will be disabled for those calls. Issue: None Dependencies: openai	2025-01-14 15:53:37 -05:00
Bagatur	4ab04ad6be	docs: oai api ref nit (#29210 )	2025-01-14 17:55:16 +00:00
Michael Chin	d9b856abad	community: Deprecate Amazon Neptune resources in langchain-community (#29191 ) Related: https://github.com/langchain-ai/langchain-aws/pull/322 The legacy `NeptuneOpenCypherQAChain` and `NeptuneSparqlQAChain` classes are being replaced by the new LCEL format chains `create_neptune_opencypher_qa_chain` and `create_neptune_sparql_qa_chain`, respectively, in the `langchain_aws` package. This PR adds deprecation warnings to all Neptune classes and functions that have been migrated to `langchain_aws`. All relevant documentation has also been updated to replace `langchain_community` usage with the new `langchain_aws` implementations. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-14 10:23:34 -05:00
Erick Friis	c55af44711	anthropic: pydantic mypy plugin (#29144 )	2025-01-13 15:32:40 -08:00
ccurme	1bf6576709	cli[patch]: fix anchor links in templates (#29178 ) These are outdated and can break docs builds.	2025-01-13 18:28:18 +00:00
Christopher Varjas	e156b372fb	langchain: support api key argument with OpenAI moderation chain (#29140 ) Description: Makes it possible to instantiate `OpenAIModerationChain` with an `openai_api_key` argument only and no `OPENAI_API_KEY` environment variable defined. Issue: https://github.com/langchain-ai/langchain/issues/25176 Dependencies: `openai` --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2025-01-13 11:00:02 -05:00
Nikhil Shahi	335ca3a606	docs: add HyperbrowserLoader docs (#29143 ) ### Description This PR adds docs for the [langchain-hyperbrowser](https://pypi.org/project/langchain-hyperbrowser/) package. It includes a document loader that uses Hyperbrowser to scrape or crawl any urls and return formatted markdown or html content as well as relevant metadata. [Hyperbrowser](https://hyperbrowser.ai) is a platform for running and scaling headless browsers. It lets you launch and manage browser sessions at scale and provides easy to use solutions for any webscraping needs, such as scraping a single page or crawling an entire site. ### Issue None ### Dependencies None ### Twitter Handle `@hyperbrowser`	2025-01-13 10:45:39 -05:00
Tymon Żarski	689592f9bb	community: Fix rank-llm import paths for new 0.20.3 version (#29154 ) # PR title: "community: Fix rank-llm import paths for new 0.20.3 version" - The "community" package is being modified to handle updated import paths for the new `rank-llm` version. --- ## Description This PR updates the import paths for the `rank-llm` package to account for changes introduced in version `0.20.3`. The changes ensure compatibility with both pre- and post-revamp versions of `rank-llm`, specifically version `0.12.8`. Conditional imports are introduced based on the detected version of `rank-llm` to handle different path structures for `VicunaReranker`, `ZephyrReranker`, and `SafeOpenai`. ## Issue RankLLMRerank usage throws an error when used GPT (not only) when rank-llm version is > 0.12.8 - #29156 ## Dependencies This change relies on the `packaging` and `pkg_resources` libraries to handle version checks. ## Twitter handle @tymzar	2025-01-13 10:22:14 -05:00
Andrew	0e3115330d	Add additional_instructions on openai assistan runs create. (#29164 ) - Description: In the functions `_create_run` and `_acreate_run`, the parameters passed to the creation of `openai.resources.beta.threads.runs` were limited. Source: ``` def _create_run(self, input: dict) -> Any: params = { k: v for k, v in input.items() if k in ("instructions", "model", "tools", "run_metadata") } return self.client.beta.threads.runs.create( input["thread_id"], assistant_id=self.assistant_id, params, ) ``` - OpenAI Documentation ([createRun](https://platform.openai.com/docs/api-reference/runs/createRun)) - Full list of parameters `openai.resources.beta.threads.runs` ([source code](https://github.com/openai/openai-python/blob/main/src/openai/resources/beta/threads/runs/runs.py#L91)) - Issue: Fix #17574 - [x] Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Co-authored-by: ccurme <chester.curme@gmail.com>	2025-01-13 10:11:47 -05:00
ccurme	e4ceafa1c8	langchain[patch]: update extended tests for compatibility with langchain-openai==0.3 (#29174 )	2025-01-13 15:04:22 +00:00
Priyansh Agrawal	c115c09b6d	community: add missing format specifier in error log in CubeSemanticLoader (#29172 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message - Description: Add a missing format specifier in an an error log in `langchain_community.document_loaders.CubeSemanticLoader` - Issue: raises `TypeError: not all arguments converted during string formatting` - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-13 09:32:57 -05:00
ThomasSaulou	349b5c91c2	fix chatperplexity: remove 'stream' from params in _stream method (#29173 ) quick fix chatperplexity: remove 'stream' from params in _stream method	2025-01-13 09:31:37 -05:00
LIU Yuwei	f980144e9c	community: add init for unstructured file loader (#29101 ) ## Description Add `__init__` for unstructured loader of epub/image/markdown/pdf/ppt/word to restrict the input type to `str` or `Path`. In the [signature](https://python.langchain.com/api_reference/community/document_loaders/langchain_community.document_loaders.markdown.UnstructuredMarkdownLoader.html) these unstructured loaders receive `file_path: str \| List[str] \| Path \| List[Path]`, but actually they only receive `str` or `Path`. ## Issue None ## Dependencies No changes.	2025-01-13 09:26:00 -05:00
Erick Friis	bbc3e3b2cf	openai: disable streaming for o1 by default (#29147 ) Currently 400s https://community.openai.com/t/streaming-support-for-o1-o1-2024-12-17-resulting-in-400-unsupported-value/1085043 o1-mini and o1-preview stream fine	2025-01-11 02:24:11 +00:00
Isaac Francisco	62074bac60	replace all LANGCHAIN_ flags with LANGSMITH_ flags (#29120 )	2025-01-11 01:24:40 +00:00
Bagatur	5c2fbb5b86	docs: Update openai README.md (#29146 )	2025-01-10 17:24:16 -08:00
Erick Friis	0a54aedb85	anthropic: pdf integration test (#29142 )	2025-01-10 21:56:31 +00:00
ccurme	8de8519daf	tests[patch]: release 0.3.8 (#29141 )	2025-01-10 21:53:41 +00:00
Jiang	7d3fb21807	Add lindorm as new integration (#29123 ) Misoperation caused the pr close: [origin pr link](https://github.com/langchain-ai/langchain/pull/29085) --------- Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>	2025-01-10 16:30:37 -05:00
ccurme	4819b500e8	pinecone[patch]: release 0.2.2 (#29139 )	2025-01-10 14:59:57 -05:00
Ashvin	46fd09ffeb	partner: Update aiohttp in langchain pinecone. (#28863 ) - partner: "Update Aiohttp for resolving vulnerability issue" - Description: I have updated the upper limit of aiohttp from `3.10` to `3.10.5` in the pyproject.toml file of langchain-pinecone. Hopefully this will resolve #28771 . Please review this as I'm quite unsure. --------- Co-authored-by: = <=> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-10 14:54:52 -05:00
ccurme	f3d370753f	xai[minor]: release 0.2 (#29132 ) Update `langchain-openai` to 0.3. See [release notes](https://github.com/langchain-ai/langchain/releases/tag/langchain-openai%3D%3D0.3.0) for details. Should only impact default values of `temperature`, `n`, and `max_retries`.	2025-01-10 11:47:27 -05:00
ccurme	6e63ccba84	openai[minor]: release 0.3 (#29100 ) ## Goal Solve the following problems with `langchain-openai`: - Structured output with `o1` [breaks out of the box](https://langchain.slack.com/archives/C050X0VTN56/p1735232400232099). - `with_structured_output` by default does not use OpenAI’s [structured output feature](https://platform.openai.com/docs/guides/structured-outputs). - We override API defaults for temperature and other parameters. ## Breaking changes: - Default method for structured output is changing to OpenAI’s dedicated [structured output feature](https://platform.openai.com/docs/guides/structured-outputs). For schemas specified via TypedDict or JSON schema, strict schema validation is disabled by default but can be enabled by specifying `strict=True`. - To recover previous default, pass `method="function_calling"` into `with_structured_output`. - Models that don’t support `method="json_schema"` (e.g., `gpt-4` and `gpt-3.5-turbo`, currently the default model for ChatOpenAI) will raise an error unless `method` is explicitly specified. - To recover previous default, pass `method="function_calling"` into `with_structured_output`. - Schemas specified via Pydantic `BaseModel` that have fields with non-null defaults or metadata (like min/max constraints) will raise an error. - To recover previous default, pass `method="function_calling"` into `with_structured_output`. - `strict` now defaults to False for `method="json_schema"` when schemas are specified via TypedDict or JSON schema. - To recover previous behavior, use `with_structured_output(schema, strict=True)` - Schemas specified via Pydantic V1 will raise a warning (and use `method="function_calling"`) unless `method` is explicitly specified. - To remove the warning, pass `method="function_calling"` into `with_structured_output`. - Streaming with default structured output method / Pydantic schema no longer generates intermediate streamed chunks. - To recover previous behavior, pass `method="function_calling"` into `with_structured_output`. - We no longer override default temperature (was 0.7 in LangChain, now will follow OpenAI, currently 1.0). - To recover previous behavior, initialize `ChatOpenAI` or `AzureChatOpenAI` with `temperature=0.7`. - Note: conceptually there is a difference between forcing a tool call and forcing a response format. Tool calls may have more concise arguments vs. generating content adhering to a schema. Prompts may need to be adjusted to recover desired behavior. --------- Co-authored-by: Jacob Lee <jacoblee93@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2025-01-10 10:50:32 -05:00
ccurme	815bfa1913	openai[patch]: support streaming with json_schema response format (#29044 ) - Stream JSON string content. Final chunk includes parsed representation (following OpenAI [docs](https://platform.openai.com/docs/guides/structured-outputs#streaming)). - Mildly (?) breaking change: if you were using streaming with `response_format` before, usage metadata will disappear unless you set `stream_usage=True`. ## Response format Before: ![Screenshot 2025-01-06 at 11 59 01 AM](https://github.com/user-attachments/assets/e54753f7-47d5-421d-b8f3-172f32b3364d) After: ![Screenshot 2025-01-06 at 11 58 13 AM](https://github.com/user-attachments/assets/34882c6c-2284-45b4-92f7-5b5b69896903) ## with_structured_output For pydantic output, behavior of `with_structured_output` is unchanged (except for warning disappearing), because we pluck the parsed representation straight from OpenAI, and OpenAI doesn't return it until the stream is completed. Open to alternatives (e.g., parsing from content or intermediate dict chunks generated by OpenAI). Before: ![Screenshot 2025-01-06 at 12 38 11 PM](https://github.com/user-attachments/assets/913d320d-f49e-4cbb-a800-b394ae817fd1) After: ![Screenshot 2025-01-06 at 12 38 58 PM](https://github.com/user-attachments/assets/f7a45dd6-d886-48a6-8d76-d0e21ca767c6)	2025-01-09 10:32:30 -05:00
Panos Vagenas	858f655a25	docs: add Docling loader docs (#29104 ) ### Description This adds the docs for the Docling document loader. [Docling](https://github.com/DS4SD/docling) parses PDF, DOCX, PPTX, HTML, and other formats into a rich unified representation including document layout, tables etc., making them ready for generative AI workflows like RAG. Some references: - https://research.ibm.com/blog/docling-generative-AI - https://www.redhat.com/en/blog/docling-missing-document-processing-companion-generative-ai - [Docling Technical Report](https://arxiv.org/abs/2408.09869) The introduced `DoclingLoader` enables users to: - use various document types in their LLM applications with ease and speed, and - leverage Docling's rich representation for advanced, document-native grounding. ### Issue Replacing PR #27987 as discussed with @efriis [here](https://github.com/langchain-ai/langchain/pull/27987#issuecomment-2489354930). ### Dependencies None --------- Signed-off-by: Panos Vagenas <35837085+vagenas@users.noreply.github.com>	2025-01-09 10:15:35 -05:00
Joshua Campbell	00dcc44739	Langchain_community: Fix issue with missing backticks in arango client (#29110 ) - Description: Adds backticks to generate_schema function in the arango graph client - Issue: We experienced an issue with the generate schema function when talking to our arango database where these backticks were missing - Dependencies: none - Twitter handle: @anangelofgrace	2025-01-09 10:00:10 -05:00
LIU Yuwei	2b09f798e1	community: add init for `UnstructuredHTMLLoader` to solve pathlib paths (#29091 ) ## Description Add `__init__` for `UnstructuredHTMLLoader` to restrict the input type to `str` or `Path`, and transfer the `self.file_path` to `str` just like `UnstructuredXMLLoader` does. ## Issue Fix #29090 ## Dependencies No changes.	2025-01-08 10:19:27 -05:00
Jin Hyung Ahn	c8ca1cd42f	community: fix "confluence-loader" enable include_labels for documents loaded via CQL (#29089 ) ## Description This PR enables label inclusion for documents loaded via CQL in the confluence-loader. - Updated _lazy_load to pass the include_labels parameter instead of False in process_pages calls for documents loaded via CQL. - Ensured that labels can now be fetched and added to the metadata for documents queried with cql. ## Related Modification History This PR builds on the previous functionality introduced in [#28259](https://github.com/langchain-ai/langchain/pull/28259), which added support for including labels with the include_labels option. However, this functionality did not work as expected for CQL queries, and this PR fixes that issue. If the False handling was intentional due to another issue, please let me know. I have verified with our Confluence instance that this change allows labels to be correctly fetched for documents loaded via CQL. ## Issue Fixes #29088 ## Dependencies No changes. ## Twitter Handle [@zenoengine](https://x.com/zenoengine)	2025-01-08 10:16:39 -05:00
Inah Jeon	9d290abccd	partner: Update Upstage Model Names and Remove Deprecated Model (#29093 ) This PR updates model names in the upstage library to reflect the latest naming conventions and removes deprecated models. Changes: Renamed Models: - `solar-1-mini-chat` -> `solar-mini` - `solar-1-mini-embedding-query` -> `embedding-query` Removed Deprecated Models: - `layout-analysis` (replaced to `document-parse`) Reference: - https://console.upstage.ai/docs/getting-started/overview - https://github.com/langchain-ai/langchain-upstage/releases/tag/libs%2Fupstage%2Fv0.5.0 Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2025-01-08 10:13:22 -05:00
Prashanth Rao	b1dafaef9b	Kùzu package integration docs (#29076 ) ## Langchain Kùzu ### Description This PR adds docs for the `langchain-kuzu` package [on PyPI](https://pypi.org/project/langchain-kuzu/) that was recently published, allowing Kùzu users to more easily use and work with LangChain QA chains. The package will also make it easier for the Kùzu team to continue supporting and updating the integration over future releases. ### Twitter Handle Please tag [@kuzudb](https://x.com/kuzudb) on Twitter once this PR is merged, so LangChain users can be notified! --------- Co-authored-by: Erick Friis <erickfriis@gmail.com>	2025-01-08 01:14:00 +00:00
Erick Friis	cc0f81f40f	partners/groq: release 0.2.3 (#29081 )	2025-01-07 23:36:51 +00:00
Erick Friis	fcc9cdd100	multiple: disable socket for unit tests (#29080 )	2025-01-07 15:31:50 -08:00
Erick Friis	539ebd5431	groq: user agent (#29079 )	2025-01-07 23:21:57 +00:00
Erick Friis	c5bee0a544	pinecone: bump core version (#29077 )	2025-01-07 20:23:33 +00:00
Cory Waddingham	ce9e9f9314	pinecone: Review pinecone tests (#29073 ) Title: langchain-pinecone: improve test structure and async handling Description: This PR improves the test infrastructure for the langchain-pinecone package by: 1. Implementing LangChain's standard test patterns for embeddings 2. Adding comprehensive configuration testing 3. Improving async test coverage 4. Fixing integration test issues with namespaces and async markers The changes make the tests more robust, maintainable, and aligned with LangChain's testing standards while ensuring proper async behavior in the embeddings implementation. Key improvements: - Added standard EmbeddingsTests implementation - Split custom configuration tests into a separate test class - Added proper async test coverage with pytest-asyncio - Fixed namespace handling in vector store integration tests - Improved test organization and documentation Dependencies: None (uses existing test dependencies) Tests and Documentation: - ✅ Added standard test implementation following LangChain's patterns - ✅ Added comprehensive unit tests for configuration and async behavior - ✅ All tests passing locally - No documentation changes needed (internal test improvements only) Twitter handle: N/A --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-07 11:46:30 -08:00
Philippe PRADOS	2921597c71	community[patch]: Refactoring PDF loaders: 01 prepare (#29062 ) - Refactoring PDF loaders step 1: "community: Refactoring PDF loaders to standardize approaches" - Description: Declare CloudBlobLoader in __init__.py. file_path is Union[str, PurePath] anywhere - Twitter handle: pprados This is one part of a larger Pull Request (PR) that is too large to be submitted all at once. This specific part focuses to prepare the update of all parsers. For more details, see [PR 28970](https://github.com/langchain-ai/langchain/pull/28970). @eyurtsev it's the start of a PR series.	2025-01-07 11:00:04 -05:00
ccurme	55677e31f7	text-splitters[patch]: release 0.3.5 (#29054 ) Resolves https://github.com/langchain-ai/langchain/issues/29053	2025-01-07 09:48:26 -05:00
Erick Friis	187131c55c	Revert "integrations[patch]: remove non-required chat param defaults" (#29048 ) Reverts langchain-ai/langchain#26730 discuss best way to release default changes (esp openai temperature)	2025-01-06 14:45:34 -08:00
Bagatur	3d7ae8b5d2	integrations[patch]: remove non-required chat param defaults (#26730 ) anthropic: - max_retries openai: - n - temperature - max_retries fireworks - temperature groq - n - max_retries - temperature mistral - max_retries - timeout - max_concurrent_requests - temperature - top_p - safe_mode --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-06 22:26:22 +00:00
UV	b9db8e9921	DOC: Improve human input prompt in FewShotChatMessagePromptTemplate example (#29023 ) Fixes #29010 This PR updates the example for FewShotChatMessagePromptTemplate by modifying the human input prompt to include a more descriptive and user-friendly question format ('What is {input}?') instead of just '{input}'. This change enhances clarity and usability in the documentation example. Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-06 12:29:15 -08:00
ccurme	1f78d4faf4	voyageai[patch]: release 0.1.4 (#29046 )	2025-01-06 20:20:19 +00:00
Eugene Evstafiev	6a152ce245	docs: add langchain-pull-md Markdown loader (#29024 ) - [x] PR title: "docs: add langchain-pull-md Markdown loader" - [x] PR message: - Description: This PR introduces the `langchain-pull-md` package to the LangChain community. It includes a new document loader that utilizes the pull.md service to convert URLs into Markdown format, particularly useful for handling web pages rendered with JavaScript frameworks like React, Angular, or Vue.js. This loader helps in efficient and reliable Markdown conversion directly from URLs without local rendering, reducing server load. - Issue: NA - Dependencies: requests >=2.25.1 - Twitter handle: https://x.com/eugeneevstafev?s=21 - [x] Add tests and docs: 1. Added unit tests to verify URL checking and conversion functionalities. 2. Created a comprehensive example notebook detailing the usage of the new loader. - [x] Lint and test: - Completed local testing using `make format`, `make lint`, and `make test` commands as per the LangChain contribution guidelines. Related Links: - [Package Repository](https://github.com/chigwell/langchain-pull-md) - [PyPI Package](https://pypi.org/project/langchain-pull-md/) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-06 19:32:43 +00:00
Ashvin	20a715a103	community: Fix redundancy in code. (#29022 ) In my previous PR (#28953), I added an unwanted condition for validating the Azure ML Endpoint. In this PR, I have rectified the issue.	2025-01-06 12:58:16 -05:00
Adrián Panella	acddfc772e	core: allow artifact in create_retriever_tool (#28903 ) Add option to return content and artifacts, to also be able to access the full info of the retrieved documents. They are returned as a list of dicts in the `artifacts` property if parameter `response_format` is set to `"content_and_artifact"`. Defaults to `"content"` to keep current behavior. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-03 22:10:31 +00:00
ccurme	3e618b16cd	community[patch]: release 0.3.14 (#29019 )	2025-01-03 15:34:24 -05:00
ccurme	18eb9c249d	langchain[patch]: release 0.3.14 (#29018 )	2025-01-03 15:15:44 -05:00
ccurme	8e50e4288c	core[patch]: release 0.3.29 (#29017 )	2025-01-03 14:58:39 -05:00
ccurme	85403bfa99	core[patch]: substantially speed up @deprecated (#29016 ) Resolves https://github.com/langchain-ai/langchain/issues/26918 Unit tests don't raise any additional `LangChainDeprecationWarning`. Would like guidance on how to test this more thoroughly if needed. Note: speed up for `bind_tools` path is shown below. This is redundant with the speedup in https://github.com/langchain-ai/langchain/pull/29015. I include it for demonstration purposes. Before: ![Screenshot 2025-01-03 at 12 54 50 PM](https://github.com/user-attachments/assets/87f289eb-4cad-4304-85f7-5c58c59080f1) After: ![Screenshot 2025-01-03 at 12 55 35 PM](https://github.com/user-attachments/assets/95ad0506-e1d1-4c5c-bb27-6a634d8810c9)	2025-01-03 14:38:53 -05:00
ccurme	4bb391fd4e	core[patch]: remove deprecated functions from tool binding hotpath (#29015 ) (Inspired by https://github.com/langchain-ai/langchain/issues/26918) We rely on some deprecated public functions in the hot path for tool binding (`convert_pydantic_to_openai_function`, `convert_python_function_to_openai_function`, and `format_tool_to_openai_function`). My understanding is that what is deprecated is not the functionality they implement, but use of them in the public API -- we expect to continue to rely on them. Here we update these functions to be private and not deprecated. We keep the public, deprecated functions as simple wrappers that can be safely deleted. The `@deprecated` wrapper adds considerable latency due to its use of the `inspect` module. This update speeds up `bind_tools` by a factor of ~100x: Before: ![Screenshot 2025-01-03 at 11 22 55 AM](https://github.com/user-attachments/assets/94b1c433-ce12-406f-b64c-ca7103badfe0) After: ![Screenshot 2025-01-03 at 11 23 41 AM](https://github.com/user-attachments/assets/02d0deab-82e4-45ca-8cc7-a20b91a5b5db) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-03 19:29:01 +00:00
Eugene Evstafiev	a86904e735	docs: fix typo (#29012 ) Thank you for contributing to LangChain! - [x] PR title: "docs: fix typo" - [x] PR message: *Delete this entire checklist* and replace with - Description: a minor fix of typo - Issue: NA - Dependencies: NA - Twitter handle: NA - [x] Add tests and docs: If you're adding a new integration, please include 1. ~~a test for the integration, preferably unit tests that do not rely on network access,~~ 2. ~~an example notebook showing its use. It lives in `docs/docs/integrations` directory.~~ - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/	2025-01-03 09:52:24 -08:00
Erick Friis	919d1c7da6	box: remove box readme for api docs build (#29014 )	2025-01-03 09:50:04 -08:00
Erick Friis	d8bc556c94	packages: update box location (#29013 )	2025-01-03 09:45:13 -08:00
Amaan	8d7daa59fb	docs: add langchain dappier retriever integration notebooks (#28931 ) Add a retriever to interact with Dappier APIs with an example notebook. The retriever can be invoked with: ```python from langchain_dappier import DappierRetriever retriever = DappierRetriever( data_model_id="dm_01jagy9nqaeer9hxx8z1sk1jx6", k=5 ) retriever.invoke("latest tech news") ``` To retrieve 5 documents related to latest news in the tech sector. The included notebook also includes deeper details about controlling filters such as selecting a data model, number of documents to return, site domain reference, minimum articles from the reference domain, and search algorithm, as well as including the retriever in a chain. The integration package can be found over here - https://github.com/DappierAI/langchain-dappier	2025-01-03 10:21:41 -05:00
ccurme	0185010b88	community[patch]: additional check for prompt caching support (#29008 ) Prompt caching explicitly excludes `gpt-4o-2024-05-13`: https://platform.openai.com/docs/guides/prompt-caching Resolves https://github.com/langchain-ai/langchain/issues/28997	2025-01-03 10:14:07 -05:00
Tari Yekorogha	ba9dfd9252	docs: Add FalkorDB Chat Message History and Update Package Registry (#28914 ) This commit updates the documentation and package registry for the FalkorDB Chat Message History integration. Changes: - Added a comprehensive example notebook falkordb_chat_message_history.ipynb demonstrating how to use FalkorDB for session-based chat message storage. - Added a provider notebook for FalkorDB - Updated libs/packages.yml to register FalkorDB as an integration package, following LangChain's new guidelines for community integrations. Notes: - This update aligns with LangChain's process for registering new integrations via documentation updates and package registry modifications. - No functional or core package changes were made in this commit. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-02 15:46:47 -05:00
Ashvin	d26c102a5a	community: Update azureml endpoint (#28953 ) - In this PR, I have updated the AzureML Endpoint with the latest endpoint. - Description: I have changed the existing `/chat/completions` to `/models/chat/completions` in libs/community/langchain_community/llms/azureml_endpoint.py - Issue: #25702 --------- Co-authored-by: = <=>	2025-01-02 14:47:02 -05:00
ccurme	7c28321f04	core[patch]: fix deprecation admonition in API ref (#28992 ) Before: ![Screenshot 2025-01-02 at 1 49 30 PM](https://github.com/user-attachments/assets/cb30526a-fc0b-439f-96d1-962c226d9dc7) After: ![Screenshot 2025-01-02 at 1 49 38 PM](https://github.com/user-attachments/assets/32c747ea-6391-4dec-b778-df457695d197)	2025-01-02 14:37:55 -05:00
Mohammad Mohtashim	0e74757b0a	(Community): `DuckDuckGoSearchAPIWrapper` backend changed from `api` to `auto` (#28961 ) - Description: `DuckDuckGoSearchAPIWrapper` default value for backend has been changed to avoid User Warning - Issue: #28957	2025-01-02 14:08:22 -05:00
Mohammad Mohtashim	aa551cbcee	(Core) Small Change in Docstring for method `partial` for `BasePromptTemplate` (#28969 ) - Description: Very small change in Docstring for `BasePromptTemplate` - Issue: #28966	2025-01-02 12:16:30 -05:00
minpeter	a873e0fbfb	community: update documentation and model IDs for FriendliAI provider (#28984 ) ### Description - In the example, remove `llama-2-13b-chat`, `mixtral-8x7b-instruct-v0-1`. - Fix llm friendli streaming implementation. - Update examples in documentation and remove duplicates. ### Issue N/A ### Dependencies None ### Twitter handle `@friendliai`	2025-01-02 12:15:59 -05:00
Hrishikesh Kalola	437ec53e29	langchain.agents: corrected documentation (#28986 ) Description: This PR updates the codebase to reflect the deprecation of the AgentType feature. It includes the following changes: Documentation Update: Added a deprecation notice to the AgentType class comment. Provided a reference to the official LangChain migration guide for transitioning to LangGraph agents. Reference Link: https://python.langchain.com/docs/how_to/migrate_agent/ Twitter handle: @hrrrriiiishhhhh --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-01-02 12:13:42 -05:00
Mohammad Mohtashim	49a26c1fca	(Community): Fix Keyword argument for `AzureAIDocumentIntelligenceParser` (#28959 ) - Description: Fix the `body` keyword argument for AzureAIDocumentIntelligenceParser` - Issue: #28948	2025-01-02 11:27:12 -05:00
ccurme	efc687a13b	community[patch]: fix instantiation for Slack tools (#28990 ) Believe the current implementation raises PydanticUserError following [this](https://github.com/pydantic/pydantic/releases/tag/v2.10.1) Pydantic release. Resolves https://github.com/langchain-ai/langchain/issues/28989	2025-01-02 16:14:17 +00:00
Yunlin Mao	c59093d67f	docs: add modelscope endpoint (#28941 ) ## Description To integrate ModelScope inference API endpoints for both Embeddings, LLMs and ChatModels, install the package `langchain-modelscope-integration` (as discussed in issue #28928 ). This is necessary because the package name `langchain-modelscope` was already registered by another party. ModelScope is a premier platform designed to connect model checkpoints with model applications. It provides the necessary infrastructure to share open models and promote model-centric development. For more information, visit GitHub page: [ModelScope](https://github.com/modelscope).	2025-01-02 10:08:41 -05:00
Bagatur	1c797ac68f	infra: speed up unit tests (#28974 ) Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-02 04:13:08 +00:00
Morgante Pell	79fc9b6b04	cli: bump gritql version (#28981 ) Description: bump gritql dependency, to use new binary names from [here](https://github.com/getgrit/gritql/pull/565) Issue: fixes https://github.com/langchain-ai/langchain/issues/27822	2025-01-01 20:02:46 -08:00
Bagatur	edbe7d5f5e	core,anthropic[patch]: fix with_structured_output typing (#28950 )	2024-12-28 15:46:51 -05:00
dabzr	ffbe5b2106	partners: fix default value for stop_sequences in ChatGroq (#28924 ) - Description: This PR addresses an issue with the `stop_sequences` field in the `ChatGroq` class. Currently, the field is defined as: ```python stop: Optional[Union[List[str], str]] = Field(None, alias="stop_sequences") ``` This causes the language server (LSP) to raise an error indicating that the `stop_sequences` parameter must be implemented. The issue occurs because `Field(None, alias="stop_sequences")` is different compared to `Field(default=None, alias="stop_sequences")`. ![image](https://github.com/user-attachments/assets/bfc34cb1-c664-4c31-b856-8f18419c7350) To resolve the issue, the field is updated to: ```python stop: Optional[Union[List[str], str]] = Field(default=None, alias="stop_sequences") ``` While this issue does not affect runtime behavior, it ensures compatibility with LSPs and improves the development experience. - Issue: N/A - Dependencies: None	2024-12-26 16:43:34 -05:00
Andy Wermke	5940ed3952	community: Fix error handling bug in ChatDeepInfra (#28918 ) In the async ClientResponse, `response.text` is not a string property, but an asynchronous function returning a string.	2024-12-26 14:45:12 -05:00
zep.hyr	7b4d2d5d44	Community : Add cost information for missing OpenAI model (#28882 ) In the previous commit, the cached model key for this model was omitted. When using the "gpt-4o-2024-11-20" model, the token count in the callback appeared as 0, and the cost was recorded as 0. We add model and cost information so that the token count and cost can be displayed for the respective model. - The message before modification is as follows. ``` Tokens Used: 0 Prompt Tokens: 0 Prompt Tokens Cached: 0 Completion Tokens: 0 Reasoning Tokens: 0 Successful Requests: 0 Total Cost (USD): $0.0 ``` - The message after modification is as follows. ``` Tokens Used: 3783 Prompt Tokens: 3625 Prompt Tokens Cached: 2560 Completion Tokens: 158 Reasoning Tokens: 0 Successful Requests: 1 Total Cost (USD): $0.010642500000000001 ```	2024-12-26 14:28:31 -05:00
Erick Friis	3726a944c0	docs: sorted by downloads [wip] (#28869 )	2024-12-23 13:13:35 -08:00
Andreas Motl	6352edf77f	docs: CrateDB: Register package `langchain-cratedb`, and add minimal "provider" documentation (#28877 ) Hi Erick. Coming back from a previous attempt, we now made a separate package for the CrateDB adapter, called `langchain-cratedb`, as advised. Other than registering the package within `libs/packages.yml`, this patch includes a minimal amount of documentation to accompany the advent of this new package. Let us know about any mistakes we made, or changes you would like to see. Thanks, Andreas. ## About - Description: Register a new database adapter package, `langchain-cratedb`, providing traditional vector store, document loader, and chat message history features for a start. - Addressed to: @efriis, @eyurtsev - References: GH-27710 - Preview: [Providers » More » CrateDB](https://langchain-git-fork-crate-workbench-register-la-4bf945-langchain.vercel.app/docs/integrations/providers/cratedb/) ## Status - PyPI: https://pypi.org/project/langchain-cratedb/ - GitHub: https://github.com/crate/langchain-cratedb - Documentation (CrateDB): https://cratedb.com/docs/guide/integrate/langchain/ - Documentation (LangChain): _This PR._ ## Backlog? Is this applicable for this kind of patch? > - [ ] Add tests and docs: If you're adding a new integration, please include > 1. a test for the integration, preferably unit tests that do not rely on network access, > 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. ## Q&A 1. Notebooks that use the LangChain CrateDB adapter are currently at [CrateDB LangChain Examples](https://github.com/crate/cratedb-examples/tree/main/topic/machine-learning/llm-langchain), and the documentation refers to them. Because they are derived from very old blueprints coming from LangChain 0.0.x times, we guess they need a refresh before adding them to `docs/docs/integrations`. Is it applicable to merge this minimal package registration + documentation patch, which already includes valid code snippets in `cratedb.mdx`, and add corresponding notebooks on behalf of a subsequent patch later? 2. How would it work getting into the tabular list of _Integration Packages_ enumerated on the [documentation entrypoint page about Providers](https://python.langchain.com/docs/integrations/providers/)? /cc Please also review, @ckurze, @wierdvanderhaar, @kneth, @simonprickett, if you can find the time. Thanks!	2024-12-23 10:55:44 -05:00
Wang Ran (汪然)	e5c9da3eb6	core[patch]: remove redundant imports (#28861 ) `Graph` has been imported at Line: 62	2024-12-23 10:31:23 -05:00
Adrián Panella	8d9907088b	community(azuresearch): allow to use any valid credential (#28873 ) Add option to use any valid credential type. Differentiates async cases needed by Azure Search. This could replace the use of a static token	2024-12-23 10:05:48 -05:00
Mohammad Mohtashim	41b6a86bbe	Community: LlamaCppEmbeddings `embed_documents` and `embed_query` (#28827 ) - Description: `embed_documents` and `embed_query` was throwing off the error as stated in the issue. The issue was that `Llama` client is returning the embeddings in a nested list which is not being accounted for in the current implementation and therefore the stated error is being raised. - Issue: #28813 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-23 09:50:22 -05:00
Darien Schettler	32917a0b98	Update dataframe.py (#28871 ) community: optimize DataFrame document loader Description: Simplify the `lazy_load` method in the DataFrame document loader by combining text extraction and metadata cleanup into a single operation. This makes the code more concise while maintaining the same functionality. Issue: N/A Dependencies: None Twitter handle: N/A	2024-12-22 19:16:16 -05:00
yeounhak	f38fc89f35	community: Corrected aload func to be asynchronous from webBaseLoader (#28337 ) - Description: The aload function, contrary to its name, is not an asynchronous function, so it cannot work concurrently with other asynchronous functions. - Issue: #28336 - Test: : Done - Docs: [here](`e0a95e5646/docs/docs/integrations/document_loaders/web_base.ipynb (L201)`) - Lint: All checks passed If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-20 14:42:52 -05:00
Mohammad Mohtashim	8cf5f20bb5	`required` tool_choice added for ChatHuggingFace (#28851 ) - Description: HuggingFace Inference Client V3 now supports `required` as tool_choice which has been added. - Issue: #28842	2024-12-20 12:06:04 -05:00
Sylvain DEPARTE	fcba567a77	partners: allow to set Prefix in AIMessage (for MistralAI) (#28846 ) Description: Added ability to set `prefix` attribute to prevent error : ``` httpx.HTTPStatusError: Error response 400 while fetching https://api.mistral.ai/v1/chat/completions: {"object":"error","message":"Expected last role User or Tool (or Assistant with prefix True) for serving but got assistant","type":"invalid_request_error","param":null,"code":null} ``` Co-authored-by: Sylvain DEPARTE <sylvain.departe@wizbii.com>	2024-12-20 11:09:45 -05:00
Jacob Mansdorfer	6d81137325	community: adding langchain-predictionguard partner package documentation (#28832 ) - [x] PR title: "community: adding langchain-predictionguard partner package documentation" - [x] PR message: - Description: This PR adds documentation for the langchain-predictionguard package to main langchain repo, along with deprecating current Prediction Guard LLMs package. The LLMs package was previously broken, so I also updated it one final time to allow it to continue working from this point onward. . This enables users to chat with LLMs through the Prediction Guard ecosystem. - Package Links: - [PyPI](https://pypi.org/project/langchain-predictionguard/) - [Github Repo](https://www.github.com/predictionguard/langchain-predictionguard) - Issue: None - Dependencies: None - Twitter handle: [@predictionguard](https://x.com/predictionguard) - [x] Add tests and docs: All docs have been added for the partner package, and the current LLMs package test was updated to reflect changes. - [x] Lint and test: Linting tests are all passing. --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-20 10:51:44 -05:00
ccurme	f0e858b4e3	core[patch]: release 0.3.28 (#28837 )	2024-12-19 17:52:32 -05:00
ccurme	137d1e9564	langchain[patch]: fix test following update to langchain-openai (#28838 )	2024-12-19 22:39:48 +00:00
Emmanuel Leroy	c8db5a19ce	langchain_community.chat_models.oci_generative_ai: Fix a bug when using optional parameters in tools (#28829 ) When using tools with optional parameters, the parameter `type` is not longer available since langchain update to 0.3 (because of the pydantic upgrade?) and there is now an `anyOf` field instead. This results in the `type` being `None` in the chat request for the tool parameter, and the LLM call fails with the error: ``` oci.exceptions.ServiceError: {'target_service': 'generative_ai_inference', 'status': 400, 'code': '400', 'opc-request-id': '...', 'message': 'Parameter definition must have a type.', 'operation_name': 'chat' ... } ``` Example code that fails: ``` from langchain_community.chat_models.oci_generative_ai import ChatOCIGenAI from langchain_core.tools import tool from typing import Optional llm = ChatOCIGenAI( model_id="cohere.command-r-plus", service_endpoint="https://inference.generativeai.us-chicago-1.oci.oraclecloud.com", compartment_id="ocid1.compartment.oc1...", auth_profile="your_profile", auth_type="API_KEY", model_kwargs={"temperature": 0, "max_tokens": 3000}, ) @tool def test(example: Optional[str] = None): """This is the tool to use to test things Args: example: example variable, defaults to None """ return "this is a test" llm_with_tools = llm.bind_tools([test]) result = llm_with_tools.invoke("can you make a test for g") ``` This PR sets the param type to `any` in that case, and fixes the problem. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-19 22:17:34 +00:00
Bagatur	c3ccd93c12	patch openai json mode test (#28831 )	2024-12-19 21:43:32 +00:00
Bagatur	ce6748dbfe	xfail openai image token count test (#28828 )	2024-12-19 21:23:30 +00:00
Anusha Karkhanis	26bdf40072	Langchain_Community: SQL LanguageParser (#28430 ) ## Description (This PR has contributions from @khushiDesai, @ashvini8, and @ssumaiyaahmed). This PR addresses Issue #11229 which addresses the need for SQL support in document parsing. This is integrated into the generic TreeSitter parsing library, allowing LangChain users to easily load codebases in SQL into smaller, manageable "documents." This pull request adds a new ```SQLSegmenter``` class, which provides the SQL integration. ## Issue Issue #11229: Add support for a variety of languages to LanguageParser ## Testing We created a file ```test_sql.py``` with several tests to ensure the ```SQLSegmenter``` is functional. Below are the tests we added: - ```def test_is_valid```: Checks SQL validity. - ```def test_extract_functions_classes```: Extracts individual SQL statements. - ```def test_simplify_code```: Simplifies SQL code with comments. --------- Co-authored-by: Syeda Sumaiya Ahmed <114104419+ssumaiyaahmed@users.noreply.github.com> Co-authored-by: ashvini hunagund <97271381+ashvini8@users.noreply.github.com> Co-authored-by: Khushi Desai <khushi.desai@advantawitty.com> Co-authored-by: Khushi Desai <59741309+khushiDesai@users.noreply.github.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-19 20:30:57 +00:00
Bagatur	a7f2148061	openai[patch]: Release 0.2.14 (#28826 )	2024-12-19 11:56:44 -08:00
Bagatur	1378ddfa5f	openai[patch]: type reasoning_effort (#28825 )	2024-12-19 19:36:49 +00:00
Erick Friis	6a37899b39	core: dont mutate tool_kwargs during tool run (#28824 ) fixes https://github.com/langchain-ai/langchain/issues/24621	2024-12-19 18:11:56 +00:00
Qun	033ac41760	fix crash when using create_xml_agent with parameterless function as … (#26002 ) When using `create_xml_agent` or `create_json_chat_agent` to create a agent, and the function corresponding to the tool is a parameterless function, the `XMLAgentOutputParser` or `JSONAgentOutputParser` will parse the tool input into an empty string, `BaseTool` will parse it into a positional argument. So, the program will crash finally because we invoke a parameterless function but with a positional argument.Specially, below code will raise StopIteration in [_parse_input](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/tools/base.py#L419) ```python from langchain import hub from langchain.agents import AgentExecutor, create_json_chat_agent, create_xml_agent from langchain_openai import ChatOpenAI prompt = hub.pull("hwchase17/react-chat-json") llm = ChatOpenAI() # agent = create_xml_agent(llm, tools, prompt) agent = create_json_chat_agent(llm, tools, prompt) agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True) agent_executor.invoke(......) ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-19 13:00:46 -05:00
Luke	f69695069d	text_splitters: Add HTMLSemanticPreservingSplitter (#25911 ) Description: With current HTML splitters, they rely on secondary use of the `RecursiveCharacterSplitter` to further chunk the document into manageable chunks. The issue with this is it fails to maintain important structures such as tables, lists, etc within HTML. This Implementation of a HTML splitter, allows the user to define a maximum chunk size, HTML elements to preserve in full, options to preserve `<a>` href links in the output and custom handlers. The core splitting begins with headers, similar to `HTMLHeaderSplitter`. If these sections exceed the length of the `max_chunk_size` further recursive splitting is triggered. During this splitting, elements listed to preserve, will be excluded from the splitting process. This can cause chunks to be slightly larger then the max size, depending on preserved length. However, all contextual relevance of the preserved item remains intact. Custom Handlers: Sometimes, companies such as Atlassian have custom HTML elements, that are not parsed by default with `BeautifulSoup`. Custom handlers allows a user to provide a function to be ran whenever a specific html tag is encountered. This allows the user to preserve and gather information within custom html tags that `bs4` will potentially miss during extraction. Dependencies: User will need to install `bs4` in their project to utilise this class I have also added in `how_to` and unit tests, which require `bs4` to run, otherwise they will be skipped. Flowchart of process: ![HTMLSemanticPreservingSplitter](https://github.com/user-attachments/assets/20873c36-22ed-4c80-884b-d3c6f433f5a7) --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-19 12:09:22 -05:00
Tommaso De Lorenzo	24bfa062bf	langchain: add support for Google Anthropic Vertex AI model garden provider in init_chat_model (#28177 ) Simple modification to add support for anthropic models deployed in Google Vertex AI model garden in `init_chat_model` importing `ChatAnthropicVertex` - [v] Lint and test	2024-12-19 12:06:21 -05:00
Erick Friis	ff7b01af88	anthropic: less pydantic for client (#28823 )	2024-12-19 08:00:02 -08:00
Erick Friis	f1d783748a	anthropic: sdk bump (#28820 )	2024-12-19 15:39:21 +00:00
Erick Friis	907f36a6e9	fireworks: fix lint (#28821 )	2024-12-19 15:36:36 +00:00
Erick Friis	6526db4871	community: bump core (#28819 )	2024-12-19 06:41:53 -08:00
Vignesh A	4c9acdfbf1	Community : Add OpenAI prompt caching and reasoning tokens tracking (#27135 ) Added Token tracking for OpenAI's prompt caching and reasoning tokens Costs updated from https://openai.com/api/pricing/ usage example ```python from langchain_community.callbacks import get_openai_callback from langchain_openai import ChatOpenAI llm = ChatOpenAI(model_name="o1-mini",temperature=1) with get_openai_callback() as cb: response = llm.invoke("hi "*1500) print(cb) ``` Output ``` Tokens Used: 1720 Prompt Tokens: 1508 Prompt Tokens Cached: 1408 Completion Tokens: 212 Reasoning Tokens: 192 Successful Requests: 1 Total Cost (USD): $0.0049559999999999995 ``` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-19 09:31:13 -05:00
ScriptShi	97f1e1d39f	community: tablestore vector store check the dimension of the embedding when writing it to store. (#28812 ) Added some restrictions to a vectorstore I released in the community before.	2024-12-19 09:30:43 -05:00
Wang Ran (汪然)	f48755d35b	core: typo `Utilities for tests.` -> `Utilities for pydantic.` (#28814 ) Description: typo	2024-12-19 09:26:17 -05:00
Wang Ran (汪然)	51b8ddaf10	core: typo in runnable (#28815 ) Thank you for contributing to LangChain! Description: Typo	2024-12-19 09:25:57 -05:00
Erick Friis	3b036a1cf2	partners/fireworks: release 0.2.6 (#28805 )	2024-12-18 22:48:35 +00:00
Erick Friis	4eb8bf7793	partners/anthropic: release 0.3.1 (#28801 )	2024-12-18 22:45:38 +00:00
Lu Peng	50afa7c4e7	community: add new parameter default_headers (#28700 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - "community: 1. add new parameter `default_headers` for oci model deployments and oci chat model deployments. 2. updated k parameter in OCIModelDeploymentLLM class." - [x] PR message: - Description: 1. add new parameters `default_headers` for oci model deployments and oci chat model deployments. 2. updated k parameter in OCIModelDeploymentLLM class. - [x] Add tests and docs: 1. unit tests 2. notebook --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-18 22:33:23 +00:00
Erick Friis	cc616de509	partners/xai: release 0.1.1 (#28806 )	2024-12-18 22:15:24 +00:00
Erick Friis	ba8c1b0d8c	partners/groq: release 0.2.2 (#28804 )	2024-12-18 22:12:02 +00:00
Erick Friis	a119cae5bd	partners/mistralai: release 0.2.4 (#28803 )	2024-12-18 22:11:48 +00:00
Erick Friis	514d78516b	partners/ollama: release 0.2.2 (#28802 )	2024-12-18 22:11:08 +00:00
Bagatur	68940dd0d6	openai[patch]: Release 0.2.13 (#28800 )	2024-12-18 22:08:47 +00:00
Erick Friis	4dc28b43ac	community: release 0.3.13 (#28798 )	2024-12-18 21:58:46 +00:00
Bagatur	557f63c2e6	core[patch]: Release 0.3.27 (#28799 )	2024-12-18 21:58:03 +00:00
Bagatur	4a531437bb	core[patch], openai[patch]: Handle OpenAI developer msg (#28794 ) - Convert developer openai messages to SystemMessage - store additional_kwargs={"__openai_role__": "developer"} so that the correct role can be reconstructed if needed - update ChatOpenAI to read in openai_role --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-18 21:54:07 +00:00
Erick Friis	079f1d93ab	langchain: release 0.3.13 (#28797 )	2024-12-18 12:32:00 -08:00
Yuxin Chen	3256b5d6ae	text-splitters: fix state persistence issue in ExperimentalMarkdownSyntaxTextSplitter (#28373 ) - Description: This PR resolves an issue with the `ExperimentalMarkdownSyntaxTextSplitter` class, which retains the internal state across multiple calls to the `split_text` method. This behaviour caused an unintended accumulation of chunks in `self` variables, leading to incorrect outputs when processing multiple Markdown files sequentially. - Modified `libs\text-splitters\langchain_text_splitters\markdown.py` to reset the relevant internal attributes at the start of each `split_text` invocation. This ensures each call processes the input independently. - Added unit tests in `libs\text-splitters\tests\unit_tests\test_text_splitters.py` to verify the fix and ensure the state does not persist across calls. - Issue: Fixes [#26440](https://github.com/langchain-ai/langchain/issues/26440). - Dependencies: No additional dependencies are introduced with this change. - [x] Unit tests were added to verify the changes. - [x] Updated documentation where necessary. - [x] Ran `make format`, `make lint`, and `make test` to ensure compliance with project standards. --------- Co-authored-by: Angel Chen <angelchen396@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 20:27:59 +00:00
Mohammad Mohtashim	7c8f977695	Community: Fix `with_structured_output` for `ChatSambaNovaCloud` (#28796 ) - Description: The `kwargs` was being checked as None object which was causing the rest of code in `with_structured_output` not getting executed. The checking part has been fixed in this PR. - Issue: #28776	2024-12-18 14:35:06 -05:00
V.Prasanna kumar	684b146b18	Fixed adding float values into DynamoDB (#26562 ) Thank you for contributing to LangChain! - [x] PR title: Add float Message into Dynamo DB - community - Example: "community: Chat Message History - [x] PR message: - Description: pushing float values into dynamo db creates error , solved that by converting to str type - Issue: Float values are not getting pushed - Twitter handle: VpkPrasanna Have added an utility function for str conversion , let me know where to place it happy to do an commit. This PR is from an discussion of #26543 @hwchase17 @baskaryan @efriis --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 13:45:00 -05:00
William FH	50ea1c3ea3	[Core] respect tracing project name cvar (#28792 )	2024-12-18 10:02:02 -08:00
Martin Triska	e6b41d081d	community: DocumentLoaderAsParser wrapper (#27749 ) ## Description This pull request introduces the `DocumentLoaderAsParser` class, which acts as an adapter to transform document loaders into parsers within the LangChain framework. The class enables document loaders that accept a `file_path` parameter to be utilized as blob parsers. This is particularly useful for integrating various document loading capabilities seamlessly into the LangChain ecosystem. When merged in together with PR https://github.com/langchain-ai/langchain/pull/27716 It opens options for `SharePointLoader` / `OneDriveLoader` to process any filetype that has a document loader. ### Features - Flexible Parsing: The `DocumentLoaderAsParser` class can adapt any document loader that meets the criteria of accepting a `file_path` argument, allowing for lazy parsing of documents. - Compatibility: The class has been designed to work with various document loaders, making it versatile for different use cases. ### Usage Example To use the `DocumentLoaderAsParser`, you would initialize it with a suitable document loader class and any required parameters. Here’s an example of how to do this with the `UnstructuredExcelLoader`: ```python from langchain_community.document_loaders.blob_loaders import Blob from langchain_community.document_loaders.parsers.documentloader_adapter import DocumentLoaderAsParser from langchain_community.document_loaders.excel import UnstructuredExcelLoader # Initialize the parser adapter with UnstructuredExcelLoader xlsx_parser = DocumentLoaderAsParser(UnstructuredExcelLoader, mode="paged") # Use parser, for ex. pass it to MimeTypeBasedParser MimeTypeBasedParser( handlers={ "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet": xlsx_parser } ) ``` - Dependencies: None - Twitter handle: @martintriska1 If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 12:47:08 -05:00
Erick Friis	9b024d00c9	text-splitters: release 0.3.4 (#28795 )	2024-12-18 09:44:36 -08:00
Erick Friis	5cf965004c	core: release 0.3.26 (#28793 )	2024-12-18 17:28:42 +00:00
Mohammad Mohtashim	d49df4871d	[Community]: Image Extraction Fixed for `PDFPlumberParser` (#28491 ) - Description: One-Bit Images was raising error which has been fixed in this PR for `PDFPlumberParser` - Issue: #28480 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 11:45:48 -05:00
binhnd102	f723a8456e	Fixes: community: fix LanceDB return no metadata (#27024 ) - [ x ] Fix when lancedb return table without metadata column - Description: Check the table schema, if not has metadata column, init the Document with metadata argument equal to empty dict - Issue: https://github.com/langchain-ai/langchain/issues/27005 - [ x ] Add tests and docs --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-18 15:21:28 +00:00
ANSARI MD AAQIB AHMED	91d28ef453	Add langchain-yt-dlp Document Loader Documentation (#28775 ) ## Overview This PR adds documentation for the `langchain-yt-dlp` package, a YouTube document loader that uses `yt-dlp` for Youtube videos metadata extraaction. ## Changes - Added documentation notebook for YoutubeLoader - Updated packages.yml to include langchain-yt-dlp ## Motivation The existing LangChain YoutubeLoader was unable to fetch YouTube metadata due to changes in YouTube's structure. This package resolves those issues by leveraging the `yt-dlp` library. ## Features - Reliable YouTube metadata extraction ## Related - Package Repository: https://github.com/aqib0770/langchain-yt-dlp - PyPI Package: https://pypi.org/project/langchain-yt-dlp/ --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 10:16:50 -05:00
GITHUBear	33b1fb95b8	partners: langchain-oceanbase Integration (#28782 ) Hi, langchain team! I'm a maintainer of [OceanBase](https://github.com/oceanbase/oceanbase). With the integration guidance, I create a python lib named [langchain-oceanbase](https://github.com/oceanbase/langchain-oceanbase) to integrate `Oceanbase Vector Store` with `Langchain`. So I'd like to add the required docs. I will appreciate your feedback. Thank you! --------- Signed-off-by: shanhaikang.shk <shanhaikang.shk@oceanbase.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-18 14:51:49 +00:00
Rave Harpaz	986b752fc8	Add OCI Generative AI new model and structured output support (#28754 ) - [X] PR title: community: Add new model and structured output support - [X] PR message: - Description: add support for meta llama 3.2 image handling, and JSON mode for structured output - Issue: NA - Dependencies: NA - Twitter handle: NA - [x] Add tests and docs: 1. we have updated our unit tests, 2. no changes required for documentation. - [x] Lint and test: make format, make lint and make test we run successfully --------- Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-18 09:50:25 -05:00
David Pryce-Compson	ef24220d3f	community: adding haiku 3.5 and opus callbacks (#28783 ) Description: Adding new AWS Bedrock model and their respective costs to match https://aws.amazon.com/bedrock/pricing/ for the Bedrock callback Issue: Missing models for those that wish to try them out Dependencies: Nothing added Twitter handle: @David_Pryce and / or @JamfSoftware If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-12-18 09:45:10 -05:00
Yudai Kotani	05a44797ee	langchain_community: Add default None values to DocumentAttributeValue class properties (#28785 ) Description: This PR addresses an issue where the DocumentAttributeValue class properties did not have default values of None. By explicitly setting the Optional attributes (DateValue, LongValue, StringListValue, and StringValue) to default to None, this change ensures the class functions as expected when no value is provided for these attributes. Changes Made: Added default None values to the following properties of the DocumentAttributeValue class: DateValue LongValue StringListValue StringValue Removed the invalid argument extra="allow" from the BaseModel inheritance. Dependencies: None. Twitter handle (optional): @__korikori1021 Checklist - [x] Verified that KendraRetriever works as expected after the changes. Co-authored-by: y1u0d2a1i <y.kotani@raksul.com>	2024-12-18 09:43:04 -05:00
Satyam Kumar	90f7713399	refactor: improve docstring parsing logic for Google style (#28730 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" Description: Improved the `_parse_google_docstring` function in `langchain/core` to support parsing multi-paragraph descriptions before the `Args:` section while maintaining compliance with Google-style docstring guidelines. This change ensures better handling of docstrings with detailed function descriptions. Issue: Fixes #28628 Dependencies: None. Twitter handle: @isatyamks --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-18 09:35:19 -05:00
Dong Shin	0b1359801e	community: add trust_env at web_base_loader (#28514 ) - Description: I am working to address a similar issue to the one mentioned in https://github.com/langchain-ai/langchain/pull/19499. Specifically, there is a problem with the Webbase loader used in open-webui, where it fails to load the proxy configuration. This PR aims to resolve that issue. <!--If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.--> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-17 21:18:16 -05:00
Erick Friis	be738aa7de	packages: enable vertex api build (#28773 )	2024-12-17 11:31:14 -08:00
Bagatur	ac278cbe8b	core[patch]: export InjectedToolCallId (#28772 )	2024-12-17 19:29:20 +00:00
Bagatur	e4d3ccf62f	json mode standard test (#25497 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-17 18:47:34 +00:00
Frank Dai	e81433497b	community: support Confluence cookies (#28760 ) Description: Some confluence instances don't support personal access token, then cookie is a convenient way to authenticate. This PR adds support for Confluence cookies. Twitter handle: soulmachine	2024-12-17 12:16:36 -05:00
ccurme	b745281eec	anthropic[patch]: increase timeouts for integration tests (#28767 ) Some tests consistently ran into the 10s limit in CI.	2024-12-17 15:47:17 +00:00
Vinit Kudva	a00258ec12	chroma: fix persistence if client_settings is passed in (#25199 ) …ent path given. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-17 10:03:02 -05:00
Omri Eliyahu Levy	f8883a1321	partners/voyageai: enable setting output dimension (#28740 ) Voyage has introduced voyage-3-large and voyage-code-3, which feature different output dimensions by leveraging a technique called "Matryoshka Embeddings" (see blog - https://blog.voyageai.com/2024/12/04/voyage-code-3/). These two models are available in various sizes: [256, 512, 1024, 2048] (https://docs.voyageai.com/docs/embeddings#model-choices). This PR adds the option to set the required output dimension.	2024-12-17 10:02:00 -05:00
German Martin	3a1d05394d	community: Apache AGE wrapper. Ensure Node Uniqueness by ID. (#28759 ) Description: The Apache AGE graph integration incorrectly handled node merging, allowing duplicate nodes with different IDs but the same type and other properties. Unlike [Neo4j](`cdf6202156/libs/community/langchain_community/graphs/neo4j_graph.py (L47)`), [Memgraph](`cdf6202156/libs/community/langchain_community/graphs/memgraph_graph.py (L50)`), [Kuzu](`cdf6202156/libs/community/langchain_community/graphs/kuzu_graph.py (L253)`), and [Gremlin](`cdf6202156/libs/community/langchain_community/graphs/gremlin_graph.py (L165)`), it did not use the node ID as the primary identifier for merging. This inconsistency caused data integrity issues and unexpected behavior when users expected updates to specific nodes by ID. Solution: This PR modifies the `node_insert_query` to `MERGE` nodes based on label and ID only and updates properties with `SET`, aligning the behavior with other graph database integrations. The `_format_properties` method was also modified to handle id overrides. Impact: This fix ensures data integrity by preventing duplicate nodes, and provides a consistent behavior across graph database integrations.	2024-12-17 09:21:59 -05:00
gsa9989	cdf6202156	cosmosdbnosql: Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook (#24424 ) * Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook --------- Co-authored-by: Aayush Kataria <aayushkataria3011@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 21:57:05 -05:00
Brian Burgin	27a9056725	community: Fix ChatLiteLLMRouter runtime issues (#28163 ) Description: Fix ChatLiteLLMRouter ctor validation and model_name parameter Issue: #19356, #27455, #28077 Twitter handle: @bburgin_0	2024-12-16 18:17:39 -05:00
Mikhail Khludnev	00deacc67e	docs, external: introduce `langchain-localai` (#28751 ) Thank you for contributing to LangChain! Referring to https://github.com/mkhludnev/langchain-localai --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 22:22:37 +00:00
Erick Friis	d4b5e7ef22	community: recommend RedisVectorStore over Redis (#28749 )	2024-12-16 21:08:30 +00:00
Hiros	8f5e72de05	community: Correctly handle multi-element rich text (#25762 ) Description: - Add _concatenate_rich_text method to combine all elements in rich text arrays - Update load_page method to use _concatenate_rich_text for rich text properties - Ensure all text content is captured, including inline code and formatted text - Add unit tests to verify correct handling of multi-element rich text This fix prevents truncation of content after backticks or other formatting elements. Issue: Using Notion DB Loader, the text for `richtext` and `title` is truncated after 1st element was loaded as Notion Loader only read the first element. Dependencies: any dependencies required for this change None. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 20:20:27 +00:00
Antonio Lanza	b2102b8cc4	text-splitters: Inconsistent results with `NLTKTextSplitter`'s `add_start_index=True` (#27782 ) This PR closes #27781 # Problem The current implementation of `NLTKTextSplitter` is using `sent_tokenize`. However, this `sent_tokenize` doesn't handle chars between 2 tokenized sentences... hence, this behavior throws errors when we are using `add_start_index=True`, as described in issue #27781. In particular: ```python from nltk.tokenize import sent_tokenize output1 = sent_tokenize("Innovation drives our success. Collaboration fosters creative solutions. Efficiency enhances data management.", language="english") print(output1) output2 = sent_tokenize("Innovation drives our success. Collaboration fosters creative solutions. Efficiency enhances data management.", language="english") print(output2) >>> ['Innovation drives our success.', 'Collaboration fosters creative solutions.', 'Efficiency enhances data management.'] >>> ['Innovation drives our success.', 'Collaboration fosters creative solutions.', 'Efficiency enhances data management.'] ``` # Solution With this new `use_span_tokenize` parameter, we can use NLTK to create sentences (with `span_tokenize`), but also add extra chars to be sure that we still can map the chunks to the original text. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-12-16 19:53:15 +00:00
Tari Yekorogha	d262d41cc0	community: added FalkorDB vector store support i.e implementation, test, docs an… (#26245 ) Description: Added support for FalkorDB Vector Store, including its implementation, unit tests, documentation, and an example notebook. The FalkorDB integration allows users to efficiently manage and query embeddings in a vector database, with relevance scoring and maximal marginal relevance search. The following components were implemented: - Core implementation for FalkorDBVector store. - Unit tests ensuring proper functionality and edge case coverage. - Example notebook demonstrating an end-to-end setup, search, and retrieval using FalkorDB. Twitter handle: @tariyekorogha --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 19:37:55 +00:00
Aaron Pham	12fced13f4	chore(community): update to OpenLLM 0.6 (#24609 ) Update to OpenLLM 0.6, which we decides to make use of OpenLLM's OpenAI-compatible endpoint. Thus, OpenLLM will now just become a thin wrapper around OpenAI wrapper. Signed-off-by: Aaron Pham <contact@aarnphm.xyz> --------- Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-16 14:30:07 -05:00
Lvlvko	5c17a4ace9	community: support Hunyuan Embedding (#23160 ) ## description - I refactor `Chathunyuan` using tencentcloud sdk because I found the original one can't work in my application - I add `HunyuanEmbeddings` using tencentcloud sdk - Both of them are extend the basic class of langchain. I have fully tested them in my application ## Dependencies - tencentcloud-sdk-python --------- Co-authored-by: centonhuang <centonhuang@tencent.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 19:27:19 +00:00
Harrison Chase	de7996c2ca	core: add kwargs support to VectorStore (#25934 ) has been missing the passthrough until now --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 18:57:57 +00:00
Lorenzo	b79a1156ed	community: correct return type of get_files_from_directory in github tool (#27885 ) ### About: - Description: the _get_files_from_directory_ method return a string, but it's used in other methods that expect a List[str] - Issue: None - Dependencies: None This pull request import a new method _list_files_ with the old logic of _get_files_from_directory_, but it return a List[str] at the end. The behavior of _ get_files_from_directory_ is not changed.	2024-12-16 10:30:33 -08:00
Sheepsta300	580a8d53f9	community: Add configurable `VisualFeatures` to the `AzureAiServicesImageAnalysisTool` (#27444 ) Thank you for contributing to LangChain! - [ ] PR title: community: Add configurable `VisualFeatures` to the `AzureAiServicesImageAnalysisTool` - [ ] PR message: - Description: The `AzureAiServicesImageAnalysisTool` is a good service and utilises the Azure AI Vision package under the hood. However, since the creation of this tool, new `VisualFeatures` have been added to allow the user to request other image specific information to be returned. Currently, the tool offers neither configuration of which features should be return nor does it offer any newer feature types. The aim of this PR is to address this and expose more of the Azure Service in this integration. - Dependencies: no new dependencies in the main class file, azure.ai.vision.imageanalysis added to extra test dependencies file. - [ ] Add tests and docs: If you're adding a new integration, please include 1. Although no tests exist for already implemented Azure Service tools, I've created 3 unit tests for this class that test initialisation and credentials, local file analysis and a test for the new changes/ features option. - [ ] Lint and test: All linting has passed. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 18:30:04 +00:00
Erick Friis	1c120e9615	core: xml output parser tags docstring (#28745 )	2024-12-16 18:25:16 +00:00

1 2 3 4 5 ...

6434 Commits