langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-11 22:04:37 +00:00

Author	SHA1	Message	Date
Eugene Yurtsev	21ab1dc675	Merge branch 'master' of github.com:xzq-xu/langchain into xzq-xu/master	2025-03-28 13:56:49 -04:00
Eugene Yurtsev	22cee5d983	x	2025-03-28 13:56:10 -04:00
Eugene Yurtsev	a14d8b103b	Merge branch 'master' into master	2025-03-28 13:53:58 -04:00
Eugene Yurtsev	6d22f40a0b	x	2025-03-28 13:51:06 -04:00
Philippe PRADOS	92189c8b31	community[patch]: Handle gray scale images in ImageBlobParser (Fixes 30261 and 29586) (#30493 ) Fix [29586](https://github.com/langchain-ai/langchain/issues/29586) and [30261](https://github.com/langchain-ai/langchain/pull/30261)	2025-03-28 10:15:40 -04:00
Philippe Prados	cefe702216	Fix dependencies, after this [PR](https://github.com/jsvine/pdfplumber/pull/1285/files )	2025-03-28 08:45:43 +01:00
小豆豆学长	1f0686db80	community: add netmind integration (#30149 ) Co-authored-by: yanrujing <rujing.yan@protagonist-ai.com> Co-authored-by: ccurme <chester.curme@gmail.com>	2025-03-27 15:27:04 -04:00
Kyungho Byoun	e6b6c07395	community: add HANA dialect to SQLDatabase (#30475 ) This PR includes support for HANA dialect in SQLDatabase, which is a wrapper class for SQLAlchemy. Currently, it is unable to set schema name when using HANA DB with Langchain. And, it does not show any message to user so that it makes hard for user to figure out why the SQL does not work as expected. Here is the reference document for HANA DB to set schema for the session. - [SET SCHEMA Statement (Session Management)](https://help.sap.com/docs/SAP_HANA_PLATFORM/4fe29514fd584807ac9f2a04f6754767/20fd550375191014b886a338afb4cd5f.html)	2025-03-27 15:19:50 -04:00
Eugene Yurtsev	1cf91a2386	docs: fix llms-txt (#30528 ) * Fix trailing slashes * Fix chat model integration links	2025-03-27 19:02:44 +00:00
Christophe Bornet	e181d43214	core: Bump ruff version to 0.11 (#30519 ) Changes are from the new TC006 rule: https://docs.astral.sh/ruff/rules/runtime-cast-value/ TC006 is auto-fixed.	2025-03-27 13:01:49 -04:00
ccurme	59908f04d4	fireworks: release 0.2.9 (#30527 )	2025-03-27 16:04:20 +00:00
ccurme	05482877be	mistralai: release 0.2.10 (#30526 )	2025-03-27 16:01:40 +00:00
Andras L Ferenczi	63673b765b	Fix: Enable max_retries Parameter in ChatMistralAI Class (#30448 ) partners: Enable max_retries in ChatMistralAI Description - This pull request reactivates the retry logic in the completion_with_retry method of the ChatMistralAI class, restoring the intended functionality of the previously ineffective max_retries parameter. New unit test that mocks failed/successful retry calls and an integration test to confirm end-to-end functionality. Issue - Closes #30362 Dependencies - No additional dependencies required Co-authored-by: andrasfe <andrasf94@gmail.com>	2025-03-27 11:53:44 -04:00
Lakindu Boteju	3aa080c2a8	Fix typos in pdfminer and pymupdf documentations (#30513 ) This pull request includes fixes in documentation for PDF loaders to correct the names of the loaders and the required installations. The most important changes include updating the loader names and installation instructions in the Jupyter notebooks. Documentation fixes: * [`docs/docs/integrations/document_loaders/pdfminer.ipynb`](diffhunk://#diff-a4a0561cd4a6e876ea34b7182de64a452060b921bb32d37b02e6a7980a41729bL34-R34): Changed references from `PyMuPDFLoader` to `PDFMinerLoader` and updated the installation instructions to replace `pymupdf` with `pdfminer`. [[1]](diffhunk://#diff-a4a0561cd4a6e876ea34b7182de64a452060b921bb32d37b02e6a7980a41729bL34-R34) [[2]](diffhunk://#diff-a4a0561cd4a6e876ea34b7182de64a452060b921bb32d37b02e6a7980a41729bL63-R63) [[3]](diffhunk://#diff-a4a0561cd4a6e876ea34b7182de64a452060b921bb32d37b02e6a7980a41729bL330-R330) * [`docs/docs/integrations/document_loaders/pymupdf.ipynb`](diffhunk://#diff-8487995f457e33daa2a08fdcff3b42e144eca069eeadfad5651c7c08cce7a5cdL292-R292): Corrected the loader name from `PDFPlumberLoader` to `PyMuPDFLoader`.	2025-03-27 11:29:11 -04:00
Miguel Grinberg	14b7d790c1	docs: Restore accidentally deleted docs on Elasticsearch strategies (#30521 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: Adding back a section of the Elasticsearch vectorstore documentation that was deleted in [this commit]([`a72fddbf8d (diff-4988344c6ccc08191f89ac1ebf1caab5185e13698d7567fde5352038cd950d77)`)). The only change I've made is to update the example RRF request, which was out of date. - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, eyurtsev, ccurme, vbarda, hwchase17.	2025-03-27 11:27:20 -04:00
ccurme	0b2244ea88	Revert "docs: restore some content to Elasticsearch integration page" (#30523 ) Reverts langchain-ai/langchain#30522 in favor of https://github.com/langchain-ai/langchain/pull/30521.	2025-03-27 15:12:36 +00:00
ccurme	80064893c1	docs: restore some content to Elasticsearch integration page (#30522 ) https://github.com/langchain-ai/langchain/pull/24858 standardized vector store integration pages, but deleted some content. Here we merge some of the old content back in. We use this version as a reference: `2c798622cd/docs/docs/integrations/vectorstores/elasticsearch.ipynb`	2025-03-27 11:07:19 -04:00
Keiichi Hirobe	956b09f468	core[patch]: stop deleting records with "scoped_full" when doc is empty (#30520 ) Fix a bug that causes `scoped_full` in index to delete records when there are no input docs.	2025-03-27 11:04:34 -04:00
Christophe Bornet	b28a474e79	core[patch]: Add ruff rules for PLW (Pylint Warnings) (#29288 ) See https://docs.astral.sh/ruff/rules/#warning-w_1 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-03-27 10:26:12 +00:00
xzq.xu	92dc3f7341	format test lint passed	2025-03-27 13:44:59 +08:00
xzq.xu	d0a9808148	modify test name	2025-03-27 13:34:51 +08:00
xzq.xu	ed2428f902	add a unit test	2025-03-27 12:43:16 +08:00
David Sánchez Sánchez	75823d580b	community: fix perplexity response parameters not being included in model response (#30440 ) This pull request includes enhancements to the `perplexity.py` file in the `chat_models` module, focusing on improving the handling of additional keyword arguments (`additional_kwargs`) in message processing methods. Additionally, new unit tests have been added to ensure the correct inclusion of citations, images, and related questions in the `additional_kwargs`. Issue: resolves https://github.com/langchain-ai/langchain/issues/30439 Enhancements to `perplexity.py`: * [`libs/community/langchain_community/chat_models/perplexity.py`](diffhunk://#diff-d3e4d7b277608683913b53dcfdbd006f0f4a94d110d8b9ac7acf855f1f22207fL208-L212): Modified the `_convert_delta_to_message_chunk`, `_stream`, and `_generate` methods to handle `additional_kwargs`, which include citations, images, and related questions. [[1]](diffhunk://#diff-d3e4d7b277608683913b53dcfdbd006f0f4a94d110d8b9ac7acf855f1f22207fL208-L212) [[2]](diffhunk://#diff-d3e4d7b277608683913b53dcfdbd006f0f4a94d110d8b9ac7acf855f1f22207fL277-L286) [[3]](diffhunk://#diff-d3e4d7b277608683913b53dcfdbd006f0f4a94d110d8b9ac7acf855f1f22207fR324-R331) New unit tests: * [`libs/community/tests/unit_tests/chat_models/test_perplexity.py`](diffhunk://#diff-dab956d79bd7d17a0f5dea3f38ceab0d583b43b63eb1b29138ee9b6b271ba1d9R119-R275): Added new tests `test_perplexity_stream_includes_citations_and_images` and `test_perplexity_stream_includes_citations_and_related_questions` to verify that the `stream` method correctly includes citations, images, and related questions in the `additional_kwargs`.	2025-03-26 22:28:08 -04:00
Eugene Yurtsev	7664874a0d	docs: llms-txt (#30506 ) First just verifying it's included in the manifest	2025-03-26 22:21:59 -04:00
Adeel Ehsan	d7d0bca2bc	docs: add vectara to libs package yml (#30504 )	2025-03-26 16:47:53 -04:00
ccurme	3781144710	docs: update doc on token usage tracking (#30505 )	2025-03-26 16:13:45 -04:00
ccurme	a9b1e1b177	openai: release 0.3.11 (#30503 )	2025-03-26 19:24:37 +00:00
ccurme	8119a7bc5c	openai[patch]: support streaming token counts in AzureChatOpenAI (#30494 ) When OpenAI originally released `stream_options` to enable token usage during streaming, it was not supported in AzureOpenAI. It is now supported. Like the [OpenAI SDK](`f66d2e6fdc/src/openai/resources/completions.py (L68)`), ChatOpenAI does not return usage metadata during streaming by default (which adds an extra chunk to the stream). The OpenAI SDK requires users to pass `stream_options={"include_usage": True}`. ChatOpenAI implements a convenience argument `stream_usage: Optional[bool]`, and an attribute `stream_usage: bool = False`. Here we extend this to AzureChatOpenAI by moving the `stream_usage` attribute and `stream_usage` kwarg (on `_(a)stream`) from ChatOpenAI to BaseChatOpenAI. --- Additional consideration: we must be sensitive to the number of users using BaseChatOpenAI to interact with other APIs that do not support the `stream_options` parameter. Suppose OpenAI in the future updates the default behavior to stream token usage. Currently, BaseChatOpenAI only passes `stream_options` if `stream_usage` is True, so there would be no way to disable this new default behavior. To address this, we could update the `stream_usage` attribute to `Optional[bool] = None`, but this is technically a breaking change (as currently values of False are not passed to the client). IMO: if / when this change happens, we could accompany it with this update in a minor bump. --- Related previous PRs: - https://github.com/langchain-ai/langchain/pull/22628 - https://github.com/langchain-ai/langchain/pull/22854 - https://github.com/langchain-ai/langchain/pull/23552 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-03-26 15:16:37 -04:00
Adeel Ehsan	56629ed87b	docs: updated the docs for vectara (#30398 ) Thank you for contributing to LangChain! PR title: Docs Update for vectara Description: Vectara is moved as langchain partner package and updating the docs according to that.	2025-03-26 15:02:21 -04:00
ccurme	f68eaab44f	tests: release 0.3.17 (#30502 )	2025-03-26 18:56:54 +00:00
Louis Auneau	0b532a4ed0	community: Azure Document Intelligence parser features not available fixed (#30370 ) Thank you for contributing to LangChain! - Description: Azure Document Intelligence OCR solution has a feature parameter that enables some features such as high-resolution document analysis, key-value pairs extraction, ... In langchain parser, you could be provided as a `analysis_feature` parameter to the constructor that was passed on the `DocumentIntelligenceClient`. However, according to the `DocumentIntelligenceClient` [API Reference](https://learn.microsoft.com/en-us/python/api/azure-ai-documentintelligence/azure.ai.documentintelligence.documentintelligenceclient?view=azure-python), this is not a valid constructor parameter. It was therefore remove and instead stored as a parser property that is used in the `begin_analyze_document`'s `features` parameter (see [API Reference](https://learn.microsoft.com/en-us/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer.documentanalysisclient?view=azure-python#azure-ai-formrecognizer-documentanalysisclient-begin-analyze-document)). I also removed the check for "Supported features" since all features are supported out-of-the-box. Also I did not check if the provided `str` actually corresponds to the Azure package enumeration of features, since the `ValueError` when creating the enumeration object is pretty explicit. Last caveat, is that some features are not supported for some kind of documents. This is documented inside Microsoft documentation and exception are also explicit. - Issue: N/A - Dependencies: No - Twitter handle: @Louis___A --------- Co-authored-by: Louis Auneau <louis@handshakehealth.co>	2025-03-26 14:40:14 -04:00
Really Him	fbd2e10703	docs: hide jsx in llm chain tutorial (#30187 ) ## Description: The Jupyter notebooks in the docs section are extremely useful and critical for widespread adoption of LangChain amongst new developers. However, because they are also converted to MDX and used to build the HTML for the Docusaurus site, they contain JSX code that degrades readability when opened in a "notebook" setting (local notebook server, google colab, etc.). For instance, here we see the website, with a nice React tab component for installation instructions (`pip` vs `conda`): ![Screenshot 2025-03-07 at 2 07 15 PM](https://github.com/user-attachments/assets/a528d618-f5a0-4d2e-9aed-16d4b8148b5a) Now, here is the same notebook viewed in colab: ![Screenshot 2025-03-07 at 2 08 41 PM](https://github.com/user-attachments/assets/87acf5b7-a3e0-46ac-8126-6cac6eb93586) Note that the text following "To install LangChain run:" contains snippets of JSX code that is (i) confusing, (ii) bad for readability, (iii) potentially misleading for a novice developer, who might take it literally to mean that "to install LangChain I should run `import Tabs from...`" and then an ill-formed command which mixes the `pip` and `conda` installation instructions. Ideally, we would like to have a system that presents a similar/equivalent UI when viewing the notebooks on the documentation site, or when interacting with them in a notebook setting - or, at a minimum, we should not present ill-formed JSX snippets to someone trying to execute the notebooks. As the documentation itself states, running the notebooks yourself is a great way to learn the tools. Therefore, these distracting and ill-formed snippets are contrary to that goal. ## Fixes: * Comment out the JSX code inside the notebook `docs/tutorials/llm_chain` with a special directive `<!-- HIDE_IN_NB` (closed with `HIDE_IN_NB -->`). This makes the JSX code "invisible" when viewed in a notebook setting. * Add a custom preprocessor that runs process_cell and just erases these comment strings. This makes sure they are rendered when converted to MDX. * Minor tweak: Refactor some of the Markdown instructions into an executable codeblock for better experience when running as a notebook. * Minor tweak: Optionally try to get the environment variables from a `.env` file in the repo so the user doesn't have to enter it every time. Depends on the user installing `python-dotenv` and adding their own `.env` file. * Add an environment variable for "LANGSMITH_PROJECT" (default="default"), per the LangSmith docs, so a local user can target a specific project in their LangSmith account. NOTE: If this PR is approved, and the maintainers agree with the general goal of aligning the notebook execution experience and the doc site UI, I would plan to implement this on the rest of the JSX snippets that are littered in the notebooks. NOTE: I wasn't able to/don't know how to run the linkcheck Makefile commands. - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Really Him <hesereallyhim@proton.me>	2025-03-26 14:22:33 -04:00
Philippe PRADOS	8e5d2a44ce	community[patch]: update PyPDFParser to take into account filters returned as arrays (#30489 ) The image parsing is generating a bug as the the extracted objects for the /Filter returns sometimes an array, sometimes a string. Fix [Issue 30098](https://github.com/langchain-ai/langchain/issues/30098)	2025-03-26 14:16:54 -04:00
ccurme	422ba4cde5	infra: handle flaky tests (#30501 )	2025-03-26 13:28:56 -04:00
ccurme	9a80be7bb7	core[patch]: release 0.3.49 (#30500 )	2025-03-26 13:26:32 -04:00
ccurme	299b222c53	mistral[patch]: check types in adding model_name to response_metadata (#30499 )	2025-03-26 16:30:09 +00:00
ccurme	22d1a7d7b6	standard-tests[patch]: require model_name in response_metadata if returns_usage_metadata (#30497 ) We are implementing a token-counting callback handler in `langchain-core` that is intended to work with all chat models supporting usage metadata. The callback will aggregate usage metadata by model. This requires responses to include the model name in its metadata. To support this, if a model `returns_usage_metadata`, we check that it includes a string model name in its `response_metadata` in the `"model_name"` key. More context: https://github.com/langchain-ai/langchain/pull/30487	2025-03-26 12:20:53 -04:00
Ante Javor	20f82502e5	Community: Add Memgraph integration docs (#30457 ) Thank you for contributing to LangChain! Description: Since we just implemented [langchain-memgraph](https://pypi.org/project/langchain-memgraph/) integration, we are adding basic docs to [your site based on this comment](https://github.com/langchain-ai/langchain/pull/30197#pullrequestreview-2671616410) from @ccurme . Twitter handle: [@memgraphdb](https://x.com/memgraphdb) - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-03-26 11:58:09 -04:00
xzq.xu	913c8b71d9	format import	2025-03-26 23:34:38 +08:00
xzq.xu	7e3dea5db8	add a new-line	2025-03-26 23:32:07 +08:00
xzq.xu	d602141ab1	remove unused e	2025-03-26 23:10:41 +08:00
xzq.xu	dd9031fc82	_prep_run_args，tool_input copy, Exception	2025-03-26 23:06:43 +08:00
xzq.xu	3382b0d8ea	_prep_run_args，tool_input copy	2025-03-26 22:56:32 +08:00
xzq.xu	e90abce577	Merge remote-tracking branch 'origin/master'	2025-03-26 22:42:15 +08:00
xzq.xu	c127ae9d26	fix the format	2025-03-26 22:41:58 +08:00
xzq.xu	65ecc22606	# Fix: Prevent run_manager from being added to state object	2025-03-26 22:36:31 +08:00
Philippe PRADOS	e73e5d087c	Merge branch 'master' into pprados/06-pdfplumber	2025-03-26 15:03:07 +01:00
Philippe Prados	09c4c1f867	Fix images parser	2025-03-26 15:01:16 +01:00
ccurme	7e62e3a137	core[patch]: store model names on usage callback handler (#30487 ) So we avoid mingling tokens from different models.	2025-03-25 21:26:09 -04:00
ccurme	32827765bf	core[patch]: mark usage callback handler as beta (#30486 )	2025-03-25 23:25:57 +00:00

... 2 3 4 5 6 ...

13194 Commits