langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-17 00:17:47 +00:00

Author	SHA1	Message	Date
Mason Daugherty	5599c59d4a	chore: formatting across codebase (#32456 ) To prevent polluting future PRs	2025-08-07 22:09:26 -04:00
Mason Daugherty	5e9eb19a83	chore: update branch with changes from master (#32277 ) Co-authored-by: Maxime Grenu <69890511+cluster2600@users.noreply.github.com> Co-authored-by: Claude <claude@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: jmaillefaud <jonathan.maillefaud@evooq.ch> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: tanwirahmad <tanwirahmad@users.noreply.github.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: niceg <79145285+growmuye@users.noreply.github.com> Co-authored-by: Chaitanya varma <varmac301@gmail.com> Co-authored-by: dishaprakash <57954147+dishaprakash@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Kanav Bansal <13186335+bansalkanav@users.noreply.github.com> Co-authored-by: Aleksandr Filippov <71711753+alex-feel@users.noreply.github.com> Co-authored-by: Alex Feel <afilippov@spotware.com>	2025-07-28 10:39:41 -04:00
Christophe Bornet	03e8327e01	core: Ruff preview fixes (#31877 ) Auto-fixes from `uv run ruff check --fix --unsafe-fixes --preview` --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-07-07 13:02:40 -04:00
Christophe Bornet	a8f2ddee31	core: Add ruff rules RUF (#29353 ) See https://docs.astral.sh/ruff/rules/#ruff-specific-rules-ruf Mostly: * [RUF022](https://docs.astral.sh/ruff/rules/unsorted-dunder-all/) (unsorted `__all__`) * [RUF100](https://docs.astral.sh/ruff/rules/unused-noqa/) (unused noqa) * [RUF021](https://docs.astral.sh/ruff/rules/parenthesize-chained-operators/) (parenthesize-chained-operators) * [RUF015](https://docs.astral.sh/ruff/rules/unnecessary-iterable-allocation-for-first-element/) (unnecessary-iterable-allocation-for-first-element) * [RUF005](https://docs.astral.sh/ruff/rules/collection-literal-concatenation/) (collection-literal-concatenation) * [RUF046](https://docs.astral.sh/ruff/rules/unnecessary-cast-to-int/) (unnecessary-cast-to-int) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-05-15 15:43:57 -04:00
Sydney Runkle	7263011b24	perf[core]: remove unnecessary model validators (#31238 ) * Remove unnecessary cast of id -> str (can do with a field setting) * Remove unnecessary `set_text` model validator (can be done with a computed field - though we had to make some changes to the `Generation` class to make this possible Before: ~2.4s Blue circles represent time spent in custom validators :( <img width="1337" alt="Screenshot 2025-05-14 at 10 10 12 AM" src="https://github.com/user-attachments/assets/bb4f477f-4ee3-4870-ae93-14ca7f197d55" /> After: ~2.2s <img width="1344" alt="Screenshot 2025-05-14 at 10 11 03 AM" src="https://github.com/user-attachments/assets/99f97d80-49de-462f-856f-9e7e8662adbc" /> We still want to optimize the backwards compatible tool calls model validator, though I think this might involve breaking changes, so wanted to separate that into a different PR. This is circled in green.	2025-05-14 10:20:22 -07:00
Sydney Runkle	75e50a3efd	core[patch]: Raise `AttributeError` (instead of `ModuleNotFoundError`) in custom `__getattr__` (#30905 ) Follow up to https://github.com/langchain-ai/langchain/pull/30769, fixing the regression reported [here](https://github.com/langchain-ai/langchain/pull/30769#issuecomment-2807483610), thanks @krassowski for the report! Fix inspired by https://github.com/PrefectHQ/prefect/pull/16172/files Other changes: * Using tuples for `__all__`, except in `output_parsers` bc of a list namespace conflict * Using a helper function for imports due to repeated logic across `__init__.py` files becoming hard to maintain. Co-authored-by: Michał Krassowski < krassowski 5832902+krassowski@users.noreply.github.com>"	2025-04-17 14:15:28 -04:00
Sydney Runkle	edb6a23aea	core[lint]: fix issue with unused ignore in `__init__.py` files (#30825 ) Fixing a race condition between https://github.com/langchain-ai/langchain/pull/30769 and https://github.com/langchain-ai/langchain/pull/30737	2025-04-14 17:57:00 +00:00
Sydney Runkle	4f69094b51	core[performance]: use custom `__getattr__` in `__init__.py` files for lazy imports (#30769 ) Most easily reviewed with the "hide whitespace" option toggled. Seeing 10-50% speed ups in import time for common structures 🚀 The general purpose of this PR is to lazily import structures within `langchain_core.XXX_module.__init__.py` so that we're not eagerly importing expensive dependencies (`pydantic`, `requests`, etc). Analysis of flamegraphs generated with `importtime` motivated these changes. For example, the one below demonstrates that importing `HumanMessage` accidentally triggered imports for `importlib.metadata`, `requests`, etc. There's still much more to do on this front, and we can start digging into our own internal code for optimizations now that we're less concerned about external imports. <img width="1210" alt="Screenshot 2025-04-11 at 1 10 54 PM" src="https://github.com/user-attachments/assets/112a3fe7-24a9-4294-92c1-d5ae64df839e" /> I've tracked the improvements with some local benchmarks: ## `pytest-benchmark` results \| Name \| Before (s) \| After (s) \| Delta (s) \| % Change \| \|-----------------------------\|------------\|-----------\|-----------\|----------\| \| Document \| 2.8683 \| 1.2775 \| -1.5908 \| -55.46% \| \| HumanMessage \| 2.2358 \| 1.1673 \| -1.0685 \| -47.79% \| \| ChatPromptTemplate \| 5.5235 \| 2.9709 \| -2.5526 \| -46.22% \| \| Runnable \| 2.9423 \| 1.7793 \| -1.163 \| -39.53% \| \| InMemoryVectorStore \| 3.1180 \| 1.8417 \| -1.2763 \| -40.93% \| \| RunnableLambda \| 2.7385 \| 1.8745 \| -0.864 \| -31.55% \| \| tool \| 5.1231 \| 4.0771 \| -1.046 \| -20.42% \| \| CallbackManager \| 4.2263 \| 3.4099 \| -0.8164 \| -19.32% \| \| LangChainTracer \| 3.8394 \| 3.3101 \| -0.5293 \| -13.79% \| \| BaseChatModel \| 4.3317 \| 3.8806 \| -0.4511 \| -10.41% \| \| PydanticOutputParser \| 3.2036 \| 3.2995 \| 0.0959 \| 2.99% \| \| InMemoryRateLimiter \| 0.5311 \| 0.5995 \| 0.0684 \| 12.88% \| Note the lack of change for `InMemoryRateLimiter` and `PydanticOutputParser` is just random noise, I'm getting comparable numbers locally. ## Local CodSpeed results We're still working on configuring CodSpeed on CI. The local usage produced similar results.	2025-04-14 08:57:54 -04:00
Christophe Bornet	f241fd5c11	core: Add ruff rules RET (#29384 ) See https://docs.astral.sh/ruff/rules/#flake8-return-ret All auto-fixes	2025-04-02 16:59:56 -04:00
Christophe Bornet	88b4233fa1	core: Add ruff rules D (docstring) (#29406 ) This ensures that the code is properly documented: https://docs.astral.sh/ruff/rules/#pydocstyle-d Related to #21983	2025-04-01 13:15:45 -04:00
Christophe Bornet	e181d43214	core: Bump ruff version to 0.11 (#30519 ) Changes are from the new TC006 rule: https://docs.astral.sh/ruff/rules/runtime-cast-value/ TC006 is auto-fixed.	2025-03-27 13:01:49 -04:00
Christophe Bornet	9e6ffd1264	core: Add ruff rules PTH (pathlib) (#29338 ) See https://docs.astral.sh/ruff/rules/#flake8-use-pathlib-pth Co-authored-by: ccurme <chester.curme@gmail.com>	2025-02-28 13:22:20 -05:00
Christophe Bornet	b3885c124f	core: Add ruff rules TC (#29268 ) See https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc Some fixes done for TC001,TC002 and TC003 but these rules are excluded since they don't play well with Pydantic. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-02-26 19:39:05 +00:00
Christophe Bornet	d31ec8810a	core: Add ruff rules for error messages (EM) (#26965 ) All auto-fixes Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-07 22:12:28 +00:00
Christophe Bornet	db8845a62a	core: Add ruff rules for pycodestyle Warning (W) (#26964 ) All auto-fixes.	2024-09-30 09:31:43 -04:00
ccurme	bba7af903b	core[patch]: set default on Blob (#26787 ) Resolves https://github.com/langchain-ai/langchain/issues/26781	2024-09-23 18:55:56 +00:00
Christophe Bornet	a47b332841	core: Put Python version as a project requirement so it is considered by ruff (#26608 ) Ruff doesn't know about the python version in `[tool.poetry.dependencies]`. It can get it from `project.requires-python`. Notes: * poetry seems to have issues getting the python constraints from `requires-python` and using `python` in per dependency constraints. So I had to duplicate the info. I will open an issue on poetry. * `inspect.isclass()` doesn't work correctly with `GenericAlias` (`list[...]`, `dict[..., ...]`) on Python <3.11 so I added some `not isinstance(type, GenericAlias)` checks: Python 3.11 ```pycon >>> import inspect >>> inspect.isclass(list) True >>> inspect.isclass(list[str]) False ``` Python 3.9 ```pycon >>> import inspect >>> inspect.isclass(list) True >>> inspect.isclass(list[str]) True ``` Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-18 14:37:57 +00:00
Christophe Bornet	3a99467ccb	core[patch]: Add ruff rule UP006(use PEP585 annotations) (#26574 ) * Added rules `UPD006` now that Pydantic is v2+ --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-09-17 21:22:50 +00:00
Erick Friis	c2a3021bb0	multiple: pydantic 2 compatibility, v0.3 (#26443 ) Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com> Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com> Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: ZhangShenao <15201440436@163.com> Co-authored-by: Friso H. Kingma <fhkingma@gmail.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Morgante Pell <morgantep@google.com>	2024-09-13 14:38:45 -07:00
Christophe Bornet	ee98da4f4e	core[patch]: Add UP(upgrade) ruff rules (#25358 )	2024-08-22 16:29:22 -07:00
Leonid Ganeline	5fcf2ef7ca	core: docstrings `documents` (#23506 ) Added missed docstrings. Formatted docstrings to the consistent form.	2024-07-16 10:43:54 -04:00
Harold Martin	ccdaf14eff	docs: Spell check fixes (#24217 ) Description: Spell check fixes for docs, comments, and a couple of strings. No code change e.g. variable names. Issue: none Dependencies: none Twitter handle: hmartin	2024-07-15 15:51:43 +00:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
Eugene Yurtsev	e800f6bb57	core[minor]: Create BaseMedia object (#23639 ) This PR implements a BaseContent object from which Document and Blob objects will inherit proposed here: https://github.com/langchain-ai/langchain/pull/23544 Alternative: Create a base object that only has an identifier and no metadata. For now decided against it, since that refactor can be done at a later time. It also feels a bit odd since our IDs are optional at the moment. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-01 15:07:30 -04:00
Eugene Yurtsev	96b72edac8	core[minor]: Add optional ID field to Document schema (#23411 ) This PR adds an optional ID field to the document schema. # 1. Optional or Required - An optional field will will requrie additional checking for the type in user code (annoying). - However, vectorstores currently don't respect this field. So if we make it required and start returning random UUIDs that might be even more confusing to users. Proposal: Start with Optional and convert to Required (with default set to uuid4()) in 1-2 major releases. # 2. Override __str__ or generic solution in prompts Overriding __str__ as a simple way to avoid changing user code that relies on default str(document) in prompts. I considered rolling out a more general solution in prompts (https://github.com/langchain-ai/langchain/pull/8685), but to do that we need to: 1. Make things serializable 2. The more general solution would likely need to be backwards compatible as well 3. It's unclear that one wants to format a List[int] in the same way as List[Document]. The former should be `,` seperated (likely), the latter should be `---` separated (likely). Proposal Start with __str__ override and focus on the vectorstore APIs, we generalize prompts later	2024-06-27 12:15:58 -04:00
Eugene Yurtsev	883e90d06e	core[patch]: Add an example to the Document schema doc-string (#23131 ) Add an example to the document schema	2024-06-19 11:35:30 -04:00
Eugene Yurtsev	a34e650f8b	core[patch]: Add doc-string to document compressor (#23085 )	2024-06-19 11:03:49 -04:00
Harrison Chase	d7c607ca00	core[minor]: move document compressor base (#17910 )	2024-02-26 17:20:50 -08:00
Leonid Ganeline	2f2b77602e	docs: modules descriptions (#17844 ) Several `core` modules do not have descriptions, like the [agent](https://api.python.langchain.com/en/latest/core_api_reference.html#module-langchain_core.agents) module. - Added missed module descriptions. The descriptions are mostly copied from the `langchain` or `community` package modules.	2024-02-21 15:58:21 -08:00
Bagatur	2a510c71a0	core[patch]: doc init positional args (#16854 )	2024-02-02 10:24:16 -08:00
Nuno Campos	eb5e250188	Propagate context vars in all classes/methods - Any direct usage of ThreadPoolExecutor or asyncio.run_in_executor needs manual handling of context vars	2023-12-29 12:34:03 -08:00
Harrison Chase	f5befe3b89	manual mapping (#14422 )	2023-12-08 16:29:33 -08:00
Bagatur	32d087fcb8	REFACTOR: combine core documents files (#13733 )	2023-11-22 10:10:26 -08:00

33 Commits