langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-03-18 11:07:36 +00:00

Author	SHA1	Message	Date
weiii668	5517ef37fb	docs(core): add docstrings to internal helper functions (#34525 ) Co-authored-by: weiii668 <your-email@example.com> Co-authored-by: Mason Daugherty <github@mdrxy.com> Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-12-30 21:58:00 -06:00
Mason Daugherty	47b79c30c0	chore(docs): fix a few refs syntax errors (#34044 ) missing whitespace for some admonitions	2025-11-22 00:58:21 -05:00
Christophe Bornet	2bfbc29ccc	chore(core): fix some ruff TC rules (#33929 ) fix some ruff TC rules but still don't enforce them as Pydantic model fields use type annotations at runtime.	2025-11-12 14:07:19 -05:00
Mason Daugherty	d40e340479	chore: attribute package change versions (#33854 ) Needed to disambiguate for within inherited docs	2025-11-06 16:57:30 -05:00
Mason Daugherty	e5e1d6c705	style: more refs work (#33707 )	2025-10-28 14:43:28 -04:00
Mason Daugherty	d9e659ca4f	style: even more refs work (#33619 )	2025-10-21 01:09:52 -04:00
Mason Daugherty	1d2273597a	docs: more fixes for refs (#33554 )	2025-10-16 22:54:16 -04:00
Mason Daugherty	26e0a00c4c	style: more work for refs (#33508 ) Largely: - Remove explicit `"Default is x"` since new refs show default inferred from sig - Inline code (useful for eventual parsing) - Fix code block rendering (indentations)	2025-10-15 18:46:55 -04:00
Mason Daugherty	5f9e3e33cd	style: remove `Defaults to None` (#33404 )	2025-10-09 17:27:35 -04:00
Christophe Bornet	f405a2c57d	chore(core): remove arg types from docstrings (#33388 ) * Remove types args * Remove types from Returns * Remove types from Yield * Replace `kwargs` by `**kwargs` when needed	2025-10-09 13:13:23 -04:00
Mason Daugherty	d13823043d	style: monorepo pass for refs (#33359 ) * Delete some double backticks previously used by Sphinx (not done everywhere yet) * Fix some code blocks / dropdowns Ignoring CLI CI for now	2025-10-08 18:41:39 -04:00
Mason Daugherty	6ea03ab46c	style(core): drop python `39` linting target for 3.10 (#33286 )	2025-10-05 23:22:34 -04:00
Mason Daugherty	ae5b105d11	docs: v1 docs updates (#33173 ) Co-authored-by: Mohammad Mohtashim <45242107+keenborder786@users.noreply.github.com> Co-authored-by: Caspar Broekhuizen <caspar@langchain.dev> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Christophe Bornet <cbornet@hotmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Sadra Barikbin <sadraqazvin1@yahoo.com> Co-authored-by: Vadym Barda <vadim.barda@gmail.com>	2025-10-02 18:46:26 -04:00
Mason Daugherty	5bea28393d	docs: standardize `.. code-block` directive usage (#33122 ) and fix typos	2025-09-25 16:49:56 -04:00
Ali Ismail	d5a4abf960	docs(core): remove duplicate 'the' in indexing/api.py (#32924 ) Description: Fixes a small typo in `_get_document_with_hash` inside `libs/core/langchain_core/indexing/api.py`. Issue: N/A (no related issue) Dependencies: None	2025-09-12 15:49:54 -04:00
Christophe Bornet	16420cad71	chore(core): fix some pydocs to use google-style (#32764 ) Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-09-08 17:52:17 +00:00
Christophe Bornet	f4e83e0ad8	chore(core): fix some docstrings (from DOC preview rule) (#32833 ) * Add `Raises` sections * Add `Returns` sections * Add `Yields` sections --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-09-08 15:44:15 +00:00
Christophe Bornet	5840dad40b	chore(core): enable ruff docstring-code-format (#32834 ) See https://docs.astral.sh/ruff/settings/#format_docstring-code-format --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-09-08 15:13:50 +00:00
Mason Daugherty	c31236264e	chore: formatting across codebase (#32466 )	2025-08-08 10:20:10 -04:00
Aleksandr Filippov	f0b6baa0ef	fix(core): track within-batch deduplication in indexing num_skipped count (#32273 ) Description: Fixes incorrect `num_skipped` count in the LangChain indexing API. The current implementation only counts documents that already exist in RecordManager (cross-batch duplicates) but fails to count documents removed during within-batch deduplication via `_deduplicate_in_order()`. This PR adds tracking of the original batch size before deduplication and includes the difference in `num_skipped`, ensuring that `num_added + num_skipped` equals the total number of input documents. Issue: Fixes incorrect document count reporting in indexing statistics Dependencies: None Fixes #32272 --------- Co-authored-by: Alex Feel <afilippov@spotware.com>	2025-07-28 09:58:51 -04:00
Christophe Bornet	03e8327e01	core: Ruff preview fixes (#31877 ) Auto-fixes from `uv run ruff check --fix --unsafe-fixes --preview` --------- Co-authored-by: Mason Daugherty <mason@langchain.dev>	2025-07-07 13:02:40 -04:00
Eugene Yurtsev	9164e6f906	core[patch]: Add additional hashing options to indexing API, warn on SHA-1 (#31649 ) Add additional hashing options to the indexing API, warn on SHA-1 Requires: - Bumping langchain-core version - bumping min langchain-core in langchain --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2025-06-24 14:44:06 -04:00
Shkarupa Alex	671e4fd114	langchain[patch]: Allow async indexing code to work for vectorstores that only defined sync delete (#30869 ) `aindex` function should check not only `adelete` method, but `delete` method too PR title: "core: fix async indexing issue with adelete/delete checking" PR message: Currently `langchain.indexes.aindex` checks if vector store has overrided adelete method. But due to `adelete` default implementation store can have just `delete` overrided to make `adelete` working. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-05-16 15:10:25 -04:00
Christophe Bornet	a8f2ddee31	core: Add ruff rules RUF (#29353 ) See https://docs.astral.sh/ruff/rules/#ruff-specific-rules-ruf Mostly: * [RUF022](https://docs.astral.sh/ruff/rules/unsorted-dunder-all/) (unsorted `__all__`) * [RUF100](https://docs.astral.sh/ruff/rules/unused-noqa/) (unused noqa) * [RUF021](https://docs.astral.sh/ruff/rules/parenthesize-chained-operators/) (parenthesize-chained-operators) * [RUF015](https://docs.astral.sh/ruff/rules/unnecessary-iterable-allocation-for-first-element/) (unnecessary-iterable-allocation-for-first-element) * [RUF005](https://docs.astral.sh/ruff/rules/collection-literal-concatenation/) (collection-literal-concatenation) * [RUF046](https://docs.astral.sh/ruff/rules/unnecessary-cast-to-int/) (unnecessary-cast-to-int) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-05-15 15:43:57 -04:00
Lope Ramos	b8ae2de169	langchain-core[patch]: `Incremental` record manager deletion should be batched (#31206 ) Description: Before this commit, if one record is batched in more than 32k rows for sqlite3 >= 3.32 or more than 999 rows for sqlite3 < 3.31, the `record_manager.delete_keys()` will fail, as we are creating a query with too many variables. This commit ensures that we are batching the delete operation leveraging the `cleanup_batch_size` as it is already done for `full` cleanup. Added unit tests for incremental mode as well on different deleting batch size.	2025-05-14 11:38:21 -04:00
Sydney Runkle	75e50a3efd	core[patch]: Raise `AttributeError` (instead of `ModuleNotFoundError`) in custom `__getattr__` (#30905 ) Follow up to https://github.com/langchain-ai/langchain/pull/30769, fixing the regression reported [here](https://github.com/langchain-ai/langchain/pull/30769#issuecomment-2807483610), thanks @krassowski for the report! Fix inspired by https://github.com/PrefectHQ/prefect/pull/16172/files Other changes: * Using tuples for `__all__`, except in `output_parsers` bc of a list namespace conflict * Using a helper function for imports due to repeated logic across `__init__.py` files becoming hard to maintain. Co-authored-by: Michał Krassowski < krassowski 5832902+krassowski@users.noreply.github.com>"	2025-04-17 14:15:28 -04:00
Christophe Bornet	a4ca1fe0ed	core: Remove some noqa (#30855 )	2025-04-15 13:08:40 -04:00
Sydney Runkle	edb6a23aea	core[lint]: fix issue with unused ignore in `__init__.py` files (#30825 ) Fixing a race condition between https://github.com/langchain-ai/langchain/pull/30769 and https://github.com/langchain-ai/langchain/pull/30737	2025-04-14 17:57:00 +00:00
Sydney Runkle	4f69094b51	core[performance]: use custom `__getattr__` in `__init__.py` files for lazy imports (#30769 ) Most easily reviewed with the "hide whitespace" option toggled. Seeing 10-50% speed ups in import time for common structures 🚀 The general purpose of this PR is to lazily import structures within `langchain_core.XXX_module.__init__.py` so that we're not eagerly importing expensive dependencies (`pydantic`, `requests`, etc). Analysis of flamegraphs generated with `importtime` motivated these changes. For example, the one below demonstrates that importing `HumanMessage` accidentally triggered imports for `importlib.metadata`, `requests`, etc. There's still much more to do on this front, and we can start digging into our own internal code for optimizations now that we're less concerned about external imports. <img width="1210" alt="Screenshot 2025-04-11 at 1 10 54 PM" src="https://github.com/user-attachments/assets/112a3fe7-24a9-4294-92c1-d5ae64df839e" /> I've tracked the improvements with some local benchmarks: ## `pytest-benchmark` results \| Name \| Before (s) \| After (s) \| Delta (s) \| % Change \| \|-----------------------------\|------------\|-----------\|-----------\|----------\| \| Document \| 2.8683 \| 1.2775 \| -1.5908 \| -55.46% \| \| HumanMessage \| 2.2358 \| 1.1673 \| -1.0685 \| -47.79% \| \| ChatPromptTemplate \| 5.5235 \| 2.9709 \| -2.5526 \| -46.22% \| \| Runnable \| 2.9423 \| 1.7793 \| -1.163 \| -39.53% \| \| InMemoryVectorStore \| 3.1180 \| 1.8417 \| -1.2763 \| -40.93% \| \| RunnableLambda \| 2.7385 \| 1.8745 \| -0.864 \| -31.55% \| \| tool \| 5.1231 \| 4.0771 \| -1.046 \| -20.42% \| \| CallbackManager \| 4.2263 \| 3.4099 \| -0.8164 \| -19.32% \| \| LangChainTracer \| 3.8394 \| 3.3101 \| -0.5293 \| -13.79% \| \| BaseChatModel \| 4.3317 \| 3.8806 \| -0.4511 \| -10.41% \| \| PydanticOutputParser \| 3.2036 \| 3.2995 \| 0.0959 \| 2.99% \| \| InMemoryRateLimiter \| 0.5311 \| 0.5995 \| 0.0684 \| 12.88% \| Note the lack of change for `InMemoryRateLimiter` and `PydanticOutputParser` is just random noise, I'm getting comparable numbers locally. ## Local CodSpeed results We're still working on configuring CodSpeed on CI. The local usage produced similar results.	2025-04-14 08:57:54 -04:00
Christophe Bornet	42944f3499	core: Improve mypy config (#30737 ) * Cleanup mypy config * Add mypy `strict` rules except `disallow_any_generics`, `warn_return_any` and `strict_equality` (TODO) * Add mypy `strict_byte` rule * Add mypy support for PEP702 `@deprecated` decorator * Bump mypy version to 1.15 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-04-11 16:35:13 -04:00
Sydney Runkle	fdc2b4bcac	core[lint]: Use 3.9 formatting for docs and tests (#30780 ) Looks like `pyupgrade` was already used here but missed some docs and tests. This helps to keep our docs looking professional and up to date. Eventually, we should lint / format our inline docs.	2025-04-11 10:39:25 -04:00
Christophe Bornet	4cc7bc6c93	core: Add ruff rules PLR (#30696 ) Add ruff rules [PLR](https://docs.astral.sh/ruff/rules/#refactor-plr) Except PLR09xxx and PLR2004. Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2025-04-09 15:15:38 -04:00
Christophe Bornet	98f0016fc2	core: Add ruff rules ARG (#30732 ) See https://docs.astral.sh/ruff/rules/#flake8-unused-arguments-arg	2025-04-09 14:39:36 -04:00
Christophe Bornet	f241fd5c11	core: Add ruff rules RET (#29384 ) See https://docs.astral.sh/ruff/rules/#flake8-return-ret All auto-fixes	2025-04-02 16:59:56 -04:00
Christophe Bornet	4f8ea13cea	core: Add ruff rules PERF (#29375 ) See https://docs.astral.sh/ruff/rules/#perflint-perf	2025-04-01 13:34:56 -04:00
Christophe Bornet	88b4233fa1	core: Add ruff rules D (docstring) (#29406 ) This ensures that the code is properly documented: https://docs.astral.sh/ruff/rules/#pydocstyle-d Related to #21983	2025-04-01 13:15:45 -04:00
Christophe Bornet	e181d43214	core: Bump ruff version to 0.11 (#30519 ) Changes are from the new TC006 rule: https://docs.astral.sh/ruff/rules/runtime-cast-value/ TC006 is auto-fixed.	2025-03-27 13:01:49 -04:00
Keiichi Hirobe	956b09f468	core[patch]: stop deleting records with "scoped_full" when doc is empty (#30520 ) Fix a bug that causes `scoped_full` in index to delete records when there are no input docs.	2025-03-27 11:04:34 -04:00
Christophe Bornet	b3885c124f	core: Add ruff rules TC (#29268 ) See https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc Some fixes done for TC001,TC002 and TC003 but these rules are excluded since they don't play well with Pydantic. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2025-02-26 19:39:05 +00:00
Christophe Bornet	e4a78dfc2a	core: Bump ruff version to 0.9 (#29201 ) Also run some preview autofix and formatting --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2025-01-22 00:20:09 +00:00
Christophe Bornet	1c4ce7b42b	core: Auto-fix some docstrings (#29337 )	2025-01-21 13:29:53 -05:00
Keiichi Hirobe	67fd554512	core[patch]: throw exception indexing code if deletion fails in vectorstore (#28103 ) The delete methods in the VectorStore and DocumentIndex interfaces return a status indicating the result. Therefore, we can assume that their implementations don't throw exceptions but instead return a result indicating whether the delete operations have failed. The current implementation doesn't check the returned value, so I modified it to throw an exception when the operation fails. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-12-13 16:14:27 -05:00
Keiichi Hirobe	258b3be5ec	core[minor]: add new clean up strategy "scoped_full" to indexing (#28505 ) ~Note that this PR is now Draft, so I didn't add change to `aindex` function and didn't add test codes for my change. After we have an agreement on the direction, I will add commits.~ `batch_size` is very difficult to decide because setting a large number like >10000 will impact VectorDB and RecordManager, while setting a small number will delete records unnecessarily, leading to redundant work, as the `IMPORTANT` section says. On the other hand, we can't use `full` because the loader returns just a subset of the dataset in our use case. I guess many people are in the same situation as us. So, as one of the possible solutions for it, I would like to introduce a new argument, `scoped_full_cleanup`. This argument will be valid only when `claneup` is Full. If True, Full cleanup deletes all documents that haven't been updated AND that are associated with source ids that were seen during indexing. Default is False. This change keeps backward compatibility. --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-12-13 20:35:25 +00:00
Eugene Yurtsev	ce90b25313	core[patch]: Update error message in indexing code for unreachable code assertion (#28712 ) Minor update for error message that should never be triggered	2024-12-13 20:21:14 +00:00
Keiichi Hirobe	da28cf1f54	core[patch]: Reverts PR #25754 and add unit tests (#28702 ) I reported the bug 2 weeks ago here: https://github.com/langchain-ai/langchain/issues/28447 I believe this is a critical bug for the indexer, so I submitted a PR to revert the change and added unit tests to prevent similar bugs from being introduced in the future. @eyurtsev Could you check this?	2024-12-13 15:13:06 -05:00
Eric Pinzur	eadc2f6a90	core: added DeleteResponse to the module (#28069 ) Description: * added `DeleteResponse` to the `langchain_core.indexing` module, for implementing DocumentIndex classes.	2024-11-13 11:08:08 -05:00
ZhangShenao	ad0387ac97	Improvement [docs] Improve api docs (#27787 ) - Add missing param - Remove unused param --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-10-31 16:56:44 +00:00
Erick Friis	265e0a164a	core: add flake8-bandit (S) ruff rules to core (#27368 ) Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-24 22:33:41 +00:00
Christophe Bornet	d31ec8810a	core: Add ruff rules for error messages (EM) (#26965 ) All auto-fixes Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-07 22:12:28 +00:00
João Carlos Ferra de Almeida	780ce00dea	core[minor]: add kwargs to index and aindex functions for custom vector_field support (#26998 ) Added `kwargs` parameters to the `index` and `aindex` functions in `libs/core/langchain_core/indexing/api.py`. This allows users to pass additional arguments to the `add_documents` and `aadd_documents` methods, enabling the specification of a custom `vector_field`. For example, users can now use `vector_field="embedding"` when indexing documents in `OpenSearchVectorStore` --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-10-07 14:52:50 -04:00

1 2

67 Commits