langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-06-02 04:58:46 +00:00

History

Chris Papademetrious 305d74c67a core: implement a batch_size parameter for CacheBackedEmbeddings (#18070 ) Description: Currently, `CacheBackedEmbeddings` computes vectors for all uncached documents before updating the store. This pull request updates the embedding computation loop to compute embeddings in batches, updating the store after each batch. I noticed this when I tried `CacheBackedEmbeddings` on our 30k document set and the cache directory hadn't appeared on disk after 30 minutes. The motivation is to minimize compute/data loss when problems occur: * If there is a transient embedding failure (e.g. a network outage at the embedding endpoint triggers an exception), at least the completed vectors are written to the store instead of being discarded. * If there is an issue with the store (e.g. no write permissions), the condition is detected early without computing (and discarding!) all the vectors. Issue: Implements enhancement #18026. Testing: I was unable to run unit tests; details in [this post](https://github.com/langchain-ai/langchain/discussions/15019#discussioncomment-8576684). --------- Signed-off-by: chrispy <chrispy@synopsys.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>		2024-03-19 18:55:43 +00:00
..
_api	core[patch]: deprecation docstring with lib (#18350 )	2024-03-01 00:44:13 +00:00
beta	core: Updated docstring for Context class (#19079 )	2024-03-18 21:15:14 -07:00
callbacks	[Enhancement] Add support for directly providing a run_id (#18990 )	2024-03-18 15:03:04 -07:00
document_loaders	community: If load() has been overridden, use it in default lazy_load() (#18690 )	2024-03-07 11:52:19 -05:00
documents	core[minor]: move document compressor base (#17910 )	2024-02-26 17:20:50 -08:00
embeddings	core[minor]: moved fake llms and embeddings to core (#19226 )	2024-03-18 10:01:26 -07:00
example_selectors	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
language_models	core[patch]: Pass sync run manager for sync stream fallback in astream (#19280 )	2024-03-19 16:32:33 +00:00
load	core[patch]: Change structured prompt lc id to match js (#19099 )	2024-03-14 20:02:52 -07:00
messages	core[minor]: generation info on msg (#18592 )	2024-03-12 04:43:17 +00:00
output_parsers	core: Fix Exception handling in XMLOutputParser (#19126 )	2024-03-18 21:08:32 -07:00
outputs	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
prompts	core[patch]: Change structured prompt lc id to match js (#19099 )	2024-03-14 20:02:52 -07:00
pydantic_v1	Separate out langchain_core package (#13577 )	2023-11-20 13:09:30 -08:00
runnables	code[patch]: Add in code documentation to core Runnable with_retry method (docs only) (#19192 )	2024-03-19 12:52:29 -04:00
tracers	core[major]: On Tool End Observation Casting Fix (#18798 )	2024-03-11 10:59:04 -04:00
utils	core: implement a batch_size parameter for CacheBackedEmbeddings (#18070 )	2024-03-19 18:55:43 +00:00
__init__.py	core[patch], community[patch]: mark runnable context, lc load as beta (#15603 )	2024-01-05 17:54:26 -05:00
agents.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
caches.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
chat_history.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
chat_sessions.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
env.py	core[patch]: update langchain-core runtime library name (#14884 )	2023-12-20 14:35:48 -08:00
exceptions.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
globals.py	core[patch]: Move `globals` to a module instead of a package (non breaking change) (#19159 )	2024-03-19 12:29:12 -04:00
memory.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
prompt_values.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
py.typed	core[minor], langchain[patch], experimental[patch]: Added missing `py.typed` to `langchain_core` (#14143 )	2023-12-01 19:15:23 -08:00
retrievers.py	[Enhancement] Add support for directly providing a run_id (#18990 )	2024-03-18 15:03:04 -07:00
stores.py	core: upgrade mypy to recent mypy (#18753 )	2024-03-07 15:25:19 -05:00
sys_info.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00
tools.py	[Enhancement] Add support for directly providing a run_id (#18990 )	2024-03-18 15:03:04 -07:00
vectorstores.py	docs: modules descriptions (#17844 )	2024-02-21 15:58:21 -08:00