langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-06-03 05:34:01 +00:00

History

matt haigh a4896da2a0 Experimental: Add other threshold types to SemanticChunker (#16807 ) Description Adding different threshold types to the semantic chunker. I’ve had much better and predictable performance when using standard deviations instead of percentiles. ![image](https://github.com/langchain-ai/langchain/assets/44395485/066e84a8-460e-4da5-9fa1-4ff79a1941c5) For all the documents I’ve tried, the distribution of distances look similar to the above: positively skewed normal distribution. All skews I’ve seen are less than 1 so that explains why standard deviations perform well, but I’ve included IQR if anyone wants something more robust. Also, using the percentile method backwards, you can declare the number of clusters and use semantic chunking to get an ‘optimal’ splitting. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>		2024-02-26 13:50:48 -08:00
..
agents	experimental[patch]: fix zero-shot pandas agent (#17442 )	2024-02-12 21:58:35 -08:00
autonomous_agents	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
chat_models	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
comprehend_moderation	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
cpal	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
data_anonymizer	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
fallacy_removal	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
generative_agents	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
graph_transformers	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
llm_bash	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
llm_symbolic_math	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
llms	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
open_clip	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
openai_assistant	Move OAI assistants to langchain and add callbacks (#13236 )	2023-11-13 17:42:07 -08:00
pal_chain	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
plan_and_execute	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
prompt_injection_identifier	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
prompts	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
pydantic_v1	`poetry lock` the experimental package. (#9478 )	2023-08-22 14:09:35 -04:00
recommenders	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
retrievers	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
rl_chain	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
smart_llm	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
sql	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
synthetic_data	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
tabular_synthetic_data	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
tools	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
tot	experimental: docstrings update (#18048 )	2024-02-23 21:24:16 -05:00
utilities	Clean up deprecated agents and update __init__ in experimental (#12231 )	2023-10-27 13:52:50 -04:00
__init__.py	Add version to langchain_experimental (#11613 )	2023-10-10 11:17:41 -04:00
py.typed	Add `py.typed` file to `langchain-experimental`. (#9557 )	2023-08-21 15:37:16 -04:00
text_splitter.py	Experimental: Add other threshold types to SemanticChunker (#16807 )	2024-02-26 13:50:48 -08:00