langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-02-21 22:56:05 +00:00

Author	SHA1	Message	Date
Sam Partee	dee06ff2d8	interim commit for cache fix	2023-08-25 15:44:12 -07:00
Sam Partee	03d9a23c97	Merge branch 'fix-semantic-cache' into redis-refactor	2023-08-25 15:24:52 -07:00
Sam Partee	e0dcf56e28	Address Embeddings interface and retriever tests	2023-08-25 15:24:13 -07:00
Sam Partee	a99288987e	Fix semantic cache	2023-08-25 02:26:37 -07:00
Sam Partee	23a3705c6f	minor context_key bug fix	2023-08-25 02:24:10 -07:00
Sam Partee	6bbbe82088	Address from_existing issues	2023-08-24 23:19:23 -07:00
Bagatur	b8060a621a	lint	2023-08-24 08:04:32 -07:00
Sam Partee	48b7df56c5	Fix some linting errors	2023-08-24 03:07:15 -07:00
Sam Partee	8acc988c84	Merge branch 'master' into redis-refactor	2023-08-24 02:51:09 -07:00
Sam Partee	215d944f52	Add metadata cleaning, docstrings, docs Add the ability to clean the metadata before it goes into redis enabling document_loaders that return lists of strings to create categorical values for Tags in Redis indices. Also, added docstrings and updated the jupyter notebook	2023-08-24 02:43:14 -07:00
Leonid Ganeline	c19888c12c	⏳ docstrings: `vectorstores` consistency (#9349 ) ⏳ - updated the top-level descriptions to a consistent format; - changed several `ValueError` to `ImportError` in the import cases; - changed the format of several internal functions from "name" to "_name". So, these functions are not shown in the Top-level API Reference page (with lists of classes/functions)	2023-08-23 23:17:05 -07:00
Kim Minjong	d0ff0db698	Update ChatOpenAI._stream to respect finish_reason (#9672 ) Currently, ChatOpenAI._stream does not reflect finish_reason to generation_info. Change it to reflect that. Same patch as https://github.com/langchain-ai/langchain/pull/9431 , but also applies to _stream.	2023-08-23 22:58:14 -07:00
Patrick Loeber	5990651070	Add new document_loader: AssemblyAIAudioTranscriptLoader (#9667 ) This PR adds a new document loader `AssemblyAIAudioTranscriptLoader` that allows to transcribe audio files with the [AssemblyAI API](https://www.assemblyai.com) and loads the transcribed text into documents. - Add new document_loader with class `AssemblyAIAudioTranscriptLoader` - Add optional dependency `assemblyai` - Add unit tests (using a Mock client) - Add docs notebook This is the equivalent to the JS integration already available in LangChain.js. See the [LangChain JS docs AssemblyAI page](https://js.langchain.com/docs/modules/data_connection/document_loaders/integrations/web_loaders/assemblyai_audio_transcription). At its simplest, you can use the loader to get a transcript back from an audio file like this: ```python from langchain.document_loaders.assemblyai import AssemblyAIAudioTranscriptLoader loader = AssemblyAIAudioTranscriptLoader(file_path="./testfile.mp3") docs = loader.load() ``` To use it, it needs the `assemblyai` python package installed, and the environment variable `ASSEMBLYAI_API_KEY` set with your API key. Alternatively, the API key can also be passed as an argument. Twitter handles to shout out if so kindly 🙇 [@AssemblyAI](https://twitter.com/AssemblyAI) and [@patloeber](https://twitter.com/patloeber) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-23 22:51:19 -07:00
Eugene Yurtsev	9e1dbd4b49	x	2023-08-23 22:51:49 -04:00
Eugene Yurtsev	b88dfcb42a	Add indexing support (#9614 ) This PR introduces a persistence layer to help with indexing workflows into vectostores. The indexing code helps users to: 1. Avoid writing duplicated content into the vectostore 2. Avoid over-writing content if it's unchanged Importantly, this keeps on working even if the content being written is derived via a set of transformations from some source content (e.g., indexing children documents that were derived from parent documents by chunking.) The two main components are: 1. Persistence layer that keeps track of which keys were updated and when. Keeping track of the timestamp of updates, allows to clean up old content safely, and with minimal complexity. 2. HashedDocument which is used to hash the contents (including metadata) of the documents. We rely on the hashes for identifying duplicates. The indexing code works with ANY document loader. To add transformations to the documents, users for now can add a custom document loader that composes an existing loader together with document transformers. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 21:41:38 -04:00
刘方瑞	c215481531	Update default index type and metric type for MyScale vector store (#9353 ) We update the default index type from `IVFFLAT` to `MSTG`, a new vector type developed by MyScale.	2023-08-23 18:26:29 -07:00
Joshua Sundance Bailey	a9c86774da	Anthropic: Allow the use of kwargs consistent with ChatOpenAI. (#9515 ) - Description: ~~Creates a new root_validator in `_AnthropicCommon` that allows the use of `model_name` and `max_tokens` keyword arguments.~~ Adds pydantic field aliases to support `model_name` and `max_tokens` as keyword arguments. Ultimately, this makes `ChatAnthropic` more consistent with `ChatOpenAI`, making the two classes more interchangeable for the developer. - Issue: https://github.com/langchain-ai/langchain/issues/9510 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 18:23:21 -07:00
Bagatur	342087bdfa	fix integration test imports (#9669 )	2023-08-23 16:47:01 -07:00
Keras Conv3d	cbaea8d63b	tair fix distance_type error, and add hybrid search (#9531 ) - fix: distance_type error, - feature: Tair add hybrid search --------- Co-authored-by: thw <hanwen.thw@alibaba-inc.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 16:38:31 -07:00
Eugene Yurtsev	cd81e8a8f2	Add exclude to GenericLoader.from_file_system (#9539 ) support exclude param in GenericLoader.from_filesystem --------- Co-authored-by: Kyle Pancamo <50267605+KylePancamo@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 16:09:10 -07:00
Jacob Lee	278ef0bdcf	Adds ChatOllama (#9628 ) @rlancemartin --------- Co-authored-by: Adilkhan Sarsen <54854336+adolkhan@users.noreply.github.com> Co-authored-by: Kim Minjong <make.dirty.code@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-23 13:02:26 -07:00
Nuno Campos	20ce283fa7	Format	2023-08-23 20:03:35 +01:00
Nuno Campos	6424b3cde0	Add another test	2023-08-23 20:02:35 +01:00
William FH	da18e177f1	Update libs/langchain/langchain/schema/runnable/base.py Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-23 20:00:16 +01:00
Nuno Campos	c326751085	Lint	2023-08-23 20:00:16 +01:00
Nuno Campos	6d19709b65	RunnableLambda, if func returns a Runnable, run it	2023-08-23 20:00:16 +01:00
Nuno Campos	677da6a0fd	Add support for async funcs in RunnableSequence	2023-08-23 19:54:48 +01:00
Nuno Campos	1751fe114d	Add one more test	2023-08-23 19:52:13 +01:00
Nuno Campos	882b97cfd2	Lint	2023-08-23 19:50:20 +01:00
Nuno Campos	3ddabe8b2c	Code review	2023-08-23 19:48:33 +01:00
Nuno Campos	fdcd50aab4	Extend test	2023-08-23 19:48:33 +01:00
Nuno Campos	9777c2801d	Update method and docstring	2023-08-23 19:48:33 +01:00
Nuno Campos	93bbf67afc	WIP Add test Add test Lint	2023-08-23 19:48:33 +01:00
Nuno Campos	c184be5511	Use a shared executor for all parallel calls	2023-08-23 19:48:33 +01:00
Nuno Campos	db4b256a28	Add error for batch of 0	2023-08-23 19:39:46 +01:00
Nuno Campos	3458489936	Lint	2023-08-23 19:39:46 +01:00
Nuno Campos	e420bf22b6	Lint	2023-08-23 19:39:46 +01:00
Nuno Campos	cc83f54694	L:int	2023-08-23 19:39:46 +01:00
Nuno Campos	d414d47c78	Use a shared executor for all parallel calls	2023-08-23 19:39:46 +01:00
Bagatur	a40c12bb88	Update the nlpcloud connector after some changes on the NLP Cloud API (#9586 ) - Description: remove some text generation deprecated parameters and update the embeddings doc, - Tag maintainer: @rlancemartin	2023-08-23 11:35:08 -07:00
Bagatur	e2e582f1f6	Fixed source key name for docugami loader (#8598 ) The Docugami loader was not returning the source metadata key. This was triggering this exception when used with retrievers, per https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/schema/prompt_template.py#L193C1-L195C41 The fix is simple and just updates the metadata key name for the document each chunk is sourced from, from "name" to "source" as expected. I tested by running the python notebook that has an end to end scenario in it. Tagging DataLoader maintainers @rlancemartin @eyurtsev	2023-08-23 11:24:55 -07:00
karynzv	5508baf1eb	Add CrateDB prompt (#9657 ) Adds a prompt template for the CrateDB SQL dialect.	2023-08-23 13:33:37 -04:00
Bagatur	a8e8a31b41	Merge branch 'master' into bagatur/locals_in_config	2023-08-23 10:26:11 -07:00
Bagatur	ef2500584c	fmt	2023-08-23 10:15:45 -07:00
Zizhong Zhang	8a03836160	docs: fix PromptGuard docs (#9659 ) Fix PromptGuard docs. Noticed several trivial issues on the docs when integrating the new class. cc @baskaryan	2023-08-23 10:04:53 -07:00
Guy Korland	39a5d02225	Cleanup of ruff warnings use isinstance() instead of type() (#9655 ) Minor cosmetic PR just cleanup of `ruff` warnings use `isinstance()` instead of `type()`	2023-08-23 07:14:31 -07:00
Joseph McElroy	2a06e7b216	ElasticsearchStore: improve error logging for adding documents (#9648 ) Not obvious what the error is when you cannot index. This pr adds the ability to log the first errors reason, to help the user diagnose the issue. Also added some more documentation for when you want to use the vectorstore with an embedding model deployed in elasticsearch. Credit: @elastic and @phoey1	2023-08-23 07:04:09 -07:00
Julien Salinas	f1072cc31f	Merge branch 'master' into master	2023-08-23 14:42:40 +02:00
Sam Partee	0fbff0238d	interim commit	2023-08-22 23:48:15 -07:00
Sam Partee	8adaa7805e	interim commit	2023-08-22 21:20:17 -07:00

1 2 3 4 5 ...

579 Commits