langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-02-04 16:20:16 +00:00

Author	SHA1	Message	Date
William Fu-Hinthorn	b09df253e3	maybe	2023-06-30 12:48:59 -07:00
William FH	e4625846e5	Add Flyte Callback Handler (#6139 ) (#6986 ) Signed-off-by: Samhita Alla <aallasamhita@gmail.com> Co-authored-by: Samhita Alla <aallasamhita@gmail.com>	2023-06-30 12:25:22 -07:00
Bagatur	e3b7effc8f	Beef up import test (#6979 )	2023-06-30 09:26:05 -07:00
Bagatur	1ce9ef3828	Rm pytz dep (#6978 )	2023-06-30 09:24:01 -07:00
Davis Chase	eb180e321f	Page per class-style api reference (#6560 ) can make it prettier, but what do we think of overall structure? https://api.python.langchain.com/en/dev2049-page_per_class/api_ref.html --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-06-30 09:23:32 -07:00
William FH	64039b9f11	Promptlayer Callback (#6975 ) Co-authored-by: Saleh Hindi <saleh.hindi.one@gmail.com> Co-authored-by: jped <jonathanped@gmail.com>	2023-06-30 08:32:42 -07:00
William FH	13c62cf6b1	Arthur Callback (#6972 ) Co-authored-by: Max Cembalest <115359769+arthuractivemodeling@users.noreply.github.com>	2023-06-30 07:48:02 -07:00
William FH	8c73037dff	Simplify eval arg names (#6944 ) It'll be easier to switch between these if the names of predictions are consistent	2023-06-30 07:47:53 -07:00
Bagatur	8f5eca236f	release v220 (#6962 )	2023-06-30 06:52:09 -07:00
Bagatur	60b0d6ea35	Bagatur/openllm ensure available (#6960 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-30 00:54:23 -07:00
Siraj Aizlewood	521c6f0233	Provided default values for tags and inheritable_tags args in BaseRun… (#6858 ) when running AsyncCallbackManagerForChainRun (from langchain.callbacks.manager import AsyncCallbackManagerForChainRun), provided default values for tags and inheritable_tages of empty lists in manager.py BaseRunManager. - Description: In manager.py, `BaseRunManager`, default values were provided for the `__init__` args `tags` and `inheritable_tags`. They default to empty lists (`[]`). - Issue: When trying to use Nvidia NeMo Guardrails with LangChain, the following exception was raised:	2023-06-29 22:01:08 -07:00
Davis Chase	bd6a0ee9e9	Redirect vecstores (#6948 )	2023-06-29 19:22:21 -07:00
Davis Chase	f780678910	Add back in clickhouse mongo vecstore notebooks (#6949 )	2023-06-29 19:21:47 -07:00
Jacob Lee	73831ef3d8	Change code block color scheme (#6945 ) Adds contrast, makes code blocks more readable.	2023-06-29 19:21:11 -07:00
Tahjyei Thompson	7d8830f707	Add `OpenAIMultiFunctionsAgent` to import list in agents directory (#6824 ) - Added OpenAIMultiFunctionsAgent to the import list of the Agents directory --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-29 18:34:26 -07:00
Matt Florence	0f6737735d	Order messages in PostgresChatMessageHistory (#6830 ) Fixes issue: https://github.com/hwchase17/langchain/issues/6829 This guarantees message history is in the correct order. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-29 18:10:28 -07:00
lucasiscovici	e9950392dd	Add password to PyPDR loader and parser (#6908 ) Add password to PyPDR loader and parser --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-29 17:35:50 -07:00
Zander Chase	429f4dbe4d	Add Input Mapper in run_on_dataset (#6894 ) If you create a dataset from runs and run the same chain or llm on it later, it usually works great. If you have an agent dataset and want to run a different agent on it, or have more complex schema, it's hard for us to automatically map these values every time. This PR lets you pass in an input_mapper function that converts the example inputs to whatever format your model expects	2023-06-29 16:53:49 -07:00
Lei Pan	76d03f398d	support max_chunk_bytes in OpensearchVectorSearch to pass down to bulk (#6855 ) Support `max_chunk_bytes` kwargs to pass down to `buik` helper, in order to support the request limits in Opensearch locally and in AWS. @rlancemartin, @eyurtsev	2023-06-29 15:50:08 -07:00
Hashem Alsaket	5861770a53	Updated QA notebook (#6801 ) Description: `all_metadatas` was not defined, `OpenAIEmbeddings` was not imported, Issue: #6723 the issue # it fixes (if applicable), Dependencies: lark, Tag maintainer: @vowelparrot , @dev2049 --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-29 15:41:53 -07:00
Kacper Łukawski	140ba682f1	Support named vectors in Qdrant (#6871 ) # Description This PR makes it possible to use named vectors from Qdrant in Langchain. That was requested multiple times, as people want to reuse externally created collections in Langchain. It doesn't change anything for the existing applications. The changes were covered with some integration tests and included in the docs. ## Example ```python Qdrant.from_documents( docs, embeddings, location=":memory:", collection_name="my_documents", vector_name="custom_vector", ) ``` ### Issue: #2594 Tagging @rlancemartin & @eyurtsev. I'd appreciate your review.	2023-06-29 15:14:22 -07:00
bradcrossen	9ca1cf003c	Re-add Support for SQLAlchemy <1.4 (#6895 ) Support for SQLAlchemy 1.3 was removed in version 0.0.203 by change #6086. Re-adding support. - Description: Imports SQLAlchemy Row at class creation time instead of at init to support SQLAlchemy <1.4. This is the only breaking change and was introduced in version 0.0.203 #6086. A similar change was merged before: https://github.com/hwchase17/langchain/pull/4647 - Dependencies: Reduces SQLAlchemy dependency to > 1.3 - Tag maintainer: @rlancemartin, @eyurtsev, @hwchase17, @wangxuqi --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-29 14:49:35 -07:00
corranmac	20c6ade2fc	Grobid parser for Scientific Articles from PDF (#6729 ) ### Scientific Article PDF Parsing via Grobid `Description:` This change adds the GrobidParser class, which uses the Grobid library to parse scientific articles into a universal XML format containing the article title, references, sections, section text etc. The GrobidParser uses a local Grobid server to return PDFs document as XML and parses the XML to optionally produce documents of individual sentences or of whole paragraphs. Metadata includes the text, paragraph number, pdf relative bboxes, pages (text may overlap over two pages), section title (Introduction, Methodology etc), section_number (i.e 1.1, 2.3), the title of the paper and finally the file path. Grobid parsing is useful beyond standard pdf parsing as it accurately outputs sections and paragraphs within them. This allows for post-fitering of results for specific sections i.e. limiting results to the methodology section or results. While sections are split via headings, ideally they could be classified specifically into introduction, methodology, results, discussion, conclusion. I'm currently experimenting with chatgpt-3.5 for this function, which could later be implemented as a textsplitter. `Dependencies:` For use, the grobid repo must be cloned and Java must be installed, for colab this is: ``` !apt-get install -y openjdk-11-jdk -q !update-alternatives --set java /usr/lib/jvm/java-11-openjdk-amd64/bin/java !git clone https://github.com/kermitt2/grobid.git os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-11-openjdk-amd64" os.chdir('grobid') !./gradlew clean install ``` Once installed the server is ran on localhost:8070 via ``` get_ipython().system_raw('nohup ./gradlew run > grobid.log 2>&1 &') ``` @rlancemartin, @eyurtsev Twitter Handle: @Corranmac Grobid Demo Notebook is [here](https://colab.research.google.com/drive/1X-St_mQRmmm8YWtct_tcJNtoktbdGBmd?usp=sharing). --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-29 14:29:29 -07:00
Baichuan Sun	6157bdf9d9	Add API Header for Amazon API Gateway Authentication (#6902 ) Add API Headers support for Amazon API Gateway to enable Authentication using DynamoDB. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-06-29 12:58:07 -07:00
Wey Gu	1c66aa6d56	chore: NebulaGraph prompt optmization (#6904 ) Was preparing for a demo project of NebulaGraphQAChain to find out the prompt needed to be optimized a little bit. Please @hwchase17 kindly help review. Thanks!	2023-06-29 12:57:39 -07:00
Harrison Chase	0ba175e13f	move octo notebook (#6901 )	2023-06-29 12:20:55 -07:00
Stefano Lottini	75fb9d2fdc	Cassandra support for chat history using CassIO library (#6771 ) ### Overview This PR aims at building on #4378, expanding the capabilities and building on top of the `cassIO` library to interface with the database (as opposed to using the core drivers directly). Usage of `cassIO` (a library abstracting Cassandra access for ML/GenAI-specific purposes) is already established since #6426 was merged, so no new dependencies are introduced. In the same spirit, we try to uniform the interface for using Cassandra instances throughout LangChain: all our appreciation of the work by @jj701 notwithstanding, who paved the way for this incremental work (thank you!), we identified a few reasons for changing the way a `CassandraChatMessageHistory` is instantiated. Advocating a syntax change is something we don't take lighthearted way, so we add some explanations about this below. Additionally, this PR expands on integration testing, enables use of Cassandra's native Time-to-Live (TTL) features and improves the phrasing around the notebook example and the short "integrations" documentation paragraph. We would kindly request @hwchase to review (since this is an elaboration and proposed improvement of #4378 who had the same reviewer). ### About the __init__ breaking changes There are [many](https://docs.datastax.com/en/developer/python-driver/3.28/api/cassandra/cluster/) options when creating the `Cluster` object, and new ones might be added at any time. Choosing some of them and exposing them as `__init__` parameters `CassandraChatMessageHistory` will prove to be insufficient for at least some users. On the other hand, working through `kwargs` or adding a long, long list of arguments to `__init__` is not a desirable option either. For this reason, (as done in #6426), we propose that whoever instantiates the Chat Message History class provide a Cassandra `Session` object, ready to use. This also enables easier injection of mocks and usage of Cassandra-compatible connections (such as those to the cloud database DataStax Astra DB, obtained with a different set of init parameters than `contact_points` and `port`). We feel that a breaking change might still be acceptable since LangChain is at `0.*`. However, while maintaining that the approach we propose will be more flexible in the future, room could be made for a "compatibility layer" that respects the current init method. Honestly, we would to that only if there are strong reasons for it, as that would entail an additional maintenance burden. ### Other changes We propose to remove the keyspace creation from the class code for two reasons: first, production Cassandra instances often employ RBAC so that the database user reading/writing from tables does not necessarily (and generally shouldn't) have permission to create keyspaces, and second that programmatic keyspace creation is not a best practice (it should be done more or less manually, with extra care about schema mismatched among nodes, etc). Removing this (usually unnecessary) operation from the `__init__` path would also improve initialization performance (shorter time). We suggest, likewise, to remove the `__del__` method (which would close the database connection), for the following reason: it is the recommended best practice to create a single Cassandra `Session` object throughout an application (it is a resource-heavy object capable to handle concurrency internally), so in case Cassandra is used in other ways by the app there is the risk of truncating the connection for all usages when the history instance is destroyed. Moreover, the `Session` object, in typical applications, is best left to garbage-collect itself automatically. As mentioned above, we defer the actual database I/O to the `cassIO` library, which is designed to encode practices optimized for LLM applications (among other) without the need to expose LangChain developers to the internals of CQL (Cassandra Query Language). CassIO is already employed by the LangChain's Vector Store support for Cassandra. We added a few more connection options in the companion notebook example (most notably, Astra DB) to encourage usage by anyone who cannot run their own Cassandra cluster. We surface the `ttl_seconds` option for automatic handling of an expiration time to chat history messages, a likely useful feature given that very old messages generally may lose their importance. We elaborated a bit more on the integration testing (Time-to-live, separation of "session ids", ...). ### Remarks from linter & co. We reinstated `cassio` as a dependency both in the "optional" group and in the "integration testing" group of `pyproject.toml`. This might not be the right thing do to, in which case the author of this PR offer his apologies (lack of confidence with Poetry - happy to be pointed in the right direction, though!). During linter tests, we were hit by some errors which appear unrelated to the code in the PR. We left them here and report on them here for awareness: ``` langchain/vectorstores/mongodb_atlas.py:137: error: Argument 1 to "insert_many" of "Collection" has incompatible type "List[Dict[str, Sequence[object]]]"; expected "Iterable[Union[MongoDBDocumentType, RawBSONDocument]]" [arg-type] langchain/vectorstores/mongodb_atlas.py:186: error: Argument 1 to "aggregate" of "Collection" has incompatible type "List[object]"; expected "Sequence[Mapping[str, Any]]" [arg-type] langchain/vectorstores/qdrant.py:16: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:19: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:20: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:22: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:23: error: Name "grpc" is not defined [name-defined] ``` In the same spirit, we observe that to even get `import langchain` run, it seems that a `pip install bs4` is missing from the minimal package installation path. Thank you!	2023-06-29 10:50:34 -07:00
Zander Chase	f5663603cf	Throw error if evaluation key not present (#6874 )	2023-06-29 10:30:39 -07:00
Zander Chase	be164b20d8	Accept any single input (#6888 ) If I upload a dataset with a single input and output column, we should be able to let the chain prepare the input without having to maintain a strict dataset format.	2023-06-29 10:29:16 -07:00
Harrison Chase	8502117f62	bump version to 219 (#6899 )	2023-06-28 23:48:42 -07:00
Pablo	6370808d41	Adding support for async (_acall) for VertexAICommon LLM (#5588 ) # Adding support for async (_acall) for VertexAICommon LLM This PR implements the `_acall` method under `_VertexAICommon`. Because VertexAI itself does not provide an async interface, I implemented it via a ThreadPoolExecutor that can delegate execution of VertexAI calls to other threads. Twitter handle: @polecitoem : ) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: fyi - @agola11 for async functionality fyi - @Ark-kun from VertexAI	2023-06-28 23:07:41 -07:00
Mike Salvatore	cbd759aaeb	Fix inconsistent logging_and_data_dir parameter in AwaDB (#6775 ) ## Description Tag maintainer: @rlancemartin, @eyurtsev ### log_and_data_dir `AwaDB.__init__()` accepts a parameter named `log_and_data_dir`. But `AwaDB.from_texts()` and `AwaDB.from_documents()` accept a parameter named `logging_and_data_dir`. This inconsistency in this parameter name can lead to confusion on the part of the caller. This PR renames `logging_and_data_dir` to `log_and_data_dir` to make all functions consistent with the constructor. ### embedding `AwaDB.__init__()` accepts a parameter named `embedding_model`. But `AwaDB.from_texts()` and `AwaDB.from_documents()` accept a parameter named `embeddings`. This inconsistency in this parameter name can lead to confusion on the part of the caller. This PR renames `embedding_model` to `embeddings` to make AwaDB's constructor consistent with the classmethod "constructors" as specified by `VectorStore` abstract base class.	2023-06-28 23:06:52 -07:00
Harrison Chase	3ac08c3de4	Harrison/octo ml (#6897 ) Co-authored-by: Bassem Yacoube <125713079+AI-Bassem@users.noreply.github.com> Co-authored-by: Shotaro Kohama <khmshtr28@gmail.com> Co-authored-by: Rian Dolphin <34861538+rian-dolphin@users.noreply.github.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Shashank Deshpande <shashankdeshpande18@gmail.com>	2023-06-28 23:04:11 -07:00
Jiří Moravčík	a6b40b73e5	Add `call_actor_task` to the Apify integration (#6862 ) A user has been testing the Apify integration inside langchain and he was not able to run saved Actor tasks. This PR adds support for calling saved Actor tasks on the Apify platform to the existing integration. The structure of very similar to the one of calling Actors.	2023-06-28 22:13:47 -07:00
Shashank Deshpande	99cfe192da	added example notebook - use custom functions with openai agent (#6865 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-06-28 22:07:33 -07:00
Rian Dolphin	2e39ede848	add with score option for max marginal relevance (#6867 ) ### Adding the functionality to return the scores with retrieved documents when using the max marginal relevance - Description: Add the method `max_marginal_relevance_search_with_score_by_vector` to the FAISS wrapper. Functionality operates the same as `similarity_search_with_score_by_vector` except for using the max marginal relevance retrieval framework like is used in the `max_marginal_relevance_search_by_vector` method. - Dependencies: None - Tag maintainer: @rlancemartin @eyurtsev - Twitter handle: @RianDolphin --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-28 22:00:34 -07:00
Shotaro Kohama	398e4cd2dc	Update `langchain.chains.create_extraction_chain_pydantic` to parse results successfully (#6887 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Description: - The current code uses `PydanticSchema.schema()` and `_get_extraction_function` at the same time. As a result, a response from OpenAI has two nested `info`, and `PydanticAttrOutputFunctionsParser` fails to parse it. This PR will use the pydantic class given as an arg instead. - Issue: no related issue yet - Dependencies: no dependency change - Tag maintainer: @dev2049 - Twitter handle: @shotarok28	2023-06-28 21:57:41 -07:00
Eduard van Valkenburg	57f370cde9	PowerBI Toolkit additional logs (#6881 ) Added some additional logs to better be able to troubleshoot and understand the performance of the call to PBI vs the rest of the work.	2023-06-28 18:16:41 -07:00
Robert Lewis	c9c8d2599e	Update Zapier Jupyter notebook to include brief OAuth example (#6892 ) Description: Adds a brief example of using an OAuth access token with the Zapier wrapper. Also links to the Zapier documentation to learn more about OAuth flows. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-28 18:06:22 -07:00
Zhicheng Geng	16b11bda83	Use `getLogger` instead of `basicConfig` in `multi_query.py` (#6891 ) Remove `logging.basicConfig`, which turns on logging. Use `getLogger` instead	2023-06-28 18:06:10 -07:00
Davis Chase	f07dd02b50	Docs /redirects (#6790 ) Auto-generated a bunch of redirects from initial docs refactor commit	2023-06-28 17:07:53 -07:00
Harrison Chase	e5611565b7	bump version to 218 (#6857 )	2023-06-27 23:36:37 -07:00
Yaohui Wang	9d1bd18596	feat (documents): add LarkSuite document loader (#6420 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> ### Summary This PR adds a LarkSuite (FeiShu) document loader. > [LarkSuite](https://www.larksuite.com/) is an enterprise collaboration platform developed by ByteDance. ### Tests - an integration test case is added - an example notebook showing usage is added. [Notebook preview](https://github.com/yaohui-wyh/langchain/blob/master/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb) <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review? - PTAL @eyurtsev @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>	2023-06-27 23:08:05 -07:00
Jingsong Gao	a435a436c1	feat(document_loaders): add tencent cos directory and file loader (#6401 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> - add tencent cos directory and file support for document-loader #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @eyurtsev	2023-06-27 23:07:20 -07:00
Ninely	d6cd0deaef	feat: Add streaming only final aiter of agent (#6274 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> #### Add streaming only final async iterator of agent This callback returns an async iterator and only streams the final output of an agent. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-27 23:06:25 -07:00
Shashank Deshpande	1db266b20d	Update link in apis.mdx (#6812 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-06-27 23:00:26 -07:00
Lance Martin	3f9900a864	Create MultiQueryRetriever (#6833 ) Distance-based vector database retrieval embeds (represents) queries in high-dimensional space and finds similar embedded documents based on "distance". But, retrieval may produce difference results with subtle changes in query wording or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious. The `MultiQueryRetriever` automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` might be able to overcome some of the limitations of the distance-based retrieval and get a richer set of results. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-27 22:59:40 -07:00
Tim Asp	3ca1a387c2	Web Loader: Add proxy support (#6792 ) Proxies are helpful, especially when you start querying against more anti-bot websites. [Proxy services](https://developers.oxylabs.io/advanced-proxy-solutions/web-unblocker/making-requests) (of which there are many) and `requests` make it easy to rotate IPs to prevent banning by just passing along a simple dict to `requests`. CC @rlancemartin, @eyurtsev	2023-06-27 22:27:49 -07:00
Ayan Bandyopadhyay	f92ccf70fd	Update to the latest Psychic python library version (#6804 ) Update the Psychic document loader to use the latest `psychicapi` python library version: `0.8.0`	2023-06-27 22:26:38 -07:00
Hun-soo Jung	f3d178f600	Specify utilities package in SerpAPIWrapper docstring (#6821 ) - Description: Specify utilities package in SerpAPIWrapper docstring - Issue: Not an issue - Dependencies: (n/a) - Tag maintainer: @dev2049 - Twitter handle: (n/a)	2023-06-27 22:26:20 -07:00
Matt Robinson	dd2a151543	Docs/unstructured api key (#6781 ) ### Summary The Unstructured API will soon begin requiring API keys. This PR updates the Unstructured integrations docs with instructions on how to generate Unstructured API keys. ### Reviewers @rlancemartin @eyurtsev @hwchase17	2023-06-27 16:54:15 -07:00
Matthew Plachter	d6664af0ee	add async to zapier nla tools (#6791 ) Replace this comment with: - Description: Add Async functionality to Zapier NLA Tools - Issue: n/a - Dependencies: n/a - Tag maintainer: Maintainer responsibilities: - Agents / Tools / Toolkits: @vowelparrot - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md	2023-06-27 16:53:35 -07:00
Neil Neuwirth	efe0d39c6a	Adjusted OpenAI cost calculation (#6798 ) Added parentheses to ensure the division operation is performed before multiplication. This now correctly calculates the cost by dividing the number of tokens by 1000 first (to get the cost per token), and then multiplies it with the model's cost per 1k tokens @agola11	2023-06-27 16:53:06 -07:00
Ian	b4c196f785	fix pinecone delete bug (#6816 ) The implementation of delete in pinecone vector omits the namespace, which will cause delete failed	2023-06-27 16:50:17 -07:00
Janos Tolgyesi	f1070de038	WebBaseLoader: optionally raise exception in the case of http error (#6823 ) - Description: this PR adds the possibility to raise an exception in the case the http request did not return a 2xx status code. This is particularly useful in the situation when the url points to a non-existent web page, the server returns a http status of 404 NOT FOUND, but WebBaseLoader anyway parses and returns the http body of the error message. - Dependencies: none, - Tag maintainer: @rlancemartin, @eyurtsev, - Twitter handle: jtolgyesi	2023-06-27 16:43:59 -07:00
rafael	ef72a7cf26	rail_parser: Allow creation from pydantic (#6832 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Adds a way to create the guardrails output parser from a pydantic model.	2023-06-27 16:40:52 -07:00
Augustine Theodore	a980095efc	Enhancement : Ignore deleted messages and media in WhatsAppChatLoader (#6839 ) - Description: Ignore deleted messages and media - Issue: #6838 - Dependencies: No new dependencies - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-27 16:36:55 -07:00
Robert Lewis	74848aafea	Zapier - Add better error messaging for 401 responses (#6840 ) Description: When a 401 response is given back by Zapier, hint to the end user why that may have occurred - If an API Key was initialized with the wrapper, ask them to check their API Key value - if an access token was initialized with the wrapper, ask them to check their access token or verify that it doesn't need to be refreshed. Tag maintainer: @dev2049	2023-06-27 16:35:42 -07:00
Matt Robinson	b24472eae3	feat: Add `UnstructuredOrgModeLoader` (#6842 ) ### Summary Adds `UnstructuredOrgModeLoader` for processing [Org-mode](https://en.wikipedia.org/wiki/Org-mode) documents. ### Testing ```python from langchain.document_loaders import UnstructuredOrgModeLoader loader = UnstructuredOrgModeLoader( file_path="example_data/README.org", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	2023-06-27 16:34:17 -07:00
Piyush Jain	e53995836a	Added missing attribute value object (#6849 ) ## Description Adds a missing type class for [AdditionalResultAttributeValue](https://docs.aws.amazon.com/kendra/latest/APIReference/API_AdditionalResultAttributeValue.html). Fixes validation failure for the query API that have `AdditionalAttributes` in the response. cc @dev2049 cc @zhichenggeng	2023-06-27 16:30:11 -07:00
Cristóbal Carnero Liñán	e494b0a09f	feat (documents): add a source code loader based on AST manipulation (#6486 ) #### Summary A new approach to loading source code is implemented: Each top-level function and class in the code is loaded into separate documents. Then, an additional document is created with the top-level code, but without the already loaded functions and classes. This could improve the accuracy of QA chains over source code. For instance, having this script: ``` class MyClass: def __init__(self, name): self.name = name def greet(self): print(f"Hello, {self.name}!") def main(): name = input("Enter your name: ") obj = MyClass(name) obj.greet() if __name__ == '__main__': main() ``` The loader will create three documents with this content: First document: ``` class MyClass: def __init__(self, name): self.name = name def greet(self): print(f"Hello, {self.name}!") ``` Second document: ``` def main(): name = input("Enter your name: ") obj = MyClass(name) obj.greet() ``` Third document: ``` # Code for: class MyClass: # Code for: def main(): if __name__ == '__main__': main() ``` A threshold parameter is added to control whether small scripts are split in this way or not. At this moment, only Python and JavaScript are supported. The appropriate parser is determined by examining the file extension. #### Tests This PR adds: - Unit tests - Integration tests #### Dependencies Only one dependency was added as optional (needed for the JavaScript parser). #### Documentation A notebook is added showing how the loader can be used. #### Who can review? @eyurtsev @hwchase17 --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-27 15:58:47 -07:00
Robert Lewis	da462d9dd4	Zapier update oauth support (#6780 ) Description: Update documentation to 1) point to updated documentation links at Zapier.com (we've revamped our help docs and paths), and 2) To provide clarity how to use the wrapper with an access token for OAuth support Demo: Initializing the Zapier Wrapper with an OAuth Access Token `ZapierNLAWrapper(zapier_nla_oauth_access_token="<redacted>")` Using LangChain to resolve the current weather in Vancouver BC leveraging Zapier NLA to lookup weather by coords. ``` > Entering new chain... I need to use a tool to get the current weather. Action: The Weather: Get Current Weather Action Input: Get the current weather for Vancouver BC Observation: {"coord__lon": -123.1207, "coord__lat": 49.2827, "weather": [{"id": 802, "main": "Clouds", "description": "scattered clouds", "icon": "03d", "icon_url": "http://openweathermap.org/img/wn/03d@2x.png"}], "weather[]icon_url": ["http://openweathermap.org/img/wn/03d@2x.png"], "weather[]icon": ["03d"], "weather[]id": [802], "weather[]description": ["scattered clouds"], "weather[]main": ["Clouds"], "base": "stations", "main__temp": 71.69, "main__feels_like": 71.56, "main__temp_min": 67.64, "main__temp_max": 76.39, "main__pressure": 1015, "main__humidity": 64, "visibility": 10000, "wind__speed": 3, "wind__deg": 155, "wind__gust": 11.01, "clouds__all": 41, "dt": 1687806607, "sys__type": 2, "sys__id": 2011597, "sys__country": "CA", "sys__sunrise": 1687781297, "sys__sunset": 1687839730, "timezone": -25200, "id": 6173331, "name": "Vancouver", "cod": 200, "summary": "scattered clouds", "_zap_search_was_found_status": true} Thought: I now know the current weather in Vancouver BC. Final Answer: The current weather in Vancouver BC is scattered clouds with a temperature of 71.69 and wind speed of 3 ```	2023-06-27 11:46:32 -07:00
Joshua Carroll	24e4ae95ba	Initial Streamlit callback integration doc (md) (#6788 ) Description: Add a documentation page for the Streamlit Callback Handler integration (#6315) Notes: - Implemented as a markdown file instead of a notebook since example code runs in a Streamlit app (happy to discuss / consider alternatives now or later) - Contains an embedded Streamlit app -> https://mrkl-minimal.streamlit.app/ Currently this app is hosted out of a Streamlit repo but we're working to migrate the code to a LangChain owned repo ![streamlit_docs](https://github.com/hwchase17/langchain/assets/116604821/0b7a6239-361f-470c-8539-f22c40098d1a) cc @dev2049 @tconkling	2023-06-27 11:43:49 -07:00
Harrison Chase	8392ca602c	bump version to 217 (#6831 )	2023-06-27 09:39:56 -07:00
Ismail Pelaseyed	fcb3a64799	Add support for passing headers and search params to openai openapi chain (#6782 ) - Description: add support for passing headers and search params to OpenAI OpenAPI chains. - Issue: n/a - Dependencies: n/a - Tag maintainer: @hwchase17 - Twitter handle: @pelaseyed --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-27 09:09:03 -07:00
Zander Chase	e1fdb67440	Update description in Evals notebook (#6808 )	2023-06-27 00:26:49 -07:00
Zander Chase	ad028bbb80	Permit Constitutional Principles (#6807 ) In the criteria evaluator.	2023-06-27 00:23:54 -07:00
Zander Chase	6ca383ecf6	Update to RunOnDataset helper functions to accept evaluator callbacks (#6629 ) Also improve docstrings and update the tracing datasets notebook to focus on "debug, evaluate, monitor"	2023-06-26 23:58:13 -07:00
WaseemH	7ac9b22886	`RecusiveUrlLoader` to `RecursiveUrlLoader` (#6787 )	2023-06-26 23:12:14 -07:00
Mshoven	4535b0b41e	🎯Bug: format the url and path_params (#6755 ) - Description: format the url and path_params correctly, - Issue: #6753, - Dependencies: None, - Tag maintainer: @vowelparrot, - Twitter handle: @0xbluesecurity	2023-06-26 23:03:57 -07:00
Zander Chase	07d802d088	Don't raise error if parent not found (#6538 ) Done so that you can pass in a run from the low level api	2023-06-26 22:57:52 -07:00
Leonid Ganeline	49c864fa18	docs: vectorstore upgrades 2 (#6796 ) updated vectorstores/ notebooks; added new integrations into ecosystem/integrations/ @dev2049 @rlancemartin, @eyurtsev	2023-06-26 22:55:04 -07:00
Zander Chase	d7dbf4aefe	Clean up agent trajectory interface (#6799 ) - Enable reference - Enable not specifying tools at the start - Add methods with keywords	2023-06-26 22:54:04 -07:00
Zander Chase	cc60fed3be	Add a Pairwise Comparison Chain (#6703 ) Notebook shows preference scoring between two chains and reports wilson score interval + p value I think I'll add the option to insert ground truth labels but doesn't have to be in this PR	2023-06-26 20:47:41 -07:00
Hakan Tekgul	2928b080f6	Update arize_callback.py - bug fix (#6784 ) - Description: Bug Fix - Added a step variable to keep track of prompts - Issue: Bug from internal Arize testing - The prompts and responses that are ingested were not mapped correctly - Dependencies: N/A	2023-06-26 16:49:46 -07:00
Zander Chase	c460b04c64	Update String Evaluator (#6615 ) - Add protocol for `evaluate_strings` - Move the criteria evaluator out so it's not restricted to being applied on traced runs	2023-06-26 14:16:14 -07:00
AaaCabbage	b3f8324de9	feat: fix the Chinese characters in the solution content will be conv… (#6734 ) fix the Chinese characters in the solution content will be converted to ascii encoding, resulting in an abnormally long number of tokens Co-authored-by: qixin <qixin@fintec.ai>	2023-06-26 13:14:48 -07:00
Chris Pappalardo	70f7c2bb2e	align chroma vectorstore get with chromadb to enable where filtering (#6686 ) allows for where filtering on collection via get - Description: aligns langchain chroma vectorstore get with underlying [chromadb collection get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103) allowing for where filtering, etc. - Issue: NA - Dependencies: none - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @pappanaka	2023-06-26 10:51:20 -07:00
Zander Chase	9ca3b4645e	Add support for tags in chain group context manager (#6668 ) Lets you specify local and inheritable tags in the group manager. Also, add more verbose docstrings for our reference docs.	2023-06-26 10:37:33 -07:00
Harrison Chase	d1bcc58beb	bump version to 216 (#6770 )	2023-06-26 09:46:19 -07:00
Zander Chase	6d30acffcb	Fix breaking tags (#6765 ) Fix tags change that broke old way of initializing agent Closes #6756	2023-06-26 09:28:11 -07:00
James Croft	ba622764cb	Improve performance when retrieving Notion DB pages (#6710 )	2023-06-26 05:46:09 -07:00
Richy Wang	ec8247ec59	Fixed bug in AnalyticDB Vector Store caused by upgrade SQLAlchemy version (#6736 )	2023-06-26 05:35:25 -07:00
Santiago Delgado	d84a3bcf7a	Office365 Tool (#6306 ) #### Background With the development of [structured tools](https://blog.langchain.dev/structured-tools/), the LangChain team expanded the platform's functionality to meet the needs of new applications. The GMail tool, empowered by structured tools, now supports multiple arguments and powerful search capabilities, demonstrating LangChain's ability to interact with dynamic data sources like email servers. #### Challenge The current GMail tool only supports GMail, while users often utilize other email services like Outlook in Office365. Additionally, the proposed calendar tool in PR https://github.com/hwchase17/langchain/pull/652 only works with Google Calendar, not Outlook. #### Changes This PR implements an Office365 integration for LangChain, enabling seamless email and calendar functionality with a single authentication process. #### Future Work With the core Office365 integration complete, future work could include integrating other Office365 tools such as Tasks and Address Book. #### Who can review? @hwchase17 or @vowelparrot can review this PR #### Appendix @janscas, I utilized your [O365](https://github.com/O365/python-o365) library extensively. Given the rising popularity of LangChain and similar AI frameworks, the convergence of libraries like O365 and tools like this one is likely. So, I wanted to keep you updated on our progress. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-26 02:59:09 -07:00
Xiaochao Dong	a15afc102c	Relax the action input check for actions that require no input (#6357 ) When the tool requires no input, the LLM often gives something like this: ```json { "action": "just_do_it" } ``` I have attempted to enhance the prompt, but it doesn't appear to be functioning effectively. Therefore, I believe we should consider easing the check a little bit. Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2023-06-26 02:30:17 -07:00
Ethan Bowen	cc33bde74f	Confluence added (#6432 ) Adding Confluence to Jira tool. Can create a page in Confluence with this PR. If accepted, will extend functionality to Bitbucket and additional Confluence features. --------- Co-authored-by: Ethan Bowen <ethan.bowen@slalom.com>	2023-06-26 02:28:04 -07:00
Surya Nudurupati	2aeb8e7dbc	Improved Documentation: Eliminating Redundancy in the Introduction.mdx (#6360 ) When the documentation was originally written there was a redundant typing of the word "using the"	2023-06-26 02:27:36 -07:00
rajib	0f6ef048d2	The openai_info.py does not have gpt-35-turbo which is the underlying Azure Open AI model name (#6321 ) Since this model name is not there in the list MODEL_COST_PER_1K_TOKENS, when we use get_openai_callback(), for gpt 3.5 model in Azure AI, we do not get the cost of the tokens. This will fix this issue #### Who can review? @hwchase17 @agola11 Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-26 02:16:39 -07:00
ArchimedesFTW	fe941cb54a	Change tags(str) to tags(dict) in mlflow_callback.py docs (#6473 ) Fixes #6472 #### Who can review? @agola11	2023-06-26 02:12:23 -07:00
0xcrusher	9187d2f3a9	Fixed caching bug for Multiple Caching types by correctly checking types (#6746 ) - Fixed an issue where some caching types check the wrong types, hence not allowing caching to work Maintainer responsibilities: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	2023-06-26 01:14:32 -07:00
Harrison Chase	e9877ea8b1	Tiktoken override (#6697 )	2023-06-26 00:49:32 -07:00
Gabriel Altay	f9771700e4	prevent DuckDuckGoSearchAPIWrapper from consuming top result (#6727 ) remove the `next` call that checks for None on the results generator	2023-06-25 19:54:15 -07:00
Pau Ramon Revilla	87802c86d9	Added a MHTML document loader (#6311 ) MHTML is a very interesting format since it's used both for emails but also for archived webpages. Some scraping projects want to store pages in disk to process them later, mhtml is perfect for that use case. This is heavily inspired from the beautifulsoup html loader, but extracting the html part from the mhtml file. --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-25 13:12:08 -07:00
Janos Tolgyesi	05eec99269	beautifulsoup get_text kwargs in WebBaseLoader (#6591 ) # beautifulsoup get_text kwargs in WebBaseLoader - Description: this PR introduces an optional `bs_get_text_kwargs` parameter to `WebBaseLoader` constructor. It can be used to pass kwargs to the downstream BeautifulSoup.get_text call. The most common usage might be to pass a custom text separator, as seen also in `BSHTMLLoader`. - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: jtolgyesi	2023-06-25 12:42:27 -07:00
Matt Robinson	be68f6f8ce	feat: Add `UnstructuredRSTLoader` (#6594 ) ### Summary Adds an `UnstructuredRSTLoader` for loading [reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText) file. ### Testing ```python from langchain.document_loaders import UnstructuredRSTLoader loader = UnstructuredRSTLoader( file_path="example_data/README.rst", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @hwchase17 - @rlancemartin - @eyurtsev	2023-06-25 12:41:57 -07:00
Chip Davis	b32cc01c9f	feat: added tqdm progress bar to UnstructuredURLLoader (#6600 ) - Description: Adds a simple progress bar with tqdm when using UnstructuredURLLoader. Exposes new paramater `show_progress_bar`. Very simple PR. - Issue: N/A - Dependencies: N/A - Tag maintainer: @rlancemartin @eyurtsev --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-25 12:41:25 -07:00
Augustine Theodore	afc292e58d	Fix WhatsAppChatLoader : Enable parsing additional formats (#6663 ) - Description: Updated regex to support a new format that was observed when whatsapp chat was exported. - Issue: #6654 - Dependencies: No new dependencies - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-25 12:08:43 -07:00
Sumanth Donthula	3e30a5d967	updated sql_database.py for returning sorted table names. (#6692 ) Added code to get the tables info in sorted order in methods get_usable_table_names and get_table_info. Linked to Issue: #6640	2023-06-25 12:04:24 -07:00
刘方瑞	9d1b3bab76	Fix Typo in LangChain MyScale Integration Doc (#6705 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Description: Fix Typo in LangChain MyScale Integration Doc @hwchase17	2023-06-25 11:54:00 -07:00
sudolong	408c8d0178	fix chroma _similarity_search_with_relevance_scores missing `kwargs` … (#6708 ) Issue: https://github.com/hwchase17/langchain/issues/6707	2023-06-25 11:53:42 -07:00
Zander Chase	d89e10d361	Fix Multi Functions Agent Tracing (#6702 ) Confirmed it works now: https://dev.langchain.plus/public/0dc32ce0-55af-432e-b09e-5a1a220842f5/r	2023-06-25 10:39:04 -07:00
Harrison Chase	1742db0c30	bump version to 215 (#6719 )	2023-06-25 08:52:51 -07:00
Ankush Gola	e1b801be36	split up batch llm calls into separate runs (#5804 )	2023-06-24 21:03:31 -07:00
Davis Chase	1da99ce013	bump v214 (#6694 )	2023-06-24 14:23:11 -07:00
Lance Martin	dd36adc0f4	Make bs4 a local import in recursive_url_loader.py (#6693 ) Resolve https://github.com/hwchase17/langchain/issues/6679	2023-06-24 13:54:10 -07:00
Harrison Chase	ef4c7b54ef	bump to version 213 (#6688 )	2023-06-24 11:56:37 -07:00
UmerHA	068142fce2	Add caching to BaseChatModel (issue #1644 ) (#5089 ) # Add caching to BaseChatModel Fixes #1644 (Sidenote: While testing, I noticed we have multiple implementations of Fake LLMs, used for testing. I consolidated them.) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-24 11:45:09 -07:00
Harrison Chase	c289cc891a	Harrison/optional ids opensearch (#6684 ) Co-authored-by: taekimsmar <66041442+taekimsmar@users.noreply.github.com>	2023-06-24 09:19:57 -07:00
Hrag Balian	2518e6c95b	Session deletion method in motorhead memory (#6609 ) Motorhead Memory module didn't support deletion of a session. Added a method to enable deletion. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-23 21:27:42 -07:00
Baichuan Sun	9fbe346860	Amazon API Gateway hosted LLM (#6673 ) This PR adds a new LLM class for the Amazon API Gateway hosted LLM. The PR also includes example notebooks for using the LLM class in an Agent chain. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-23 21:27:25 -07:00
Davis Chase	fa1bb873e2	Fix openapi parameter parsing (#6676 ) Ensure parameters are json serializable, related to #6671	2023-06-23 21:19:12 -07:00
Akash	b7e1c54947	Just corrected a small inconsistency on a doc page (#6603 ) ### Just corrected a small inconsistency on a doc page (not exactly a typo, per se) - Description: There was inconsistency due to the use of single quotes at one place on the [Squential Chains](https://python.langchain.com/docs/modules/chains/foundational/sequential_chains) page of the docs, - Issue: NA, - Dependencies: NA, - Tag maintainer: @dev2049, - Twitter handle: kambleakash0	2023-06-23 16:09:29 -07:00
Davis Chase	2da1aab50b	Wiki loader lint (#6670 )	2023-06-23 16:05:42 -07:00
Leonid Ganeline	1c81883d42	added docstrings where they missed (#6626 ) This PR targets the `API Reference` documentation. - Several classes and functions missed `docstrings`. These docstrings were created. - In several places this ``` except ImportError: raise ValueError( ``` was replaced to ``` except ImportError: raise ImportError( ```	2023-06-23 15:49:44 -07:00
Shashank	3364e5818b	Changed generate_prompt.py (#6644 ) Modified regex for Fix: ValueError: Could not parse output	2023-06-23 15:48:33 -07:00
Davis Chase	f1e1ac2a01	chroma nb close img tag (#6669 )	2023-06-23 15:41:54 -07:00
eLafo	db8b13df4c	adds doc_content_chars_max argument to WikipediaLoader (#6645 ) # Description It adds a new initialization param in `WikipediaLoader` so we can override the `doc_content_chars_max` param used in `WikipediaAPIWrapper` under the hood, e.g: ```python from langchain.document_loaders import WikipediaLoader # doc_content_chars_max is the new init param loader = WikipediaLoader(query="python", doc_content_chars_max=90000) ``` ## Decisions `doc_content_chars_max` default value will be 4000, because it's the current value I have added pycode comments # Issue #6639 # Dependencies None # Twitter handle [@elafo](https://twitter.com/elafo)	2023-06-23 15:22:09 -07:00
Davis Chase	5e5b30b74f	openapi -> openai nit (#6667 )	2023-06-23 15:09:02 -07:00
Jeff Huber	2acf109c4b	update chroma notebook (#6664 ) @rlancemartin I updated the notebook for Chroma to hopefully be a lot easier for users.	2023-06-23 15:03:06 -07:00
Eduard van Valkenburg	48381f1f78	PowerBI: catch outdated token (#6634 ) This adds just a small tweak to catch the error that says the token is expired rather then retrying.	2023-06-23 15:01:08 -07:00
Piyush Jain	b1de927f1b	Kendra retriever api (#6616 ) ## Description Replaces [Kendra Retriever](https://github.com/hwchase17/langchain/blob/master/langchain/retrievers/aws_kendra_index_retriever.py) with an updated version that uses the new [retriever API](https://docs.aws.amazon.com/kendra/latest/dg/searching-retrieve.html) which is better suited for retrieval augmented generation (RAG) systems. Note: This change requires the latest version (1.26.159) of boto3 to work. `pip install -U boto3` to upgrade the boto3 version. cc @hupe1980 cc @dev2049	2023-06-23 14:59:35 -07:00
ChrisLovejoy	4e5d78579b	fix minor typo in vector_db_qa.mdx (#6604 ) - Description: minor typo fixed - doesn't instead of does. No other changes.	2023-06-23 14:57:37 -07:00
Ikko Eltociear Ashimine	73da193a4b	Fix typo in myscale_self_query.ipynb (#6601 )	2023-06-23 14:57:12 -07:00
Saarthak Maini	ba256b23f2	Fix Typo (#6595 ) Resolves #6582	2023-06-23 14:56:54 -07:00
kourosh hakhamaneshi	f6fdabd20b	Fix ray-project/Aviary integration (#6607 ) - Description: The aviary integration has changed url link. This PR provide fix for those changes and also it makes providing the input URL optional to the API (since they can be set via env variables). - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2023-06-23 14:49:53 -07:00
northern-64bit	dbe1d029ec	Fix grammar mistake in base.py in planners (#6611 ) Fix a typo in `langchain/experimental/plan_and_execute/planners/base.py`, by changing "Given input, decided what to do." to "Given input, decide what to do." This is in the docstring for functions running LLM chains which shall create a plan, "decided" does not make any sense in this context.	2023-06-23 14:47:10 -07:00
Aaron Pham	082976d8d0	fix(docs): broken link for OpenLLM (#6622 ) This link for the notebook of OpenLLM is not migrated to the new format Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-23 13:59:17 -07:00
Davis Chase	fe828185ed	Dev2049/bump 212 (#6665 )	2023-06-23 13:48:02 -07:00
Hassan Ouda	9e52134d30	ChatVertexAI broken - Fix error with sending context in params (#6652 ) vertex Ai chat is broken right now. That is because context is in params and chat.send_message doesn't accept that as a params. - Closes issue [ChatVertexAI Error: _ChatSessionBase.send_message() got an unexpected keyword argument 'context' #6610](https://github.com/hwchase17/langchain/issues/6610)	2023-06-23 13:38:21 -07:00
Lance Martin	c2b25c17c5	Recursive URL loader (#6455 ) We may want to process load all URLs under a root directory. For example, let's look at the [LangChain JS documentation](https://js.langchain.com/docs/). This has many interesting child pages that we may want to read in bulk. Of course, the `WebBaseLoader` can load a list of pages. But, the challenge is traversing the tree of child pages and actually assembling that list! We do this using the `RecusiveUrlLoader`. This also gives us the flexibility to exclude some children (e.g., the `api` directory with > 800 child pages).	2023-06-23 13:09:00 -07:00
Lance Martin	be02572d58	Add delete and ensure add_texts performs upsert (w/ ID optional) (#6126 ) ## Goal We want to ensure consistency across vectordbs: 1/ add `delete` by ID method to the base vectorstore class 2/ ensure `add_texts` performs `upsert` with ID optionally passed ## Testing - [x] Pinecone: notebook test w/ `langchain_test` vectorstore. - [x] Chroma: Review by @jeffchuber, notebook test w/ in memory vectorstore. - [x] Supabase: Review by @copple, notebook test w/ `langchain_test` table. - [x] Weaviate: Notebook test w/ `langchain_test` index. - [x] Elastic: Revied by @vestal. Notebook test w/ `langchain_test` table. - [ ] Redis: Asked for review from owner of recent `delete` method https://github.com/hwchase17/langchain/pull/6222	2023-06-23 13:03:10 -07:00
Lance Martin	393f469eb3	Create merge loader that combines documents from a set of loaders (#6659 ) Simple utility loader that combines documents from a set of specified loaders.	2023-06-23 13:02:48 -07:00
Davis Chase	6988039975	openapi_openai docstring (#6661 )	2023-06-23 11:38:33 -07:00
Davis Chase	b25933b607	bump 211 (#6660 )	2023-06-23 11:10:48 -07:00
Davis Chase	e013459b18	Openapi to openai (#6658 )	2023-06-23 11:00:34 -07:00
Davis Chase	b062a3f938	bump 210 (#6656 )	2023-06-23 09:37:58 -07:00
Alejandra De Luna	980c865174	fix: remove callbacks arg from Tool and StructuredTool inferred schema (#6483 ) Fixes #5456 This PR removes the `callbacks` argument from a tool's schema when creating a `Tool` or `StructuredTool` with the `from_function` method and `infer_schema` is set to `True`. The `callbacks` argument is now removed in the `create_schema_from_function` and `_get_filtered_args` methods. As suggested by @vowelparrot, this fix provides a straightforward solution that minimally affects the existing implementation. A test was added to verify that this change enables the expected use of `Tool` and `StructuredTool` when using a `CallbackManager` and inferring the tool's schema. - @hwchase17	2023-06-23 01:48:27 -07:00
Zander Chase	b4fe7f3a09	Session to project (#6249 ) Sessions are being renamed to projects in the tracer	2023-06-23 01:11:01 -07:00
Zander Chase	9c09861946	Add tags in agent initialization (#6559 ) Add better docstrings for agent executor as well Inspo: https://github.com/hwchase17/langchainjs/pull/1722 ![image](https://github.com/hwchase17/langchain/assets/130414180/d11662bc-0c0e-4166-9ff3-354d41a9144a)	2023-06-22 22:35:00 -07:00
Lance Martin	6e69bfbb28	Loader for OpenCityData and minor cleanups to Pandas, Airtable loaders (#6301 ) Many cities have open data portals for events like crime, traffic, etc. Socrata provides an API for many, including SF (e.g., see [here](https://dev.socrata.com/foundry/data.sfgov.org/tmnf-yvry)). This is a new data loader for city data that uses Socrata API.	2023-06-22 22:20:42 -07:00
Christoph Kahl	9d42621fa4	added redis method to delete entries by keys (#6222 ) In addition to my last pr (return keys of added entries), we also need a method to delete the entries by keys. @dev2049	2023-06-22 13:26:47 -07:00
Tim Conkling	c28990d871	StreamlitCallbackHandler (#6315 ) A new implementation of `StreamlitCallbackHandler`. It formats Agent thoughts into Streamlit expanders. You can see the handler in action here: https://langchain-mrkl.streamlit.app/ Per a discussion with Harrison, we'll be adding a `StreamlitCallbackHandler` implementation to an upcoming [Streamlit](https://github.com/streamlit/streamlit) release as well, and will be updating it as we add new LLM- and LangChain-specific features to Streamlit. The idea with this PR is that the LangChain `StreamlitCallbackHandler` will "auto-update" in a way that keeps it forward- (and backward-) compatible with Streamlit. If the user has an older Streamlit version installed, the LangChain `StreamlitCallbackHandler` will be used; if they have a newer Streamlit version that has an updated `StreamlitCallbackHandler`, that implementation will be used instead. (I'm opening this as a draft to get the conversation going and make sure we're on the same page. We're really excited to land this into LangChain!) #### Who can review? @agola11, @hwchase17	2023-06-22 13:14:28 -07:00
Nuno Campos	74ac6fb6b9	Allow callback handlers to opt into being run inline (#6424 ) This is useful eg for callback handlers that use context vars (like open telemetry) See https://github.com/hwchase17/langchain/pull/6095	2023-06-22 11:36:19 -07:00
Harrison Chase	a9108c1809	add mongo (HOLD) (#6437 ) do not merge in	2023-06-22 11:08:12 -07:00
Lance Martin	30f7288082	MD header text splitter returns Documents (#6571 ) Return `Documents` from MD header text splitter to simplify UX. Updates the test as well as example notebooks.	2023-06-22 09:25:38 -07:00
Rogério Chaves	3436da65a4	Fix callback forwarding in async plan method for OpenAI function agent (#6584 ) The callback argument was missing, preventing me to get callbacks to work properly when using it async	2023-06-22 08:18:31 -07:00
Davis Chase	b909bc8b58	bump 209 (#6593 )	2023-06-22 08:18:19 -07:00
minhajul-clarifai	6e57306a13	Clarifai integration (#5954 ) # Changes This PR adds [Clarifai](https://www.clarifai.com/) integration to Langchain. Clarifai is an end-to-end AI Platform. Clarifai offers user the ability to use many types of LLM (OpenAI, cohere, ect and other open source models). As well, a clarifai app can be treated as a vector database to upload and retrieve data. The integrations includes: - Clarifai LLM integration: Clarifai supports many types of language model that users can utilize for their application - Clarifai VectorDB: A Clarifai application can hold data and embeddings. You can run semantic search with the embeddings #### Before submitting - [x] Added integration test for LLM - [x] Added integration test for VectorDB - [x] Added notebook for LLM - [x] Added notebook for VectorDB Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-22 08:00:15 -07:00
Jeroen Van Goey	7f6f5c2a6a	Add missing word in comment (#6587 ) Changed ``` # Do this so we can exactly what's going on under the hood ``` to ``` # Do this so we can see exactly what's going on under the hood ```	2023-06-22 07:54:28 -07:00
Davis Chase	d50de2728f	Add AzureML endpoint LLM wrapper (#6580 ) ### Description We have added a new LLM integration `azureml_endpoint` that allows users to leverage models from the AzureML platform. Microsoft recently announced the release of [Azure Foundation Models](https://learn.microsoft.com/en-us/azure/machine-learning/concept-foundation-models?view=azureml-api-2) which users can find in the AzureML Model Catalog. The Model Catalog contains a variety of open source and Hugging Face models that users can deploy on AzureML. The `azureml_endpoint` allows LangChain users to use the deployed Azure Foundation Models. ### Dependencies No added dependencies were required for the change. ### Tests Integration tests were added in `tests/integration_tests/llms/test_azureml_endpoint.py`. ### Notebook A Jupyter notebook demonstrating how to use `azureml_endpoint` was added to `docs/modules/llms/integrations/azureml_endpoint_example.ipynb`. ### Twitters [Prakhar Gupta](https://twitter.com/prakhar_in) [Matthew DeGuzman](https://twitter.com/matthew_d13) --------- Co-authored-by: Matthew DeGuzman <91019033+matthewdeguzman@users.noreply.github.com> Co-authored-by: prakharg-msft <75808410+prakharg-msft@users.noreply.github.com>	2023-06-22 01:46:01 -07:00
Davis Chase	4fabd02d25	Add OpenLLM wrapper(#6578 ) LLM wrapper for models served with OpenLLM --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> Authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Chaoyu <paranoyang@gmail.com>	2023-06-22 01:18:14 -07:00
Brendan Graham	d718f3b6d0	feat: interfaces for async embeddings, implement async openai (#6563 ) Since it seems like #6111 will be blocked for a bit, I've forked @tyree731's fork and implemented the requested changes. This change adds support to the base Embeddings class for two methods, aembed_query and aembed_documents, those two methods supporting async equivalents of embed_query and embed_documents respectively. This ever so slightly rounds out async support within langchain, with an initial implementation of this functionality being implemented for openai. Implements https://github.com/hwchase17/langchain/issues/6109 --------- Co-authored-by: Stephen Tyree <tyree731@gmail.com>	2023-06-21 23:16:33 -07:00
ljeagle	ca24dc2d5f	Upgrade the version of AwaDB and add some new interfaces (#6565 ) 1. upgrade the version of AwaDB 2. add some new interfaces 3. fix bug of packing page content error @dev2049 please review, thanks! --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-21 23:15:18 -07:00
Harrison Chase	937a7e93f2	add motherduck docs (#6572 )	2023-06-21 23:13:45 -07:00
Muhammad Vaid	ae81b96b60	Detailed using the Twilio tool to send messages with 3rd party apps incl. WhatsApp (#6562 ) Everything needed to support sending messages over WhatsApp Business Platform (GA), Facebook Messenger (Public Beta) and Google Business Messages (Private Beta) was present. Just added some details on leveraging it.	2023-06-21 19:26:50 -07:00
Kenzie Mihardja	b8d78424ab	Change Data Loader Namespace (#6568 ) Description: Update the artifact name of the xml file and the namespaces. Co-authored with @tjaffri Co-authored-by: Kenzie Mihardja <kenzie@docugami.com>	2023-06-21 19:24:04 -07:00
Gengliang Wang	0673245d0c	Remove duplicate databricks entries in ecosystem integrations (#6569 ) Currently, there are two Databricks entries in https://python.langchain.com/docs/ecosystem/integrations/ <img width="277" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/86ab4ad2-6bce-4459-9d56-1ab2fbb69f6d"> The reason is that there are duplicated notebooks for Databricks integration: * https://github.com/hwchase17/langchain/blob/master/docs/extras/ecosystem/integrations/databricks.ipynb * https://github.com/hwchase17/langchain/blob/master/docs/extras/ecosystem/integrations/databricks/databricks.ipynb This PR is to remove the second one for simplicity.	2023-06-21 19:14:33 -07:00
Suri Chen	14b9418cc5	Fix whatsappchatloader - enable parsing new datetime format on WhatsApp chat (#6555 ) - Description: observed new format on WhatsApp exported chat - example: `[2023/5/4, 16:17:13] ~ Carolina: 🥺` - Dependencies: no additional dependencies required - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-21 19:11:49 -07:00
Zander Chase	5322bac5fc	Wait for all futures (#6554 ) - Expose method to wait for all futures - Wait for submissions in the run_on_dataset functions to ensure runs are fully submitted before cleaning up	2023-06-21 18:20:17 -07:00
HenriZuber	e0605b464b	feat: faiss filter from list (#6537 ) ### Feature Using FAISS on a retrievalQA task, I found myself wanting to allow in multiple sources. From what I understood, the filter feature takes in a dict of form {key: value} which then will check in the metadata for the exact value linked to that key. I added some logic to be able to pass a list which will be checked against instead of an exact value. Passing an exact value will also work. Here's an example of how I could then use it in my own project: ``` pdfs_to_filter_in = ["file_A", "file_B"] filter_dict = { "source": [f"source_pdfs/{pdf_name}.pdf" for pdf_name in pdfs_to_filter_in] } retriever = db.as_retriever() retriever.search_kwargs = {"filter": filter_dict} ``` I added an integration test based on the other ones I found in `tests/integration_tests/vectorstores/test_faiss.py` under `test_faiss_with_metadatas_and_list_filter()`. It doesn't feel like this is worthy of its own notebook or doc, but I'm open to suggestions if needed. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 10:49:01 -07:00
Davis Chase	00a7403236	update pr tmpl (#6552 )	2023-06-21 10:03:52 -07:00
Jeroen Van Goey	57b5f42847	Remove unintended double negation in docstring (#6541 ) Small typo fix. `ImportError: If importing vertexai SDK didn't not succeed.` -> `ImportError: If importing vertexai SDK did not succeed.`.	2023-06-21 10:01:28 -07:00
Andrey E. Vedishchev	a2a0715bd4	Minor Grammar Fixes in Docs and Comments (#6536 ) Just some grammar fixes: I found "retriver" instead of "retriever" in several comments across the documentation and in the comments. I fixed it. Co-authored-by: andrey.vedishchev <andrey.vedishchev@rgigroup.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 09:53:31 -07:00
dirtysalt	57cc3d1d3d	[Feature][VectorStore] Support StarRocks as vector db (#6119 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Here are some examples to use StarRocks as vectordb ``` from langchain.vectorstores import StarRocks from langchain.vectorstores.starrocks import StarRocksSettings embeddings = OpenAIEmbeddings() # conifgure starrocks settings settings = StarRocksSettings() settings.port = 41003 settings.host = '127.0.0.1' settings.username = 'root' settings.password = '' settings.database = 'zya' # to fill new embeddings docsearch = StarRocks.from_documents(split_docs, embeddings, config = settings) # or to use already-built embeddings in database. docsearch = StarRocks(embeddings, settings) ``` #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 09:02:33 -07:00
Zander Chase	7a4ff424fc	Relax string input mapper check (#6544 ) for run evaluator. It could be that an evalutor doesn't need the output	2023-06-21 08:01:42 -07:00
Harrison Chase	ace442b992	bump to ver 208 (#6540 )	2023-06-21 07:32:36 -07:00
Harrison Chase	53c1f120a8	Harrison/multi tool (#6518 )	2023-06-21 07:19:52 -07:00
Naman Modi	37a89918e0	Infino integration for simplified logs, metrics & search across LLM data & token usage (#6218 ) ### Integration of Infino with LangChain for Enhanced Observability This PR aims to integrate [Infino](https://github.com/infinohq/infino), an open source observability platform written in rust for storing metrics and logs at scale, with LangChain, providing users with a streamlined and efficient method of tracking and recording LangChain experiments. By incorporating Infino into LangChain, users will be able to gain valuable insights and easily analyze the behavior of their language models. #### Please refer to the following files related to integration: - `InfinoCallbackHandler`: A [callback handler](https://github.com/naman-modi/langchain/blob/feature/infino-integration/langchain/callbacks/infino_callback.py) specifically designed for storing chain responses within Infino. - Example `infino.ipynb` file: A comprehensive notebook named [infino.ipynb](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/modules/callbacks/integrations/infino.ipynb) has been included to guide users on effectively leveraging Infino for tracking LangChain requests. - [Integration Doc](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/ecosystem/integrations/infino.mdx) for Infino integration. By integrating Infino, LangChain users will gain access to powerful visualization and debugging capabilities. Infino enables easy tracking of inputs, outputs, token usage, execution time of LLMs. This comprehensive observability ensures a deeper understanding of individual executions and facilitates effective debugging. Co-authors: @vinaykakade @savannahar68 --------- Co-authored-by: Vinay Kakade <vinaykakade@gmail.com>	2023-06-21 01:38:20 -07:00
Elijah Tarr	e0f468f6c1	Update model token mappings/cost to include 0613 models (#6122 ) Add `gpt-3.5-turbo-16k` to model token mappings, as per the following new OpenAI blog post: https://openai.com/blog/function-calling-and-other-api-updates Fixes #6118 Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 01:37:16 -07:00
Jakub Misiło	5d149e4d50	Fix issue with non-list `To` header in GmailSendMessage Tool (#6242 ) Fixing the problem of feeding `str` instead of `List[str]` to the email tool. Fixes #6234 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 01:25:49 -07:00
Anubhav Bindlish	94c7899257	Integrate Rockset as Vectorstore (#6216 ) This PR adds Rockset as a vectorstore for langchain. [Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/) is a real time OLAP database which provides a fast and efficient vector search functionality. Further since it is entirely schemaless, it can store metadata in separate columns thereby allowing fast metadata filters during vector similarity search (as opposed to storing the entire metadata in a single JSON column). It currently supports three distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and `DOT_PRODUCT`. This PR adds `rockset` client as an optional dependency. We would love a twitter shoutout, our handle is https://twitter.com/RocksetCloud --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 01:22:27 -07:00
ElReyZero	ab7ecc9c30	Feat: Add a prompt template parameter to qa with structure chains (#6495 ) This pull request introduces a new feature to the LangChain QA Retrieval Chains with Structures. The change involves adding a prompt template as an optional parameter for the RetrievalQA chains that utilize the recently implemented OpenAI Functions. The main purpose of this enhancement is to provide users with the ability to input a more customizable prompt to the chain. By introducing a prompt template as an optional parameter, users can tailor the prompt to their specific needs and context, thereby improving the flexibility and effectiveness of the RetrievalQA chains. ## Changes Made - Created a new optional parameter, "prompt", for the RetrievalQA with structure chains. - Added an example to the RetrievalQA with sources notebook. My twitter handle is @El_Rey_Zero --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 00:23:36 -07:00
Mircea Pasoi	2e024823d2	Add async support for HuggingFaceTextGenInference (#6507 ) Adding support for async calls in `HuggingFaceTextGenInference` Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-20 23:12:24 -07:00
Hassan Ouda	456ca3d587	Be able to use Codey models on Vertex AI (#6354 ) Added the functionality to leverage 3 new Codey models from Vertex AI: - code-bison - Code generation using the existing LLM integration - code-gecko - Code completion using the existing LLM integration - codechat-bison - Code chat using the existing chat_model integration --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-20 23:11:54 -07:00
囧囧	0fce8ef178	Add KuzuQAChain (#6454 ) This PR adds `KuzuGraph` and `KuzuQAChain` for interacting with [Kùzu database](https://github.com/kuzudb/kuzu). Kùzu is an in-process property graph database management system (GDBMS) built for query speed and scalability. The `KuzuGraph` and `KuzuQAChain` provide the same functionality as the existing integration with NebulaGraph and Neo4j and enables query generation and question answering over Kùzu database. A notebook example and a simple test case have also been added. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-20 22:07:00 -07:00
Chanin Nantasenamat	6e07283dd5	Update index.mdx (#6326 ) #### Fix Added the mention of "store" amongst the tasks that the data connection module can perform aside from the existing 3 (load, transform and query). Particularly, this implies the generation of embeddings vectors and the creation of vector stores.	2023-06-20 21:40:20 -07:00
Zander Chase	ffa4ff1a2e	Export trajectory eval fn (#6509 ) from the run_evaluators dir	2023-06-20 21:18:28 -07:00
TheOnlyWayUp	bb437646fc	typo(llamacpp.ipynb): 'condiser' -> 'consider' (#6474 )	2023-06-20 18:48:25 -07:00
northern-64bit	7492060525	Fix typo in docstring of format_tool_to_openai_function (#6479 ) Fixes typo "open AI" to "OpenAI" in docstring of `format_tool_to_openai_function` in `langchain/tools/convert_to_openai.py`.	2023-06-20 18:42:30 -07:00
Davis Chase	b3c49e94a0	Make streamlit import optional (#6510 )	2023-06-20 18:41:59 -07:00
Daniel McDonald	cece8c8bf0	Fixed: 'readible' -> readable (#6492 ) Hello there👋 I have made a pull request to fix a small typo.	2023-06-20 18:39:59 -07:00
hsparmar	834c3378af	Documentation Fix: Correct the example code output in the prompt templates doc (#6496 ) Documentation is showing the wrong example output for the prompt templates code snippet. This PR fixes that issue.	2023-06-20 17:21:09 -07:00
Davis Chase	c91cf68754	Fix link (#6501 )	2023-06-20 14:44:22 -07:00
Davis Chase	3298bf4f00	docs/fix links (#6498 )	2023-06-20 14:06:50 -07:00
Lance Martin	ae6196507d	Update notebook for MD header splitter and create new cookbook (#6399 ) Move MD header text splitter example to its own cookbook.	2023-06-20 13:53:41 -07:00
Stefano Lottini	22af93d851	Vector store support for Cassandra (#6426 ) This addresses #6291 adding support for using Cassandra (and compatible databases, such as DataStax Astra DB) as a [Vector Store](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor(ANN)+Vector+Search+via+Storage-Attached+Indexes). A new class `Cassandra` is introduced, which complies with the contract and interface for a vector store, along with the corresponding integration test, a sample notebook and modified dependency toml. Dependencies: the implementation relies on the library `cassio`, which simplifies interacting with Cassandra for ML- and LLM-oriented workloads. CassIO, in turn, uses the `cassandra-driver` low-lever drivers to communicate with the database. The former is added as optional dependency (+ in `extended_testing`), the latter was already in the project. Integration testing relies on a locally-running instance of Cassandra. [Here](https://cassio.org/more_info/#use-a-local-vector-capable-cassandra) a detailed description can be found on how to compile and run it (at the time of writing the feature has not made it yet to a release). During development of the integration tests, I added a new "fake embedding" class for what I consider a more controlled way of testing the MMR search method. Likewise, I had to amend what looked like a glitch in the behaviour of `ConsistentFakeEmbeddings` whereby an `embed_query` call would have bypassed storage of the requested text in the class cache for use in later repeated invocations. @dev2049 might be the right person to tag here for a review. Thank you! --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-20 10:46:20 -07:00
Harrison Chase	cac6e45a67	improve documentation on base chain (#6468 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-06-20 10:34:57 -07:00
Zeeland	ad7089a6d0	fix: change ddg to DDGS (#6480 ) This commit updates the duckduckgo search utility by using a more accurate name in the import statement.	2023-06-20 10:15:05 -07:00
Davis Chase	8cd5f65a6f	release 207 (#6488 )	2023-06-20 10:14:29 -07:00
zhaoshengbo	ab44c24333	Add Alibaba Cloud OpenSearch as a new vector store (#6154 ) Hello Folks, Thanks for creating and maintaining this great project. I'm excited to submit this PR to add Alibaba Cloud OpenSearch as a new vector store. OpenSearch is a one-stop platform to develop intelligent search services. OpenSearch was built based on the large-scale distributed search engine developed by Alibaba. OpenSearch serves more than 500 business cases in Alibaba Group and thousands of Alibaba Cloud customers. OpenSearch helps develop search services in different search scenarios, including e-commerce, O2O, multimedia, the content industry, communities and forums, and big data query in enterprises. OpenSearch provides the vector search feature. In specific scenarios, especially test question search and image search scenarios, you can use the vector search feature together with the multimodal search feature to improve the accuracy of search results. This PR includes: A AlibabaCloudOpenSearch class that can connect to the Alibaba Cloud OpenSearch instance. add embedings and metadata into a opensearch datasource. querying by squared euclidean and metadata. integration tests. ipython notebook and docs. I have read your contributing guidelines. And I have passed the tests below - [x] make format - [x] make lint - [x] make coverage - [x] make test --------- Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	2023-06-20 10:07:40 -07:00
Davis Chase	b7ad4c4c30	fix openai qa chain (#6487 )	2023-06-20 10:01:13 -07:00
thehunmonkgroup	10adec5f1b	add FunctionMessage support to `_convert_dict_to_message()` in OpenAI chat model (#6382 ) Already supported in the reverse operation in `_convert_message_to_dict()`, this just provides parity. @hwchase17 @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-20 08:25:55 -07:00
Harrison Chase	7414e9d196	bump version to 206 (#6465 )	2023-06-19 23:05:09 -07:00
Hubert	22601b0b63	fix neo4j schema query (#6381 ) Fix issue #6380 <!-- Remove if not applicable --> Fixes #6380 (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: HubertKl <HubertKl>	2023-06-19 22:48:35 -07:00
Gavin	b0d80c4b3e	Update serpapi.py Support baidu list type answer_box (#6386 ) Support baidu list type answer_box From [this document](https://serpapi.com/baidu-answer-box), we can know that the answer_box attribute returned by the Baidu interface is a list, and the list contains only one Object, but an error will occur when the current code is executed. So when answer_box is a list, we reset res["answer_box"] so that the code can execute successfully.	2023-06-19 22:48:18 -07:00
Bryce Drennan	384fa43fc3	fix: llm caching for replicate (#6396 ) Caching wasn't accounting for which model was used so a result for the first executed model would return for the same prompt on a different model. This was because `Replicate._identifying_params` did not include the `model` parameter. FYI - @cbh123 - @hwchase17 - @agola11	2023-06-19 22:47:59 -07:00
Zeeland	8a604b93ab	feat: use latest duckduckgo_search API to call (#6409 ) # Provider the latest duckduckgo_search API The Git commit contents involve two files related to some DuckDuckGo query operations, and an upgrade of the DuckDuckGo module to version 3.8.3. A suitable commit message could be "Upgrade DuckDuckGo module to version 3.8.3, including query operations". Specifically, in the duckduckgo_search.py file, a DDGS() class instance is newly added to replace the previous ddg() function, and the time parameter name in the get_snippets() and results() methods is changed from "time" to "timelimit" to accommodate recent changes. In the pyproject.toml file, the duckduckgo-search module is upgraded to version 3.8.3. [duckduckgo_search readme attention](https://github.com/deedy5/duckduckgo_search): Versions before v2.9.4 no longer work as of May 12, 2023 ## Who can review? @vowelparrot --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:47:39 -07:00
Harrison Chase	9eec7c3206	Harrison/unstructured page number (#6464 ) Co-authored-by: Reza Sanaie <reza@sanaie.ca>	2023-06-19 22:31:43 -07:00
Alonso Silva Allende	b82ddf9cfb	Improve error message (#6275 ) Trying to use OpenAI models like 'text-davinci-002' or 'text-davinci-003' the agent doesn't work and the message is 'Only supported with OpenAI models.' The error message should be 'Only supported with ChatOpenAI models.' My Twitter handle is @alonsosilva <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> Co-authored-by: SILVA Alonso <alonso.silva@nokia-bell-labs.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:21:01 -07:00
zengbo	7e5f5ebf86	Fix the issue where ANTHROPIC_API_URL set in environment is not takin… (#6400 ) I apologize for the error: the 'ANTHROPIC_API_URL' environment variable doesn't take effect if the 'anthropic_api_url' parameter has a default value. #### Who can review? Models - @hwchase17 - @agola11	2023-06-19 22:20:36 -07:00
Grayson Adkins	9f5f747dc3	Fix broken links in autonomous agents docs (#6398 ) Fixes broken links here: https://python.langchain.com/docs/use_cases/autonomous_agents.html #### Who can review? Tag maintainers/contributors who might be interested: Agents / Tools / Toolkits - @hwchase17	2023-06-19 22:20:00 -07:00
volodymyr-memsql	d2e9b621ab	Update SinglStoreDB vectorstore (#6423 ) 1. Introduced new distance strategies support: DOT_PRODUCT and EUCLIDEAN_DISTANCE for enhanced flexibility. 2. Implemented a feature to filter results based on metadata fields. 3. Incorporated connection attributes specifying "langchain python sdk" usage for enhanced traceability and debugging. 4. Expanded the suite of integration tests for improved code reliability. 5. Updated the existing notebook with the usage example @dev2049 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:08:58 -07:00
Avinash Raj	6efd5fa2b9	Fix for #6431 - chatprompt template with partial variables giing validation error (#6456 ) W.r.t recent changes, ChatPromptTemplate does not accepting partial variables. This PR should fix that issue. Fixes #6431 #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:08:15 -07:00
Harrison Chase	02c0a1e77e	Harrison/functions in retrieval (#6463 )	2023-06-19 22:07:58 -07:00
Swapnil Sharma	dc4ffa8d9b	Incorrect argument count handling (#5543 ) Throwing ToolException when incorrect arguments are passed to tools so that that agent can course correct them. # Incorrect argument count handling I was facing an error where the agent passed incorrect arguments to tools. As per the discussions going around, I started throwing ToolException to allow the model to course correct. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:06:20 -07:00
kYLe	3a58c4c3a0	Fixed a link typo /-/route -> /-/routes. and change endpoint format (#6186 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes a link typo from `/-/route` to `/-/routes`. and change endpoint format from `f"{self.anyscale_service_url}/{self.anyscale_service_route}"` to `f"{self.anyscale_service_url}{self.anyscale_service_route}"` Also adding documentation about the format of the endpoint #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:05:54 -07:00
Leonid Ganeline	03b16ed2b1	docs `retrievers` fixes (#6299 ) Fixed several inconsistencies: - file names and notebook titles should be similar otherwise ToC on the [retrievers page](https://python.langchain.com/en/latest/modules/indexes/retrievers.html) and on the left ToC tab are different. For example, now, `Self-querying with Chroma` is not correctly alphabetically sorted because its file named `chroma_self_query.ipynb` - `Stringing compressors and document transformers...` demoted from `#` to `##`. Otherwise, it appears in Toc. - several formatting problems #### Who can review? @hwchase17 @dev2049 Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:04:35 -07:00
M. Tolga Cangöz	bccee85c8f	Update introduction.mdx (#6425 ) Fix typo	2023-06-19 22:04:09 -07:00
Nir Gazit	95b77a5215	Fix Custom LLM Agent example (#6429 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> The `CustomOutputParser` needs to throw `OutputParserException` when it fails to parse the response from the agent, so that the executor can [catch it and retry](`be9371ca8f/langchain/agents/agent.py (L767)`) when `handle_parsing_errors=True`. <!-- Remove if not applicable --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 22:03:58 -07:00
ykerus	b697bbb5b5	Remove backticks without clear purpose from docs (#6442 ) #### Description - Removed two backticks surrounding the phrase "chat messages as" - This phrase stood out among other formatted words/phrases such as `prompt`, `role`, `PromptTemplate`, etc., which all seem to have a clear function. - `chat messages as`, formatted as such, confused me while reading, leading me to believe the backticks were misplaced. #### Who can review? @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 22:03:38 -07:00
Dhruvil Shah	9494623869	Update web_base.ipynb (#6430 ) Minor new line character in the markdown. Also, this option is not yet in the latest version of LangChain (0.0.190) from Conda. Maybe in the next update. @eyurtsev @hwchase17	2023-06-19 21:43:35 -07:00
Wenchen Li	76ae9da9db	Add `_similarity_search_with_relevance_scores` in `Pinecone` (#6446 ) Just so it is consistent with other `VectorStore` classes. This is a follow-up of #6056 which also discussed the potential of adding `similarity_search_by_vector_returning_embeddings` that we will continue the discussion here. potentially related: #6286 #### Who can review? Tag maintainers/contributors who might be interested: @rlancemartin <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 21:36:40 -07:00
Ismail Pelaseyed	d4e8e0f5ab	Add example for question answering over documents with OpenAI Function Agent (#6448 ) This PR adds an example of doing question answering over documents using OpenAI Function Agents. #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 21:35:45 -07:00
Andrey Avtomonov	68a675cc68	Remove extra word in the introduction documentation (#6450 ) Removed an extra word in the introduction documentation, a simple typo	2023-06-19 21:31:17 -07:00
Ankush Gola	a9246333fd	fix anthropic chat model mutating input list (#6457 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes: ChatAnthropic was mutating the input message list during formatting which isn't ideal bc you could be changing the behavior for other chat models when using the same input #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested:	2023-06-19 21:30:52 -07:00
Zander Chase	bc0af67aaf	Add Trajectory Eval RunEvaluator (#6449 )	2023-06-19 21:11:50 -07:00
Hakan Tekgul	6a157cf8bb	Update arize_callback.py (#6433 ) Arize released a new Generative LLM Model Type, adjusting the callback function to new logging. Added arize imports, please delete if not necessary. Specifically, this change makes sure that the prompt and response pairs from LangChain agents are logged into Arize as a Generative LLM model, instead of our previous categorical model. In order to do this, the callback functions collects the necessary data and passes the data into Arize using Python Pandas SDK. Arize library, specifically pandas.logger is an additional dependency. Notebook For Test: https://docs.arize.com/arize/resources/integrations/langchain Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead Tracing / Callbacks @agola11	2023-06-19 18:33:49 -07:00
Zander Chase	00f276d23f	Run eval in eval mode (#6447 ) For the `run_on_dataset` sessions	2023-06-19 18:31:38 -07:00
Harrison Chase	1300a4bc8c	expose docs chains (#6453 )	2023-06-19 17:18:54 -07:00
Harrison Chase	286452c7f0	remove mongo	2023-06-19 10:04:14 -07:00
David Duong	be9371ca8f	Include placeholder value for all secrets, not just kwargs (#6421 ) Mirror PR for https://github.com/hwchase17/langchainjs/pull/1696 Secrets passed via environment variables should be present in the serialised chain	2023-06-19 15:41:45 +01:00
Harrison Chase	df40cd233f	bump version to 205 (#6410 )	2023-06-18 23:21:26 -07:00
Harrison Chase	e9c2b280db	Harrison/refactor functions (#6408 )	2023-06-18 23:13:42 -07:00
Harrison Chase	6a4a950a3c	changes to llm chain (#6328 ) - return raw and full output (but keep run shortcut method functional) - change output parser to take in generations (good for working with messages) - add output parser to base class, always run (default to same as current) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-06-18 22:49:47 -07:00
Davis Chase	d3c2eab0b3	Docs nit (#6350 )	2023-06-18 20:58:12 -07:00
Davis Chase	af96de6552	fix prod docs build (#6402 )	2023-06-18 20:56:12 -07:00
Fei Wang	50556f3b35	support memory for functions (#6165 ) #### Before submitting Add memory support for `OpenAIFunctionsAgent` like `StructuredChatAgent`. #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 19:00:40 -07:00
Dhruvil Shah	b2b9ded12f	Update web_base.py _fetch() method For SiteMapLoader (#6256 ) A must-include for SiteMap Loader to avoid the SSL verification error. Setting the 'verify' to False by ``` sitemap_loader.requests_kwargs = {"verify": False}``` does not bypass the SSL verification in some websites. There are websites (https:// researchadmin.asu.edu/ sitemap.xml) where setting "verify" to False as shown below would not work: sitemap_loader.requests_kwargs = {"verify": False} We need this merge to tell the Session to use a connector with a specific argument about SSL: \# For SiteMap SSL verification if not self.request_kwargs['verify']: connector = aiohttp.TCPConnector(ssl=False) else: connector = None <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #5483 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 18:34:18 -07:00
Harrison Chase	10bff4ecc4	Harrison/chroma fix (#6390 ) Co-authored-by: Junu Moon(Fran) <francomoon7@gmail.com>	2023-06-18 18:33:26 -07:00
Harrison Chase	5c1fa3e70e	Harrison/typesense fix (#6391 ) Co-authored-by: Gaurav Chauhan <2796gaurav@gmail.com> Co-authored-by: gaurav <gaurav.chauhan1@rksv.in>	2023-06-18 18:33:15 -07:00
Harrison Chase	5ccebce777	rm pandas from arize (#6392 )	2023-06-18 18:33:04 -07:00
matias-biatoz	3b7c4c51d5	Added gpt-3.5-turbo 0613 16k and 16k-0613 pricing (#6287 ) @agola11 Issue #6193 I added the new pricing for the new models. Also, now gpt-3.5-turbo got split into "input" and "output" pricing. It currently does not support that.	2023-06-18 18:32:20 -07:00
Ly Nguyen	1e0af59f69	- Fix pass system_message argument in new feature openai_functions_agent (#6297 ) can't pass system_message argument, the prompt always show default message "System: You are a helpful AI assistant." ``` system_message = SystemMessage( content="You are an AI that provides information to Human regarding documentation." ) agent = initialize_agent( tools, llm=openai_llm_chat, agent=AgentType.OPENAI_FUNCTIONS, system_message=system_message, agent_kwargs={ "system_message": system_message, }, verbose=False, ) ``` #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:54:00 -07:00
georgian	e64bafed3a	Fixes typo in Vectara.similarity_search (#6277 ) Fixes a simple typo. @hwchase17 @dev2049 Co-authored-by: Georgian Sarghi <georgian.sarghi@gmail.com>	2023-06-18 17:48:54 -07:00
Ted	112695e4da	Iterate through filtered file types instead of all listed files (#6258 ) # Iterate through filtered file types instead of all listed files Fixes https://github.com/hwchase17/langchain/issues/6257 https://github.com/hwchase17/langchain/pull/4926 originally added the functionality to filter by file type, storing the filtered files in `_files` https://github.com/hwchase17/langchain/pull/5220 removed the functionality when adding code to filter trashed files by using the `files` variables instead of the `_files` variable. This PR simply adds the functionality back by using `_files` again. #### Who can review? @hwchase17 - project lead @eyurtsev	2023-06-18 17:47:58 -07:00
Dhruvil Shah	ba90e3c990	Update web_base.ipynb for guiding purposes (#6248 ) To bypass SSL verification errors during fetching, you can include the `verify=False` parameter. This markdown proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #6079 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:10 -07:00
Dhruvil Shah	92f05a67a4	Add markdown to specify important arguments (#6246 ) To bypass SSL verification errors during web scraping, you can include the ssl_verify=False parameter along with the headers parameter. This combination of arguments proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #1829 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:00 -07:00
ikebo	ca7a44d024	add max_context_size property in BaseOpenAI (#6239 ) Hi, I make a small improvement for BaseOpenAI. I added a max_context_size attribute to BaseOpenAI so that we can get the max context size directly instead of only getting the maximum token size of the prompt through the max_tokens_for_prompt method. Who can review? @hwchase17 @agola11 I followed the [Common Tasks](`c7db9febb0/.github/CONTRIBUTING.md`), the test is all passed. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:46:35 -07:00
Jan Pawellek	3e3ed8c5c9	Fix LLM types so that they can be loaded from config dicts (#6235 ) LLM configurations can be loaded from a Python dict (or JSON file deserialized as dict) using the [load_llm_from_config](`8e1a7a8646/langchain/llms/loading.py (L12)`) function. However, the type string in the `type_to_cls_dict` lookup dict differs from the type string defined in some LLM classes. This means that the LLM object can be saved, but not loaded again, because the type strings differ.	2023-06-18 17:46:22 -07:00
Shu	46782ad79b	Fixed an unhandled error that was raised when DynamoDB did not have any chat history. (#6141 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> The current version of chat history with DynamoDB doesn't handle the case correctly when a table has no chat history. This change solves this error handling. <!-- Remove if not applicable --> Fixes https://github.com/hwchase17/langchain/issues/6088 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:39:19 -07:00
Cameron Vetter	2286204354	Correct AzureSearch Vector Store not applying search_kwargs when searching (#6132 ) Fixes #6131 Simply passes kwargs forward from similarity_search to helper functions so that search_kwargs are applied to search as originally intended. See bug for repro steps. #### Who can review? @hwchase17 @dev2049 Twitter: poshporcupine	2023-06-18 17:39:06 -07:00
Pierre Dulac	395a2a3724	Fix typo in the CAI critique prompt (#6123 ) Very small typo in the Constitutional AI critique default prompt. The negation "If there is no material critique of ..." is used two times, should be used only on the first one. Cheers, Pierre	2023-06-18 17:38:56 -07:00
Hao Chen	38057f0d2e	Fix latest clickhouse vector schema change (#6385 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes https://github.com/hwchase17/langchain/issues/6208 <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> VectorStores / Retrievers / Memory - @dev2049	2023-06-18 17:34:53 -07:00
Davit Buniatyan	1ab9dc8293	[hotfix] Deep Lake fails on newer version due to hardcode (#6383 ) Hot Fixes for Deep Lake [would highly appreciate expedited review] * deeplake version was hardcoded and since deeplake upgraded the integration fails with confusing error * an additional integration test fixed due to embedding function * Additionally fixed docs for code understanding links after docs upgraded * notebook removal of public parameter to make sure code understanding notebook works #### Who can review? @hwchase17 @dev2049 --------- Co-authored-by: Davit Buniatyan <d@activeloop.ai>	2023-06-18 17:33:49 -07:00
hp0404	6aa7b04f79	Fix integration tests for Faiss vector store (#6281 ) Fixes #5807 (issue) #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:25:49 -07:00
Chakib Benziane	ddd518a161	searx_search: updated tools and doc (#6276 ) - Allows using the same wrapper to create multiple tools ```python wrapper = SearxSearchWrapper(searx_host="**") github_tool = SearxSearchResults(name="Github", wrapper=wrapper, kwargs = { "engines": ["github"], }) arxiv_tool = SearxSearchResults(name="Arxiv", wrapper=wrapper, kwargs = { "engines": ["arxiv"] }) ``` - Updated link to searx documentation Agents / Tools / Toolkits - @hwchase17	2023-06-18 17:23:12 -07:00
ju-bezdek	e2f36ee608	OpenAI functions dont work with async streaming... #6225 (#6226 ) Related to this https://github.com/hwchase17/langchain/issues/6225 Just copied the implementation from `generate` function to `agenerate` and tested it. Didn't run any official tests thought <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #6225 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17, @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:05:16 -07:00
Jan Pawellek	ea6a5b03e0	Fix output final text for HuggingFaceTextGenInference when streaming (#6211 ) The LLM integration [HuggingFaceTextGenInference](https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_text_gen_inference.py) already has streaming support. However, when streaming is enabled, it always returns an empty string as the final output text when the LLM is finished. This is because `text` is instantiated with an empty string and never updated. This PR fixes the collection of the final output text by concatenating new tokens.	2023-06-18 17:01:15 -07:00
Tomaz Bratanic	b3bccabc66	Add option to save/load graph cypher QA (#6219 ) Similar as https://github.com/hwchase17/langchain/pull/5818 Added the functionality to save/load Graph Cypher QA Chain due to a user reporting the following error > raise NotImplementedError("Saving not supported for this chain type.")\nNotImplementedError: Saving not supported for this chain type.\n'	2023-06-18 17:00:27 -07:00
Harrison Chase	495128ba95	Harrison/functions docs improvements (#6389 ) Co-authored-by: Sumanth Donthula <46747610+sumanthdonthula@users.noreply.github.com>	2023-06-18 16:57:33 -07:00
Leonid Ganeline	c7ca350cd3	Fix class promotion (#6187 ) In LangChain, all module classes are enumerated in the `__init__.py` file of the correspondent module. But some classes were missed and were not included in the module `__init__.py` This PR: - added the missed classes to the module `__init__.py` files - `__init__.py:__all_` variable value (a list of the class names) was sorted - `langchain.tools.sql_database.tool.QueryCheckerTool` was renamed into the `QuerySQLCheckerTool` because it conflicted with `langchain.tools.spark_sql.tool.QueryCheckerTool` - changes to `pyproject.toml`: - added `pgvector` to `pyproject.toml:extended_testing` - added `pandas` to `pyproject.toml:[tool.poetry.group.test.dependencies]` - commented out the `streamlit` from `collbacks/__init__.py`, It is because now the `streamlit` requires Python >=3.7, !=3.9.7 - fixed duplicate names in `tools` - fixed correspondent ut-s #### Who can review? @hwchase17 @dev2049	2023-06-18 16:55:18 -07:00
Harrison Chase	c0c2fd0782	Harrison/zep mem (#6388 ) Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-06-18 16:53:35 -07:00
Harrison Chase	b7159c15cc	Harrison/metaphor search fix (#6387 ) Co-authored-by: jeffzwang <jeffreyzhiyuanwang@gmail.com>	2023-06-18 16:53:24 -07:00
Harrison Chase	9bf5b0defa	Harrison/myscale self query (#6376 ) Co-authored-by: Fangrui Liu <fangruil@moqi.ai> Co-authored-by: 刘方瑞 <fangrui.liu@outlook.com> Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>	2023-06-18 16:53:10 -07:00
Harrison Chase	bd8d418a95	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-18 16:45:49 -07:00
Harrison Chase	3a75d59c3d	searx - docs	2023-06-18 16:45:42 -07:00
MIDORIBIN	5be465bd86	Fixed PermissionError on windows (#6170 ) Fixed PermissionError that occurred when downloading PDF files via http in BasePDFLoader on windows. When downloading PDF files via http in BasePDFLoader, NamedTemporaryFile is used. This function cannot open the file again on Windows.[Python Doc](https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile) So, we created a temporary directory with TemporaryDirectory and placed the downloaded file there. temporary directory is deleted in the deconstruct. Fixes #2698 #### Who can review? Tag maintainers/contributors who might be interested: - @eyurtsev - @hwchase17	2023-06-18 16:39:57 -07:00
xleven	4fc7939848	fix link of callbacks on modules page (#6323 ) Since [Callbacks](https://python.langchain.com/docs/modules/callbacks/getting_started/) on [Modules](https://python.langchain.com/docs/modules/) went to a "Page Not Found".	2023-06-18 15:08:12 -07:00
Vijay	2b3b4e0f60	Add the ability to run the map_reduce chains process results step as async (#6181 ) This will add the ability to add an AsyncCallbackManager (handler) for the reducer chain, which would be able to stream the tokens via the `async def on_llm_new_token` callback method Fixes # (issue) [5532](https://github.com/hwchase17/langchain/issues/5532) @hwchase17 @agola11 The following code snippet explains how this change would be used to enable `reduce_llm` with streaming support in a `map_reduce` chain I have tested this change and it works for the streaming use-case of reducer responses. I am happy to share more information if this makes solution sense. ``` AsyncHandler .......................... class StreamingLLMCallbackHandler(AsyncCallbackHandler): """Callback handler for streaming LLM responses.""" def __init__(self, websocket): self.websocket = websocket # This callback method is to be executed in async async def on_llm_new_token(self, token: str, **kwargs: Any) -> None: resp = ChatResponse(sender="bot", message=token, type="stream") await self.websocket.send_json(resp.dict()) Chain .......... stream_handler = StreamingLLMCallbackHandler(websocket) stream_manager = AsyncCallbackManager([stream_handler]) streaming_llm = ChatOpenAI( streaming=True, callback_manager=stream_manager, verbose=False, temperature=0, ) main_llm = OpenAI( temperature=0, verbose=False, ) doc_chain = load_qa_chain( llm=main_llm, reduce_llm=streaming_llm, chain_type="map_reduce", callback_manager=manager ) qa_chain = ConversationalRetrievalChain( retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator, callback_manager=manager, ) # Here `acall` will trigger `acombine_docs` on `map_reduce` which should then call `_aprocess_result` which in turn will call `self.combine_document_chain.arun` hence async callback will be awaited result = await qa_chain.acall( {"question": question, "chat_history": chat_history} ) ```	2023-06-18 13:19:56 -07:00
Alvaro Bartolome	e0dea577ee	Extend `ArgillaCallbackHandler` support (#6153 ) Hi again @agola11! 🤗 ## What's in this PR? After playing around with different chains we noticed that some chains were using different `output_key`s and we were just handling some, so we've extended the support to any output, either if it's a Python list or a string. Kudos to @dvsrepo for spotting this! --------- Co-authored-by: Daniel Vila Suero <daniel@argilla.io>	2023-06-18 11:18:33 -07:00
Harrison Chase	a8cb9ee013	Harrison/gdrive enhancements (#6375 ) Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>	2023-06-18 11:07:23 -07:00
rafael	ebfffaa38f	Guardrails output parser: Pass LLM api for reasking (#6089 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes https://github.com/ShreyaR/guardrails/issues/155 Enables guardrails reasking by specifying an LLM api in the output parser.	2023-06-18 10:50:20 -07:00
Davis Chase	ec850e607f	bump 203 (#6372 )	2023-06-18 09:20:47 -07:00
Lance Martin	370becdfc2	Add self query retriever example with MD header splitting (#6359 ) Flesh out the notebook example for `MarkdownHeaderTextSplitter`	2023-06-17 21:40:20 -07:00
Lance Martin	2c97fbabbd	Update MD header text splitter notebook (#6339 ) Highlight use case for maintaining header groups when splitting.	2023-06-17 13:19:27 -07:00
Harrison Chase	a2bbe3dda4	Harrison/mmr support for opensearch (#6349 ) Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>	2023-06-17 12:22:37 -07:00
Davis Chase	2eea5d4cb4	Add ignore vercel preview script (#6320 ) skip building preview of docs for anything branch that doesn't start with `__docs__`. will eventually update to look at code diff directories but patching for now	2023-06-17 11:17:08 -07:00
Harrison Chase	7a48d9ee82	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-17 11:16:19 -07:00
Kenny	e30fdffd1e	Add new openai 0613 model costs (#6110 ) Added costs for gpt-4-32k-0613, gpt-4-0613, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, and gpt-3.5-turbo-16k-0613 to openai_info callback based on this [OpenAI post](https://openai.com/blog/function-calling-and-other-api-updates) @agola11	2023-06-17 11:11:47 -07:00
Dhruvil Shah	2eec687474	update web_base.py to have verify option (#6107 ) We propose an enhancement to the web-based loader initialize method by introducing a "verify" option. This enhancement addresses the issue of SSL verification errors encountered on certain web pages. By providing users with the option to set the verify parameter to False, we offer greater flexibility and control. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ### Fixes #6079 #### Who can review? @eyurtsev @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-17 11:10:48 -07:00
Harrison Chase	680d6bbbf8	fix titles in documentation	2023-06-17 11:09:11 -07:00
Nuno Campos	e194dc5306	Make lckwargs private (#6344 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-17 19:08:25 +01:00
Harrison Chase	8cfb52ddbb	fix spelling	2023-06-17 11:06:54 -07:00
zengbo	5d5298087f	Custom Anthropic API URL (#6221 ) [Feature] User can custom the Anthropic API URL #### Who can review? Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11	2023-06-17 11:01:29 -07:00
Harrison Chase	61e4a1adf9	Harrison/faiss score (#6341 ) Co-authored-by: Frank Stein <16441059+simonfromla@users.noreply.github.com> Co-authored-by: Sims Juju <sims@Ju.lan>	2023-06-17 11:00:47 -07:00
Harrison Chase	42a28ac1ba	Harrison/error zero tools (#6340 ) Co-authored-by: Juhee Kim <46583939+juppytt@users.noreply.github.com>	2023-06-17 11:00:35 -07:00
Slawomir Gonet	eef62bf4e9	qdrant: search by vector (#6043 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Added support to `search_by_vector` to Qdrant Vector store. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review VectorStores / Retrievers / Memory - @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 -->	2023-06-17 09:44:28 -07:00
Mark	b7ba7e8a7b	Allow GoogleDrive to authenticate via application default credentials on Cloud Run/GCE etc without service key (#6035 ) @eyurtsev The existing GoogleDrive implementation always needs a service account to be available at the credentials location. When running on GCP services such as Cloud Run, a service account already exists in the metadata of the service, so no physical key is necessary. This change adds a check to see if it is running in such an environment, and uses that authentication instead. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-17 09:44:17 -07:00
lonestriker	6f36f0f930	Add oobabooga/text-generation-webui support as a llm (#5997 ) Add oobabooga/text-generation-webui support as an LLM. Currently, supports using text-generation-webui's non-streaming API interface. Allows users who already have text-gen running to use the same models with langchain. #### Before submitting Simple usage, similar to existing LLM supported: ``` from langchain.llms import TextGen llm = TextGen(model_url = "http://localhost:5000") ``` #### Who can review? @hwchase17 - project lead --------- Co-authored-by: Hien Ngo <Hien.Ngo@adia.ae>	2023-06-17 09:42:15 -07:00
Richy Wang	444ca3f669	Improve AnalyticDB Vector Store implementation without affecting user (#6086 ) Hi there: As I implement the AnalyticDB VectorStore use two table to store the document before. It seems just use one table is a better way. So this commit is try to improve AnalyticDB VectorStore implementation without affecting user behavior: 1. Streamline the `post_init `behavior by creating a single table with vector indexing. 2. Update the `add_texts` API for document insertion. 3. Optimize `similarity_search_with_score_by_vector` to retrieve results directly from the table. 4. Implement `_similarity_search_with_relevance_scores`. 5. Add `embedding_dimension` parameter to support different dimension embedding functions. Users can continue using the API as before. Test cases added before is enough to meet this commit.	2023-06-17 09:36:31 -07:00
Ja-sonYun	cdd1d78bf2	make modelname_to_contextsize as a staticmethod (#6040 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes ##6039 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17　@agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-17 09:13:08 -07:00
Saba Sturua	427551eabf	DocArray as a Retriever (#6031 ) ## DocArray as a Retriever [DocArray](https://github.com/docarray/docarray) is an open-source tool for managing your multi-modal data. It offers flexibility to store and search through your data using various document index backends. This PR introduces `DocArrayRetriever` - which works with any available backend and serves as a retriever for Langchain apps. Also, I added 2 notebooks: DocArray Backends - intro to all 5 currently supported backends, how to initialize, index, and use them as a retriever DocArray Usage - showcasing what additional search parameters you can pass to create versatile retrievers Example: ```python from docarray.index import InMemoryExactNNIndex from docarray import BaseDoc, DocList from docarray.typing import NdArray from langchain.embeddings.openai import OpenAIEmbeddings from langchain.retrievers import DocArrayRetriever # define document schema class MyDoc(BaseDoc): description: str description_embedding: NdArray[1536] embeddings = OpenAIEmbeddings() # create documents descriptions = ["description 1", "description 2"] desc_embeddings = embeddings.embed_documents(texts=descriptions) docs = DocList[MyDoc]( [ MyDoc(description=desc, description_embedding=embedding) for desc, embedding in zip(descriptions, desc_embeddings) ] ) # initialize document index with data db = InMemoryExactNNIndex[MyDoc](docs) # create a retriever retriever = DocArrayRetriever( index=db, embeddings=embeddings, search_field="description_embedding", content_field="description", ) # find the relevant document doc = retriever.get_relevant_documents("action movies") print(doc) ``` #### Who can review? @dev2049 --------- Signed-off-by: jupyterjazz <saba.sturua@jina.ai>	2023-06-17 09:09:33 -07:00
Masafumi Mori	7bb437146d	fix links to prompt templates and example selectors (#6332 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # links to prompt templates and example selectors on the [Prompts](https://python.langchain.com/docs/modules/model_io/prompts/) page are invalid. #### Before submitting Just a small note that I tried to run `make docs_clean` and other related commands before PR written [here](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md#build-documentation-locally), it gives me an error: ```bash langchain % make docs_clean Traceback (most recent call last): File "/Users/masafumi/Downloads/langchain/.venv/bin/make", line 5, in <module> from scripts.proto import main ModuleNotFoundError: No module named 'scripts' make: *** [docs_clean] Error 1 # Poetry (version 1.5.1) # Python 3.9.13 ``` I couldn't figure out how to fix this, so I didn't run those command. But links should work. #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 Similar issue #6323 Co-authored-by: masafumimori <m.masafumimori@outlook.com>	2023-06-17 09:07:14 -07:00
Francisco Ingham	83eea230f3	changed height in the nb example (#6327 ) changed height in the example to a more reasonable number (from 9 feet to 6 feet)	2023-06-17 00:05:48 -07:00
James O'Dwyer	0475d015fe	Handle Managed Motorhead Data Key (#6169 ) # Handle Managed Motorhead Data Key Managed motorhead will return a payload with a `data` key. we need to handle this to properly access messages from the server.	2023-06-16 20:36:18 -07:00
Luke Stanley	364f8e7b5d	Better Entity Memory code documentation (#6318 ) Just adds some comments and docstring improvements. There was some behaviour that was quite unclear to me at first like: - "when do things get updated?" - "why are there only entity names and no summaries?" - "why do the entity names disappear?" Now it can be much more obvious to many. I am lukestanley on Twitter.	2023-06-16 18:08:44 -07:00
Harrison Chase	af18413d97	Harrison/deeplake new features (#6263 ) Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-16 17:53:55 -07:00
Davis Chase	6640293087	fix eval guide links (#6319 )	2023-06-16 17:53:46 -07:00
ljeagle	ad324a39ae	Improve the performance of add_texts interface and upgrade the AwaDB from 0.3.2 to 0.3.3 (#6316 ) 1. Changed the implementation of add_texts interface for the AwaDB vector store in order to improve the performance 2. Upgrade the AwaDB from 0.3.2 to 0.3.3 --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-16 16:50:01 -07:00
Davis Chase	24b2af5218	nit (#6305 )	2023-06-16 16:21:27 -07:00
Pierre Alexandre SCHEMBRI	9ca11c06b7	Fixes #6282 (#6283 ) Fixes #6282 1 liner to fix default http headers not passed by `LLMRequestsChain`	2023-06-16 16:21:01 -07:00
Davis Chase	23cdebddc4	Del linkcheck readme (#6317 )	2023-06-16 16:18:45 -07:00
Brigit Murtaugh	ccd916babe	Update dev container (#6189 ) Fixes https://github.com/hwchase17/langchain/issues/6172 As described in https://github.com/hwchase17/langchain/issues/6172, I'd love to help update the dev container in this project. Summary of changes: - Dev container now builds (the current container in this repo won't build for me) - Dockerfile updates - Update image to our [currently-maintained Python image](https://github.com/devcontainers/images/tree/main/src/python/.devcontainer) (`mcr.microsoft.com/devcontainers/python`) rather than the deprecated image from vscode-dev-containers - Move Dockerfile to root of repo - in order for `COPY` to work properly, it needs the files (in this case, `pyproject.toml` and `poetry.toml`) in the same directory - devcontainer.json updates - Removed `customizations` and `remoteUser` since they should be covered by the updated image in the Dockerfile - Update comments - Update docker-compose.yaml to properly point to updated Dockerfile - Add a .gitattributes to avoid line ending conversions, which can result in hundreds of pending changes ([info](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files)) - Add a README in the .devcontainer folder and info on the dev container in the contributing.md Outstanding questions: - Is it expected for `poetry install` to take some time? It takes about 30 minutes for this dev container to finish building in a Codespace, but a user should only have to experience this once. Through some online investigation, this doesn't seem unusual - Versions of poetry newer than 1.3.2 failed every time - based on some of the guidance in contributing.md and other online resources, it seemed changing poetry versions might be a good solution. 1.3.2 is from Jan 2023 --------- Co-authored-by: bamurtaugh <brmurtau@microsoft.com> Co-authored-by: Samruddhi Khandale <samruddhikhandale@github.com>	2023-06-16 15:42:14 -07:00
Davis Chase	03b5891cf7	more redirect (#6314 )	2023-06-16 14:43:59 -07:00
Davis Chase	eaee492dbc	basic redirect (#6309 )	2023-06-16 13:39:58 -07:00
Davis Chase	d2243757a3	update readme (#6304 )	2023-06-16 12:27:16 -07:00
Davis Chase	2f47e5c766	update api link (#6303 )	2023-06-16 12:18:17 -07:00
Davis Chase	d558bcfad8	rm ignore_vercel (#6302 )	2023-06-16 12:06:58 -07:00
Davis Chase	87e502c6bc	Doc refactor (#6300 ) Co-authored-by: jacoblee93 <jacoblee93@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-16 11:52:56 -07:00
Harrison Chase	94c82a189d	bump to 202 (#6262 )	2023-06-16 06:52:36 -07:00
hp0404	b01cf0dd54	ArxivAPIWrapper - doc_content_chars_max (#6063 ) This PR refactors the ArxivAPIWrapper class making `doc_content_chars_max` parameter optional. Additionally, tests have been added to ensure the functionality of the doc_content_chars_max parameter. Fixes #6027 (issue)	2023-06-15 22:16:42 -07:00
Daniel King	a9b97aa6f4	Update output format of MosaicML endpoint to be more flexible (#6060 ) There will likely be another change or two coming over the next couple weeks as we stabilize the API, but putting this one in now which just makes the integration a bit more flexible with the response output format. ``` (langchain) danielking@MML-1B940F4333E2 langchain % pytest tests/integration_tests/llms/test_mosaicml.py tests/integration_tests/embeddings/test_mosaicml.py =================================================================================== test session starts =================================================================================== platform darwin -- Python 3.10.11, pytest-7.3.1, pluggy-1.0.0 rootdir: /Users/danielking/github/langchain configfile: pyproject.toml plugins: asyncio-0.20.3, mock-3.10.0, dotenv-0.5.2, cov-4.0.0, anyio-3.6.2 asyncio: mode=strict collected 12 items tests/integration_tests/llms/test_mosaicml.py ...... [ 50%] tests/integration_tests/embeddings/test_mosaicml.py ...... [100%] =================================================================================== slowest 5 durations =================================================================================== 4.76s call tests/integration_tests/llms/test_mosaicml.py::test_retry_logic 4.74s call tests/integration_tests/llms/test_mosaicml.py::test_mosaicml_llm_call 4.13s call tests/integration_tests/llms/test_mosaicml.py::test_instruct_prompt 0.91s call tests/integration_tests/llms/test_mosaicml.py::test_short_retry_does_not_loop 0.66s call tests/integration_tests/llms/test_mosaicml.py::test_mosaicml_extra_kwargs =================================================================================== 12 passed in 19.70s =================================================================================== ``` #### Who can review? @hwchase17 @dev2049	2023-06-15 22:15:39 -07:00
JaysonAlbert	50d9c7d5a4	Fix: change the chatgpt plugin retriever metadata format (#5920 ) the current implement put the doc itself as the metadata, but the document chatgpt plugin retriever returned already has a `metadata` field, it's better to use that instead. the original code will throw the following exception when using `RetrievalQAWithSourcesChain`, becuse it can not find the field `metadata`: ```python Exception has occurred: ValueError (note: full exception trace is shown but execution is paused at: _run_module_as_main) Document prompt requires documents to have metadata variables: ['source']. Received document with missing metadata: ['source']. File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 27, in format_document raise ValueError( File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 65, in <listcomp> doc_strings = [format_document(doc, self.document_prompt) for doc in docs] File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 65, in _get_inputs doc_strings = [format_document(doc, self.document_prompt) for doc in docs] File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 85, in combine_docs inputs = self._get_inputs(docs, **kwargs) File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 84, in _call output, extra_return_dict = self.combine_docs( File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/base.py", line 140, in __call__ raise e ``` Additionally, the `metadata` filed in the `chatgpt plugin retriever` have these fileds by default: ```json { "source": "file", //email, file or chat "source_id": "filename.docx", // the filename "url": "", ... } ``` so, we should set `source_id` to `source` in the langchain metadata. ```python metadata = d.pop("metadata", d) if(metadata.get("source_id")): metadata["source"] = metadata.pop("source_id") ``` #### Who can review? @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: wangjie <wangjie@htffund.com>	2023-06-15 22:04:45 -07:00
Harrison Chase	e67b26eee9	Harrison/openai functions (#6261 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 21:54:39 -07:00
Harrison Chase	6aafb46807	Harrison/openai functions (#6223 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 21:43:33 -07:00
Zander Chase	bc9b8c8239	Improve Error Message for failed callback (#6247 ) Include the handler class name in the warning	2023-06-15 19:18:37 -07:00
Alon Roth	0013256e81	Support chat history persistence in AutoGPT (#5716 ) Short Description Added a new argument to AutoGPT class which allows to persist the chat history to a file. Changes 1. Removed the `self.full_message_history: List[BaseMessage] = []` 2. Replaced it with `chat_history_memory` which can take any subclasses of `BaseChatMessageHistory` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-15 17:49:03 -07:00
Martin Antos	1913320cbe	Feature/add acreom loader (#5780 ) adding new loader for [acreom](https://acreom.com) vaults. It's based on the Obsidian loader with some additional text processing for acreom specific markdown elements. @eyurtsev please take a look! --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-15 11:53:00 -07:00
Zander Chase	ae76e473e1	Add Tags for LLMs (#6229 ) - [x] Add tracing tags to LLMs + Chat Models (both inheritable and local) - [x] Add tags for the run_on_dataset helper function(s)	2023-06-15 11:24:11 -07:00
Harrison Chase	8e1a7a8646	bump version to 201 (#6233 )	2023-06-15 08:28:47 -07:00
Harrison Chase	e82687ddf4	Harrison/use functions agent (#6185 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 08:18:50 -07:00
Ryo Kanazawa	7d2b946d0b	Fix typo `pandocs` to `pandoc` (#6203 ) Fixes https://github.com/hwchase17/langchain/issues/6204 ### Context An typo issue with `pandoc`. #### Who can review? @hwchase17	2023-06-15 08:18:27 -07:00
Kyle Roth	c7db9febb0	count tokens for new OpenAI model versions (#6195 ) Trying to call `ChatOpenAI.get_num_tokens_from_messages` returns the following error for the newly announced models `gpt-3.5-turbo-0613` and `gpt-4-0613`: ``` NotImplementedError: get_num_tokens_from_messages() is not presently implemented for model gpt-3.5-turbo-0613.See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens. ``` This adds support for counting tokens for those models, by counting tokens the same way they're counted for the previous versions of `gpt-3.5-turbo` and `gpt-4`. #### reviewers - @hwchase17 - @agola11	2023-06-15 06:16:03 -07:00
xu0o0	7ad13cdbdb	feat: add content_format param to ConfluenceLoader.load() (#5922 ) Confluence API supports difference format of page content. The storage format is the raw XML representation for storage. The view format is the HTML representation for viewing with macros rendered as though it is viewed by users. Add the `content_format` parameter to `ConfluenceLoader.load()` to specify the content format, this is set to `ContentFormat.STORAGE` by default. #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-14 16:56:28 -07:00
0xJordan	c5a46e7435	feat: Add support for the Solidity language (#6054 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ## Add Solidity programming language support for code splitter. Twitter: @0xjord4n_ <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-14 14:25:02 -07:00
Nuno Campos	17c4ec4812	Add docs for tags (#6155 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-14 14:01:58 -07:00
thiswillbeyourgithub	4a649e3b14	typo: 'following following' to 'following' (#6163 ) Co-authored-by: thiswillbeyourgithub <github@32mail.33mail.com>	2023-06-14 10:58:47 -07:00
Maciej Bryński	8a44c879c6	Update readthedocs_documentation.ipynb (#6148 ) Minor fix in documentation. Change URL in wget call to proper one.	2023-06-14 07:21:48 -07:00
Zander Chase	e0e3ef1c57	Update Name (#6136 )	2023-06-13 22:25:36 -07:00
Zander Chase	4555ad5d1f	Add Run Collector Callback (#6133 ) Add a callback handler that can collect nested run objects. Useful for evaluation.	2023-06-13 22:17:37 -07:00
Harrison Chase	6ac120f299	bump ver to 200 (#6130 )	2023-06-13 19:33:51 -07:00
Harrison Chase	e41f0b341c	add functions agent (#6113 )	2023-06-13 18:51:01 -07:00
Zander Chase	b3b155d488	Return session name in runner response (#6112 ) Makes it easier to then run evals w/o thinking about specifying a session	2023-06-13 16:59:43 -07:00
Harrison Chase	e74733ab9e	support streaming for functions (#6115 )	2023-06-13 15:26:26 -07:00
Nuno Campos	11ab0be11a	Add support for tags (#5898 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-13 12:30:59 -07:00
Harrison Chase	1281fdf0f2	Harrison/notebook functions (#6103 )	2023-06-13 10:52:54 -07:00
Harrison Chase	34ebb29726	bump version to 199 (#6102 )	2023-06-13 10:50:33 -07:00
Wenchen Li	f9edf76e7c	Implement `max_marginal_relevance_search` in `VectorStore` of Pinecone (#6056 ) This adds implementation of MMR search in pinecone; and I have two semi-related observations about this vector store class: - Maybe we should also have a `similarity_search_by_vector_returning_embeddings` like in supabase, but it's not in the base `VectorStore` class so I didn't implement - Talking about the base class, there's `similarity_search_with_relevance_scores`, but in pinecone it is called `similarity_search_with_score`; maybe we should consider renaming it to align with other `VectorStore` base and sub classes (or add that as an alias for backward compatibility) #### Who can review? Tag maintainers/contributors who might be interested: - VectorStores / Retrievers / Memory - @dev2049	2023-06-13 10:46:45 -07:00
Harrison Chase	970b2f9d38	convert tools to openai (#6100 )	2023-06-13 10:40:49 -07:00
Harrison Chase	292accde2b	support functions (#6099 )	2023-06-13 10:32:58 -07:00
Lance Martin	ee3d0513ad	Add tests and update notebook for MarkdownHeaderTextSplitter (#6069 ) Add test and update notebook for `MarkdownHeaderTextSplitter`.	2023-06-13 09:07:52 -07:00
Keshav Kumar	8fdf88b8e3	Fix for ModuleNotFoundError while running langchain-server. Issue #5833 (#6077 ) This PR fixes the error `ModuleNotFoundError: No module named 'langchain.cli'` Fixes https://github.com/hwchase17/langchain/issues/5833 (issue)	2023-06-13 08:37:07 -07:00
Zander Chase	0c52275bdb	Use Run object from SDK (#6067 ) Update the Run object in the tracer to extend that in the SDK to include the parameters necessary for tracking/tracing	2023-06-13 07:14:11 -07:00
Harrison Chase	cde1e8739a	turn off repr (#6078 )	2023-06-12 22:45:24 -07:00
Nuno Campos	a9b3b2e327	Enable serialization for anthropic (#6049 )	2023-06-12 22:39:10 -07:00
Harrison Chase	6ac5d80286	propogate kwargs fully (#6076 )	2023-06-12 22:37:55 -07:00
Harrison Chase	ec1a2adf9c	improve tools (#6062 )	2023-06-12 22:19:03 -07:00
Julius Lipp	5b6bbf4ab2	Add embaas document extraction api endpoints (#6048 ) # Introduces embaas document extraction api endpoints In this PR, we add support for embaas document extraction endpoints to Text Embedding Models (with LLMs, in different PRs coming). We currently offer the MTEB leaderboard top performers, will continue to add top embedding models and soon add support for customers to deploy thier own models. Additional Documentation + Infomation can be found [here](https://embaas.io). While developing this integration, I closely followed the patterns established by other langchain integrations. Nonetheless, if there are any aspects that require adjustments or if there's a better way to present a new integration, let me know! :) Additionally, I fixed some docs in the embeddings integration. Related PR: #5976 #### Who can review? DataLoaders - @eyurtsev	2023-06-12 19:13:52 -07:00
Zander Chase	2f0088039d	Log tracer errors (#6066 ) Example (would log several times if not for the helper fn. Would emit no logs due to mulithreading previously) ![image](https://github.com/hwchase17/langchain/assets/130414180/070d25ae-1f06-4487-9617-0a6f66f3f01e)	2023-06-12 17:13:49 -07:00
Lance Martin	b023f0c0f2	Text splitter for Markdown files by header (#5860 ) This creates a new kind of text splitter for markdown files. The user can supply a set of headers that they want to split the file on. We define a new text splitter class, `MarkdownHeaderTextSplitter`, that does a few things: (1) For each line, it determines the associated set of user-specified headers (2) It groups lines with common headers into splits See notebook for example usage and test cases.	2023-06-12 15:46:42 -07:00
Jens Madsen	2c91f0d750	chore: spedd up integration test by using smaller model (#6044 ) Adds a new parameter `relative_chunk_overlap` for the `SentenceTransformersTokenTextSplitter` constructor. The parameter sets the chunk overlap using a relative factor, e.g. for a model where the token limit is 100, a `relative_chunk_overlap=0.5` implies that `chunk_overlap=50` Tag maintainers/contributors who might be interested: @hwchase17, @dev2049	2023-06-12 13:27:10 -07:00
Harrison Chase	5922742d56	comment out	2023-06-12 10:57:31 -07:00
Harrison Chase	681ba6d520	embaas title	2023-06-12 08:00:14 -07:00
Ben Flast	7a5e36f3f5	Mongo db doc fix (#6042 ) I missed a few errors in my initial fix @hwchase1. Thanks!	2023-06-12 07:29:27 -07:00
Harrison Chase	289e9aeb9d	bump ver to 198 (#6026 )	2023-06-11 21:32:45 -07:00
Harrison Chase	d1561b74eb	Harrison/cognitive search (#6011 ) Co-authored-by: Fabrizio Ruocco <ruoccofabrizio@gmail.com>	2023-06-11 21:15:42 -07:00
wenmeng zhou	bb7ac9edb5	add dashscope text embedding (#5929 ) #### What I do Adding embedding api for [DashScope](https://help.aliyun.com/product/610100.html), which is the DAMO Academy's multilingual text unified vector model based on the LLM base. It caters to multiple mainstream languages worldwide and offers high-quality vector services, helping developers quickly transform text data into high-quality vector data. Currently supported languages include Chinese, English, Spanish, French, Portuguese, Indonesian, and more. #### Who can review? Models - @hwchase17 - @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-11 21:14:20 -07:00
Ben Flast	010d0bfeea	Update MongoDB Atlas support docs (#6022 ) Updating MongoDB Atlas support docs @hwchase17 let me know if you have any questions	2023-06-11 20:57:15 -07:00
Harrison Chase	e05997c25e	Harrison/hologres (#6012 ) Co-authored-by: Changgeng Zhao <changgeng@nyu.edu> Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>	2023-06-11 20:56:51 -07:00
ljeagle	c5bce4a465	add from_documents interface in awadb vector store (#6023 ) added new interface from_documents in awadb vector store @dev2049 --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-11 19:35:03 -07:00
Zander Chase	2c9619bc1d	Remove from PR template (#6018 )	2023-06-11 19:34:26 -07:00
ju-bezdek	18f5c985d9	Langchain decorators (#6017 ) Added description of LangChain Decorators ✨ into the integration section <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-11 19:32:24 -07:00
Zander Chase	a197acfcd3	Update check (#6020 ) We were assigning the name as None in on_chat_model_start then not updating, resulting in a validation error.	2023-06-11 17:59:09 -07:00
Nuno Campos	18af149e91	nc/load (#5733 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-11 15:51:28 -07:00
Zander Chase	614cff89bc	I before E (#6015 )	2023-06-11 15:45:12 -07:00
Harrison Chase	a7227ee01b	Harrison/embaas (#6010 ) Co-authored-by: Julius Lipp <43986145+juliuslipp@users.noreply.github.com>	2023-06-11 13:35:14 -07:00
xu0o0	232faba796	fix: TypeError when loading confluence pages by cql (#5878 ) The Confluence loader uses the wrong API (`Confluence.cql()` provided by `atlassian-python-api`) to load pages by CQL. `Confluence.cql()` is a wrapper of the `/rest/api/search` API which searches for entities in Confluence. To search for pages in Confluence, the loader can use the `/rest/api/content/search` API. #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> #### References ##### Cloud API https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-content/#api-wiki-rest-api-content-search-get https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-search/#api-wiki-rest-api-search-get ##### Server API https://docs.atlassian.com/ConfluenceServer/rest/8.3.1/#api/content-search https://docs.atlassian.com/ConfluenceServer/rest/8.3.1/#api/search	2023-06-11 13:23:22 -07:00
Akhil Vempali	d7d629911b	feat: ✨ Added filtering option to FAISS vectorstore (#5966 ) Inspired by the filtering capability available in ChromaDB, added the same functionality to the FAISS vectorestore as well. Since FAISS does not have an inbuilt method of filtering used the approach suggested in this [thread](https://github.com/facebookresearch/faiss/issues/1079) Langchain Issue inspiration: https://github.com/hwchase17/langchain/issues/4572 - [x] Added filtering capability to semantic similarly and MMR - [x] Added test cases for filtering in `tests/integration_tests/vectorstores/test_faiss.py` #### Who can review? Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049 - @hwchase17	2023-06-11 13:20:03 -07:00
Jiaping(JP) Zhang	6e90406e0f	[APIChain] enhance the robustness or url (#6008 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> I used the APIChain sometimes it failed during the intermediate step when generating the api url and calling the `request` function. After some digging, I found the url sometimes includes the space at the beginning, like `%20https://...api.com` which causes the ` self.requests_wrapper.get` internal function to fail. Including a little string preprocessing `.strip` to remove the space seems to improve the robustness of the APIchain to make sure it can send the request and retrieve the API result more reliably. <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @vowelparrot Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-11 13:13:57 -07:00
Ikko Eltociear Ashimine	c868a3eef3	Update databricks.md (#6006 ) HuggingFace -> Hugging Face #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review?	2023-06-11 13:13:33 -07:00
Harrison Chase	20e9ce8a62	bump version to 197 (#6007 )	2023-06-11 10:14:57 -07:00
Harrison Chase	704d56e241	support kwargs (#5990 )	2023-06-11 10:09:22 -07:00
Mark Pors	b934677a81	Obey handler.raise_error in _ahandle_event_for_handler (#6001 ) Obey `handler.raise_error` in `_ahandle_event_for_handler` Exceptions for async callbacks were only logged as warnings, also when `raise_error = True` #### Who can review? @hwchase17 @agola11	2023-06-11 09:49:26 -07:00
Harrison Chase	2d038b57b2	Harrison/arxiv fix (#5993 ) Co-authored-by: Juanjo do Olmo <87780148+SimplyJuanjo@users.noreply.github.com>	2023-06-11 09:48:09 -07:00
Vincent	0b740c9baa	add ocr_languages param for ConfluenceLoader.load() (#5823 ) @eyurtsev 当Confluence文档内容中包含附件，且附件内容为非英文时，提取出来的文本是乱码的。 When the content of the document contains attachments, and the content of the attachments is not in English, the extracted text is garbled. 这主要是因为没有为pytesseract传递lang参数，默认情况下只支持英文。 This is mainly because lang parameter is not passed to pytesseract, and only English is supported by default. 所以我给ConfluenceLoader.load()添加了ocr_languages参数，以便支持多种语言。 So I added the ocr_languages parameter to ConfluenceLoader.load () to support multiple languages.	2023-06-10 16:51:04 -07:00
Thomas B	ac3e6e3944	Fix IndexError in RecursiveCharacterTextSplitter (#5902 ) Fixes (not reported) an error that may occur in some cases in the RecursiveCharacterTextSplitter. An empty `new_separators` array ([]) would end up in the else path of the condition below and used in a function where it is expected to be non empty. ```python if new_separators is None: ... else: # _split_text() expects this array to be non-empty! other_info = self._split_text(s, new_separators) ``` resulting in an `IndexError` ```python def _split_text(self, text: str, separators: List[str]) -> List[str]: """Split incoming text and return chunks.""" final_chunks = [] # Get appropriate separator to use > separator = separators[-1] E IndexError: list index out of range langchain/text_splitter.py:425: IndexError ``` #### Who can review? @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:48:53 -07:00
Satheesh Valluru	d2270a2261	Fix: Grammer fix in documentation (#5925 ) Fix for grammatical errors in the documentation of `vectorstore`. @vowelparrot	2023-06-10 16:43:36 -07:00
Jens Madsen	1250cd4630	fix: use model token limit not tokenizer ditto (#5939 ) This fixes a token limit bug in the SentenceTransformersTokenTextSplitter. Before the token limit was taken from tokenizer used by the model. However, for some models the token limit of the tokenizer (from `AutoTokenizer.from_pretrained`) does not equal the token limit of the model. This was a false assumption. Therefore, the token limit of the text splitter is now taken from the sentence transformers model token limit. Twitter: @plasmajens #### Before submitting #### Who can review? @hwchase17 and/or @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:36:03 -07:00
Ofer Mendelevitch	f8cf09a230	Update to Vectara integration (#5950 ) This PR updates the Vectara integration (@hwchase17 ): * Adds reuse of requests.session to imrpove efficiency and speed. * Utilizes Vectara's low-level API (instead of standard API) to better match user's specific chunking with LangChain * Now add_texts puts all the texts into a single Vectara document so indexing is much faster. * updated variables names from alpha to lambda_val (to be consistent with Vectara docs) and added n_context_sentence so it's available to use if needed. * Updates to documentation and tests --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:27:01 -07:00
qued	e4224a396b	feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955 ) # Unstructured XML Loader Adds an `UnstructuredXMLLoader` class for .xml files. Works with unstructured>=0.6.7. A plain text representation of the text with the XML tags will be available under the `page_content` attribute in the doc. ### Testing ```python from langchain.document_loaders import UnstructuredXMLLoader loader = UnstructuredXMLLoader( "example_data/factbook.xml", ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	2023-06-10 16:24:42 -07:00
Lance Martin	21bd16bb59	Create Airtable loader (#5958 ) Create document loader for Airtable	2023-06-10 15:43:18 -07:00
Harrison Chase	9218684759	Add a new vector store - AwaDB (#5971 ) (#5992 ) Added AwaDB vector store, which is a wrapper over the AwaDB, that can be used as a vector storage and has an efficient similarity search. Added integration tests for the vector store Added jupyter notebook with the example Delete a unneeded empty file and resolve the conflict(https://github.com/hwchase17/langchain/pull/5886) Please check, Thanks! @dev2049 @hwchase17 --------- <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: ljeagle <vincent_jieli@yeah.net> Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-10 15:42:32 -07:00
Tomaz Bratanic	d5819a7ca7	Add additional parameters to Graph Cypher Chain (#5979 ) Based on the inspiration from the SQL chain, the following three parameters are added to Graph Cypher Chain. - top_k: Limited the number of results from the database to be used as context - return_direct: Return database results without transforming them to natural language - return_intermediate_steps: Return intermediate steps	2023-06-10 14:39:55 -07:00
Daniel Grittner	0ca37e613c	Fix handling of missing action & input for async MRKL agent (#5985 ) Hi, This is a fix for https://github.com/hwchase17/langchain/pull/5014. This PR forgot to add the ability to self solve the ValueError(f"Could not parse LLM output: {llm_output}") error for `_atake_next_step`.	2023-06-10 14:38:20 -07:00
Harrison Chase	ca1afa7213	add test for structured tools (#5989 )	2023-06-10 14:37:26 -07:00
constDave	5f356b9993	Fixed typo missing "use" (#5991 ) <!-- Fixed a simple typo on https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/vectorstore.html where the word "use" was missing. #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-10 14:31:58 -07:00
Kaarthik Andavar	d6f5d0c6b1	Fix: SnowflakeLoader returning empty documents (#5967 ) Fix SnowflakeLoader's Behavior of Returning Empty Documents Description: This PR addresses the issue where the SnowflakeLoader was consistently returning empty documents. After investigation, it was found that the query method within the SnowflakeLoader was not properly fetching and processing the data. Changes: 1. Modified the query method in SnowflakeLoader to handle data fetch and processing more accurately. 2. Enhanced error handling within the SnowflakeLoader to catch and log potential issues that may arise during data loading. Impact: This fix will ensure the SnowflakeLoader reliably returns the expected documents instead of empty ones, improving the efficiency and reliability of data processing tasks in the LangChain project. Before Fix: `[ Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}) ]` After Fix: `[Document(page_content='CUSTOMER_ID: 1\nFIRST_NAME: John\nLAST_NAME: Doe\nEMAIL: john.doe@example.com\nPHONE: 555-123-4567\nADDRESS: 123 Elm St, San Francisco, CA 94102', metadata={}), Document(page_content='CUSTOMER_ID: 2\nFIRST_NAME: Jane\nLAST_NAME: Doe\nEMAIL: jane.doe@example.com\nPHONE: 555-987-6543\nADDRESS: 456 Oak St, San Francisco, CA 94103', metadata={}), Document(page_content='CUSTOMER_ID: 3\nFIRST_NAME: Michael\nLAST_NAME: Smith\nEMAIL: michael.smith@example.com\nPHONE: 555-234-5678\nADDRESS: 789 Pine St, San Francisco, CA 94104', metadata={}), Document(page_content='CUSTOMER_ID: 4\nFIRST_NAME: Emily\nLAST_NAME: Johnson\nEMAIL: emily.johnson@example.com\nPHONE: 555-345-6789\nADDRESS: 321 Maple St, San Francisco, CA 94105', metadata={}), Document(page_content='CUSTOMER_ID: 5\nFIRST_NAME: David\nLAST_NAME: Williams\nEMAIL: david.williams@example.com\nPHONE: 555-456-7890\nADDRESS: 654 Birch St, San Francisco, CA 94106', metadata={}), Document(page_content='CUSTOMER_ID: 6\nFIRST_NAME: Emma\nLAST_NAME: Jones\nEMAIL: emma.jones@example.com\nPHONE: 555-567-8901\nADDRESS: 987 Cedar St, San Francisco, CA 94107', metadata={}), Document(page_content='CUSTOMER_ID: 7\nFIRST_NAME: Oliver\nLAST_NAME: Brown\nEMAIL: oliver.brown@example.com\nPHONE: 555-678-9012\nADDRESS: 147 Cherry St, San Francisco, CA 94108', metadata={}), Document(page_content='CUSTOMER_ID: 8\nFIRST_NAME: Sophia\nLAST_NAME: Davis\nEMAIL: sophia.davis@example.com\nPHONE: 555-789-0123\nADDRESS: 369 Walnut St, San Francisco, CA 94109', metadata={}), Document(page_content='CUSTOMER_ID: 9\nFIRST_NAME: James\nLAST_NAME: Taylor\nEMAIL: james.taylor@example.com\nPHONE: 555-890-1234\nADDRESS: 258 Hawthorn St, San Francisco, CA 94110', metadata={}), Document(page_content='CUSTOMER_ID: 10\nFIRST_NAME: Isabella\nLAST_NAME: Wilson\nEMAIL: isabella.wilson@example.com\nPHONE: 555-901-2345\nADDRESS: 963 Aspen St, San Francisco, CA 94111', metadata={})] ` Tests: All unit and integration tests have been run and passed successfully. Additional tests were added to validate the new behavior of the SnowflakeLoader. Checklist: - [x] Code changes are covered by tests - [x] Code passes `make format` and `make lint` - [x] This PR does not introduce any breaking changes Please review and let me know if any changes are required.	2023-06-10 13:03:50 -07:00
Harrison Chase	62ec10a7f5	bump version to 196 (#5988 )	2023-06-10 09:06:35 -07:00
German Martin	736a1819aa	LOTR: Lord of the Retrievers. A retriever that merge several retrievers together applying document_formatters to them. (#5798 ) "One Retriever to merge them all, One Retriever to expose them, One Retriever to bring them all and in and process them with Document formatters." Hi @dev2049! Here bothering people again! I'm using this simple idea to deal with merging the output of several retrievers into one. I'm aware of DocumentCompressorPipeline and ContextualCompressionRetriever but I don't think they allow us to do something like this. Also I was getting in trouble to get the pipeline working too. Please correct me if i'm wrong. This allow to do some sort of "retrieval" preprocessing and then using the retrieval with the curated results anywhere you could use a retriever. My use case is to generate diff indexes with diff embeddings and sources for a more colorful results then filtering them with one or many document formatters. I saw some people looking for something like this, here: https://github.com/hwchase17/langchain/issues/3991 and something similar here: https://github.com/hwchase17/langchain/issues/5555 This is just a proposal I know I'm missing tests , etc. If you think this is a worth it idea I can work on tests and anything you want to change. Let me know! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 08:41:02 -07:00
Lance Martin	f3e7ac0a2c	Add load() to snowflake loader (#5956 ) Quick fix for recently added [snowflake data loader](https://github.com/hwchase17/langchain/pull/5825/files).	2023-06-09 11:27:29 -07:00
Harrison Chase	3678cba0be	bump ver to 195 (#5949 )	2023-06-09 09:17:08 -07:00
Harrison Chase	7af186fddf	fixes to docs (#5919 )	2023-06-09 09:15:53 -07:00
Kacper Łukawski	7cc200766e	Expose full params in Qdrant (#5947 ) # Expose full params in Qdrant There were many questions regarding supporting some additional parameters in Qdrant integration. Qdrant supports many vector search optimizations that were impossible to use directly in Qdrant before. That includes: 1. Possibility to manipulate collection params while using `Qdrant.from_texts`. The PR allows setting things such as quantization, HNWS config, optimizers config, etc. That makes it consistent with raw `QdrantClient`. 2. Extended options while searching. It includes HNSW options, exact search, score threshold filtering, and read consistency in distributed mode. After merging that PR, #4858 might also be closed. ## Who can review? VectorStores / Retrievers / Memory @dev2049 @hwchase17	2023-06-09 08:56:32 -07:00
Rubén Martínez	db7ef635c0	Add support for the endpoint URL in DynamoDBChatMesasgeHistory (#5836 ) This PR adds the possibility of specifying the endpoint URL to AWS in the DynamoDBChatMessageHistory, so that it is possible to target not only the AWS cloud services, but also a local installation. Specifying the endpoint URL, which is normally not done when addressing the cloud services, is very helpful when targeting a local instance (like [Localstack](https://localstack.cloud/)) when running local tests. Fixes #5835 #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-08 23:21:11 -07:00
Lior	0eb1bc1a02	Fix the issue where the parameters passed to VertexAI ignored #5889 (#5891 ) Fixes #5889 and fixes the name of the argument in init_vertexai @hwchase17 @agola11 Co-authored-by: Lior Durahly <lior.durahly@superwise.ai>	2023-06-08 23:15:22 -07:00
Fei Wang	63fcf41bea	Fix openai proxy error (#5914 ) Fixes proxy error. Since openai does not parse proxy parameters and uses openai.proxy directly, the proxy method needs to be modified. `7610c5adfa/openai/api_requestor.py (LL90)` #### Who can review? @hwchase17 - project lead Models - @hwchase17 - @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-08 23:15:06 -07:00
felpigeon	2791a753bf	Add start index to metadata in TextSplitter (#5912 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Add start index to metadata in TextSplitter - Modified method `create_documents` to track start position of each chunk - The `start_index` is included in the metadata if the `add_start_index` parameter in the class constructor is set to `True` This enables referencing back to the original document, particularly useful when a specific chunk is retrieved. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 23:09:32 -07:00
Philip Kiely - Baseten	a09a0e3511	Baseten integration (#5862 ) This PR adds a Baseten integration. I've done my best to follow the contributor's guidelines and add docs, an example notebook, and an integration test modeled after similar integrations' test. Please let me know if there is anything I can do to improve the PR. When it is merged, please tag https://twitter.com/basetenco and https://twitter.com/philip_kiely as contributors (the note on the PR template said to include Twitter accounts)	2023-06-08 23:05:57 -07:00
Tamara Lazarevic	0ce8745928	Fix typo (#5894 )	2023-06-08 23:05:22 -07:00
Andrew Grangaard	d8ae925425	arxiv: Correct name of search client attribute to 'arxiv_search' from incorrect 'arxiv_client' (#5917 ) + this private attribute is referenced as `arxiv_search` in internal usage and is set when verifying the environment twitter: @spazm #### Who can review? Any of @hwchase17, @leo-gan, or @bongsang might be interested in reviewing. + Mismatch between `arxiv_client` attribute vs `arxiv_search` in validation and usage is present in the initial commit by @hwchase17. + @leo-gan has made most of the edits. + @bongsang implemented pdf download.	2023-06-08 22:49:11 -07:00
sergiolrinditex	fe8bbc2da7	Create snowflake Loader (#5825 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-08 22:03:00 -07:00
Zander Chase	77c286cf02	Use LCP Client in Tracer (#5908 ) Move the LCP calls to the client.	2023-06-08 21:15:14 -07:00
Frank Hübner	3ec6400d70	Feature/add AWS Kendra Index Retriever (#5856 ) adding a new retriever for AWS Kendra @dev2049 please take a look!	2023-06-08 15:44:09 -07:00
Piyush Jain	a6ebffb695	Fixes model arguments for amazon models (#5896 ) Fixes #5713 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @agola11 @aarora79 @rsgrewal-aws	2023-06-08 14:16:01 -07:00
小铭	767fa91eae	Fix the shortcut conflict for document page search (#5874 ) Fix the document page to open both search and Mendable when pressing Ctrl+K. I have changed the shortcut for Mendable to Ctrl+J. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 14:15:19 -07:00
Zander Chase	5f74db4500	Update run eval imports in init (#5858 )	2023-06-08 10:44:36 -07:00
warjiang	511c12dd39	fix: update qa_chain doc for "chai_type" (#5877 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> `load_qa_with_sources_chain` method already support four type of chain, including `map_rerank`. update document to prevent any misunderstandings 😀. ![image](https://github.com/hwchase17/langchain/assets/6478745/325260b2-6121-4900-aef9-001febff811a) <!-- Remove if not applicable --> Fixes # (issue) No, just update document. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 07:32:51 -07:00
Harrison Chase	893d20f735	bump version to 194 (#5866 )	2023-06-07 22:47:48 -07:00
Harrison Chase	35cfd25db3	Harrison/nebula graph (#5865 ) Co-authored-by: Wey Gu <weyl.gu@gmail.com> Co-authored-by: chenweisomebody <chenweisomebody@gmail.com>	2023-06-07 21:56:43 -07:00
Harrison Chase	658f8bdee7	Harrison/fauna loader (#5864 ) Co-authored-by: Shadid12 <Shadid12@users.noreply.github.com>	2023-06-07 21:32:23 -07:00
Liang Zhang	5518f24ec3	Implement saving and loading of RetrievalQA chain (#5818 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #3983 Mimicing what we do for saving and loading VectorDBQA chain, I added the logic for RetrievalQA chain. Also added a unit test. I did not find how we test other chains for their saving and loading functionality, so I just added a file with one test case. Let me know if there are recommended ways to test it. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 21:07:13 -07:00
Liang Zhang	b93638ef1e	Refactor and update databricks integration page (#5575 ) # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 20:45:47 -07:00
volodymyr-memsql	a1549901ce	Added SingleStoreDB Vector Store (#5619 ) - Added `SingleStoreDB` vector store, which is a wrapper over the SingleStore DB database, that can be used as a vector storage and has an efficient similarity search. - Added integration tests for the vector store - Added jupyter notebook with the example @dev2049 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:45:33 -07:00
jjzhuo	78aa59c68b	Fix serialization issue with W&B (#5693 ) The chain input_documents are not displaying properly in W&B, due to serialization issue: <img width="1164" alt="Screenshot 2023-06-04 at 11 58 26 AM" src="https://github.com/hwchase17/langchain/assets/134809928/f31f14f6-0935-4cca-9913-6760cd40eadf"> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:44:59 -07:00
Alec Flett	ec0dd6e34a	propagate callbacks to ConversationalRetrievalChain (#5572 ) # Allow callbacks to monitor ConversationalRetrievalChain <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> I ran into an issue where load_qa_chain was not passing the callbacks down to the child LLM chains, and so made sure that callbacks are propagated. There are probably more improvements to do here but this seemed like a good place to stop. Note that I saw a lot of references to callbacks_manager, which seems to be deprecated. I left that code alone for now. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 20:25:21 -07:00
Jeff Vestal	3294774148	Add knn and query search field options to ElasticKnnSearch (#5641 ) in the `ElasticKnnSearch` class added 2 arguments that were not exposed properly `knn_search` added: - `vector_query_field: Optional[str] = 'vector'` -- vector_query_field: Field name to use in knn search if not default 'vector' `knn_hybrid_search` added: - `vector_query_field: Optional[str] = 'vector'` -- vector_query_field: Field name to use in knn search if not default 'vector' - `query_field: Optional[str] = 'text'` -- query_field: Field name to use in search if not default 'text' Fixes # https://github.com/hwchase17/langchain/issues/5633 cc: @dev2049 @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:19:14 -07:00
Mark Marryatt	cef79ca579	Fix exporting GCP Vertex Matching Engine from vectorstores (#5793 ) The Vertex Matching Engine docs include [the line](`b177a29d3f/docs/modules/indexes/vectorstores/examples/matchingengine.ipynb (L32)`) `from langchain.vectorstores import MatchingEngine` which doesn't work as it wasn't added to the vectorestores module exports. - @dev2049	2023-06-07 19:45:33 -07:00
Dave Ingram	106364a45c	Update to Getting Started docs page for Memory (#5855 ) Simply fixing a small typo in the memory page. Also removed an extra code block at the end of the file. Along the way, the current outputs seem to have changed in a few places so left that for posterity, and updated the number of runs which seems harmless, though I can clean that up if preferred.	2023-06-07 19:45:21 -07:00
bnassivet	9355e3f5f5	qdrant vector store - search with relevancy scores (#5781 ) Implementation of similarity_search_with_relevance_scores for quadrant vector store. As implemented the method is also compatible with other capacities such as filtering. Integration tests updated. #### Who can review? Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-06-07 19:26:40 -07:00
Ning Ren	f15763518a	docs: add Shale Protocol integration guide (#5814 ) This PR adds documentation for Shale Protocol's integration with LangChain. [Shale Protocol](https://shaleprotocol.com) provides forever-free production-ready inference APIs to the open-source community. We have global data centers and plan to support all major open LLMs (estimated ~1,000 by 2025). The team consists of software and ML engineers, AI researchers, designers, and operators across North America and Asia. Combined together, the team has 50+ years experience in machine learning, cloud infrastructure, software engineering and product development. Team members have worked at places like Google and Microsoft. #### Who can review? Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11 --------- Co-authored-by: Karen Sheng <46656667+karensheng@users.noreply.github.com>	2023-06-07 19:25:59 -07:00
Duarte OC	137da7e4b6	Update microsoft loader example with docx2txt dependency (#5832 ) @eyurtsev	2023-06-07 19:21:48 -07:00
Aidan Holland	9f4b720a63	Add additional VertexAI Params (#5837 ) ## Changes - Added the `stop` param to the `_VertexAICommon` class so it can be set at llm initialization ## Example Usage ```python VertexAI( # ... temperature=0.15, max_output_tokens=128, top_p=1, top_k=40, stop=["\n```"], ) ``` ## Possible Reviewers - @hwchase17 - @agola11	2023-06-07 19:20:37 -07:00
Eduard van Valkenburg	76fcd96dae	Add logging in PBI tool (#5841 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Add some logging into the powerbi tool so that you can see the queries being sent to PBI and attempts to correct them. <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @vowelparrot <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 19:19:21 -07:00
Matt Robinson	11fec7d4d1	feat: Add `UnstructuredCSVLoader` for CSV files (#5844 ) ### Summary Adds an `UnstructuredCSVLoader` for loading CSVs. One advantage of using `UnstructuredCSVLoader` relative to the standard `CSVLoader` is that if you use `UnstructuredCSVLoader` in `"elements"` mode, an HTML representation of the table will be available in the metadata. #### Who can review? @hwchase17 @eyurtsev	2023-06-07 19:18:01 -07:00
Soos3D	0b4a51930c	Add how to use a custom scraping function with the sitemap loader. (#5847 ) Hi! I just added an example of how to use a custom scraping function with the sitemap loader. I recently used this feature and had to dig in the source code to find it. I thought it might be useful to other devs to have an example in the Jupyter Notebook directly. I only added the example to the documentation page. @eyurtsev I was not able to run the lint. Please let me know if I have to do anything else. I know this is a very small contribution, but I hope it will be valuable. My Twitter handle is @web3Dav3. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 19:16:51 -07:00
Yessen Kanapin	c66755b661	Add DeepInfra embeddings integration with tests and examples, better exception handling for Deep Infra LLM (#5854 ) #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead - @agola11 --------- Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	2023-06-07 19:14:30 -07:00
ugfly1210	4d8cda1c3b	FIX: backslash escaped (#5815 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> LatexTextSplitter needs to use "\n\\\chapter" when separators are escaped, such as "\n\\\chapter", otherwise it will report an error: (re.error: bad escape \c at position 1 (line 2, column 1)) Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use re.error: bad escape \c at position 1 (line 2, column 1) See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> Co-authored-by: Pang <ugfly@qq.com>	2023-06-07 16:01:07 -07:00
Zander Chase	3af36943e8	Rm extraneous args to the trace group helper (#5801 ) These are being ignored	2023-06-07 13:09:29 -07:00
whysage	8ef7274ee6	feat: issue-5712 add sleep tool (#5715 ) Fixes # 5712 added sleep tool	2023-06-07 09:39:02 -07:00
Zander Chase	d9fcc45d05	Add in the async methods and link the run id (#5810 )	2023-06-07 08:27:44 -07:00
Harrison Chase	ce7c11625f	bump version to 193 (#5838 )	2023-06-07 07:38:57 -07:00
warjiang	5a207cce8f	fix: fullfill openai params when embedding (#5821 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #5822 I upgrade my langchain lib by execute `pip install -U langchain`, and the verion is 0.0.192。But i found that openai.api_base not working. I use azure openai service as openai backend, the openai.api_base is very import for me. I hava compared tag/0.0.192 and tag/0.0.191, and figure out that: ![image](https://github.com/hwchase17/langchain/assets/6478745/e183fdb2-8224-45c9-b3b4-26d62823999a) openai params is moved inside `_invocation_params` function，and used in some openai invoke: ![image](https://github.com/hwchase17/langchain/assets/6478745/5a55a048-5fa9-4bf4-aaef-3902226bec5e) ![image](https://github.com/hwchase17/langchain/assets/6478745/85b8cebc-eeb8-4538-a525-814719c8f8df) but still some case not covered like: ![image](https://github.com/hwchase17/langchain/assets/6478745/e0297620-f2b2-4f4f-98bd-d0ed19022dac) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 07:32:57 -07:00
Harrison Chase	b3ae6bcd3f	bump ver to 192 (#5812 )	2023-06-06 22:23:11 -07:00
Harrison Chase	5468528748	rm docs mongo (#5811 )	2023-06-06 22:22:44 -07:00
Andrew Switlyk	69f4ffb851	Update adding_memory.ipynb (#5806 ) just change "to" to "too" so it matches the above prompt <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-06 22:10:53 -07:00
Sun bin	2be4fbb835	add doc about reusing MongoDBAtlasVectorSearch (#5805 ) DOC: add doc about reusing MongoDBAtlasVectorSearch #### Who can review? Anyone authorized.	2023-06-06 22:10:36 -07:00
bnassivet	062c3c00a2	fixed faiss integ tests (#5808 ) Fixes # 5807 Realigned tests with implementation. Also reinforced folder unicity for the test_faiss_local_save_load test using date-time suffix #### Before submitting - Integration test updated - formatting and linting ok (locally) #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead VectorStores / Retrievers / Memory -@dev2049	2023-06-06 22:07:27 -07:00
SvMax	92b87c2fec	added support for different types in ResponseSchema class (#5789 ) I added support for specifing different types with ResponseSchema objects: ## before ` extracted_info = ResponseSchema(name="extracted_info", description="List of extracted information") ` generate the following doc: ```json\n{\n\t\"extracted_info\": string // List of extracted information}``` This brings GPT to create a JSON with only one string in the specified field even if you requested a List in the description. ## now `extracted_info = ResponseSchema(name="extracted_info", type="List[string]", description="List of extracted information") ` generate the following doc: ```json\n{\n\t\"extracted_info\": List[string] // List of extracted information}``` This way the model responds better to the prompt generating an array of strings. Tag maintainers/contributors who might be interested: Agents / Tools / Toolkits @vowelparrot Don't know who can be interested, I suppose this is a tool, so I tagged you vowelparrot, anyway, it's a minor change, and shouldn't impact any other part of the framework.	2023-06-06 22:00:48 -07:00
Harrison Chase	3954bcf396	WIP: openai settings (#5792 ) [] need to test more [] make sure they arent saved when serializing [] do for embeddings	2023-06-06 21:57:58 -07:00
Alex Lee	b7999a9bc1	Add UTF-8 json ouput support while langchain.debug is set to True. (#5802 ) Before: <img width="984" alt="image" src="https://github.com/hwchase17/langchain/assets/4317474/2b0807b4-a1d6-4df2-87cc-92b1c8e10534"> After: <img width="992" alt="image" src="https://github.com/hwchase17/langchain/assets/4317474/128c2c7d-2ed5-4c95-954d-b0964c83526a"> Thanks in advance. @agola11	2023-06-06 21:56:33 -07:00
kourosh hakhamaneshi	a0d847f636	[Docs][Hotfix] Fix broken links (#5800 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Some links were broken from the previous merge. This PR fixes them. Tested locally. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2023-06-06 17:17:16 -07:00
Zander Chase	217b5cc72d	Base RunEvaluator Chain (#5750 ) Clean up a bit and only implement the QA and reference free implementations from https://github.com/hwchase17/langchain/pull/5618	2023-06-06 16:42:15 -07:00
Lance Martin	4092fd21dc	YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772 ) This introduces the `YoutubeAudioLoader`, which will load blobs from a YouTube url and write them. Blobs are then parsed by `OpenAIWhisperParser()`, as show in this [PR](https://github.com/hwchase17/langchain/pull/5580), but we extend the parser to split audio such that each chuck meets the 25MB OpenAI size limit. As shown in the notebook, this enables a very simple UX: ``` # Transcribe the video to text loader = GenericLoader(YoutubeAudioLoader([url],save_dir),OpenAIWhisperParser()) docs = loader.load() ``` Tested on full set of Karpathy lecture videos: ``` # Karpathy lecture videos urls = ["https://youtu.be/VMj-3S1tku0" "https://youtu.be/PaCmpygFfXo", "https://youtu.be/TCH_1BHY58I", "https://youtu.be/P6sfmUTpUmc", "https://youtu.be/q8SA3rM6ckI", "https://youtu.be/t3YJ5hKiMQ0", "https://youtu.be/kCc8FmEb1nY"] # Directory to save audio files save_dir = "~/Downloads/YouTube" # Transcribe the videos to text loader = GenericLoader(YoutubeAudioLoader(urls,save_dir),OpenAIWhisperParser()) docs = loader.load() ```	2023-06-06 15:15:08 -07:00
Gengliang Wang	2a4b32dee2	Revise DATABRICKS_API_TOKEN as DATABRICKS_TOKEN (#5796 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> In the [Databricks integration](https://python.langchain.com/en/latest/integrations/databricks.html) and [Databricks LLM](https://python.langchain.com/en/latest/modules/models/llms/integrations/databricks.html), we suggestted users to set the ENV variable `DATABRICKS_API_TOKEN`. However, this is inconsistent with the other Databricks library. To make it consistent, this PR changes the variable from `DATABRICKS_API_TOKEN` to `DATABRICKS_TOKEN` After changes, there is no more `DATABRICKS_API_TOKEN` in the doc ``` $ git grep DATABRICKS_API_TOKEN\|wc -l 0 $ git grep DATABRICKS_TOKEN\|wc -l 8 ``` cc @hwchase17 @dev2049 @mengxr since you have reviewed the previous PRs.	2023-06-06 14:22:49 -07:00
Paul-Emile Brotons	daf3e99b96	fixing from_documents method of the MongoDB Atlas vector store (#5794 ) FIxed a bug in from_documents method --> Collection objects do not implement truth value testing or bool(). @dev2049	2023-06-06 14:22:23 -07:00
Ankush Gola	b177a29d3f	support returning run info for llms, chat models and chains (#5666 ) returning the run id is important for accessing the run later on	2023-06-06 10:07:46 -07:00
Yoann Poupart	65111eb2b3	Attribute support for html tags (#5782 ) # What does this PR do? Change the HTML tags so that a tag with attributes can be found. ## Before submitting - [x] Tests added - [x] CI/CD validated ### Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.	2023-06-06 09:27:37 -07:00
Zander Chase	0cfaa76e45	Set Falsey (#5783 ) Seems natural to try to disable logging by setting `MY_VAR=false` rather than unsetting (especially once you've already set it in the background)	2023-06-06 09:26:38 -07:00
Harrison Chase	2ae2d6cd1d	fix ver 191 (#5784 )	2023-06-06 09:17:23 -07:00
Zander Chase	204a73c1d9	Use client from LCP-SDK (#5695 ) - Remove the client implementation (this breaks backwards compatibility for existing testers. I could keep the stub in that file if we want, but not many people are using it yet - Add SDK as dependency - Update the 'run_on_dataset' method to be a function that optionally accepts a client as an argument - Remove the langchain plus server implementation (you get it for free with the SDK now) We could make the SDK optional for now, but the plan is to use w/in the tracer so it would likely become a hard dependency at some point.	2023-06-06 06:51:05 -07:00
Harrison Chase	08e2352f7b	bump ver 191 (#5766 )	2023-06-05 20:54:08 -07:00
berkedilekoglu	f907b62526	Scores are explained in vectorestore docs (#5613 ) # Scores in Vectorestores' Docs Are Explained Following vectorestores can return scores with similar documents by using `similarity_search_with_score`: - chroma - docarray_hnsw - docarray_in_memory - faiss - myscale - qdrant - supabase - vectara - weaviate However, in documents, these scores were either not explained at all or explained in a way that could lead to misunderstandings (e.g., FAISS). For instance in FAISS document: if we consider the score returned by the function as a similarity score, we understand that a document returning a higher score is more similar to the source document. However, since the scores returned by the function are distance scores, we should understand that smaller scores correspond to more similar documents. For the libraries other than Vectara, I wrote the scores they use by investigating from the source libraries. Since I couldn't be certain about the score metric used by Vectara, I didn't make any changes in its documentation. The links mentioned in Vectara's documentation became broken due to updates, so I replaced them with working ones. VectorStores / Retrievers / Memory - @dev2049 my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 20:39:49 -07:00
Adil Ansari	233b52735e	feat: Support for `Tigris` Vector Database for vector search (#5703 ) ### Changes - New vector store integration - [Tigris](https://tigrisdata.com) - Adds [tigrisdb](https://pypi.org/project/tigrisdb/) optional dependency - Example notebook demonstrating usage Fixes #5535 Closes tigrisdata/tigris-client-python#40 #### Twitter handles We'd love a shoutout on our [@TigrisData](https://twitter.com/TigrisData) and [@adilansari](https://twitter.com/adilansari) twitter handles #### Who can review? @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 20:39:16 -07:00
Edrick Da Corte Henriquez	38dabdbb3a	Update tutorials.md (#5761 ) # Added an overview of LangChain modules Aimed at introducing newcomers to LangChain's main modules :) Twitter handle is @edrick_dch ## Who can review? @eyurtsev	2023-06-05 20:37:11 -07:00
Ankush Gola	84a46753ab	Tracing Group (#5326 ) Add context manager to group all runs under a virtual parent --------- Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>	2023-06-05 19:18:43 -07:00
Ilya	d5b1608216	fix markdown text splitter horizontal lines (#5625 ) Fixes #5614 #### Issue The `**` combination produces an exception when used as a seperator in `re.split`. Instead `\\\` should be used for regex exprations. #### Who can review? @eyurtsev	2023-06-05 16:40:26 -07:00
Harrison Chase	25487fa5ee	Harrison/youtube multi language (#5758 ) Co-authored-by: rafly lesmana <raflylesmana111@gmail.com>	2023-06-05 16:38:07 -07:00
Shelby Jenkins	2dcda8a8ac	Strips whitespace and \n from loc before filtering urls from sitemap (#5728 ) Fixes #5699 #### Who can review? Tag maintainers/contributors who might be interested: @woodworker @LeSphax @johannhartmann --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 16:33:55 -07:00
Harrison Chase	98dd6d068a	cohere retries (#5757 ) …719) A minor update to retry Cohore API call in case of errors using tenacity as it is done for OpenAI LLMs. #### Who can review? @hwchase17, @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Sagar Sapkota <22609549+sagar-spkt@users.noreply.github.com>	2023-06-05 16:28:58 -07:00
M Waleed Kadous	5124c1e0d9	Add aviary support (#5661 ) Aviary is an open source toolkit for evaluating and deploying open source LLMs. You can find out more about it on [http://github.com/ray-project/aviary). You can try it out at [http://aviary.anyscale.com](aviary.anyscale.com). This code adds support for Aviary in LangChain. To minimize dependencies, it connects directly to the HTTP endpoint. The current implementation is not accelerated and uses the default implementation of `predict` and `generate`. It includes a test and a simple example. @hwchase17 and @agola11 could you have a look at this? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 16:28:42 -07:00
felpigeon	a47c8618ec	Add class attribute "return_generated_question" to class "BaseConversationalRetrievalChain" (#5749 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Adding a class attribute "return_generated_question" to class "BaseConversationalRetrievalChain". If set to `True`, the chain's output has a key "generated_question" with the question generated by the sub-chain `question_generator` as the value. This way the generated question can be logged. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049 @vowelparrot	2023-06-05 16:10:12 -07:00
Leonid Ganeline	87ad4fc4b2	docs: updated `ecosystem/dependents` (#5753 ) updated `ecosystem/dependents` data (it was updated 2+ weeks ago) #### Who can review? @hwchase17 @eyurtsev @dev2049	2023-06-05 16:09:55 -07:00
Leonid Ganeline	92a5f00ffb	docs: `ecosystem/integrations` update 5 (#5752 ) - added missed integration to `docs/ecosystem/integrations/` - updated notebooks to consistent format: changed titles, file names; added descriptions #### Who can review? @hwchase17 @dev2049	2023-06-05 16:08:55 -07:00
Lance Martin	aea090045b	Create OpenAIWhisperParser for generating Documents from audio files (#5580 ) # OpenAIWhisperParser This PR creates a new parser, `OpenAIWhisperParser`, that uses the [OpenAI Whisper model](https://platform.openai.com/docs/guides/speech-to-text/quickstart) to perform transcription of audio files to text (`Documents`). Please see the notebook for usage.	2023-06-05 15:51:13 -07:00
Hao Chen	a4c9053d40	Integrate Clickhouse as Vector Store (#5650 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Description This PR is mainly to integrate open source version of ClickHouse as Vector Store as it is easy for both local development and adoption of LangChain for enterprises who already have large scale clickhouse deployment. ClickHouse is a open source real-time OLAP database with full SQL support and a wide range of functions to assist users in writing analytical queries. Some of these functions and data structures perform distance operations between vectors, [enabling ClickHouse to be used as a vector database](https://clickhouse.com/blog/vector-search-clickhouse-p1). Recently added ClickHouse capabilities like [Approximate Nearest Neighbour (ANN) indices](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/annindexes) support faster approximate matching of vectors and provide a promising development aimed to further enhance the vector matching capabilities of ClickHouse. In LangChain, some ClickHouse based commercial variant vector stores like [Chroma](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/chroma.py) and [MyScale](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/myscale.py), etc are already integrated, but for some enterprises with large scale Clickhouse clusters deployment, it will be more straightforward to upgrade existing clickhouse infra instead of moving to another similar vector store solution, so we believe it's a valid requirement to integrate open source version of ClickHouse as vector store. As `clickhouse-connect` is already included by other integrations, this PR won't include any new dependencies. #### Before submitting <!-- If you're adding a new integration, please include: 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @hwchase17 @dev2049 Could you please help review? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 13:32:04 -07:00
Gustavo Brian	2f2d27fd82	Error in documentation: Chroma constructor (#5731 ) Chroma("langchain_store", embeddings.embed_query) must be Chroma("langchain_store", embeddings)	2023-06-05 13:30:58 -07:00
George Geddes	019eb13681	Fix a typo in the documentation for the Slack document loader (#5745 ) Fixes a typo I noticed while reading the docs.	2023-06-05 13:30:24 -07:00
Andrew Grangaard	450eb91fe2	Removes unnecessary backslash escaping for backticks in python (#5751 ) Fixed python deprecation warning: DeprecationWarning: invalid escape sequence '`' backticks (`) do not have special meaning in python strings and should not be escaped. -- @spazm on twitter ### Who can review: @nfcampos ported this change from javascript, @hwchase17 wrote the original STRUCTURED_FORMAT_INSTRUCTIONS,	2023-06-05 13:30:11 -07:00
Daniel Chalef	0551bc90a5	Zep Hybrid Search (#5742 ) Zep now supports persisting custom metadata with messages and hybrid search across both message embeddings and structured metadata. This PR implements custom metadata and enhancements to the `ZepChatMessageHistory` and `ZepRetriever` classes to implement this support. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049 --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	2023-06-05 12:59:28 -07:00
Tomaz Bratanic	a0ea6f6b6b	Cypher search: Check if generated Cypher is provided in backticks (#5541 ) # Check if generated Cypher code is wrapped in backticks Some LLMs like the VertexAI like to explain how they generated the Cypher statement and wrap the actual code in three backticks: ![Screenshot from 2023-06-01 08-08-23](https://github.com/hwchase17/langchain/assets/19948365/1d8eecb3-d26c-4882-8f5b-6a9bc7e93690) I have observed a similar pattern with OpenAI chat models in a conversational settings, where multiple user and assistant message are provided to the LLM to generate Cypher statements, where then the LLM wants to maybe apologize for previous steps or explain its thoughts. Interestingly, both OpenAI and VertexAI wrap the code in three backticks if they are doing any explaining or apologizing. Checking if the generated cypher is wrapped in backticks seems like a low-hanging fruit to expand the cypher search to other LLMs and conversational settings.	2023-06-05 12:48:13 -07:00
Abhijeet Malamkar	1a9ac3b1f9	Adding support to save multiple memories at a time. Cuts save time by … (#5172 ) # Adding support to save multiple memories at a time. Cuts save time by more then half <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 - VectorStores / Retrievers / Memory - @dev2049 --> @dev2049 @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 12:47:48 -07:00
kourosh hakhamaneshi	625717daa8	docs: Added Deploying LLMs into production + a new ecosystem (#4047 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Co-authored-by: Kamil Kaczmarek <kaczmarek.poczta@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 12:47:27 -07:00
Ralph Schlosser	74f8e603d9	Addresses GPT4All wrapper model_type attribute issues #5720 . (#5743 ) Fixes #5720. A more in-depth discussion is in my comment here: https://github.com/hwchase17/langchain/issues/5720#issuecomment-1577047018 In a nutshell, there has been a subtle change in the latest version of GPT4Alls Python bindings. The change I submitted yesterday is compatible with this version, however, this version is as of yet unreleased and thus the code change breaks Langchain's wrapper under the currently released version of GPT4All. This pull request proposes a backwards-compatible solution.	2023-06-05 12:45:29 -07:00
Harrison Chase	d0d89d39ef	bump version to 190 (#5704 )	2023-06-04 20:04:50 -07:00
mheguy-stingray	b64c39dfe7	top_k and top_p transposed in vertexai (#5673 ) Fix transposed properties in vertexai model Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-04 16:59:53 -07:00
Tobias Herbold	3fb0e4872a	sqlalchemy MovedIn20Warning declarative_base DEPRICATION fix (#5676 ) fix for the sqlalchemy deprecated declarative_base import : ``` MovedIn20Warning: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (deprecated since: 2.0) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9) Base = declarative_base() # type: Any ``` Import is wrapped in an try catch Block to fallback to the old import if needed. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-04 16:52:52 -07:00
Jens Madsen	8d9e9e013c	refactor: extract token text splitter function (#5179 ) # Token text splitter for sentence transformers The current TokenTextSplitter only works with OpenAi models via the `tiktoken` package. This is not clear from the name `TokenTextSplitter`. In this (first PR) a token based text splitter for sentence transformer models is added. In the future I think we should work towards injecting a tokenizer into the TokenTextSplitter to make ti more flexible. Could perhaps be reviewed by @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-04 14:41:44 -07:00
Nathan Azrak	26ec845921	Raise an exception in MKRL and Chat Output Parsers if parsing text which contains both an action and a final answer (#5609 ) Raises exception if OutputParsers receive a response with both a valid action and a final answer Currently, if an OutputParser receives a response which includes both an action and a final answer, they return a FinalAnswer object. This allows the parser to accept responses which propose an action and hallucinate an answer without the action being parsed or taken by the agent. This PR changes the logic to: 1. store a variable checking whether a response contains the `FINAL_ANSWER_ACTION` (this is the easier condition to check). 2. store a variable checking whether the response contains a valid action 3. if both are present, raise a new exception stating that both are present 4. if an action is present, return an AgentAction 5. if an answer is present, return an AgentAnswer 6. if neither is present, raise the relevant exception based around the action format (these have been kept consistent with the prior exception messages) Disclaimer: * Existing mock data included strings which did include an action and an answer. This might indicate that prioritising returning AgentAnswer was always correct, and I am patching out desired behaviour? @hwchase17 to advice. Curious if there are allowed cases where this is not hallucinating, and we do want the LLM to output an action which isn't taken. * I have not passed `send_to_llm` through this new exception Fixes #5601 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 - project lead @vowelparrot	2023-06-04 14:40:49 -07:00
Lucas Rodrigues	c112d7334d	Update MongoDBChatMessageHistory to create an index on SessionId (#5632 ) All the queries to the database are done based on the SessionId property, this will optimize how Mongo retrieves all messages from a session #### Who can review? Tag maintainers/contributors who might be interested: @dev2049	2023-06-04 14:39:56 -07:00
Jason Weill	6c11f94013	Retitles Bedrock doc to appear in correct alphabetical order in site nav (#5639 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #5638. Retitles "Amazon Bedrock" page to "Bedrock" so that the Integrations section of the left nav is properly sorted in alphabetical order. #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-04 14:39:25 -07:00
Will Smith	6e25e65085	SQL agent : Improved prompt engineering prevents agent guessing database column names. (#5671 ) @vowelparrot: Minor change to the SQL agent: Tells agent to introspect the schema of the most relevant tables, I found this to dramatically decrease the chance that the agent wastes times guessing column names.	2023-06-04 14:39:00 -07:00
Nuhman Pk	8f98592ac9	Added Dependencies Status, Open issues and releases badges in Readme.md (#5681 ) [![Dependency Status](https://img.shields.io/librariesio/github/hwchase17/langchain)](https://libraries.io/github/hwchase17/langchain) [![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues) [![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)	2023-06-04 14:30:52 -07:00
Harrison Chase	b9040669a0	Harrison/pipeline prompt (#5540 ) idea is to make prompts more composable	2023-06-04 14:29:37 -07:00
George Roberts	647210a4b9	Add args_schema to google_places tool (#5680 ) Tiny change to actually add the args_schema to the tool. @vowelparrot	2023-06-04 14:28:46 -07:00
Ralph Schlosser	8fea0529c1	This fixes issue #5651 - GPT4All wrapper loading issue (#5657 ) Fixes #5651 Small typo in wrapper code. Note the `model_type` parameter is currently unused by GPT4All. https://github.com/hwchase17/langchain/issues/5651 #### Who can review?	2023-06-04 07:21:16 -07:00
Jiayao Yu	6a3ceaa377	Support similarity_score_threshold retrieval with Chroma (#5655 ) Fixes https://github.com/hwchase17/langchain/issues/5067 Verified the following code now works correctly: ``` db = Chroma(persist_directory=index_directory(index_name), embedding_function=embeddings) retriever = db.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.4}) docs = retriever.get_relevant_documents(query) ```	2023-06-03 16:57:00 -07:00
Hao Chen	3e45b83065	Improve Error Messaging for APOC Procedure Failure in Neo4jGraph (#5547 ) ## Improve Error Messaging for APOC Procedure Failure in Neo4jGraph This commit revises the error message provided when the 'apoc.meta.data()' procedure fails. Previously, the message simply instructed the user to install the APOC plugin in Neo4j. The new error message is more specific. Also removed an unnecessary newline in the Cypher statement variable: `node_properties_query`. Fixes #5545 ## Who can review? - @vowelparrot - @dev2049	2023-06-03 16:56:39 -07:00
Ricardo Reis	33ea606f45	Update youtube.py - Fix metadata validation error in YoutubeLoader (#5479 ) This commit addresses a ValueError occurring when the YoutubeLoader class tries to add datetime metadata from a YouTube video's publish date. The error was happening because the ChromaDB metadata validation only accepts str, int, or float data types. In the `_get_video_info` method of the `YoutubeLoader` class, the publish date retrieved from the YouTube video was of datetime type. This commit fixes the issue by converting the datetime object to a string before adding it to the metadata dictionary. Additionally, this commit introduces error handling in the `_get_video_info` method to ensure that all metadata fields have valid values. If a metadata field is found to be None, a default value is assigned. This prevents potential errors during metadata validation when metadata fields are None. The file modified in this commit is youtube.py. # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-03 16:56:17 -07:00
Shuqian	5af2c51e78	refactor: BaseStringMessagePromptTemplate from_template method (#5332 ) # refactor BaseStringMessagePromptTemplate from_template method Refactor the `from_template` method of the `BaseStringMessagePromptTemplate` class to allow passing keyword arguments to the `from_template` method of `PromptTemplate`. Enable the usage of arguments like `template_format`. In my scenario, I intend to utilize Jinja2 for formatting the human message prompt in the chat template. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Models - @hwchase17 - @agola11 - @jonasalexander --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 16:55:58 -07:00
mbchang	d3bdb8ea6d	FileCallbackHandler (#5589 ) # like [StdoutCallbackHandler](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/stdout.py), but writes to a file When running experiments I have found myself wanting to log the outputs of my chains in a more lightweight way than using WandB tracing. This PR contributes a callback handler that writes to file what `StdoutCallbackHandler` would print. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ## Example Notebook <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> See the included `filecallbackhandler.ipynb` notebook for usage. Would it be better to include this notebook under `modules/callbacks` or under `integrations/`? ![image](https://github.com/hwchase17/langchain/assets/6439365/c624de0e-343f-4eab-a55b-8808a887489f) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-03 16:48:48 -07:00
rajib	1c51d3db0f	Created fix for 5475 (#5659 ) Created fix for 5475 Currently in PGvector, we do not have any function that returns the instance of an existing store. The from_documents always adds embeddings and then returns the store. This fix is to add a function that will return the instance of an existing store Also changed the jupyter example for PGVector to show the example of using the function <!-- Remove if not applicable --> Fixes # 5475 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @dev2049 @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 16:47:52 -07:00
Michael Landis	475007d63a	fix: correct momento chat history notebook typo and title (#5646 ) This PR corrects a minor typo in the Momento chat message history notebook and also expands the title from "Momento" to "Momento Chat History", inline with other chat history storage providers. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? cc @dev2049 who reviewed the original integration	2023-06-03 16:39:27 -07:00
Paul-Emile Brotons	92f218207b	removing client+namespace in favor of collection (#5610 ) removing client+namespace in favor of collection for an easier instantiation and to be similar to the typescript library @dev2049	2023-06-03 16:27:31 -07:00
Harrison Chase	ad09367a92	Harrison/pubmed integration (#5664 ) Co-authored-by: younis basher <71520361+younis-ba@users.noreply.github.com> Co-authored-by: Younis Bashir <younis@omicmd.com>	2023-06-03 16:25:28 -07:00
Harrison Chase	9921f8cc3a	Harrison/update azure nb (#5665 ) Co-authored-by: NEWTON MALLICK <38786893+N-E-W-T-O-N@users.noreply.github.com>	2023-06-03 16:25:08 -07:00
C.J. Jameson	4e71a1702b	nit: pgvector python example notebook, fix variable reference (#5595 ) # Your PR Title (What it does) Fixes the pgvector python example notebook : one of the variables was not referencing anything ## Before submitting ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-06-03 15:29:34 -07:00
Leonid Ganeline	b201cfaa0f	docs `ecosystem/integrations` update 4 (#5590 ) # docs `ecosystem/integrations` update 4 Added missed integrations. Fixed inconsistencies. ## Who can review? @hwchase17 @dev2049	2023-06-03 15:29:03 -07:00
Davis Chase	ae3611730a	handle single arg to and/or (#5637 ) @ryderwishart @eyurtsev thoughts on handling this in the parser itself? related to #5570	2023-06-03 15:18:46 -07:00
khallbobo	934319fc28	Add parameters to send_message() call for vertexai chat models (PaLM2) (#5566 ) # Ensure parameters are used by vertexai chat models (PaLM2) The current version of the google aiplatform contains a bug where parameters for a chat model are not used as intended. See https://github.com/googleapis/python-aiplatform/issues/2263 Params can be passed both to start_chat() and send_message(); however, the parameters passed to start_chat() will not be used if send_message() is called without the overrides. This is due to the defaults in send_message() being global values rather than None (there is code in send_message() which would use the params from start_chat() if the param passed to send_message() evaluates to False, but that won't happen as the defaults are global values). Fixes # 5531 @hwchase17 @agola11	2023-06-03 15:17:38 -07:00
UmerHA	44ad9628c9	QuickFix for FinalStreamingStdOutCallbackHandler: Ignore new lines & white spaces (#5497 ) # Make FinalStreamingStdOutCallbackHandler more robust by ignoring new lines & white spaces `FinalStreamingStdOutCallbackHandler` doesn't work out of the box with `ChatOpenAI`, as it tokenized slightly differently than `OpenAI`. The response of `OpenAI` contains the tokens `["\nFinal", " Answer", ":"]` while `ChatOpenAI` contains `["Final", " Answer", ":"]`. This PR make `FinalStreamingStdOutCallbackHandler` more robust by ignoring new lines & white spaces when determining if the answer prefix has been reached. Fixes #5433 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Tracing / Callbacks - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589	2023-06-03 15:05:58 -07:00
Nathan Azrak	1f4abb265a	Adds the option to pass the original prompt into the AgentExecutor for PlanAndExecute agents (#5401 ) # Adds the option to pass the original prompt into the AgentExecutor for PlanAndExecute agents This PR allows the user to optionally specify that they wish for the original prompt/objective to be passed into the Executor agent used by the PlanAndExecute agent. This solves a potential problem where the plan is formed referring to some context contained in the original prompt, but which is not included in the current prompt. Currently, the prompt format given to the Executor is: ``` System: Respond to the human as helpfully and accurately as possible. You have access to the following tools: <Tool and Action Description> <Output Format Description> Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation:. Thought: Human: <Previous steps> <Current step> ``` This PR changes the final part after `Human:` to optionally insert the objective: ``` Human: <objective> <Previous steps> <Current step> ``` I have given a specific example in #5400 where the context of a database path is lost, since the plan refers to the "given path". The PR has been linted and formatted. So that existing behaviour is not changed, I have defaulted the argument to `False` and added it as the last argument in the signature, so it does not cause issues for any users passing args positionally as opposed to using keywords. Happy to take any feedback or make required changes! Fixes #5400 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @vowelparrot --------- Co-authored-by: Nathan Azrak <nathan.azrak@gmail.com>	2023-06-03 14:59:09 -07:00
Felipe Ferreira	ae2cf1f598	Implements support for Personal Access Token Authentication in the ConfluenceLoader (#5385 ) # Implements support for Personal Access Token Authentication in the ConfluenceLoader Fixes #5191 Implements a new optional parameter for the ConfluenceLoader: `token`. This allows the use of personal access authentication when using the on-prem server version of Confluence. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @Jflick58 Twitter Handle: felipe_yyc --------- Co-authored-by: Felipe <feferreira@ea.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 14:57:49 -07:00
Gardner Bickford	b81f98b8a6	Update confluence.py to return spaces between elements (#5383 ) # Update confluence.py to return spaces between elements like headers and links. Please see https://stackoverflow.com/questions/48913975/how-to-return-nicely-formatted-text-in-beautifulsoup4-when-html-text-is-across-m Given: ```html <address> 183 Main St<br>East Copper<br>Massachusetts<br>U S A<br> MA 01516-113 </address> ``` The document loader currently returns: ``` '183 Main StEast CopperMassachusettsU S A MA 01516-113' ``` After this change, the document loader will return: ``` 183 Main St East Copper Massachusetts U S A MA 01516-113 ``` @eyurtsev would you prefer this to be an option that can be passed in?	2023-06-03 14:57:25 -07:00
Zeeland	b72401b47b	pref: reduce DB query error rate (#5339 ) # Reduce DB query error rate If you use sql agent of `SQLDatabaseToolkit` to query data, it is prone to errors in query fields and often uses fields that do not exist in database tables for queries. However, the existing prompt does not effectively make the agent aware that there are problems with the fields they query. At this time, we urgently need to improve the prompt so that the agent realizes that they have queried non-existent fields and allows them to use the `schema_sql_db`, that is,` ListSQLDatabaseTool` first queries the corresponding fields in the table in the database, and then uses `QuerySQLDatabaseTool` for querying. There is a demo of my project to show this problem. Original Agent ```python def create_mysql_kit(): db = SQLDatabase.from_uri("mysql+pymysql://xxxxxxx") llm = OpenAI(temperature=0) toolkit = SQLDatabaseToolkit(db=db, llm=llm) agent_executor = create_sql_agent( llm=OpenAI(temperature=0), toolkit=toolkit, verbose=True ) agent_executor.run("Who are the users of sysuser in this system? Tell me the username of all users") if __name__ == '__main__': create_mysql_kit() ``` original output ``` > Entering new AgentExecutor chain... Action: list_tables_sql_db Action Input: "" Observation: app_sysrole_menus, app_bimfacemodel, app_project_users, app_measuringpointdata, auth_user, auth_user_groups, django_apscheduler_djangojobexecution, app_project, app_elementpoint, django_apscheduler_djangojob, django_content_type, app_sysrole, django_admin_log, app_bimfaceaccount, app_measuringpoint_warning_thresholds, app_measuringpoint, app_company, auth_group_permissions, app_sysuser, app_sysuser_companies, app_sysmenu, app_datawarningthreshold, auth_group, auth_permission, app_datawarningrecord, auth_user_user_permissions, app_bimfaceaccount_bimface_models, django_migrations, app_measuringitem, django_session Thought: I should query the app_sysuser table to get the users in the system. Action: query_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: SELECT username FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT name FROM app_sysuser LIMIT 10; Observation: SELECT name FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT name FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'name' in 'field list'") [SQL: SELECT name FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT id, username FROM app_sysuser LIMIT 10; Observation: SELECT id, username FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT id, username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'id' in 'field list'") [SQL: SELECT id, username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT user_id, username FROM app_sysuser LIMIT 10; Observation: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT user_id, user_name FROM app_sysuser LIMIT 10; Observation: SELECT user_id, user_name FROM app_sysuser WHERE user_id NOT IN (NULL) AND user_name BETWEEN 'A' AND 'Z' UNION ALL SELECT user_id, user_name FROM app_sysuser WHERE user_id IS NOT NULL AND user_name LIKE '%a%' LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT user_id, user_name FROM app_sysuser WHERE user_id NOT IN (NULL) AND user_name BETWEEN 'A' AND 'Z' UNION ALL SELECT user_id, user_name FROM app_sysuser WHERE user_id IS NOT NULL AND user_name LIKE '%a%' LIMIT 10; Observation: [] Thought: I now know the final answer. Final Answer: There are no users in the sysuser table. > Finished chain. Process finished with exit code 0 ``` As you can see, agent always use incorrect sql to query the data. Actually, the username field name of my `app_sysuser` table is `user_name`, not the `username` believed by the agent. Optimized agent output ``` > Entering new AgentExecutor chain... Action: list_tables_sql_db Action Input: "" Observation: app_project_users, django_content_type, app_measuringpoint_warning_thresholds, app_bimfaceaccount_bimface_models, django_session, app_sysuser, django_migrations, app_bimfacemodel, app_elementpoint, app_measuringpoint, django_apscheduler_djangojobexecution, auth_permission, app_sysuser_companies, app_measuringpointdata, app_measuringitem, app_sysrole_menus, django_admin_log, auth_group, django_apscheduler_djangojob, app_sysmenu, app_project, app_bimfaceaccount, app_datawarningthreshold, app_datawarningrecord, auth_user_groups, auth_user_user_permissions, auth_group_permissions, auth_user, app_company, app_sysrole Thought: I should query the app_sysuser table to get the usernames of all the users. Action: query_sql_db Action Input: SELECT username FROM app_sysuser Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should use the schema_sql_db tool to get the correct table fields. Action: schema_sql_db Action Input: app_sysuser Observation: CREATE TABLE app_sysuser ( user_id INTEGER NOT NULL AUTO_INCREMENT, user_account VARCHAR(50) COLLATE utf8mb4_general_ci NOT NULL, user_password VARCHAR(255) COLLATE utf8mb4_general_ci NOT NULL, user_name VARCHAR(50) COLLATE utf8mb4_general_ci NOT NULL, user_avatar VARCHAR(255) COLLATE utf8mb4_general_ci, user_email VARCHAR(254) COLLATE utf8mb4_general_ci, user_phone VARCHAR(20) COLLATE utf8mb4_general_ci, user_create_time DATETIME(6) NOT NULL, user_update_time DATETIME(6) NOT NULL, user_status SMALLINT, role_id_id INTEGER, PRIMARY KEY (user_id), CONSTRAINT app_sysuser_role_id_id_51b50f39_fk_app_sysrole_role_id FOREIGN KEY(role_id_id) REFERENCES app_sysrole (role_id) )COLLATE utf8mb4_general_ci DEFAULT CHARSET=utf8mb4 ENGINE=InnoDB /* 3 rows from app_sysuser table: user_id user_account user_password user_name user_avatar user_email user_phone user_create_time user_update_time user_status role_id_id xxxxxxxxxxxxxx */ Thought: I should query the app_sysuser table to get the usernames of all the users. Action: query_sql_db Action Input: SELECT user_account FROM app_sysuser LIMIT 10 Observation: [('baiyun',), ('eatrice',), ('lisi',), ('pingxiang',), ('wangwu',), ('zeeland',), ('zsj',), ('zzw',)] Thought: I now know the final answer Final Answer: The usernames of the users in the sysuser table are baiyun, eatrice, lisi, pingxiang, wangwu, zeeland, zsj, and zzw. > Finished chain. Process finished with exit code 0 ``` I have tested about 10 related prompts and they all work properly, with a much lower error rate compared to before ## Who can review? @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 14:55:56 -07:00
mbchang	ce6dbe41a9	minor refactor GenerativeAgentMemory (#5315 ) # minor refactor of GenerativeAgentMemory <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> - refactor `format_memories_detail` to be more reusable - modified prompts for getting topics for reflection and for generating insights - update `characters.ipynb` to reflect changes ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @vowelparrot @hwchase17 @dev2049	2023-06-03 14:53:14 -07:00
Leonid Ganeline	95c6ed0568	docs: `modules` pages simplified (#5116 ) # docs: modules pages simplified Fixied #5627 issue Merged several repetitive sections in the `modules` pages. Some texts, that were hard to understand, were also simplified. ## Who can review? @hwchase17 @dev2049	2023-06-03 14:44:32 -07:00
Chandan Routray	bc875a9df1	Fixed multi input prompt for MapReduceChain (#4979 ) # Fixed multi input prompt for MapReduceChain Added `kwargs` support for inner chains of `MapReduceChain` via `from_params` method Currently the `from_method` method of intialising `MapReduceChain` chain doesn't work if prompt has multiple inputs. It happens because it uses `StuffDocumentsChain` and `MapReduceDocumentsChain` underneath, both of them require specifying `document_variable_name` if `prompt` of their `llm_chain` has more than one `input`. With this PR, I have added support for passing their respective `kwargs` via the `from_params` method. ## Fixes https://github.com/hwchase17/langchain/issues/4752 ## Who can review? @dev2049 @hwchase17 @agola11 --------- Co-authored-by: imeckr <chandanroutray2012@gmail.com>	2023-06-03 14:41:03 -07:00
Matt Robinson	a97e4252e3	feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617 ) # Unstructured Excel Loader Adds an `UnstructuredExcelLoader` class for `.xlsx` and `.xls` files. Works with `unstructured>=0.6.7`. A plain text representation of the Excel file will be available under the `page_content` attribute in the doc. If you use the loader in `"elements"` mode, an HTML representation of the Excel file will be available under the `text_as_html` metadata key. Each sheet in the Excel document is its own document. ### Testing ```python from langchain.document_loaders import UnstructuredExcelLoader loader = UnstructuredExcelLoader( "example_data/stanley-cups.xlsx", mode="elements" ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	2023-06-03 12:44:12 -07:00
Leonid Ganeline	9a7488a5ce	fix import issue (#5636 ) # fix for the import issue Added document loader classes from [`figma`, `iugu`, `onedrive_file`] to `document_loaders/__inti__.py` imports Also sorted `__all__` Fixed #5623 issue	2023-06-02 14:58:41 -07:00
Zander Chase	20ec1173f4	Update Tracer Auth / Reduce Num Calls (#5517 ) Update the session creation and calls --------- Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	2023-06-02 12:13:56 -07:00
Sean Morgan	949729ff5c	Fix bedrock llm boto3 client instantiation (#5629 ) Same issue as https://github.com/hwchase17/langchain/pull/5574	2023-06-02 12:04:49 -07:00
Caleb Ellington	c5a7a85a4e	fix chroma update_document to embed entire documents, fixes a characer-wise embedding bug (#5584 ) # Chroma update_document full document embeddings bugfix Chroma update_document takes a single document, but treats the page_content sting of that document as a list when getting the new document embedding. This is a two-fold problem, where the resulting embedding for the updated document is incorrect (it's only an embedding of the first character in the new page_content) and it calls the embedding function for every character in the new page_content string, using many tokens in the process. Fixes #5582 Co-authored-by: Caleb Ellington <calebellington@Calebs-MBP.hsd1.ca.comcast.net>	2023-06-02 11:12:48 -07:00
Davis Chase	3c6fa9126a	bump 189 (#5620 )	2023-06-02 09:09:22 -07:00
Davis Chase	d784401215	Dev2049/add argilla callback (#5621 ) Co-authored-by: Alvaro Bartolome <alvarobartt@gmail.com> Co-authored-by: Daniel Vila Suero <daniel@argilla.io> Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>	2023-06-02 09:05:06 -07:00
Kacper Łukawski	71a7c16ee0	Fix: Qdrant ids (#5515 ) # Fix Qdrant ids creation There has been a bug in how the ids were created in the Qdrant vector store. They were previously calculated based on the texts. However, there are some scenarios in which two documents may have the same piece of text but different metadata, and that's a valid case. Deduplication should be done outside of insertion. It has been fixed and covered with the integration tests. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-02 08:57:34 -07:00
Jeff Vestal	d1f65d8dc1	Es knn index search 5346 (#5569 ) # Create elastic_vector_search.ElasticKnnSearch class This extends `langchain/vectorstores/elastic_vector_search.py` by adding a new class `ElasticKnnSearch` Features: - Allow creating an index with the `dense_vector` mapping compataible with kNN search - Store embeddings in index for use with kNN search (correct mapping creates HNSW data structure) - Perform approximate kNN search - Perform hybrid BM25 (`query{}`) + kNN (`knn{}`) search - perform knn search by either providing a `query_vector` or passing a hosted `model_id` to use query_vector_builder to automatically generate a query_vector at search time Connection options - Using `cloud_id` from Elastic Cloud - Passing elasticsearch client object search options - query - k - query_vector - model_id - size - source - knn_boost (hybrid search) - query_boost (hybrid search) - fields This also adds examples to `docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb` Fixes # [5346](https://github.com/hwchase17/langchain/issues/5346) cc: @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-02 08:40:35 -07:00
Davis Chase	8b3df18bcc	human approval callback (#5581 ) ![Screenshot 2023-06-01 at 2 39 40 PM](https://github.com/hwchase17/langchain/assets/130488702/769f1480-7e51-46d9-bcde-698d0b091803)	2023-06-02 06:59:33 -07:00
Zander Chase	6655f43282	Rm Template Title (#5616 ) Remove the redundant title from the PR template #### Before submitting	2023-06-02 06:54:55 -07:00
Bharat Ramanathan	28d6277396	docs(integration): update colab and external links in WandbTracing docs (#5602 ) # Update Wandb Tracking documentation This PR updates the Wandb Tracking documentation for formatting, updated broken links and colab notebook links --------- Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>	2023-06-02 02:58:42 -07:00
Waldecir Santos	db45970a66	Fix SQLAlchemy truncating text when it is too big (#5206 ) # Fixes SQLAlchemy truncating the result if you have a big/text column with many chars. SQLAlchemy truncates columns if you try to convert a Row or Sequence to a string directly For comparison: - Before: ```[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ... (2 characters truncated) ... hat is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]``` - After: ```[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]``` ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: I'm not sure who to tag for chains, maybe @vowelparrot ?	2023-06-01 21:33:31 -04:00
Davis Chase	4c572ffe95	nit (#5578 )	2023-06-01 14:21:15 -07:00
sseide	001b147450	Documentation fixes (linting and broken links) (#5563 ) # Lint sphinx documentation and fix broken links This PR lints multiple warnings shown in generation of the project documentation (using "make docs_linkcheck" and "make docs_build"). Additionally documentation internal links to (now?) non-existent files are modified to point to existing documents as it seemed the new correct target. The documentation is not updated content wise. There are no source code changes. Fixes # (issue) - broken documentation links to other files within the project - sphinx formatting (linting) ## Before submitting No source code changes, so no new tests added. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 13:06:17 -07:00
Sean Morgan	8441cff1d7	Fix bedrock auth validation (#5574 ) https://github.com/hwchase17/langchain/pull/5523 has a small bug if client was not passed in constructor	2023-06-01 12:35:06 -07:00
Andrew Lei	6258f72a00	Add missing comma in conv chat agent prompt json (#5573 ) # Add missing comma in conversational chat agent prompt json Inspired by: https://github.com/hwchase17/langchainjs/pull/1498	2023-06-01 12:12:44 -07:00
Ikko Eltociear Ashimine	14a611775c	Fix typo in docugami.ipynb (#5571 ) # Fix typo in docugami.ipynb Fixed typo. infromation -> information	2023-06-01 11:45:56 -07:00
Blithe	80b3fdf2f7	make the elasticsearch api support version which below 8.x (#5495 ) the api which create index or search in the elasticsearch below 8.x is different with 8.x. When use the es which below 8.x , it will throw error. I fix the problem Co-authored-by: gaofeng27692 <gaofeng27692@hundsun.com>	2023-06-01 10:58:20 -07:00
Davis Chase	6632188606	bump 188 (#5568 )	2023-06-01 08:50:54 -07:00
Davis Chase	6afb463e9b	Qdrant self query (#5567 ) Add self query abilities to qdrant vectorstore	2023-06-01 08:40:31 -07:00
Patrick Keane	47c2ec2d0b	Corrects inconsistently misspelled variable name. (#5559 ) Corrects a spelling error (of the word separator) in several variable names. Three cut/paste instances of this were corrected, amidst instances of it also being named properly, which would likely would lead to issues for someone in the future. Here is one such example: ``` seperators = self.get_separators_for_language(Language.PYTHON) super().__init__(separators=seperators, kwargs) ``` becomes ``` separators = self.get_separators_for_language(Language.PYTHON) super().__init__(separators=separators, kwargs) ``` Make test results below: ``` ============================== 708 passed, 52 skipped, 27 warnings in 11.70s ============================== ```	2023-06-01 10:27:58 -04:00
Harrison Chase	342b671d05	add brave search util (#5538 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 01:11:51 -07:00
Davis Chase	983a213bdc	add maxcompute (#5533 ) cc @pengwork (fresh branch, no creds)	2023-06-01 00:54:42 -07:00
Bharat Ramanathan	22603d19e0	feat(integrations): Add WandbTracer (#4521 ) # WandbTracer This PR adds the `WandbTracer` and deprecates the existing `WandbCallbackHandler`. Added an example notebook under the docs section alongside the `LangchainTracer` Here's an example [colab](https://colab.research.google.com/drive/1pY13ym8ENEZ8Fh7nA99ILk2GcdUQu0jR?usp=sharing) with the same notebook and the [trace](https://wandb.ai/parambharat/langchain-tracing/runs/8i45cst6) generated from the colab run Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 00:01:19 -07:00
Leonid Ganeline	373ad49157	docs `ecosystem/integrations` update 3 (#5470 ) # docs: `ecosystem_integrations` update 3 Next cycle of updating the `ecosystem/integrations` * Added an integration `template` file * Added missed integration files * Fixed several document_loaders/notebooks ## Who can review? Is it possible to assign somebody to review PRs on docs? Thanks.	2023-05-31 17:54:05 -07:00
Aditi Viswanathan	bc66b3fb8d	make BaseEntityStore inherit from BaseModel (#5478 ) # Make BaseEntityStore inherit from BaseModel This enables initializing InMemoryEntityStore by optionally passing in a value for the store field. ## Who can review? It's a small change so I think any of the reviewers can review, but tagging @dev2049 who seems most relevant since the change relates to Memory.	2023-05-31 17:32:19 -07:00
Sheng Han Lim	3bae595182	Add texts with embeddings to PGVector wrapper (#5500 ) Similar to #1813 for faiss, this PR is to extend functionality to pass text and its vector pair to initialize and add embeddings to the PGVector wrapper. Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @dev2049	2023-05-31 17:31:52 -07:00
Tobias van der Werff	8d07ba0d51	Fix wrong class instantiation in docs MMR example (#5501 ) # Fix wrong class instantiation in docs MMR example <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> When looking at the Maximal Marginal Relevance ExampleSelector example at https://python.langchain.com/en/latest/modules/prompts/example_selectors/examples/mmr.html, I noticed that there seems to be an error. Initially, the `MaxMarginalRelevanceExampleSelector` class is used as an `example_selector` argument to the `FewShotPromptTemplate` class. Then, according to the text, a comparison is made to regular similarity search. However, the `FewShotPromptTemplate` still uses the `MaxMarginalRelevanceExampleSelector` class, so the output is the same. To fix it, I added an instantiation of the `SemanticSimilarityExampleSelector` class, because this seems to be what is intended. ## Who can review? @hwchase17	2023-05-31 17:30:59 -07:00
Taras Tsugrii	b61f50665e	[retrievers][knn] Replace loop appends with list comprehension. (#5529 ) # Replace loop appends with list comprehension. It's much faster, more idiomatic and slightly more readable.	2023-05-31 16:57:24 -07:00
Taras Tsugrii	0ad76c3380	Replace loop appends with list comprehension. (#5528 ) # Replace loop appends with list comprehension. It's significantly faster because it avoids repeated method lookup. It's also more idiomatic and readable.	2023-05-31 16:56:13 -07:00
Timothy Ji	bd9e0f3934	Add param requests_kwargs for WebBaseLoader (#5485 ) # Add param `requests_kwargs` for WebBaseLoader Fixes # (issue) #5483 ## Who can review? @eyurtsev	2023-05-31 15:27:38 -07:00
Taras Tsugrii	359fb8fa3a	Replace list comprehension with generator. (#5526 ) # Replace list comprehension with generator. Since these strings can be fairly long, it's best to not construct unnecessary temporary list just to pass it to `join`. Generators produce items one-by-one and even though they are slightly more expensive than lists in terms of CPU they are much more memory-friendly and slightly more readable.	2023-05-31 15:10:43 -07:00
Matt Robinson	4c8aad0d1b	docs: unstructured no longer requires installing detectron2 from source (#5524 ) # Update Unstructured docs to remove the `detectron2` install instructions Removes `detectron2` installation instructions from the Unstructured docs because installing `detectron2` is no longer required for `unstructured>=0.7.0`. The `detectron2` model now runs using the ONNX runtime. ## Who can review? @hwchase17 @eyurtsev	2023-05-31 15:03:21 -07:00
Rithwik Ediga Lakhamsani	d765d77e9b	Add minor fixes for PySpark Document Loader Docs (#5525 ) # Add minor fixes for PySpark Document Loader Docs Renamed "PySpack" to "PySpark" and executed the notebook to show outputs.	2023-05-31 15:02:57 -07:00
Taras Tsugrii	af41cdfc8b	Replace enumerate with zip. (#5527 ) # Replace enumerate with zip. It's more idiomatic and slightly more readable.	2023-05-31 15:02:23 -07:00
James O'Dwyer	226a7521ed	Add Managed Motorhead (#5507 ) # Add Managed Motorhead This change enabled MotorheadMemory to utilize Metal's managed version of Motorhead. We can easily enable this by passing in a `api_key` and `client_id` in order to hit the managed url and access the memory api on Metal. Twitter: [@softboyjimbo](https://twitter.com/softboyjimbo) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 14:55:41 -07:00
Piyush Jain	5ffa924488	Skips creating boto client for Bedrock if passed in constructor (#5523 ) # Skips creating boto client if passed in constructor Current LLM and Embeddings class always creates a new boto client, even if one is passed in a constructor. This blocks certain users from passing in externally created boto clients, for example in SSO authentication. ## Who can review? @hwchase17 @jasondotparse @rsgrewal-aws <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-31 14:54:12 -07:00
Leonid Ganeline	6b47aaab82	added DeepLearing.AI course link (#5518 ) # added DeepLearing.AI course link ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: not @hwchase17 - hehe	2023-05-31 14:53:14 -07:00
Víctor Navarro Aránguiz	f39340ff6b	Add allow_download as attribute for GPT4All (#5512 ) # Added support for download GPT4All model if does not exist I've include the class attribute `allow_download` to the GPT4All class. By default, `allow_download` is set to False. ## Changes Made - Added a new attribute `allow_download` to the GPT4All class. - Updated the `validate_environment` method to pass the `allow_download` parameter to the GPT4All model constructor. ## Context This change provides more control over model downloading in the GPT4All class. Previously, if the model file was not found in the cache directory `~/.cache/gpt4all/`, the package returned error "Failed to retrieve model (type=value_error)". Now, if `allow_download` is set as True then it will use GPT4All package to download it . With the addition of the `allow_download` attribute, users can now choose whether the wrapper is allowed to download the model or not. ## Dependencies There are no new dependencies introduced by this change. It only utilizes existing functionality provided by the GPT4All package. ## Testing Since this is a minor change to the existing behavior, the existing test suite for the GPT4All package should cover this scenario Co-authored-by: Vokturz <victornavarrrokp47@gmail.com>	2023-05-31 13:32:31 -07:00
Zander Chase	ea09c0846f	Add Feedback Methods + Evaluation examples (#5166 ) Add CRUD methods to interact with feedback endpoints + added eval examples to the notebook	2023-05-31 11:14:27 -07:00
Davis Chase	46b7181f13	bump 187 (#5504 )	2023-05-31 07:35:09 -07:00
Harrison Chase	f0ea77b230	add more vars to text splitter (#5503 )	2023-05-31 07:21:20 -07:00
Piyush Jain	562fdfc8f9	Bedrock llm and embeddings (#5464 ) # Bedrock LLM and Embeddings This PR adds a new LLM and an Embeddings class for the [Bedrock](https://aws.amazon.com/bedrock) service. The PR also includes example notebooks for using the LLM class in a conversation chain and embeddings usage in creating an embedding for a query and document. Note: AWS is doing a private release of the Bedrock service on 05/31/2023; users need to request access and added to an allowlist in order to start using the Bedrock models and embeddings. Please use the [Bedrock Home Page](https://aws.amazon.com/bedrock) to request access and to learn more about the models available in Bedrock. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-31 07:17:01 -07:00
Harrison Chase	5ce74b5958	code splitter docs (#5480 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 07:11:53 -07:00
Harrison Chase	470b2822a3	Add matching engine vectorstore (#3350 ) Co-authored-by: Tom Piaggio <tomaspiaggio@google.com> Co-authored-by: scafati98 <jupyter@matchingengine.us-central1-a.c.scafati-joonix.internal> Co-authored-by: scafati98 <scafatieugenio@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 02:28:02 -07:00
Kacper Łukawski	8bcaca435a	Feature: Qdrant filters supports (#5446 ) # Support Qdrant filters Qdrant has an [extensive filtering system](https://qdrant.tech/documentation/concepts/filtering/) with rich type support. This PR makes it possible to use the filters in Langchain by passing an additional param to both the `similarity_search_with_score` and `similarity_search` methods. ## Who can review? @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 02:26:16 -07:00
Harrison Chase	f72bb966f8	Harrison/html splitter (#5468 ) Co-authored-by: David Revillas <26328973+r3v1@users.noreply.github.com>	2023-05-30 21:06:07 -07:00
Ankush Gola	1671c2afb2	py tracer fixes (#5377 )	2023-05-30 18:47:06 -07:00
Jose Ignacio Hervás Díaz	ce8b7a2a69	SQLite-backed Entity Memory (#5129 ) # SQLite-backed Entity Memory Following the initiative of https://github.com/hwchase17/langchain/pull/2397 I think it would be helpful to be able to persist Entity Memory on disk by default Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 18:39:47 -07:00
Jeff Vestal	46e181aa8b	Allow ElasticsearchEmbeddings to create a connection with ES Client object (#5321 ) This PR adds a new method `from_es_connection` to the `ElasticsearchEmbeddings` class allowing users to use Elasticsearch clusters outside of Elastic Cloud. Users can create an Elasticsearch Client object and pass that to the new function. The returned object is identical to the one returned by calling `from_credentials` ``` # Create Elasticsearch connection es_connection = Elasticsearch( hosts=['https://es_cluster_url:port'], basic_auth=('user', 'password') ) # Instantiate ElasticsearchEmbeddings using es_connection embeddings = ElasticsearchEmbeddings.from_es_connection( model_id, es_connection, ) ``` I also added examples to the elasticsearch jupyter notebook Fixes # https://github.com/hwchase17/langchain/issues/5239 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 17:26:30 -07:00
Mark Pors	0a44bfdca3	Allow for async use of SelfAskWithSearchChain (#5394 ) # Allow for async use of SelfAskWithSearchChain Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 17:02:39 -07:00
Víctor Navarro Aránguiz	8121e04200	added n_threads functionality for gpt4all (#5427 ) # Added support for modifying the number of threads in the GPT4All model I have added the capability to modify the number of threads used by the GPT4All model. This allows users to adjust the model's parallel processing capabilities based on their specific requirements. ## Changes Made - Updated the `validate_environment` method to set the number of threads for the GPT4All model using the `values["n_threads"]` parameter from the `GPT4All` class constructor. ## Context Useful in scenarios where users want to optimize the model's performance by leveraging multi-threading capabilities. Please note that the `n_threads` parameter was included in the `GPT4All` class constructor but was previously unused. This change ensures that the specified number of threads is utilized by the model . ## Dependencies There are no new dependencies introduced by this change. It only utilizes existing functionality provided by the GPT4All package. ## Testing Since this is a minor change testing is not required. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:31:30 -07:00
Blithe	e31705b5ab	convert the parameter 'text' to uppercase in the function 'parse' of the class BooleanOutputParser (#5397 ) when the LLMs output 'yes\|no'，BooleanOutputParser can parse it to 'True\|False', fix the ValueError in parse(). <!-- when use the BooleanOutputParser in the chain_filter.py, the LLMs output 'yes\|no'，the function 'parse' will throw ValueError。 --> Fixes # (issue) #5396 https://github.com/hwchase17/langchain/issues/5396 --------- Co-authored-by: gaofeng27692 <gaofeng27692@hundsun.com>	2023-05-30 16:26:17 -07:00
Natalie	199cc700a3	Ability to specify credentials wihen using Google BigQuery as a data loader (#5466 ) # Adds ability to specify credentials when using Google BigQuery as a data loader Fixes #5465 . Adds ability to set credentials which must be of the `google.auth.credentials.Credentials` type. This argument is optional and will default to `None. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:25:22 -07:00
Harrison Chase	eab4b4ccd7	add simple test for imports (#5461 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:24:27 -07:00
Janos Tolgyesi	1111f18eb4	Add maximal relevance search to SKLearnVectorStore (#5430 ) # Add maximal relevance search to SKLearnVectorStore This PR implements the maximum relevance search in SKLearnVectorStore. Twitter handle: jtolgyesi (I submitted also the original implementation of SKLearnVectorStore) ## Before submitting Unit tests are included. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:13:33 -07:00
Ayan Bandyopadhyay	8181f9e362	Update psychicapi version (#5471 ) Update [psychicapi](https://pypi.org/project/psychicapi/) python package dependency to the latest version 0.5. The newest python package version addresses breaking changes in the Psychic http api.	2023-05-30 15:55:22 -07:00
Kacper Łukawski	f93d256190	Feat: Add batching to Qdrant (#5443 ) # Add batching to Qdrant Several people requested a batching mechanism while uploading data to Qdrant. It is important, as there are some limits for the maximum size of the request payload, and without batching implemented in Langchain, users need to implement it on their own. This PR exposes a new optional `batch_size` parameter, so all the documents/texts are loaded in batches of the expected size (64, by default). The integration tests of Qdrant are extended to cover two cases: 1. Documents are sent in separate batches. 2. All the documents are sent in a single request.	2023-05-30 15:33:54 -07:00
Camille Van Hoffelen	80e133f16d	Added async _acall to FakeListLLM (#5439 ) # Added Async _acall to FakeListLLM FakeListLLM is handy when unit testing apps built with langchain. This allows the use of FakeListLLM inside concurrent code with [asyncio](https://docs.python.org/3/library/asyncio.html). I also changed the pydocstring which was out of date. ## Who can review? @hwchase17 - project lead @agola11 - async	2023-05-30 14:34:36 -07:00
Leonid Ganeline	1f11f80641	docs: cleaning (#5413 ) # docs cleaning Changed docs to consistent format (probably, we need an official doc integration template): - ClearML - added product descriptions; changed title/headers - Rebuff - added product descriptions; changed title/headers - WhyLabs - added product descriptions; changed title/headers - Docugami - changed title/headers/structure - Airbyte - fixed title - Wolfram Alpha - added descriptions, fixed title - OpenWeatherMap - - added product descriptions; changed title/headers - Unstructured - changed description ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 @dev2049	2023-05-30 13:58:16 -07:00
Matt Wells	1d861dc37a	MRKL output parser no longer breaks well formed queries (#5432 ) # Handles the edge scenario in which the action input is a well formed SQL query which ends with a quoted column There may be a cleaner option here (or indeed other edge scenarios) but this seems to robustly determine if the action input is likely to be a well formed SQL query in which we don't want to arbitrarily trim off `"` characters Fixes #5423 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Agents / Tools / Toolkits - @vowelparrot	2023-05-30 15:58:47 -04:00
Yoann Poupart	c1807d8408	`encoding_kwargs` for InstructEmbeddings (#5450 ) # What does this PR do? Bring support of `encode_kwargs` for ` HuggingFaceInstructEmbeddings`, change the docstring example and add a test to illustrate with `normalize_embeddings`. Fixes #3605 (Similar to #3914) Use case: ```python from langchain.embeddings import HuggingFaceInstructEmbeddings model_name = "hkunlp/instructor-large" model_kwargs = {'device': 'cpu'} encode_kwargs = {'normalize_embeddings': True} hf = HuggingFaceInstructEmbeddings( model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs ) ```	2023-05-30 11:57:04 -07:00
Patrick Keane	e09afb4b44	Removes duplicated call from langchain/client/langchain.py (#5449 ) This removes duplicate code presumably introduced by a cut-and-paste error, spotted while reviewing the code in ```langchain/client/langchain.py```. The original code had back to back occurrences of the following code block: ``` response = self._get( path, params=params, ) raise_for_status_with_text(response) ```	2023-05-30 11:52:46 -07:00
Jan Brinkmann	0d3a9d481f	Fixed docstring in faiss.py for load_local (#5440 ) # Fix for docstring in faiss.py vectorstore (load_local) The doctring should reflect that load_local loads something FROM the disk.	2023-05-30 11:41:00 -07:00
Davis Chase	4379bd4cbb	bump 186 (#5459 )	2023-05-30 10:47:59 -07:00
Davis Chase	2649b638dd	fix (#5457 )	2023-05-30 10:42:20 -07:00
Davis Chase	64b4165c8d	bump 185 (#5442 )	2023-05-30 08:08:11 -07:00
ByronHsu	9d658aaa5a	Add more code splitters (go, rst, js, java, cpp, scala, ruby, php, swift, rust) (#5171 ) As the title says, I added more code splitters. The implementation is trivial, so i don't add separate tests for each splitter. Let me know if any concerns. Fixes # (issue) https://github.com/hwchase17/langchain/issues/5170 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @hwchase17 --------- Signed-off-by: byhsu <byhsu@linkedin.com> Co-authored-by: byhsu <byhsu@linkedin.com>	2023-05-30 11:04:05 -04:00
Paul-Emile Brotons	a61b7f7e7c	adding MongoDBAtlasVectorSearch (#5338 ) # Add MongoDBAtlasVectorSearch for the python library Fixes #5337 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 07:59:01 -07:00
Harrison Chase	c4b502a470	Harrison/condense q llm (#5438 )	2023-05-30 07:15:37 -07:00
Lei Xu	ee57054d05	Rename and fix typo in lancedb (#5425 ) # Fix typo in LanceDB notebook filename	2023-05-30 00:24:17 -07:00
Zander Chase	26ff18575c	Set old LCTracer to default to port 8000 (#5381 ) Issue from: https://discord.com/channels/1038097195422978059/1069478035918688346/1112445980466483222	2023-05-29 22:42:53 -07:00
Harrison Chase	760632b292	Harrison/spark reader (#5405 ) Co-authored-by: Rithwik Ediga Lakhamsani <rithwik.ediga@databricks.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:23:17 -07:00
UmerHA	8259f9b7fa	DocumentLoader for GitHub (#5408 ) # Creates GitHubLoader (#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes #5257 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:11:21 -07:00
German Martin	0b3e0dd1d2	New Trello document loader (#4767 ) # Added New Trello loader class and documentation Simple Loader on top of py-trello wrapper. With a board name you can pull cards and to do some field parameter tweaks on load operation. I included documentation and examples. Included unit test cases using patch and a fixture for py-trello client class. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 19:47:56 -07:00
Harrison Chase	72f99ff953	Harrison/text splitter (#5417 ) adds support for keeping separators around when using recursive text splitter	2023-05-29 16:56:31 -07:00
小铭	cf5803e44c	Add ToolException that a tool can throw. (#5050 ) # Add ToolException that a tool can throw This is an optional exception that tool throws when execution error occurs. When this exception is thrown, the agent will not stop working,but will handle the exception according to the handle_tool_error variable of the tool,and the processing result will be returned to the agent as observation,and printed in pink on the console.It can be used like this: ```python from langchain.schema import ToolException from langchain import LLMMathChain, SerpAPIWrapper, OpenAI from langchain.agents import AgentType, initialize_agent from langchain.chat_models import ChatOpenAI from langchain.tools import BaseTool, StructuredTool, Tool, tool from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) llm_math_chain = LLMMathChain(llm=llm, verbose=True) class Error_tool: def run(self, s: str): raise ToolException('The current search tool is not available.') def handle_tool_error(error) -> str: return "The following errors occurred during tool execution:"+str(error) search_tool1 = Error_tool() search_tool2 = SerpAPIWrapper() tools = [ Tool.from_function( func=search_tool1.run, name="Search_tool1", description="useful for when you need to answer questions about current events.You should give priority to using it.", handle_tool_error=handle_tool_error, ), Tool.from_function( func=search_tool2.run, name="Search_tool2", description="useful for when you need to answer questions about current events", return_direct=True, ) ] agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, handle_tool_errors=handle_tool_error) agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?") ``` ![image](https://github.com/hwchase17/langchain/assets/32786500/51930410-b26e-4f85-a1e1-e6a6fb450ada) ## Who can review? - @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:05:58 +00:00
Harrison Chase	cce731c3c2	bump version 184 (#5407 )	2023-05-29 07:53:32 -07:00
Harrison Chase	2da8c48be1	Harrison/datetime parser (#4693 ) Co-authored-by: Jacob Valdez <jacobfv@msn.com> Co-authored-by: Jacob Valdez <jacob.valdez@limboid.ai> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-05-29 07:52:30 -07:00
Leonid Ganeline	1837caa70d	docs: `ecosystem/integrations` update 1 (#5219 ) # docs: ecosystem/integrations update It is the first in a series of `ecosystem/integrations` updates. The ecosystem/integrations list is missing many integrations. I'm adding the missing integrations in a consistent format: 1. description of the integrated system 2. `Installation and Setup` section with 'pip install ...`, Key setup, and other necessary settings 3. Sections like `LLM`, `Text Embedding Models`, `Chat Models`... with links to correspondent examples and imports of the used classes. This PR keeps new docs, that are presented in the `docs/modules/models/text_embedding/examples` but missed in the `ecosystem/integrations`. The next PRs will cover the next example sections. Also updated `integrations.rst`: added the `Dependencies` section with a link to the packages used in LangChain. ## Who can review? @hwchase17 @eyurtsev @dev2049	2023-05-29 07:25:17 -07:00
Leonid Ganeline	a3598193a0	docs: `ecosystem/integrations` update 2 (#5282 ) # docs: ecosystem/integrations update 2 #5219 - part 1 The second part of this update (parts are independent of each other! no overlap): - added diffbot.md - updated confluence.ipynb; added confluence.md - updated college_confidential.md - updated openai.md - added blackboard.md - added bilibili.md - added azure_blob_storage.md - added azlyrics.md - added aws_s3.md ## Who can review? @hwchase17@agola11 @agola11 @vowelparrot @dev2049	2023-05-29 07:19:43 -07:00
Eduard van Valkenburg	ccb6238de1	Implemented appending arbitrary messages (#5293 ) # Implemented appending arbitrary messages to the base chat message history, the in-memory and cosmos ones. <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> As discussed this is the alternative way instead of #4480, with a add_message method added that takes a BaseMessage as input, so that the user can control what is in the base message like kwargs. <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-29 07:18:59 -07:00
Harrison Chase	d6fb25c439	Harrison/prediction guard update (#5404 ) Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>	2023-05-29 07:14:59 -07:00
Harrison Chase	416c8b1da3	Harrison/deep infra (#5403 ) Co-authored-by: Yessen Kanapin <yessenzhar@gmail.com> Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	2023-05-29 07:10:50 -07:00
Timothy Ji	100d6655df	Reformat openai proxy setting as code (#5330 ) # Reformat the openai proxy setting as code Only affect the doc for openai Model - @hwchase17 - @agola11	2023-05-29 07:02:47 -07:00
Justin Flick	c09f8e4ddc	Add pagination for Vertex AI embeddings (#5325 ) Fixes #5316 --------- Co-authored-by: Justin Flick <jflick@homesite.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-29 06:57:41 -07:00
Harrison Chase	3e16468423	Harrison/llamacpp (#5402 ) Co-authored-by: Gavin S <gavinswanson@gmail.com>	2023-05-29 06:44:58 -07:00
Chandan Routray	642ae83d86	Removed deprecated llm attribute for load_chain (#5343 ) # Removed deprecated llm attribute for load_chain Currently `load_chain` for some chain types expect `llm` attribute to be present but `llm` is deprecated attribute for those chains and might not be persisted during their `chain.save`. Fixes #5224 [(issue)](https://github.com/hwchase17/langchain/issues/5224) ## Who can review? @hwchase17 @dev2049 --------- Co-authored-by: imeckr <chandanroutray2012@gmail.com>	2023-05-29 06:44:47 -07:00
Oleh Kuznetsov	f6615cac41	Update llamacpp demonstration notebook (#5344 ) # Update llamacpp demonstration notebook Add instructions to install with BLAS backend, and update the example of model usage. Fixes #5071. However, it is more like a prevention of similar issues in the future, not a fix, since there was no problem in the framework functionality ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11	2023-05-29 06:43:26 -07:00
Martin Holecek	44b48d9518	Fix update_document function, add test and documentation. (#5359 ) # Fix for `update_document` Function in Chroma ## Summary This pull request addresses an issue with the `update_document` function in the Chroma class, as described in [#5031](https://github.com/hwchase17/langchain/issues/5031#issuecomment-1562577947). The issue was identified as an `AttributeError` raised when calling `update_document` due to a missing corresponding method in the `Collection` object. This fix refactors the `update_document` method in `Chroma` to correctly interact with the `Collection` object. ## Changes 1. Fixed the `update_document` method in the `Chroma` class to correctly call methods on the `Collection` object. 2. Added the corresponding test `test_chroma_update_document` in `tests/integration_tests/vectorstores/test_chroma.py` to reflect the updated method call. 3. Added an example and explanation of how to use the `update_document` function in the Jupyter notebook tutorial for Chroma. ## Test Plan All existing tests pass after this change. In addition, the `test_chroma_update_document` test case now correctly checks the functionality of `update_document`, ensuring that the function works as expected and updates the content of documents correctly. ## Reviewers @dev2049 This fix will ensure that users are able to use the `update_document` function as expected, without encountering the previous `AttributeError`. This will enhance the usability and reliability of the Chroma class for all users. Thank you for considering this pull request. I look forward to your feedback and suggestions.	2023-05-29 06:39:25 -07:00
Louis Amaudruz	e455ba4ed5	Add async support to routing chains (#5373 ) # Add async support for (LLM) routing chains <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Add asynchronous LLM calls support for the routing chains. More specifically: - Add async `aroute` function (i.e. async version of `route`) to the `RouterChain` which calls the routing LLM asynchronously - Implement the async `_acall` for the `LLMRouterChain` - Implement the async `_acall` function for `MultiRouteChain` which first calls asynchronously the routing chain with its new `aroute` function, and then calls asynchronously the relevant destination chain. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? - @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Async - @agola11 -->	2023-05-29 06:37:26 -07:00
Gael Grosch	8b7721ebbb	fix: Blob.from_data mimetype is lost (#5395 ) # Fix lost mimetype when using Blob.from_data method The mimetype is lost due to a typo in the class attribue name Fixes # - (no issue opened but I can open one if needed) ## Changes * Fixed typo in name * Added unit-tests to validate the output Blob ## Review @eyurtsev	2023-05-29 06:36:50 -07:00
Jacob Lee	f77f27163d	Update PR template with Twitter handle request (#5382 ) # Updates PR template to request Twitter handle for shoutouts! Makes it easier for maintainers to show their appreciation 😄	2023-05-29 06:23:17 -07:00
Zander Chase	14099f1b93	Use Default Factory (#5380 ) We shouldn't be calling a constructor for a default value - should use default_factory instead. This is especially ad in this case since it requires an optional dependency and an API key to be set. Resolves #5361	2023-05-29 06:22:35 -07:00
Harrison Chase	6df90ad9fd	handle json parsing errors (#5371 ) adds tests cases, consolidates a lot of PRs	2023-05-29 06:18:19 -07:00
玄猫	99a1e3f3a3	Fix: Handle empty documents in ContextualCompressionRetriever (Issue #5304 ) (#5306 ) # Fix: Handle empty documents in ContextualCompressionRetriever (Issue #5304) Fixes #5304 Prevent cohere.error.CohereAPIError caused by an empty list of documents by adding a condition to check if the input documents list is empty in the compress_documents method. If the list is empty, return an empty list immediately, avoiding the error and unnecessary processing. @dev2049 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-28 13:19:34 -07:00
os1ma	1366d070fc	Add path validation to DirectoryLoader (#5327 ) # Add path validation to DirectoryLoader This PR introduces a minor adjustment to the DirectoryLoader by adding validation for the path argument. Previously, if the provided path didn't exist or wasn't a directory, DirectoryLoader would return an empty document list due to the behavior of the `glob` method. This could potentially cause confusion for users, as they might expect a file-loading error instead. So, I've added two validations to the load method of the DirectoryLoader: - Raise a FileNotFoundError if the provided path does not exist - Raise a ValueError if the provided path is not a directory Due to the relatively small scope of these changes, a new issue was not created. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev	2023-05-28 15:31:23 -04:00
Harrison Chase	ad7f4c0317	bump to 183 (#5372 )	2023-05-28 11:42:58 -07:00
Harrison Chase	b6927970f1	revert bad json (#5370 )	2023-05-28 10:22:02 -07:00
Matt Wells	9a5c9df809	Fixes iter error in FAISS add_embeddings call (#5367 ) # Remove re-use of iter within add_embeddings causing error As reported in https://github.com/hwchase17/langchain/issues/5336 there is an issue currently involving the atempted re-use of an iterator within the FAISS vectorstore adapter Fixes # https://github.com/hwchase17/langchain/issues/5336 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-05-28 09:59:30 -07:00
Davis Chase	b705f260f4	bump 182 (#5364 )	2023-05-28 09:16:18 -07:00
Janos Tolgyesi	5f4552391f	Add SKLearnVectorStore (#5305 ) # Add SKLearnVectorStore This PR adds SKLearnVectorStore, a simply vector store based on NearestNeighbors implementations in the scikit-learn package. This provides a simple drop-in vector store implementation with minimal dependencies (scikit-learn is typically installed in a data scientist / ml engineer environment). The vector store can be persisted and loaded from json, bson and parquet format. SKLearnVectorStore has soft (dynamic) dependency on the scikit-learn, numpy and pandas packages. Persisting to bson requires the bson package, persisting to parquet requires the pyarrow package. ## Before submitting Integration tests are provided under `tests/integration_tests/vectorstores/test_sklearn.py` Sample usage notebook is provided under `docs/modules/indexes/vectorstores/examples/sklear.ipynb` Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-28 08:17:42 -07:00
Aymen Furter	e2742953a6	feat: support for shopping search in SerpApi (#5259 ) # Support for shopping search in SerpApi ## Who can review? @vowelparrot	2023-05-27 21:20:24 -07:00
Eduard van Valkenburg	1daa7068b2	added cosmos kwargs option (#5292 ) # Added the ability to pass kwargs to cosmos client constructor The cosmos client has a ton of options that can be set, so allowing those to be passed to the constructor from the chat memory constructor with this PR.	2023-05-27 21:19:40 -07:00
Kenton	881dfe8179	Sample Notebook for DynamoDB Chat Message History (#5351 ) # Sample Notebook for DynamoDB Chat Message History @dev2049 Adding a sample notebook for the DynamoDB Chat Message History class. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-27 21:16:24 -07:00
mbchang	f079cdf479	fix: remove empty lines that cause InvalidRequestError (#5320 ) # remove empty lines in GenerativeAgentMemory that cause InvalidRequestError in OpenAIEmbeddings <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Let's say the text given to `GenerativeAgent._parse_list` is ``` text = """ Insight 1: <insight 1> Insight 2: <insight 2> """ ``` This creates an `openai.error.InvalidRequestError: [''] is not valid under any of the given schemas - 'input'` because `GenerativeAgent.add_memory()` tries to add an empty string to the vectorstore. This PR fixes the issue by removing the empty line between `Insight 1` and `Insight 2` ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @hwchase17 @vowelparrot @dev2049	2023-05-27 21:15:03 -07:00
Deepak S V	c6e5d90eff	Fixing blank thoughts in verbose for "_Exception" Action (#5331 ) Fixed the issue of blank Thoughts being printed in verbose when `handle_parsing_errors=True`, as below: Before Fix: ``` Observation: There are 38175 accounts available in the dataframe. Thought: Observation: Invalid or incomplete response Thought: Observation: Invalid or incomplete response Thought: ``` After Fix: ``` Observation: There are 38175 accounts available in the dataframe. Thought:AI: { "action": "Final Answer", "action_input": "There are 38175 accounts available in the dataframe." } Observation: Invalid Action or Action Input format Thought:AI: { "action": "Final Answer", "action_input": "The number of available accounts is 38175." } Observation: Invalid Action or Action Input format ``` @vowelparrot currently I have set the colour of thought to green (same as the colour when `handle_parsing_errors=False`). If you want to change the colour of this "_Exception" case to red or something else (when `handle_parsing_errors=True`), feel free to change it in line 789.	2023-05-27 21:14:16 -07:00
DanConstantini	c49c6ac97a	Add Chainlit to deployment options (#5314 ) # Add Chainlit to deployment options Add [Chainlit](https://github.com/Chainlit/chainlit) as deployment options Used links to Github examples and Chainlit doc on the LangChain integration Co-authored-by: Dan Constantini <danconstantini@Dan-Constantini-MacBook.local>	2023-05-27 21:12:53 -07:00
Harrison Chase	5292e855c0	add enum output parser (#5165 )	2023-05-27 20:59:24 -07:00
Harrison Chase	179ddbe88b	add enum output parser (#5165 )	2023-05-27 20:58:23 -07:00
Leonid Ganeline	465a970724	docs: added link to LangChain Handbook (#5311 ) # added a link to LangChain Handbook ## Who can review? Community members can review the PR once tests pass.	2023-05-27 20:57:40 -07:00
Russ	6e974b5f04	Fix typos (#5323 ) # Documentation typo fixes Fixes # (issue) Simple typos in the blockchain .ipynb documentation	2023-05-26 18:55:21 -07:00
Michael Landis	f75f0dbad6	docs: improve flow of llm caching notebook (#5309 ) # docs: improve flow of llm caching notebook The notebook `llm_caching` demos various caching providers. In the previous version, there was setup common to all examples but under the `In Memory Caching` heading. If a user comes and only wants to try a particular example, they will run the common setup, then the cells for the specific provider they are interested in. Then they will get import and variable reference errors. This commit moves the common setup to the top to avoid this. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049	2023-05-26 13:34:11 -04:00
Eugene Yurtsev	0a8d6bc402	Add instructions to pyproject.toml (#5138 ) # Add instructions to pyproject.toml * Add instructions to pyproject.toml about how to handle optional dependencies. ## Before submitting ## Who can review? --------- Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com> Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>	2023-05-26 13:29:07 -04:00
Shukri	58e95cd11e	Better docs for weaviate hybrid search (#5290 ) # Better docs for weaviate hybrid search <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes: NA ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049	2023-05-26 09:30:41 -07:00
Davis Chase	641303a361	bump 181 (#5302 )	2023-05-26 08:44:19 -07:00
Leonid Kuligin	aa3c7b3271	Fixed passing creds to VertexAI LLM (#5297 ) # Fixed passing creds to VertexAI LLM Fixes #5279 It looks like we should drop a type annotation for Credentials. Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-05-26 08:31:02 -07:00
Eugene Yurtsev	a669abf16b	Update CONTRIBUTION guidelines and PR Template (#5140 ) # Update contribution guidelines and PR template This PR updates the contribution guidelines to include more information on how to handle optional dependencies. The PR template is updated to include a link to the contribution guidelines document.	2023-05-26 10:18:11 -04:00
Peng Qu	d481d887bc	Add an example to make the prompt more robust (#5291 ) # Add example to LLMMath to help with power operator Add example to LLMMath that helps the model to interpret `^` as the power operator rather than the python xor operator.	2023-05-26 09:32:35 -04:00
Xiangrui Meng	aec642febb	LLM wrapper for Databricks (#5142 ) This PR adds LLM wrapper for Databricks. It supports two endpoint types: * serving endpoint * cluster driver proxy app An integration notebook is included to show how it works. Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com> Co-authored-by: Gengliang Wang <gengliang@apache.org> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:19:37 -07:00
Ted Martinez	1cb6498fdb	Tedma4/twilio tool (#5136 ) # Add twilio sms tool --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:19:22 -07:00
Moonsik Kang	a0281f5acb	Fixed typo: 'ouput' to 'output' in all documentation (#5272 ) # Fixed typo: 'ouput' to 'output' in all documentation In this instance, the typo 'ouput' was amended to 'output' in all occurrences within the documentation. There are no dependencies required for this change.	2023-05-25 19:18:31 -07:00
Michael Landis	7047a2c1af	feat: add Momento as a standard cache and chat message history provider (#5221 ) # Add Momento as a standard cache and chat message history provider This PR adds Momento as a standard caching provider. Implements the interface, adds integration tests, and documentation. We also add Momento as a chat history message provider along with integration tests, and documentation. [Momento](https://www.gomomento.com/) is a fully serverless cache. Similar to S3 or DynamoDB, it requires zero configuration, infrastructure management, and is instantly available. Users sign up for free and get 50GB of data in/out for free every month. ## Before submitting ✅ We have added documentation, notebooks, and integration tests demonstrating usage. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 19:13:21 -07:00
Hassan Ouda	56ad56c812	Support bigquery dialect - SQL (#5261 ) # Your PR Title (What it does) Adding an if statement to deal with bigquery sql dialect. When I use bigquery dialect before, it failed while using SET search_path TO. So added a condition to set dataset as the schema parameter which is equivalent to SET search_path TO . I have tested and it works. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049	2023-05-25 18:19:17 -07:00
Abdelsalam ElTamawy	2ef5579eae	Added pipline args to `HuggingFacePipeline.from_model_id` (#5268 ) The current `HuggingFacePipeline.from_model_id` does not allow passing of pipeline arguments to the transformer pipeline. This PR enables adding important pipeline parameters like setting `max_new_tokens` for example. Previous to this PR it would be necessary to manually create the pipeline through huggingface transformers then handing it to langchain. For example instead of this ```py model_id = "gpt2" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) pipe = pipeline( "text-generation", model=model, tokenizer=tokenizer, max_new_tokens=10 ) hf = HuggingFacePipeline(pipeline=pipe) ``` You can write this ```py hf = HuggingFacePipeline.from_model_id( model_id="gpt2", task="text-generation", pipeline_kwargs={"max_new_tokens": 10} ) ``` Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 17:54:52 -07:00
Davis Chase	f01dfe858d	OpenAI lint (#5273 ) Causing lint issues if you have openai installed, annoying for local dev	2023-05-25 16:20:06 -07:00
Nicholas Liu	7652d2abb0	Add Multi-CSV/DF support in CSV and DataFrame Toolkits (#5009 ) Add Multi-CSV/DF support in CSV and DataFrame Toolkits * CSV and DataFrame toolkits now accept list of CSVs/DFs * Add default prompts for many dataframes in `pandas_dataframe` toolkit Fixes #1958 Potentially fixes #4423 ## Testing * Add single and multi-dataframe integration tests for `pandas_dataframe` toolkit with permutations of `include_df_in_prompt` * Add single and multi-CSV integration tests for csv toolkit --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-25 14:23:11 -07:00
Alex Rothberg	3223a97dc6	Add visible_only and strict_mode options to ClickTool (#4088 ) Partially addresses: https://github.com/hwchase17/langchain/issues/4066	2023-05-25 14:10:39 -07:00
Ravindra Marella	b3988621c5	Add C Transformers for GGML Models (#5218 ) # Add C Transformers for GGML Models I created Python bindings for the GGML models: https://github.com/marella/ctransformers Currently it supports GPT-2, GPT-J, GPT-NeoX, LLaMA, MPT, etc. See [Supported Models](https://github.com/marella/ctransformers#supported-models). It provides a unified interface for all models: ```python from langchain.llms import CTransformers llm = CTransformers(model='/path/to/ggml-gpt-2.bin', model_type='gpt2') print(llm('AI is going to')) ``` It can be used with models hosted on the Hugging Face Hub: ```py llm = CTransformers(model='marella/gpt-2-ggml') ``` It supports streaming: ```py from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler llm = CTransformers(model='marella/gpt-2-ggml', callbacks=[StreamingStdOutCallbackHandler()]) ``` Please see [README](https://github.com/marella/ctransformers#readme) for more details. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 13:42:44 -07:00
Davis Chase	ca88b25da6	Zep sdk version (#5267 ) zep-python's sync methods no longer need an asyncio wrapper. This was causing issues with FastAPI deployment. Zep also now supports putting and getting of arbitrary message metadata. Bump zep-python version to v0.30 Remove nest-asyncio from Zep example notebooks. Modify tests to include metadata. --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org> Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-05-25 13:42:10 -07:00
Janil Wörst	5525602df0	Docs link custom agent page in getting started (#5250 ) # Docs: link custom agent page in getting started	2023-05-25 13:11:30 -07:00
Alon Diament	d3cd21ccf8	Fixed regression in JoplinLoader's get note url (#5265 ) Fixes a regression in JoplinLoader that was introduced during the code review (bad `page` wildcard in _get_note_url). ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 @leo-gan	2023-05-25 13:10:10 -07:00
Davis Chase	3be9ba14f3	OpenSearch top k parameter fix (#5216 ) For most queries it's the `size` parameter that determines final number of documents to return. Since our abstractions refer to this as `k`, set this to be `k` everywhere instead of expecting a separate param. Would be great to have someone more familiar with OpenSearch validate that this is reasonable (e.g. that having `size` and what OpenSearch calls `k` be the same won't lead to any strange behavior). cc @naveentatikonda Closes #5212	2023-05-25 09:51:23 -07:00
Yves Maurer	88ed8e1cd6	Added the option of specifying a proxy for the OpenAI API (#5246 ) # Added the option of specifying a proxy for the OpenAI API Fixes #5243 Co-authored-by: Yves Maurer <>	2023-05-25 09:50:25 -07:00
mwinterde	9c0cb90997	Resolve error in StructuredOutputParser docs (#5240 ) # Resolve error in StructuredOutputParser docs Documentation for `StructuredOutputParser` currently not reproducible, that is, `output_parser.parse(output)` raises an error because the LLM returns a response with an invalid format ```python _input = prompt.format_prompt(question="what's the capital of france") output = model(_input.to_string()) output # ? # # ```json # { # "answer": "Paris", # "source": "https://www.worldatlas.com/articles/what-is-the-capital-of-france.html" # } # ``` ``` Was fixed by adding a question mark to the prompt	2023-05-25 07:47:25 -07:00
Peng Qu	c7e2151a4b	remove extra "\n" to ensure that the format of the description, examp… (#5232 ) remove extra "\n" to ensure that the format of the description, example, and prompt&generation are completely consistent.	2023-05-25 07:46:39 -07:00
Davis Chase	15b17f9334	bump 180 (#5248 )	2023-05-25 07:09:50 -07:00
mwinterde	9e57be4b5c	Fix typo in docstring of RetryWithErrorOutputParser (#5244 )	2023-05-25 09:59:31 -04:00
Shukri	09e246f306	Weaviate: Add QnA with sources example (#5247 ) # Add QnA with sources example <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes: see https://stackoverflow.com/questions/76207160/langchain-doesnt-work-with-weaviate-vector-database-getting-valueerror/76210017#76210017 ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049	2023-05-25 09:58:33 -04:00
Archon	5cdd9ab7e1	Add MiniMax embeddings (#5174 ) - Add support for MiniMax embeddings Doc: [MiniMax embeddings](https://api.minimax.chat/document/guides/embeddings?id=6464722084cdc277dfaa966a) --------- Co-authored-by: Archon <archongum@outlook.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 06:57:49 -07:00
Eugene Yurtsev	5cfa72a130	Bibtex integration for document loader and retriever (#5137 ) # Bibtex integration Wrap bibtexparser to retrieve a list of docs from a bibtex file. * Get the metadata from the bibtex entries * `page_content` get from the local pdf referenced in the `file` field of the bibtex entry using `pymupdf` * If no valid pdf file, `page_content` set to the `abstract` field of the bibtex entry * Support Zotero flavour using regex to get the file path * Added usage example in `docs/modules/indexes/document_loaders/examples/bibtex.ipynb` --------- Co-authored-by: Sébastien M. Popoff <sebastien.popoff@espci.fr> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-25 00:21:31 -07:00
Ati Sharma	40b086d6e8	Allow to specify ID when adding to the FAISS vectorstore. (#5190 ) # Allow to specify ID when adding to the FAISS vectorstore This change allows unique IDs to be specified when adding documents / embeddings to a faiss vectorstore. - This reflects the current approach with the chroma vectorstore. - It allows rejection of inserts on duplicate IDs - will allow deletion / update by searching on deterministic ID (such as a hash). - If not specified, a random UUID is generated (as per previous behaviour, so non-breaking). This commit fixes #5065 and #3896 and should fix #2699 indirectly. I've tested adding and merging. Kindly tagging @Xmaster6y @dev2049 for review. --------- Co-authored-by: Ati Sharma <ati@agalmic.ltd> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-24 22:26:46 -07:00
Nicholas Liu	f0ea093de8	Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue #5104 ) (#5220 ) # Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue #5104) Fixes #5104 If the previous behavior of loading files that used to live in the folder, but are now trashed, you can use the `load_trashed_files` parameter: ``` loader = GoogleDriveLoader( folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5", recursive=False, load_trashed_files=True ) ``` As not loading trashed files should be expected behavior, should we 1. even provide the `load_trashed_files` parameter? 2. add documentation? Feels most users will stick with default behavior ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: DataLoaders - @eyurtsev Twitter: [@nicholasliu77](https://twitter.com/nicholasliu77)	2023-05-24 22:26:17 -07:00
Keno	eff31a3361	Remove API key from docs (#5223 ) I found an API key for `serpapi_api_key` while reading the docs. It seems to have been modified very recently. Removed it in this PR @hwchase17 - project lead	2023-05-24 22:25:39 -07:00
maspotts	95c9aa1ccb	Create async copy of from_text() inside GraphIndexCreator. (#5214 ) Copies `GraphIndexCreator.from_text()` to make an async version called `GraphIndexCreator.afrom_text()`. This is (should be) a trivial change: it just adds a copy of `GraphIndexCreator.from_text()` which is async and awaits a call to `chain.apredict()` instead of `chain.predict()`. There is no unit test for GraphIndexCreator, and I did not create one, but this code works for me locally. @agola11 @hwchase17	2023-05-24 21:54:12 -07:00
Leonid Ganeline	2ad29f410d	fix a mistake in concepts.md (#5222 ) # fix a mistake in concepts.md ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:	2023-05-24 21:47:22 -07:00
Harrison Chase	a775aa6389	Harrison/vertex (#5049 ) Co-authored-by: Leonid Kuligin <kuligin@google.com> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: sasha-gitg <44654632+sasha-gitg@users.noreply.github.com> Co-authored-by: Justin Flick <Justinjayflick@gmail.com> Co-authored-by: Justin Flick <jflick@homesite.com>	2023-05-24 15:51:12 -07:00
Zander Chase	e6c4571191	Add 'status' command to get server status (#5197 ) Example: ``` $ langchain plus start --expose ... $ langchain plus status The LangChainPlus server is currently running. Service Status Published Ports langchain-backend Up 40 seconds 1984 langchain-db Up 41 seconds 5433 langchain-frontend Up 40 seconds 80 ngrok Up 41 seconds 4040 To connect, set the following environment variables in your LangChain application: LANGCHAIN_TRACING_V2=true LANGCHAIN_ENDPOINT=https://5cef-70-23-89-158.ngrok.io $ langchain plus stop $ langchain plus status The LangChainPlus server is not running. $ langchain plus start The LangChainPlus server is currently running. Service Status Published Ports langchain-backend Up 5 seconds 1984 langchain-db Up 6 seconds 5433 langchain-frontend Up 5 seconds 80 To connect, set the following environment variables in your LangChain application: LANGCHAIN_TRACING_V2=true LANGCHAIN_ENDPOINT=http://localhost:1984 ```	2023-05-24 21:43:16 +00:00
Zander Chase	e76e68b211	Add Delete Session Method (#5193 )	2023-05-24 21:06:03 +00:00
Zander Chase	66113c2a62	Log warning (#5192 ) Changes debug log to warning log when LC Tracer fails to instantiate	2023-05-24 21:05:13 +00:00
Ankush Gola	b7fcb35a39	add option to pass openai key to langchain plus command (#5213 )	2023-05-24 21:05:03 +00:00
Davis Chase	dcee8936c1	nit (#5208 )	2023-05-24 12:52:20 -07:00
Alon Diament	44abe925df	Add Joplin document loader (#5153 ) # Add Joplin document loader [Joplin](https://joplinapp.org/) is an open source note-taking app. Joplin has a [REST API](https://joplinapp.org/api/references/rest_api/) for accessing its local database. The proposed `JoplinLoader` uses the API to retrieve all notes in the database and their metadata. Joplin needs to be installed and running locally, and an access token is required. - The PR includes an integration test. - The PR includes an example notebook. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 12:31:55 -07:00
Rodrigo Siqueira	f10be072ff	Add Iugu document loader (#5162 ) Create IUGU loader --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 11:47:01 -07:00
ByronHsu	f0730c6489	Allow readthedoc loader to pass custom html tag (#5175 ) ## Description The html structure of readthedocs can differ. Currently, the html tag is hardcoded in the reader, and unable to fit into some cases. This pr includes the following changes: 1. Replace `find_all` with `find` because we just want one tag. 2. Provide `custom_html_tag` to the loader. 3. Add tests for readthedoc loader 4. Refactor code ## Issues See more in https://github.com/hwchase17/langchain/pull/2609. The problem was not completely fixed in that pr. --------- Signed-off-by: byhsu <byhsu@linkedin.com> Co-authored-by: byhsu <byhsu@linkedin.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:40:27 -07:00
Alexander Dibrov	d8eed6018f	Output parsing variation allowance (#5178 ) # Output parsing variation allowance for self-ask with search This change makes self-ask with search easier for Llama models to follow, as they tend toward returning 'Followup:' instead of 'Follow up:' despite an otherwise valid remaining output. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:39:09 -07:00
Matt Wells	c173bf1c62	Fixes scope of query Session in PGVector (#5194 ) `vectorstore.PGVector`: The transactional boundary should be increased to cover the query itself Currently, within the `similarity_search_with_score_by_vector` the transactional boundary (created via the `Session` call) does not include the select query being made. This can result in un-intended consequences when interacting with the PGVector instance methods directly --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:37:45 -07:00
Tommaso De Lorenzo	52714cedd4	fixing total cost finetuned model giving zero (#5144 ) # OpanAI finetuned model giving zero tokens cost Very simple fix to the previously committed solution to allowing finetuned Openai models. Improves #5127 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:04:08 -07:00
Harrison Chase	94cf391ef1	standardize json parsing (#5168 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 10:03:53 -07:00
Davis Chase	2b2176a3c1	tfidf retriever (#5114 ) Co-authored-by: vempaliakhil96 <vempaliakhil96@gmail.com>	2023-05-24 10:02:09 -07:00
Shukri	b00c77dc62	Improve weaviate vectorstore docs (#5201 ) # Improve weaviate vectorstore docs	2023-05-24 09:31:48 -07:00
Tomaz Bratanic	fd866d1801	Update Cypher QA prompt (#5173 ) # Improve Cypher QA prompt The current QA prompt is optimized for networkX answer generation, which returns all the possible triples. However, Cypher search is a bit more focused and doesn't necessary return all the context information. Due to that reason, the model sometimes refuses to generate an answer even though the information is provided: ![Screenshot from 2023-05-24 08-36-23](https://github.com/hwchase17/langchain/assets/19948365/351cf9c1-2567-447c-91fd-284ae3fa1ccf) To fix this issue, I have updated the prompt. Interestingly, I tried many variations with less instructions and they didn't work properly. However, the current fix works nicely. ![Screenshot from 2023-05-24 08-37-25](https://github.com/hwchase17/langchain/assets/19948365/fc830603-e6ec-4a23-8a86-eaf572996014)	2023-05-24 08:31:30 -07:00
Zach Schillaci	aa14e223ee	Reuse `length_func` in `MapReduceDocumentsChain` (#5181 ) # Reuse `length_func` in `MapReduceDocumentsChain` Pretty straightforward refactor in `MapReduceDocumentsChain`. Reusing the local variable `length_func`, instead of the longer alternative `self.combine_document_chain.prompt_length`. @hwchase17	2023-05-24 08:28:37 -07:00
Harrison Chase	11c26ebb55	Harrison/modelscope (#5156 ) Co-authored-by: thomas-yanxin <yx20001210@163.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 08:06:45 -07:00
Davis Chase	2d5588c5f0	bump 179 (#5200 )	2023-05-24 07:55:27 -07:00
Saba Sturua	47e4ee4370	adjust docarray docstrings (#5185 ) Follow up of https://github.com/hwchase17/langchain/pull/5015 Thanks for catching this! Just a small PR to adjust couple of strings to these changes Signed-off-by: jupyterjazz <saba.sturua@jina.ai>	2023-05-24 07:50:35 -07:00
Jeff Vestal	cf19a2a59f	example usage (#5182 ) Adding example usage for elasticsearch knn embeddings [per](https://github.com/hwchase17/langchain/pull/3401#issuecomment-1548518389) https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/elasticsearch.py	2023-05-24 07:47:15 -07:00
Ikko Eltociear Ashimine	fff21a0b35	Update rellm_experimental.ipynb (#5189 ) # Your PR Title (What it does) HuggingFace -> Hugging Face	2023-05-24 11:41:00 +00:00
Nolan Tremelling	faa26650c9	Beam (#4996 ) # Beam Calls the Beam API wrapper to deploy and make subsequent calls to an instance of the gpt2 LLM in a cloud deployment. Requires installation of the Beam library and registration of Beam Client ID and Client Secret. Additional calls can then be made through the instance of the large language model in your code or by calling the Beam API. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 01:25:18 -07:00
Ofer Mendelevitch	c81fb88035	Vectara (#5069 ) # Vectara Integration This PR provides integration with Vectara. Implemented here are: * langchain/vectorstore/vectara.py * tests/integration_tests/vectorstores/test_vectara.py * langchain/retrievers/vectara_retriever.py And two IPYNB notebooks to do more testing: * docs/modules/chains/index_examples/vectara_text_generation.ipynb * docs/modules/indexes/vectorstores/examples/vectara.ipynb --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-24 01:24:58 -07:00
Jason Bosco	9c4b43b494	Add Typesense vector store (#1674 ) Closes #931. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 23:20:45 -07:00
Leonid Ganeline	33929489b9	docs: added missed `document_loaders` examples (#5150 ) # DOCS added missed document_loader examples Added missed examples: `JSON`, `Open Document Format (ODT)`, `Wikipedia`, `tomarkdown`. Updated them to a consistent format. ## Who can review? @hwchase17 @dev2049	2023-05-23 21:56:41 -07:00
Daniel Quinteros	c111134a55	Clarification of the reference to the "get_text_legth" function in ge… (#5154 ) # Clarification of the reference to the "get_text_legth" function in getting_started.md Reference to the function "get_text_legth" in the documentation did not make sense. Comment added for clarification. @hwchase17	2023-05-23 20:43:38 -07:00
Daniel Quinteros	de4ef24f75	Docs: updated getting_started.md (#5151 ) # Docs: updated getting_started.md Just accommodating some unnecessary spaces in the example of "pass few shot examples to a prompt template". @vowelparrot	2023-05-23 20:43:26 -07:00
mbchang	b1b7f3541c	fix: fix current_time=Now bug for aadd_documents in TimeWeightedRetriever (#5155 ) # Same as PR #5045, but for async <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Fixes #4825 I had forgotten to update the asynchronous counterpart `aadd_documents` with the bug fix from PR #5045, so this PR also fixes `aadd_documents` too. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-23 20:31:45 -07:00
Jeremiah Lowin	925dd3e59e	Add async versions of predict() and predict_messages() (#4867 ) # Add async versions of predict() and predict_messages() #4615 introduced a unifying interface for "base" and "chat" LLM models via the new `predict()` and `predict_messages()` methods that allow both types of models to operate on string and message-based inputs, respectively. This PR adds async versions of the same (`apredict()` and `apredict_messages()`) that are identical except for their use of `agenerate()` in place of `generate()`, which means they repurpose all existing work on the async backend. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 (follows his work on #4615) @agola11 (async) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-23 17:22:49 -07:00
Junlin Zhou	9242998db1	Empty check before pop (#4929 ) # Check whether 'other' is empty before popping This PR could fix a potential 'popping empty set' error. Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>	2023-05-23 16:46:50 -07:00
Daniel King	de6e6c764e	Add MosaicML inference endpoints (#4607 ) # Add MosaicML inference endpoints This PR adds support in langchain for MosaicML inference endpoints. We both serve a select few open source models, and allow customers to deploy their own models using our inference service. Docs are here (https://docs.mosaicml.com/en/latest/inference.html), and sign up form is here (https://forms.mosaicml.com/demo?utm_source=langchain). I'm not intimately familiar with the details of langchain, or the contribution process, so please let me know if there is anything that needs fixing or this is the wrong way to submit a new integration, thanks! I'm also not sure what the procedure is for integration tests. I have tested locally with my api key. ## Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-23 15:59:08 -07:00
Adheeban Manoharan	68f0d45485	Adding Weather Loader (#5056 ) Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 15:57:33 -07:00
Jeff Vestal	0b542a9706	Add ElasticsearchEmbeddings class for generating embeddings using Elasticsearch models (#3401 ) This PR introduces a new module, `elasticsearch_embeddings.py`, which provides a wrapper around Elasticsearch embedding models. The new ElasticsearchEmbeddings class allows users to generate embeddings for documents and query texts using a [model deployed in an Elasticsearch cluster](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-model-ref.html#ml-nlp-model-ref-text-embedding). ### Main features: 1. The ElasticsearchEmbeddings class initializes with an Elasticsearch connection object and a model_id, providing an interface to interact with the Elasticsearch ML client through [infer_trained_model](https://elasticsearch-py.readthedocs.io/en/v8.7.0/api.html?highlight=trained%20model%20infer#elasticsearch.client.MlClient.infer_trained_model) . 2. The `embed_documents()` method generates embeddings for a list of documents, and the `embed_query()` method generates an embedding for a single query text. 3. The class supports custom input text field names in case the deployed model expects a different field name than the default `text_field`. 4. The implementation is compatible with any model deployed in Elasticsearch that generates embeddings as output. ### Benefits: 1. Simplifies the process of generating embeddings using Elasticsearch models. 2. Provides a clean and intuitive interface to interact with the Elasticsearch ML client. 3. Allows users to easily integrate Elasticsearch-generated embeddings. Related issue https://github.com/hwchase17/langchain/issues/3400 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 14:50:33 -07:00
Theodore Rolle	754b5133e9	Improve PlanningOutputParser whitespace handling (#5143 ) Some LLM's will produce numbered lists with leading whitespace, i.e. in response to "What is the sum of 2 and 3?": ``` Plan: 1. Add 2 and 3. 2. Given the above steps taken, please respond to the users original question. ``` This commit updates the PlanningOutputParser regex to ignore leading whitespace before the step number, enabling it to correctly parse this format.	2023-05-23 12:47:26 -07:00
Tommaso De Lorenzo	5002f3ae35	solving #2887 (#5127 ) # Allowing openAI fine-tuned models Very simple fix that checks whether a openAI `model_name` is a fine-tuned model when loading `context_size` and when computing call's cost in the `openai_callback`. Fixes #2887 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-23 11:18:03 -07:00
Myeongseop Kim	7a75bb2121	docs: fix minor typo + add wikipedia package installation part in human_input_llm.ipynb (#5118 ) # Fix typo + add wikipedia package installation part in human_input_llm.ipynb This PR 1. Fixes typo ("the the human input LLM"), 2. Addes wikipedia package installation part (in accordance with `WikipediaQueryRun` [documentation](https://python.langchain.com/en/latest/modules/agents/tools/examples/wikipedia.html)) in `human_input_llm.ipynb` (`docs/modules/models/llms/examples/human_input_llm.ipynb`)	2023-05-23 10:59:30 -07:00

1753 changed files with 108119 additions and 45906 deletions

									
										37

.devcontainer/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				# Dev container

				This project includes a [dev container](https://containers.dev/), which lets you use a container as a full-featured dev environment.

				You can use the dev container configuration in this folder to build and run the app without needing to install any of its tools locally! You can use it in [GitHub Codespaces](https://github.com/features/codespaces) or the [VS Code Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).

				## GitHub Codespaces

				[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)

				You may use the button above, or follow these steps to open this repo in a Codespace:

				1. Click the **Code** drop-down menu at the top of https://github.com/hwchase17/langchain.

				1. Click on the **Codespaces** tab.

				1. Click **Create codespace on master** .

				For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).

				## VS Code Dev Containers

				[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)

				If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.

				You can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:

				1. If this is your first time using a development container, please ensure your system meets the pre-reqs (i.e. have Docker installed) in the [getting started steps](https://aka.ms/vscode-remote/containers/getting-started).

				2. Open a locally cloned copy of the code:

				   - Clone this repository to your local filesystem.

				   - Press <kbd>F1</kbd> and select the **Dev Containers: Open Folder in Container...** command.

				   - Select the cloned copy of this folder, wait for the container to start, and try things out!

				You can learn more in the [Dev Containers documentation](https://code.visualstudio.com/docs/devcontainers/containers).

				## Tips and tricks

				* If you are working with the same repository folder in a container and Windows, you'll want consistent line endings (otherwise you may see hundreds of changes in the SCM view). The `.gitattributes` file in the root of this repo will disable line ending conversion and should prevent this. See [tips and tricks](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files) for more info.

				* If you'd like to review the contents of the image used in this dev container, you can check it out in the [devcontainers/images](https://github.com/devcontainers/images/tree/main/src/python) repo.

									
										45

.devcontainer/devcontainer.json
									
												View File
												
				@@ -1,24 +1,26 @@

				// For format details, see https://aka.ms/devcontainer.json. For config options, see the

				// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-dockerfile

				// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-docker-compose

				{

					"dockerComposeFile": "./docker-compose.yaml",

					"service": "langchain",

					"workspaceFolder": "/workspaces/langchain",

					// Name for the dev container

					"name": "langchain",

					"customizations": {

						"vscode": {

							"extensions": [   

								"ms-python.python"

							],

							"settings": {

								"python.defaultInterpreterPath": "/home/vscode/langchain-py-env/bin/python3.11"

							}

						}

					},

					// Features to add to the dev container. More info: https://containers.dev/features.

					"features": {},

					// Point to a Docker Compose file

					"dockerComposeFile": "./docker-compose.yaml",

					// Required when using Docker Compose. The name of the service to connect to once running

					"service": "langchain",

					// The optional 'workspaceFolder' property is the path VS Code should open by default when

					// connected. This is typically a file mount in .devcontainer/docker-compose.yml

					"workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",

					// Prevent the container from shutting down

					"overrideCommand": true

					// Features to add to the dev container. More info: https://containers.dev/features

					// "features": {

					// 	"ghcr.io/devcontainers-contrib/features/poetry:2": {}

					// }

					// Use 'forwardPorts' to make a list of ports inside the container available locally.

					// "forwardPorts": [],

				@@ -26,8 +28,9 @@

					// Uncomment the next line to run commands after the container is created.

					// "postCreateCommand": "cat /etc/os-release",

					// Uncomment to connect as an existing user other than the container default. More info: https://aka.ms/dev-containers-non-root.

					// "remoteUser": "devcontainer"

					"remoteUser": "vscode",

					"overrideCommand": true

					// Configure tool-specific properties.

					// "customizations": {},

					// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.

					// "remoteUser": "root"

				}

									
										7

.devcontainer/docker-compose.yaml
									
												View File
												
				@@ -2,10 +2,11 @@ version: '3'

				services:

				  langchain:

				    build:

				      dockerfile: .devcontainer/Dockerfile

				      context: ../ 

				      dockerfile: dev.Dockerfile

				      context: ..

				    volumes:

				      - ../:/workspaces/langchain

				   # Update this to wherever you want VS Code to mount the folder of your project

				      - ..:/workspaces:cached

				    networks:

				      - langchain-network 

				  #   environment:

3

.gitattributes vendored Normal file

View File

@@ -0,0 +1,3 @@
 * text=auto eol=lf
 *.{cmd,[cC][mM][dD]} text eol=crlf
 *.{bat,[bB][aA][tT]} text eol=crlf

									
										43

.github/CONTRIBUTING.md
									
										vendored
									
												View File
												
				@@ -59,6 +59,8 @@ we do not want these to get in the way of getting good code into the codebase.

				## 🚀 Quick Start

				> **Note:** You can run this repository locally (which is described below) or in a [development container](https://containers.dev/) (which is described in the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer)).

				This project uses [Poetry](https://python-poetry.org/) as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.

				❗Note: If you use `Conda` or `Pyenv` as your environment / package manager, avoid dependency conflicts by doing the following first:

				@@ -115,8 +117,37 @@ To get a report of current coverage, run the following:

				make coverage

				```

				### Working with Optional Dependencies

				Langchain relies heavily on optional dependencies to keep the Langchain package lightweight.

				If you're adding a new dependency to Langchain, assume that it will be an optional dependency, and

				that most users won't have it installed.

				Users that do not have the dependency installed should be able to **import** your code without

				any side effects (no warnings, no errors, no exceptions). 

				To introduce the dependency to the pyproject.toml file correctly, please do the following: 

				1. Add the dependency to the main group as an optional dependency

				  ```bash

				  poetry add --optional [package_name]

				  ```

				2. Open pyproject.toml and add the dependency to the `extended_testing` extra

				3. Relock the poetry file to update the extra.

				  ```bash

				  poetry lock --no-update

				  ```

				4. Add a unit test that the very least attempts to import the new code. Ideally the unit

				test makes use of lightweight fixtures to test the logic of the code.

				5. Please use the `@pytest.mark.requires(package_name)` decorator for any tests that require the dependency.

				### Testing

				See section about optional dependencies.

				#### Unit Tests

				Unit tests cover modular logic that does not require calls to outside APIs.

				To run unit tests:

				@@ -133,8 +164,20 @@ make docker_tests

				If you add new logic, please add a unit test.

				#### Integration Tests

				Integration tests cover logic that requires making calls to outside APIs (often integration with other services).

				**warning** Almost no tests should be integration tests. 

				  Tests that require making network connections make it difficult for other

				  developers to test the code.

				  Instead favor relying on `responses` library and/or mock.patch to mock

				  requests using small fixtures.

				To run integration tests:

				```bash

									
										2

.github/ISSUE_TEMPLATE/bug-report.yml
									
										vendored
									
												View File
												
				@@ -46,7 +46,7 @@ body:

				        - @agola11

				        Tools / Toolkits

				        - @vowelparrot

				        - ...

				      placeholder: "@Username ..."

									
										60

.github/PULL_REQUEST_TEMPLATE.md
									
										vendored
									
												View File
												
				@@ -1,46 +1,26 @@

				# Your PR Title (What it does)

				<!-- Thank you for contributing to LangChain!

				<!--

				Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution.

				Replace this comment with:

				  - Description: a description of the change, 

				  - Issue: the issue # it fixes (if applicable),

				  - Dependencies: any dependencies required for this change,

				  - Tag maintainer: for a quicker response, tag the relevant maintainer (see below),

				  - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out!

				Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change.

				If you're adding a new integration, please include:

				  1. a test for the integration, preferably unit tests that do not rely on network access,

				  2. an example notebook showing its use.

				After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost.

				-->

				Maintainer responsibilities:

				  - General / Misc / if you don't know who to tag: @dev2049

				  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev

				  - Models / Prompts: @hwchase17, @dev2049

				  - Memory: @hwchase17

				  - Agents / Tools / Toolkits: @vowelparrot

				  - Tracing / Callbacks: @agola11

				  - Async: @agola11

				<!-- Remove if not applicable -->

				If no one reviews your PR within a few days, feel free to @-mention the same people again.

				Fixes # (issue)

				## Before submitting

				<!-- If you're adding a new integration, include an integration test and an example notebook showing its use! -->

				## Who can review?

				Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:

				<!-- For a quicker response, figure out the right person to tag with @

				        @hwchase17 - project lead

				        Tracing / Callbacks

				        - @agola11

				        Async

				        - @agola11

				        DataLoaders

				        - @eyurtsev

				        Models

				        - @hwchase17

				        - @agola11

				        Agents / Tools / Toolkits

				        - @vowelparrot

				        VectorStores / Retrievers / Memory

				        - @dev2049

				See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

				 -->

									
										38

.github/workflows/linkcheck.yml
									
										vendored
									
												View File
											
				@@ -1,38 +0,0 @@

				name: linkcheck

				on:

				  push:

				    branches: [master]

				  pull_request:

				    paths:

				      - 'docs/**'

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  build:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version:

				          - "3.11"

				    steps:

				      - uses: actions/checkout@v3

				      - name: Install poetry

				        run: |

				          pipx install poetry==$POETRY_VERSION

				      - name: Set up Python ${{ matrix.python-version }}

				        uses: actions/setup-python@v4

				        with:

				          python-version: ${{ matrix.python-version }}

				          cache: poetry

				      - name: Install dependencies

				        run: |

				          poetry install --with docs

				      - name: Build the docs

				        run: |

				          make docs_build

				      - name: Analyzing the docs with linkcheck

				        run: |

				          make docs_linkcheck

17

.gitignore vendored

View File

@@ -73,6 +73,7 @@ instance/
 # Sphinx documentation
 docs/_build/
 docs/docs/_build/
 # PyBuilder
 target/
@@ -149,4 +150,18 @@ wandb/
 # integration test artifacts
 data_map*
 \[('_type', 'fake'), ('stop', None)]
 \[('_type', 'fake'), ('stop', None)]
 # Replit files
 *replit*
 node_modules
 docs/.yarn/
 docs/node_modules/
 docs/.docusaurus/
 docs/.cache-loader/
 docs/_dist
 docs/api_reference/_build
 docs/docs_skeleton/build
 docs/docs_skeleton/node_modules
 docs/docs_skeleton/yarn.lock

4

.gitmodules vendored Normal file

View File

@@ -0,0 +1,4 @@
 [submodule "docs/_docs_skeleton"]
 	path = docs/_docs_skeleton
 	url = https://github.com/langchain-ai/langchain-shared-docs
 	branch = main

									
										7

.readthedocs.yaml
									
												View File
												
				@@ -9,10 +9,13 @@ build:

				  os: ubuntu-22.04

				  tools:

				    python: "3.11"

				  jobs:

				    pre_build:

				      - python docs/api_reference/create_api_rst.py

				# Build documentation in the docs/ directory with Sphinx

				sphinx:

				   configuration: docs/conf.py

				   configuration: docs/api_reference/conf.py

				# If using Sphinx, optionally build your docs in additional formats such as PDF

				# formats:

				@@ -23,4 +26,4 @@ python:

				   install:

				   - requirements: docs/requirements.txt

				   - method: pip

				     path: .

				     path: .

									
										3

Makefile
									
												View File
												
				@@ -10,6 +10,9 @@ coverage:

				clean: docs_clean

				docs_compile:

					poetry run nbdoc_build --srcdir $(srcdir)

				docs_build:

					cd docs && poetry run make html

									
										14

README.md
									
												View File
												
				@@ -2,9 +2,9 @@

				⚡ Building applications with LLMs through composability ⚡

				[![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)

				[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml)

				[![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml)

				[![linkcheck](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml)

				[![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain)

				[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

				[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)

				@@ -12,6 +12,8 @@

				[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)

				[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)

				[![GitHub star chart](https://img.shields.io/github/stars/hwchase17/langchain?style=social)](https://star-history.com/#hwchase17/langchain)

				[![Dependency Status](https://img.shields.io/librariesio/github/hwchase17/langchain)](https://libraries.io/github/hwchase17/langchain)

				[![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues)

				Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).

				@@ -33,22 +35,22 @@ This library aims to assist in the development of those types of applications. C

				**❓ Question Answering over specific documents**

				- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/question_answering.html)

				- [Documentation](https://python.langchain.com/docs/use_cases/question_answering/)

				- End-to-end Example: [Question Answering over Notion Database](https://github.com/hwchase17/notion-qa)

				**💬 Chatbots**

				- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/chatbots.html)

				- [Documentation](https://python.langchain.com/docs/use_cases/chatbots/)

				- End-to-end Example: [Chat-LangChain](https://github.com/hwchase17/chat-langchain)

				**🤖 Agents**

				- [Documentation](https://langchain.readthedocs.io/en/latest/modules/agents.html)

				- [Documentation](https://python.langchain.com/docs/modules/agents/)

				- End-to-end Example: [GPT+WolframAlpha](https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain)

				## 📖 Documentation

				Please see [here](https://langchain.readthedocs.io/en/latest/?) for full documentation on:

				Please see [here](https://python.langchain.com) for full documentation on:

				- Getting started (installation, setting up the environment, simple examples)

				- How-To examples (demos, integrations, helper functions)

				@@ -84,7 +86,7 @@ Memory refers to persisting state between calls of a chain/agent. LangChain prov

				[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.

				For more information on these concepts, please see our [full documentation](https://langchain.readthedocs.io/en/latest/).

				For more information on these concepts, please see our [full documentation](https://python.langchain.com).

				## 💁 Contributing

									
										11

.devcontainer/Dockerfile → dev.Dockerfile
									
												View File
												
				@@ -1,15 +1,15 @@

				# This is a Dockerfile for Developer Container

				# This is a Dockerfile for the Development Container

				# Use the Python base image

				ARG VARIANT="3.11-bullseye"

				FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT} AS langchain-dev-base

				FROM mcr.microsoft.com/devcontainers/python:0-${VARIANT} AS langchain-dev-base

				USER vscode

				# Define the version of Poetry to install (default is 1.4.2)

				# Define the directory of python virtual environment

				ARG PYTHON_VIRTUALENV_HOME=/home/vscode/langchain-py-env \

				    POETRY_VERSION=1.4.2 

				    POETRY_VERSION=1.3.2

				ENV POETRY_VIRTUALENVS_IN_PROJECT=false \

				    POETRY_NO_INTERACTION=true 

				@@ -35,8 +35,7 @@ FROM langchain-dev-base AS langchain-dev-dependencies

				ARG PYTHON_VIRTUALENV_HOME

				# Copy only the dependency files for installation

				COPY pyproject.toml poetry.lock poetry.toml ./

				COPY pyproject.toml poetry.toml ./

				# Install the Poetry dependencies (this layer will be cached as long as the dependencies don't change)

				RUN poetry install --no-interaction --no-ansi --with dev,test,docs

				RUN poetry install --no-interaction --no-ansi --with dev,test,docs

									
										12

docs/.local_build.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,12 @@

				mkdir _dist

				cp -r {docs_skeleton,snippets} _dist

				mkdir -p _dist/docs_skeleton/static/api_reference

				cd api_reference

				poetry run make html

				cp -r _build/* ../_dist/docs_skeleton/static/api_reference

				cd ..

				cp -r extras/* _dist/docs_skeleton/docs

				cd _dist/docs_skeleton

				poetry run nbdoc_build

				yarn install

				yarn start

									
										57

docs/additional_resources/tracing.md
									
												View File
											
				@@ -1,57 +0,0 @@

				# Tracing

				By enabling tracing in your LangChain runs, you’ll be able to more effectively visualize, step through, and debug your chains and agents.

				First, you should install tracing and set up your environment properly.

				You can use either a locally hosted version of this (uses Docker) or a cloud hosted version (in closed alpha).

				If you're interested in using the hosted platform, please fill out the form [here](https://forms.gle/tRCEMSeopZf6TE3b6).

				- [Locally Hosted Setup](../tracing/local_installation.md)

				- [Cloud Hosted Setup](../tracing/hosted_installation.md)

				## Tracing Walkthrough

				When you first access the UI, you should see a page with your tracing sessions.

				An initial one "default" should already be created for you.

				A session is just a way to group traces together.

				If you click on a session, it will take you to a page with no recorded traces that says "No Runs."

				You can create a new session with the new session form.

				![](../tracing/homepage.png)

				If we click on the `default` session, we can see that to start we have no traces stored.

				![](../tracing/default_empty.png)

				If we now start running chains and agents with tracing enabled, we will see data show up here.

				To do so, we can run [this notebook](../tracing/agent_with_tracing.ipynb) as an example.

				After running it, we will see an initial trace show up.

				![](../tracing/first_trace.png)

				From here we can explore the trace at a high level by clicking on the arrow to show nested runs.

				We can keep on clicking further and further down to explore deeper and deeper.

				![](../tracing/explore.png)

				We can also click on the "Explore" button of the top level run to dive even deeper.

				Here, we can see the inputs and outputs in full, as well as all the nested traces.

				![](../tracing/explore_trace.png)

				We can keep on exploring each of these nested traces in more detail.

				For example, here is the lowest level trace with the exact inputs/outputs to the LLM.

				![](../tracing/explore_llm.png)

				## Changing Sessions

				1. To initially record traces to a session other than `"default"`, you can set the `LANGCHAIN_SESSION` environment variable to the name of the session you want to record to:

				```python

				import os

				os.environ["LANGCHAIN_TRACING"] = "true"

				os.environ["LANGCHAIN_SESSION"] = "my_session" # Make sure this session actually exists. You can create a new session in the UI.

				```

				2. To switch sessions mid-script or mid-notebook, do NOT set the `LANGCHAIN_SESSION` environment variable. Instead: `langchain.set_tracing_callback_manager(session_name="my_session")`

0

docs/Makefile → docs/api_reference/Makefile

View File

0

docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css

View File

1860

docs/api_reference/api_reference.rst Normal file

View File

File diff suppressed because it is too large Load Diff

									
										58

docs/conf.py → docs/api_reference/conf.py
									
												View File
												
				@@ -11,13 +11,14 @@

				# add these directories to sys.path here. If the directory is relative to the

				# documentation root, use os.path.abspath to make it absolute, like shown here.

				#

				# import os

				# import sys

				# sys.path.insert(0, os.path.abspath('.'))

				import os

				import sys

				import toml

				with open("../pyproject.toml") as f:

				sys.path.insert(0, os.path.abspath("."))

				with open("../../pyproject.toml") as f:

				    data = toml.load(f)

				# -- Project information -----------------------------------------------------

				@@ -45,26 +46,34 @@ extensions = [

				    "sphinx.ext.napoleon",

				    "sphinx.ext.viewcode",

				    "sphinxcontrib.autodoc_pydantic",

				    "myst_nb",

				    "sphinx_copybutton",

				    "sphinx_panels",

				    "IPython.sphinxext.ipython_console_highlighting",

				]

				source_suffix = [".ipynb", ".html", ".md", ".rst"]

				source_suffix = [".rst"]

				autodoc_pydantic_model_show_json = False

				autodoc_pydantic_field_list_validators = False

				autodoc_pydantic_config_members = False

				autodoc_pydantic_model_show_config_summary = False

				autodoc_pydantic_model_show_validator_members = False

				autodoc_pydantic_model_show_field_summary = False

				autodoc_pydantic_model_members = False

				autodoc_pydantic_model_undoc_members = False

				# autodoc_typehints = "signature"

				# autodoc_typehints = "description"

				autodoc_pydantic_model_show_validator_summary = False

				autodoc_pydantic_model_signature_prefix = "class"

				autodoc_pydantic_field_signature_prefix = "param"

				autodoc_member_order = "groupwise"

				autoclass_content = "both"

				autodoc_typehints_format = "short"

				autodoc_default_options = {

				    "members": True,

				    "show-inheritance": True,

				    "inherited-members": "BaseModel",

				    "undoc-members": True,

				    "special-members": "__call__",

				}

				# autodoc_typehints = "description"

				# Add any paths that contain templates here, relative to this directory.

				templates_path = ["_templates"]

				templates_path = ["templates"]

				# List of patterns, relative to source directory, that match files and

				# directories to ignore when looking for source files.

				@@ -77,20 +86,24 @@ exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]

				# The theme to use for HTML and HTML Help pages.  See the documentation for

				# a list of builtin themes.

				#

				html_theme = "sphinx_book_theme"

				html_theme = "scikit-learn-modern"

				html_theme_path = ["themes"]

				html_theme_options = {

				    "path_to_docs": "docs",

				    "repository_url": "https://github.com/hwchase17/langchain",

				    "use_repository_button": True,

				# redirects dictionary maps from old links to new links

				html_additional_pages = {}

				redirects = {

				    "index": "api_reference",

				}

				for old_link in redirects:

				    html_additional_pages[old_link] = "redirects.html"

				html_context = {

				    "display_github": True,  # Integrate GitHub

				    "github_user": "hwchase17",  # Username

				    "github_repo": "langchain",  # Repo name

				    "github_version": "master",  # Version

				    "conf_py_path": "/docs/",  # Path in the checkout to the docs root

				    "conf_py_path": "/docs/api_reference",  # Path in the checkout to the docs root

				    "redirects": redirects,

				}

				# Add any paths that contain custom static files (such as style sheets) here,

				@@ -103,10 +116,9 @@ html_static_path = ["_static"]

				html_css_files = [

				    "css/custom.css",

				]

				html_use_index = False

				html_js_files = [

				    "js/mendablesearch.js",

				]

				nb_execution_mode = "off"

				myst_enable_extensions = ["colon_fence"]

				# generate autosummary even if no references

				autosummary_generate = True

									
										94

docs/api_reference/create_api_rst.py
									
										Normal file
									
												View File
												
				@@ -0,0 +1,94 @@

				"""Script for auto-generating api_reference.rst"""

				import glob

				import re

				from pathlib import Path

				ROOT_DIR = Path(__file__).parents[2].absolute()

				PKG_DIR = ROOT_DIR / "langchain"

				WRITE_FILE = Path(__file__).parent / "api_reference.rst"

				def load_members() -> dict:

				    members: dict = {}

				    for py in glob.glob(str(PKG_DIR) + "/**/*.py", recursive=True):

				        module = py[len(str(PKG_DIR)) + 1 :].replace(".py", "").replace("/", ".")

				        top_level = module.split(".")[0]

				        if top_level not in members:

				            members[top_level] = {"classes": [], "functions": []}

				        with open(py, "r") as f:

				            for line in f.readlines():

				                cls = re.findall(r"^class ([^_].*)\(", line)

				                members[top_level]["classes"].extend([module + "." + c for c in cls])

				                func = re.findall(r"^def ([^_].*)\(", line)

				                members[top_level]["functions"].extend([module + "." + f for f in func])

				    return members

				def construct_doc(members: dict) -> str:

				    full_doc = """\

				.. _api_reference:

				=============

				API Reference

				=============

				"""

				    for module, _members in sorted(members.items(), key=lambda kv: kv[0]):

				        classes = _members["classes"]

				        functions = _members["functions"]

				        if not (classes or functions):

				            continue

				        module_title = module.replace("_", " ").title()

				        if module_title == "Llms":

				            module_title = "LLMs"

				        section = f":mod:`langchain.{module}`: {module_title}"

				        full_doc += f"""\

				{section}

				{'=' * (len(section) + 1)}

				.. automodule:: langchain.{module}

				    :no-members:

				    :no-inherited-members:

				"""

				        if classes:

				            cstring = "\n    ".join(sorted(classes))

				            full_doc += f"""\

				Classes

				--------------

				.. currentmodule:: langchain

				.. autosummary::

				    :toctree: {module}

				    :template: class.rst

				    {cstring}

				"""

				        if functions:

				            fstring = "\n    ".join(sorted(functions))

				            full_doc += f"""\

				Functions

				--------------

				.. currentmodule:: langchain

				.. autosummary::

				    :toctree: {module}

				    {fstring}

				"""

				    return full_doc

				def main() -> None:

				    members = load_members()

				    full_doc = construct_doc(members)

				    with open(WRITE_FILE, "w") as f:

				        f.write(full_doc)

				if __name__ == "__main__":

				    main()

									
										8

docs/api_reference/index.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,8 @@

				=============

				LangChain API

				=============

				.. toctree::

				    :maxdepth: 2

				    api_reference.rst

0

docs/make.bat → docs/api_reference/make.bat

View File

27

docs/api_reference/templates/COPYRIGHT.txt Normal file

View File

@@ -0,0 +1,27 @@
 Copyright (c) 2007-2023 The scikit-learn developers.
 All rights reserved.
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
 * Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
 * Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.
 * Neither the name of the copyright holder nor the names of its
   contributors may be used to endorse or promote products derived from
   this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

									
										28

docs/api_reference/templates/class.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,28 @@

				:mod:`{{module}}`.{{objname}}

				{{ underline }}==============

				.. currentmodule:: {{ module }}

				.. autoclass:: {{ objname }}

				   {% block methods %}

				   {% if methods %}

				   .. rubric:: {{ _('Methods') }}

				   .. autosummary::

				   {% for item in methods %}

				      ~{{ name }}.{{ item }}

				   {%- endfor %}

				   {% endif %}

				   {% endblock %}

				   {% block attributes %}

				   {% if attributes %}

				   .. rubric:: {{ _('Attributes') }}

				   .. autosummary::

				   {% for item in attributes %}

				      ~{{ name }}.{{ item }}

				   {%- endfor %}

				   {% endif %}

				   {% endblock %}

									
										15

docs/api_reference/templates/redirects.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,15 @@

				{% set redirect = pathto(redirects[pagename]) %}

				<!DOCTYPE html>

				<html>

				  <head>

				    <meta charset="utf-8">

				    <meta name="viewport" content="width=device-width, initial-scale=1.0">

				    <meta http-equiv="Refresh" content="0; url={{ redirect }}" />

				    <meta name="Description" content="scikit-learn: machine learning in Python">

				    <link rel="canonical" href="{{ redirect }}" />

				    <title>scikit-learn: machine learning in Python</title>

				  </head>

				  <body>

				    <p>You will be automatically redirected to the <a href="{{ redirect }}">new location of this page</a>.</p>

				  </body>

				</html>

27

docs/api_reference/themes/COPYRIGHT.txt Normal file

View File

@@ -0,0 +1,27 @@
 Copyright (c) 2007-2023 The scikit-learn developers.
 All rights reserved.
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
 * Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
 * Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.
 * Neither the name of the copyright holder nor the names of its
   contributors may be used to endorse or promote products derived from
   this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

									
										67

docs/api_reference/themes/scikit-learn-modern/javascript.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,67 @@

				<script>

				$(document).ready(function() {

				    /* Add a [>>>] button on the top-right corner of code samples to hide

				     * the >>> and ... prompts and the output and thus make the code

				     * copyable. */

				    var div = $('.highlight-python .highlight,' +

				                '.highlight-python3 .highlight,' +

				                '.highlight-pycon .highlight,' +

						'.highlight-default .highlight')

				    var pre = div.find('pre');

				    // get the styles from the current theme

				    pre.parent().parent().css('position', 'relative');

				    var hide_text = 'Hide prompts and outputs';

				    var show_text = 'Show prompts and outputs';

				    // create and add the button to all the code blocks that contain >>>

				    div.each(function(index) {

				        var jthis = $(this);

				        if (jthis.find('.gp').length > 0) {

				            var button = $('<span class="copybutton">&gt;&gt;&gt;</span>');

				            button.attr('title', hide_text);

				            button.data('hidden', 'false');

				            jthis.prepend(button);

				        }

				        // tracebacks (.gt) contain bare text elements that need to be

				        // wrapped in a span to work with .nextUntil() (see later)

				        jthis.find('pre:has(.gt)').contents().filter(function() {

				            return ((this.nodeType == 3) && (this.data.trim().length > 0));

				        }).wrap('<span>');

				    });

				    // define the behavior of the button when it's clicked

				    $('.copybutton').click(function(e){

				        e.preventDefault();

				        var button = $(this);

				        if (button.data('hidden') === 'false') {

				            // hide the code output

				            button.parent().find('.go, .gp, .gt').hide();

				            button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'hidden');

				            button.css('text-decoration', 'line-through');

				            button.attr('title', show_text);

				            button.data('hidden', 'true');

				        } else {

				            // show the code output

				            button.parent().find('.go, .gp, .gt').show();

				            button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'visible');

				            button.css('text-decoration', 'none');

				            button.attr('title', hide_text);

				            button.data('hidden', 'false');

				        }

				    });

					/*** Add permalink buttons next to glossary terms ***/

					$('dl.glossary > dt[id]').append(function() {

						return ('<a class="headerlink" href="#' +

							    this.getAttribute('id') +

							    '" title="Permalink to this term">¶</a>');

					});

				});

				</script>

				{%- if pagename != 'index' and pagename != 'documentation' %}

				    {% if theme_mathjax_path %}

				<script id="MathJax-script" async src="{{ theme_mathjax_path }}"></script>

				    {% endif %}

				{%- endif %}

									
										142

docs/api_reference/themes/scikit-learn-modern/layout.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,142 @@

				{# TEMPLATE VAR SETTINGS #}

				{%- set url_root = pathto('', 1) %}

				{%- if url_root == '#' %}{% set url_root = '' %}{% endif %}

				{%- if not embedded and docstitle %}

				  {%- set titlesuffix = " &mdash; "|safe + docstitle|e %}

				{%- else %}

				  {%- set titlesuffix = "" %}

				{%- endif %}

				{%- set lang_attr = 'en' %}

				<!DOCTYPE html>

				<!--[if IE 8]><html class="no-js lt-ie9" lang="{{ lang_attr }}" > <![endif]-->

				<!--[if gt IE 8]><!--> <html class="no-js" lang="{{ lang_attr }}" > <!--<![endif]-->

				<head>

				  <meta charset="utf-8">

				  {{ metatags }}

				  <meta name="viewport" content="width=device-width, initial-scale=1.0">

				  {% block htmltitle %}

				  <title>{{ title|striptags|e }}{{ titlesuffix }}</title>

				  {% endblock %}

				  <link rel="canonical" href="http://scikit-learn.org/stable/{{pagename}}.html" />

				  {% if favicon_url %}

				  <link rel="shortcut icon" href="{{ favicon_url|e }}"/>

				  {% endif %}

				  <link rel="stylesheet" href="{{ pathto('_static/css/vendor/bootstrap.min.css', 1) }}" type="text/css" />

				  {%- for css in css_files %}

				    {%- if css|attr("rel") %}

				  <link rel="{{ css.rel }}" href="{{ pathto(css.filename, 1) }}" type="text/css"{% if css.title is not none %} title="{{ css.title }}"{% endif %} />

				    {%- else %}

				  <link rel="stylesheet" href="{{ pathto(css, 1) }}" type="text/css" />

				    {%- endif %}

				  {%- endfor %}

				  <link rel="stylesheet" href="{{ pathto('_static/' + style, 1) }}" type="text/css" />

				<script id="documentation_options" data-url_root="{{ pathto('', 1) }}" src="{{ pathto('_static/documentation_options.js', 1) }}"></script>

				<script src="{{ pathto('_static/jquery.js', 1) }}"></script>

				{%- block extrahead %} {% endblock %}

				</head>

				<body>

				{% include "nav.html" %}

				{%- block content %}

				<div class="d-flex" id="sk-doc-wrapper">

				    <input type="checkbox" name="sk-toggle-checkbox" id="sk-toggle-checkbox">

				    <label id="sk-sidemenu-toggle" class="sk-btn-toggle-toc btn sk-btn-primary" for="sk-toggle-checkbox">Toggle Menu</label>

				    <div id="sk-sidebar-wrapper" class="border-right">

				      <div class="sk-sidebar-toc-wrapper">

				        <div class="btn-group w-100 mb-2" role="group" aria-label="rellinks">

				          {%- if prev %}

				            <a href="{{ prev.link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ prev.title|striptags }}">Prev</a>

				          {%- else %}

				            <a href="#" role="button" class="btn sk-btn-rellink py-1 disabled"">Prev</a>

				          {%- endif %}

				          {%- if parents -%}

				            <a href="{{ parents[-1].link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ parents[-1].title|striptags }}">Up</a>

				          {%- else %}

				            <a href="#" role="button" class="btn sk-btn-rellink disabled py-1">Up</a>

				          {%- endif %}

				          {%- if next %}

				            <a href="{{ next.link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ next.title|striptags }}">Next</a>

				          {%- else %}

				            <a href="#" role="button" class="btn sk-btn-rellink py-1 disabled"">Next</a>

				          {%- endif %}

				        </div>

				        {%- if pagename != "install" %}

				        <div class="alert alert-warning p-1 mb-2" role="alert">

				          <p class="text-center mb-0">

				          <strong>LangChain {{ release }}</strong><br/>

				          </p>

				        </div>

				        {%- endif %}

				            {%- if meta and meta['parenttoc']|tobool %}

				            <div class="sk-sidebar-toc">

				            {% set nav = get_nav_object(maxdepth=3, collapse=True, numbered=True) %}

				              <ul>

				              {% for main_nav_item in nav %}

				              {% if main_nav_item.active %}

				              <li>

				                <a href="{{ main_nav_item.url }}" class="sk-toc-active">{{ main_nav_item.title }}</a>

				              </li>

				              <ul>

				              {% for nav_item in main_nav_item.children %}

				                <li>

				                  <a href="{{ nav_item.url }}" class="{% if nav_item.active %}sk-toc-active{% endif %}">{{ nav_item.title }}</a>

				                  {% if nav_item.children %}

				                  <ul>

				                    {% for inner_child in nav_item.children %}

				                      <li class="sk-toctree-l3">

				                        <a href="{{ inner_child.url }}">{{ inner_child.title }}</a>

				                      </li>

				                    {% endfor %}

				                  </ul>

				                  {% endif %}

				                </li>

				              {% endfor %}

				              </ul>

				              {% endif %}

				              {% endfor %}

				              </ul>

				            </div>

				            {%- elif meta and meta['globalsidebartoc']|tobool %}

				            <div class="sk-sidebar-toc sk-sidebar-global-toc">

				              {{ toctree(maxdepth=2, titles_only=True) }}

				            </div>

				            {%- else %}

				            <div class="sk-sidebar-toc">

				              {{ toc }}

				            </div>

				            {%- endif %}

				      </div>

				    </div>

				    <div id="sk-page-content-wrapper">

				      <div class="sk-page-content container-fluid body px-md-3" role="main">

				        {% block body %}{% endblock %}

				      </div>

				    <div class="container">

				      <footer class="sk-content-footer">

				        {%- if pagename != 'index' %}

				        {%- if show_copyright %}

				          {%- if hasdoc('copyright') %}

				            {% trans path=pathto('copyright'), copyright=copyright|e %}&copy; {{ copyright }}.{% endtrans %}

				          {%- else %}

				            {% trans copyright=copyright|e %}&copy; {{ copyright }}.{% endtrans %}

				          {%- endif %}

				        {%- endif %}

				        {%- if last_updated %}

				          {% trans last_updated=last_updated|e %}Last updated on {{ last_updated }}.{% endtrans %}

				        {%- endif %}

				        {%- if show_source and has_source and sourcename %}

				          <a href="{{ pathto('_sources/' + sourcename, true)|e }}" rel="nofollow">{{ _('Show this page source') }}</a>

				        {%- endif %}

				        {%- endif %}

				      </footer>

				    </div>

				  </div>

				</div>

				{%- endblock %}

				<script src="{{ pathto('_static/js/vendor/bootstrap.min.js', 1) }}"></script>

				{% include "javascript.html" %}

				</body>

				</html>

									
										85

docs/api_reference/themes/scikit-learn-modern/nav.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,85 @@

				{%- if pagename != 'index' and pagename != 'documentation' %}

				  {%- set nav_bar_class = "sk-docs-navbar" %}

				  {%- set top_container_cls = "sk-docs-container" %}

				{%- else %}

				  {%- set nav_bar_class = "sk-landing-navbar" %}

				  {%- set top_container_cls = "sk-landing-container" %}

				{%- endif %}

				{% if theme_link_to_live_contributing_page|tobool %}

				{# Link to development page for live builds #}

				  {%- set development_link = "https://scikit-learn.org/dev/developers/index.html" %}

				{# Open on a new development page in new window/tab for live builds #}

				  {%- set development_attrs = 'target="_blank" rel="noopener noreferrer"' %}

				{%- else %}

				  {%- set development_link = pathto('developers/index') %}

				  {%- set development_attrs = '' %}

				{%- endif %}

				{# title, link, link_attrs #}

				{%- set drop_down_navigation = [

				  ('Getting Started', pathto('getting_started'), ''),

				  ('Tutorial', pathto('tutorial/index'), ''),

				  ("What's new", pathto('whats_new/v' + version), ''),

				  ('Glossary', pathto('glossary'), ''),

				  ('Development', development_link, development_attrs),

				  ('FAQ', pathto('faq'), ''),

				  ('Support', pathto('support'), ''),

				  ('Related packages', pathto('related_projects'), ''),

				  ('Roadmap', pathto('roadmap'), ''),

				  ('Governance', pathto('governance'), ''),

				  ('About us', pathto('about'), ''),

				  ('GitHub', 'https://github.com/scikit-learn/scikit-learn', ''),

				  ('Other Versions and Download', 'https://scikit-learn.org/dev/versions.html', '')]

				-%}

				<nav id="navbar" class="{{ nav_bar_class }} navbar navbar-expand-md navbar-light bg-light py-0">

				  <div class="container-fluid {{ top_container_cls }} px-0">

				    {%- if logo_url %}

				      <a class="navbar-brand py-0" href="{{ pathto('index') }}">

				        <img

				          class="sk-brand-img"

				          src="{{ logo_url|e }}"

				          alt="logo"/>

				      </a>

				    {%- endif %}

				    <button

				      id="sk-navbar-toggler"

				      class="navbar-toggler"

				      type="button"

				      data-toggle="collapse"

				      data-target="#navbarSupportedContent"

				      aria-controls="navbarSupportedContent"

				      aria-expanded="false"

				      aria-label="Toggle navigation"

				    >

				      <span class="navbar-toggler-icon"></span>

				    </button>

				    <div class="sk-navbar-collapse collapse navbar-collapse" id="navbarSupportedContent">

				      <ul class="navbar-nav mr-auto">

				        <li class="nav-item">

				          <a class="sk-nav-link nav-link" href="{{ pathto('api_reference') }}">API</a>

				        </li>

				        <li class="nav-item">

				          <a class="sk-nav-link nav-link" target="_blank" rel="noopener noreferrer" href="https://python.langchain.com/">Python Docs</a>

				        </li>

				        {%- for title, link, link_attrs in drop_down_navigation %}

				        <li class="nav-item">

				          <a class="sk-nav-link nav-link nav-more-item-mobile-items" href="{{ link }}" {{ link_attrs }}>{{ title }}</a>

				        </li>

				        {%- endfor %}

				      </ul>

				      {%- if pagename != "search"%}

				      <div id="searchbox" role="search">

				          <div class="searchformwrapper">

				          <form class="search" action="{{ pathto('search') }}" method="get">

				            <input class="sk-search-text-input" type="text" name="q" aria-labelledby="searchlabel" />

				            <input class="sk-search-text-btn" type="submit" value="{{ _('Go') }}" />

				          </form>

				          </div>

				      </div>

				      {%- endif %}

				    </div>

				  </div>

				</nav>

									
										16

docs/api_reference/themes/scikit-learn-modern/search.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,16 @@

				{%- extends "basic/search.html" %}

				{% block extrahead %}

				  <script type="text/javascript" src="{{ pathto('_static/underscore.js', 1) }}"></script>

				  <script type="text/javascript" src="{{ pathto('searchindex.js', 1) }}" defer></script>

				  <script type="text/javascript" src="{{ pathto('_static/doctools.js', 1) }}"></script>

				  <script type="text/javascript" src="{{ pathto('_static/language_data.js', 1) }}"></script>

				  <script type="text/javascript" src="{{ pathto('_static/searchtools.js', 1) }}"></script>

				  <!-- <script type="text/javascript" src="{{ pathto('_static/sphinx_highlight.js', 1) }}"></script> -->

				  <script type="text/javascript">

				    $(document).ready(function() {

				      if (!Search.out) {

				        Search.init();

				      }

				    });

				  </script>

				{% endblock %}

1395

docs/api_reference/themes/scikit-learn-modern/static/css/theme.css Normal file

View File

File diff suppressed because it is too large Load Diff

6

docs/api_reference/themes/scikit-learn-modern/static/css/vendor/bootstrap.min.css vendored Normal file

View File

File diff suppressed because one or more lines are too long

6

docs/api_reference/themes/scikit-learn-modern/static/js/vendor/bootstrap.min.js vendored Normal file

View File

File diff suppressed because one or more lines are too long

2

docs/api_reference/themes/scikit-learn-modern/static/js/vendor/jquery-3.6.3.slim.min.js vendored Normal file

View File

File diff suppressed because one or more lines are too long

8

docs/api_reference/themes/scikit-learn-modern/theme.conf Normal file

View File

@@ -0,0 +1,8 @@
 [theme]
 inherit = basic
 pygments_style = default
 stylesheet = css/theme.css
 [options]
 link_to_live_contributing_page = false
 mathjax_path =

7

docs/docs_skeleton/.gitignore vendored Normal file

View File

@@ -0,0 +1,7 @@
 .yarn/
 node_modules/
 .docusaurus
 .cache-loader
 docs/api

									
										49

docs/docs_skeleton/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,49 @@

				# Website

				This website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator.

				### Installation

				```

				$ yarn

				```

				### Local Development

				```

				$ yarn start

				```

				This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

				### Build

				```

				$ yarn build

				```

				This command generates static content into the `build` directory and can be served using any static contents hosting service.

				### Deployment

				Using SSH:

				```

				$ USE_SSH=true yarn deploy

				```

				Not using SSH:

				```

				$ GIT_USER=<Your GitHub username> yarn deploy

				```

				If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.

				### Continuous Integration

				Some common defaults for linting/formatting have been set for you. If you integrate your project with an open source Continuous Integration system (e.g. Travis CI, CircleCI), you may check for issues using the following command.

				```

				$ yarn ci

				```

									
										12

docs/docs_skeleton/babel.config.js
									
										Normal file
									
												View File
												
				@@ -0,0 +1,12 @@

				/**

				 * Copyright (c) Meta Platforms, Inc. and affiliates.

				 *

				 * This source code is licensed under the MIT license found in the

				 * LICENSE file in the root directory of this source tree.

				 *

				 * @format

				 */

				module.exports = {

				  presets: [require.resolve("@docusaurus/core/lib/babel/preset")],

				};

									
										76

docs/docs_skeleton/code-block-loader.js
									
										Normal file
									
												View File
												
				@@ -0,0 +1,76 @@

				/* eslint-disable prefer-template */

				/* eslint-disable no-param-reassign */

				// eslint-disable-next-line import/no-extraneous-dependencies

				const babel = require("@babel/core");

				const path = require("path");

				const fs = require("fs");

				/**

				 *

				 * @param {string|Buffer} content Content of the resource file

				 * @param {object} [map] SourceMap data consumable by https://github.com/mozilla/source-map

				 * @param {any} [meta] Meta data, could be anything

				 */

				async function webpackLoader(content, map, meta) {

				  const cb = this.async();

				  if (!this.resourcePath.endsWith(".ts")) {

				    cb(null, JSON.stringify({ content, imports: [] }), map, meta);

				    return;

				  }

				  try {

				    const result = await babel.parseAsync(content, {

				      sourceType: "module",

				      filename: this.resourcePath,

				    });

				    const imports = [];

				    result.program.body.forEach((node) => {

				      if (node.type === "ImportDeclaration") {

				        const source = node.source.value;

				        if (!source.startsWith("langchain")) {

				          return;

				        }

				        node.specifiers.forEach((specifier) => {

				          if (specifier.type === "ImportSpecifier") {

				            const local = specifier.local.name;

				            const imported = specifier.imported.name;

				            imports.push({ local, imported, source });

				          } else {

				            throw new Error("Unsupported import type");

				          }

				        });

				      }

				    });

				    imports.forEach((imp) => {

				      const { imported, source } = imp;

				      const moduleName = source.split("/").slice(1).join("_");

				      const docsPath = path.resolve(__dirname, "docs", "api", moduleName);

				      const available = fs.readdirSync(docsPath, { withFileTypes: true });

				      const found = available.find(

				        (dirent) =>

				          dirent.isDirectory() &&

				          fs.existsSync(path.resolve(docsPath, dirent.name, imported + ".md"))

				      );

				      if (found) {

				        imp.docs =

				          "/" + path.join("docs", "api", moduleName, found.name, imported);

				      } else {

				        throw new Error(

				          `Could not find docs for ${source}.${imported} in docs/api/`

				        );

				      }

				    });

				    cb(null, JSON.stringify({ content, imports }), map, meta);

				  } catch (err) {

				    cb(err);

				  }

				}

				module.exports = webpackLoader;

0

docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored

View File

Before

Width: | Height: | Size: 559 KiB

After

Width: | Height: | Size: 559 KiB

0

docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/DataberryDashboard.png vendored

View File

Before

Width: | Height: | Size: 157 KiB

After

Width: | Height: | Size: 157 KiB

0

docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored

View File

Before

Width: | Height: | Size: 235 KiB

After

Width: | Height: | Size: 235 KiB

0

docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored

View File

Before

Width: | Height: | Size: 148 KiB

After

Width: | Height: | Size: 148 KiB

0

docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored

View File

Before

Width: | Height: | Size: 3.5 MiB

After

Width: | Height: | Size: 3.5 MiB

BIN
docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 18 KiB

BIN
docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 85 KiB

BIN
docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 16 KiB

									
										21

docs/docs_skeleton/docs/_static/css/custom.css
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,21 @@

				pre {

				  white-space: break-spaces;

				}

				@media (min-width: 1200px) {

				  .container,

				  .container-lg,

				  .container-md,

				  .container-sm,

				  .container-xl {

				    max-width: 2560px !important;

				  }

				}

				#my-component-root *, #headlessui-portal-root * {

				  z-index: 10000;

				}

				.content-container p {

				    margin: revert;

				}

BIN
docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 542 B

BIN
docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 1.2 KiB

BIN
docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 15 KiB

0

docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js vendored

View File

BIN
docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 103 KiB

BIN
docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 136 KiB

BIN
docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 34 KiB

8

docs/docs_skeleton/docs/ecosystem/integrations/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 0
 ---
 # Integrations
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

5

docs/docs_skeleton/docs/get_started/installation.mdx Normal file

View File

@@ -0,0 +1,5 @@
 # Installation
 import Installation from "@snippets/get_started/installation.mdx"
 <Installation/>

65

docs/docs_skeleton/docs/get_started/introduction.mdx Normal file

View File

@@ -0,0 +1,65 @@
 ---
 sidebar_position: 0
 ---
 # Introduction
 **LangChain** is a framework for developing applications powered by language models. It enables applications that are:
 - **Data-aware**: connect a language model to other sources of data
 - **Agentic**: allow a language model to interact with its environment
 The main value props of LangChain are:
 . **Components**: abstractions for working with language models, along with a collection of implementations for each abstraction. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
 . **Off-the-shelf chains**: a structured assembly of components for accomplishing specific higher-level tasks
 Off-the-shelf chains make it easy to get started. For more complex applications and nuanced use-cases, components make it easy to customize existing chains or build new ones.
 ## Get started
 [Here’s](/docs/get_started/installation.html) how to install LangChain, set up your environment, and start building.
 We recommend following our [Quickstart](/docs/get_started/quickstart.html) guide to familiarize yourself with the framework by building your first LangChain application.
 _**Note**: These docs are for the LangChain [Python package](https://github.com/hwchase17/langchain). For documentation on [LangChain.js](https://github.com/hwchase17/langchainjs), the JS/TS version, [head here](https://js.langchain.com/docs)._
 ## Modules
 LangChain provides standard, extendable interfaces and external integrations for the following modules, listed from least to most complex:
 #### [Model I/O](/docs/modules/model_io/)
 Interface with language models
 #### [Data connection](/docs/modules/data_connection/)
 Interface with application-specific data
 #### [Chains](/docs/modules/chains/)
 Construct sequences of calls
 #### [Agents](/docs/modules/agents/)
 Let chains choose which tools to use given high-level directives
 #### [Memory](/docs/modules/memory/)
 Persist application state between runs of a chain
 #### [Callbacks](/docs/modules/callbacks/)
 Log and stream intermediate steps of any chain
 ## Examples, ecosystem, and resources
 ### [Use cases](/docs/use_cases/)
 Walkthroughs and best-practices for common end-to-end use cases, like:
 - [Chatbots](/docs/use_cases/chatbots/)
 - [Answering questions using sources](/docs/use_cases/question_answering/)
 - [Analyzing structured data](/docs/use_cases/tabular.html)
 - and much more...
 ### [Guides](/docs/guides/)
 Learn best practices for developing with LangChain.
 ### [Ecosystem](/docs/ecosystem/)
 LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/ecosystem/integrations/) and [dependent repos](/docs/ecosystem/dependents.html).
 ### [Additional resources](/docs/additional_resources/)
 Our community is full of prolific developers, creative builders, and fantastic teachers. Check out [YouTube tutorials](/docs/additional_resources/youtube.html) for great tutorials from folks in the community, and [Gallery](https://github.com/kyrolabs/awesome-langchain) for a list of awesome LangChain projects, compiled by the folks at [KyroLabs](https://kyrolabs.com).
 <h3><span style={{color:"#2e8555"}}> Support </span></h3>
 Join us on [GitHub](https://github.com/hwchase17/langchain) or [Discord](https://discord.gg/6adMQxSpJS) to ask questions, share feedback, meet other developers building with LangChain, and dream about the future of LLM’s.
 ## API reference
 Head to the [reference](https://api.python.langchain.com) section for full documentation of all classes and methods in the LangChain Python package.

158

docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file

View File

@@ -0,0 +1,158 @@
 # Quickstart
 ## Installation
 To install LangChain run:
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 import Install from "@snippets/get_started/quickstart/installation.mdx"
 <Install/>
 For more details, see our [Installation guide](/docs/get_started/installation.html).
 ## Environment setup
 Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs.
 import OpenAISetup from "@snippets/get_started/quickstart/openai_setup.mdx"
 <OpenAISetup/>
 ## Building an application
 Now we can start building our language model application. LangChain provides many modules that can be used to build language model applications. Modules can be used as stand-alones in simple applications and they can be combined for more complex use cases.
 ## LLMs
 #### Get predictions from a language model
 The basic building block of LangChain is the LLM, which takes in text and generates more text.
 As an example, suppose we're building an application that generates a company name based on a company description. In order to do this, we need to initialize an OpenAI model wrapper. In this case, since we want the outputs to be MORE random, we'll initialize our model with a HIGH temperature.
 import LLM from "@snippets/get_started/quickstart/llm.mdx"
 <LLM/>
 ## Chat models
 Chat models are a variation on language models. While chat models use language models under the hood, the interface they expose is a bit different: rather than expose a "text in, text out" API, they expose an interface where "chat messages" are the inputs and outputs.
 You can get chat completions by passing one or more messages to the chat model. The response will be a message. The types of messages currently supported in LangChain are `AIMessage`, `HumanMessage`, `SystemMessage`, and `ChatMessage` -- `ChatMessage` takes in an arbitrary role parameter. Most of the time, you'll just be dealing with `HumanMessage`, `AIMessage`, and `SystemMessage`.
 import ChatModel from "@snippets/get_started/quickstart/chat_model.mdx"
 <ChatModel/>
 ## Prompt templates
 Most LLM applications do not pass user input directly into to an LLM. Usually they will add the user input to a larger piece of text, called a prompt template, that provides additional context on the specific task at hand.
 In the previous example, the text we passed to the model contained instructions to generate a company name. For our application, it'd be great if the user only had to provide the description of a company/product, without having to worry about giving the model instructions.
 import PromptTemplateLLM from "@snippets/get_started/quickstart/prompt_templates_llms.mdx"
 import PromptTemplateChatModel from "@snippets/get_started/quickstart/prompt_templates_chat_models.mdx"
 <Tabs>
     <TabItem value="llms" label="LLMs" default>
 With PromptTemplates this is easy! In this case our template would be very simple:
 <PromptTemplateLLM/>
 </TabItem>
 <TabItem value="chat_models" label="Chat models">
 Similar to LLMs, you can make use of templating by using a `MessagePromptTemplate`. You can build a `ChatPromptTemplate` from one or more `MessagePromptTemplate`s. You can use `ChatPromptTemplate`'s `format_messages` method to generate the formatted messages.
 Because this is generating a list of messages, it is slightly more complex than the normal prompt template which is generating only a string. Please see the detailed guides on prompts to understand more options available to you here.
 <PromptTemplateChatModel/>
     </TabItem>
 </Tabs>
 ## Chains
 Now that we've got a model and a prompt template, we'll want to combine the two. Chains give us a way to link (or chain) together multiple primitives, like models, prompts, and other chains.
 import ChainLLM from "@snippets/get_started/quickstart/chains_llms.mdx"
 import ChainChatModel from "@snippets/get_started/quickstart/chains_chat_models.mdx"
 <Tabs>
 <TabItem value="llms" label="LLMs" default>
 The simplest and most common type of chain is an LLMChain, which passes an input first to a PromptTemplate and then to an LLM. We can construct an LLM chain from our existing model and prompt template.
 <ChainLLM/>
 There we go, our first chain! Understanding how this simple chain works will set you up well for working with more complex chains.
 </TabItem>
 <TabItem value="chat_models" label="Chat models">
 The `LLMChain` can be used with chat models as well:
 <ChainChatModel/>
 </TabItem>
 </Tabs>
 ## Agents
 import AgentLLM from "@snippets/get_started/quickstart/agents_llms.mdx"
 import AgentChatModel from "@snippets/get_started/quickstart/agents_chat_models.mdx"
 Our first chain ran a pre-determined sequence of steps. To handle complex workflows, we need to be able to dynamically choose actions based on inputs.
 Agents do just this: they use a language model to determine which actions to take and in what order. Agents are given access to tools, and they repeatedly choose a tool, run the tool, and observe the output until they come up with a final answer.
 To load an agent, you need to choose a(n):
 - LLM/Chat model: The language model powering the agent.
 - Tool(s): A function that performs a specific duty. This can be things like: Google Search, Database lookup, Python REPL, other chains. For a list of predefined tools and their specifications, see the [Tools documentation](/docs/modules/agents/tools/).
 - Agent name: A string that references a supported agent class. An agent class is largely parameterized by the prompt the language model uses to determine which action to take. Because this notebook focuses on the simplest, highest level API, this only covers using the standard supported agents. If you want to implement a custom agent, see [here](/docs/modules/agents/how_to/custom_agent.html). For a list of supported agents and their specifications, see [here](/docs/modules/agents/agent_types/).
 For this example, we'll be using SerpAPI to query a search engine.
 You'll need to install the SerpAPI Python package:
 ```bash
 pip install google-search-results
 ```
 And set the `SERPAPI_API_KEY` environment variable.
 <Tabs>
 <TabItem value="llms" label="LLMs" default>
 <AgentLLM/>
 </TabItem>
 <TabItem value="chat_models" label="Chat models">
 Agents can also be used with chat models, you can initialize one using `AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION` as the agent type.
 <AgentChatModel/>
 </TabItem>
 </Tabs>
 ## Memory
 The chains and agents we've looked at so far have been stateless, but for many applications it's necessary to reference past interactions. This is clearly the case with a chatbot for example, where you want it to understand new messages in the context of past messages.
 The Memory module gives you a way to maintain application state. The base Memory interface is simple: it lets you update state given the latest run inputs and outputs and it lets you modify (or contextualize) the next input using the stored state.
 There are a number of built-in memory systems. The simplest of these are is a buffer memory which just prepends the last few inputs/outputs to the current input - we will use this in the example below.
 import MemoryLLM from "@snippets/get_started/quickstart/memory_llms.mdx"
 import MemoryChatModel from "@snippets/get_started/quickstart/memory_chat_models.mdx"
 <Tabs>
 <TabItem value="llms" label="LLMs" default>
 <MemoryLLM/>
 </TabItem>
 <TabItem value="chat_models" label="Chat models">
 You can use Memory with chains and agents initialized with chat models. The main difference between this and Memory for LLMs is that rather than trying to condense all previous messages into a string, we can keep them as their own unique memory object.
 <MemoryChatModel/>
 </TabItem>
 </Tabs>

13

docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx Normal file

View File

@@ -0,0 +1,13 @@
 # Conversational
 This walkthrough demonstrates how to use an agent optimized for conversation. Other agents are often optimized for using tools to figure out the best response, which is not ideal in a conversational setting where you may want the agent to be able to chat with the user as well.
 import Example from "@snippets/modules/agents/agent_types/conversational_agent.mdx"
 <Example/>
 import ChatExample from "@snippets/modules/agents/agent_types/chat_conversation_agent.mdx"
 ## Using a chat model
 <ChatExample/>

57

docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx Normal file

View File

@@ -0,0 +1,57 @@
 ---
 sidebar_position: 0
 ---
 # Agent types
 ## Action agents
 Agents use an LLM to determine which actions to take and in what order.
 An action can either be using a tool and observing its output, or returning a response to the user.
 Here are the agents available in LangChain.
 ### [Zero-shot ReAct](/docs/modules/agents/agent_types/react.html)
 This agent uses the [ReAct](https://arxiv.org/pdf/2205.00445.pdf) framework to determine which tool to use
 based solely on the tool's description. Any number of tools can be provided.
 This agent requires that a description is provided for each tool.
 **Note**: This is the most general purpose action agent.
 ### [Structured input ReAct](/docs/modules/agents/agent_types/structured_chat.html)
 The structured tool chat agent is capable of using multi-input tools.
 Older agents are configured to specify an action input as a single string, but this agent can use a tools' argument
 schema to create a structured action input. This is useful for more complex tool usage, like precisely
 navigating around a browser.
 ### [OpenAI Functions](/docs/modules/agents/agent_types/openai_functions_agent.html)
 Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been explicitly fine-tuned to detect when a
 function should to be called and respond with the inputs that should be passed to the function.
 The OpenAI Functions Agent is designed to work with these models.
 ### [Conversational](/docs/modules/agents/agent_types/chat_conversation_agent.html)
 This agent is designed to be used in conversational settings.
 The prompt is designed to make the agent helpful and conversational.
 It uses the ReAct framework to decide which tool to use, and uses memory to remember the previous conversation interactions.
 ### [Self ask with search](/docs/modules/agents/agent_types/self_ask_with_search.html)
 This agent utilizes a single tool that should be named `Intermediate Answer`.
 This tool should be able to lookup factual answers to questions. This agent
 is equivalent to the original [self ask with search paper](https://ofir.io/self-ask.pdf),
 where a Google search API was provided as the tool.
 ### [ReAct document store](/docs/modules/agents/agent_types/react_docstore.html)
 This agent uses the ReAct framework to interact with a docstore. Two tools must
 be provided: a `Search` tool and a `Lookup` tool (they must be named exactly as so).
 The `Search` tool should search for a document, while the `Lookup` tool should lookup
 a term in the most recently found document.
 This agent is equivalent to the
 original [ReAct paper](https://arxiv.org/pdf/2210.03629.pdf), specifically the Wikipedia example.
 ## [Plan-and-execute agents](/docs/modules/agents/agent_types/plan_and_execute.html)
 Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).

11

docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx Normal file

View File

@@ -0,0 +1,11 @@
 # OpenAI functions
 Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been fine-tuned to detect when a function should to be called and respond with the inputs that should be passed to the function.
 In an API call, you can describe functions and have the model intelligently choose to output a JSON object containing arguments to call those functions.
 The goal of the OpenAI Function APIs is to more reliably return valid and useful function calls than a generic text completion or chat API.
 The OpenAI Functions Agent is designed to work with these models.
 import Example from "@snippets/modules/agents/agent_types/openai_functions_agent.mdx";
 <Example/>

11

docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx Normal file

View File

@@ -0,0 +1,11 @@
 # Plan and execute
 Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).
 The planning is almost always done by an LLM.
 The execution is usually done by a separate agent (equipped with tools).
 import Example from "@snippets/modules/agents/agent_types/plan_and_execute.mdx"
 <Example/>

15

docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx Normal file

View File

@@ -0,0 +1,15 @@
 # ReAct
 This walkthrough showcases using an agent to implement the [ReAct](https://react-lm.github.io/) logic.
 import Example from "@snippets/modules/agents/agent_types/react.mdx"
 <Example/>
 ## Using chat models
 You can also create ReAct agents that use chat models instead of LLMs as the agent driver.
 import ChatExample from "@snippets/modules/agents/agent_types/react_chat.mdx"
 <ChatExample/>

10

docs/docs_skeleton/docs/modules/agents/agent_types/structured_chat.mdx Normal file

View File

@@ -0,0 +1,10 @@
 # Structured tool chat
 The structured tool chat agent is capable of using multi-input tools.
 Older agents are configured to specify an action input as a single string, but this agent can use the provided tools' `args_schema` to populate the action input.
 import Example from "@snippets/modules/agents/agent_types/structured_chat.mdx"
 <Example/>

									
										2

docs/docs_skeleton/docs/modules/agents/how_to/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				label: 'How-to'

				position: 1

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_agent.mdx Normal file

View File

@@ -0,0 +1,14 @@
 # Custom LLM Agent
 This notebook goes through how to create your own custom LLM agent.
 An LLM agent consists of three parts:
 - PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
 - LLM: This is the language model that powers the agent
 - `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
 - OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
 import Example from "@snippets/modules/agents/how_to/custom_llm_agent.mdx"
 <Example/>

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_chat_agent.mdx Normal file

View File

@@ -0,0 +1,14 @@
 # Custom LLM Agent (with a ChatModel)
 This notebook goes through how to create your own custom agent based on a chat model.
 An LLM chat agent consists of three parts:
 - PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
 - ChatModel: This is the language model that powers the agent
 - `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
 - OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
 import Example from "@snippets/modules/agents/how_to/custom_llm_chat_agent.mdx"
 <Example/>

16

docs/docs_skeleton/docs/modules/agents/how_to/mrkl.mdx Normal file

View File

@@ -0,0 +1,16 @@
 # Replicating MRKL
 This walkthrough demonstrates how to replicate the [MRKL](https://arxiv.org/pdf/2205.00445.pdf) system using agents.
 This uses the example Chinook database.
 To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository.
 import Example from "@snippets/modules/agents/how_to/mrkl.mdx"
 <Example/>
 ## With a chat model
 import ChatExample from "@snippets/modules/agents/how_to/mrkl_chat.mdx"
 <ChatExample/>

51

docs/docs_skeleton/docs/modules/agents/index.mdx Normal file

View File

@@ -0,0 +1,51 @@
 ---
 sidebar_position: 4
 ---
 # Agents
 Some applications require a flexible chain of calls to LLMs and other tools based on user input. The **Agent** interface provides the flexibility for such applications. An agent has access to a suite of tools, and determines which ones to use depending on the user input. Agents can use multiple tools, and use the output of one tool as the input to the next.
 There are two main types of agents:
 - **Action agents**: at each timestep, decide on the next action using the outputs of all previous actions
 - **Plan-and-execute agents**: decide on the full sequence of actions up front, then execute them all without updating the plan
 Action agents are suitable for small tasks, while plan-and-execute agents are better for complex or long-running tasks that require maintaining long-term objectives and focus. Often the best approach is to combine the dynamism of an action agent with the planning abilities of a plan-and-execute agent by letting the plan-and-execute agent use action agents to execute plans.
 For a full list of agent types see [agent types](/docs/modules/agents/agent_types/). Additional abstractions involved in agents are:
 - [**Tools**](/docs/modules/agents/tools/): the actions an agent can take. What tools you give an agent highly depend on what you want the agent to do
 - [**Toolkits**](/docs/modules/agents/toolkits/): wrappers around collections of tools that can be used together a specific use case. For example, in order for an agent to
   interact with a SQL database it will likely need one tool to execute queries and another to inspect tables
 ## Action agents
 At a high-level an action agent:
 . Receives user input
 . Decides which tool, if any, to use and the tool input
 . Calls the tool and records the output (also known as an "observation")
 . Decides the next step using the history of tools, tool inputs, and observations
 . Repeats 3-4 until it determines it can respond directly to the user
 Action agents are wrapped in **agent executors**, which are responsible for calling the agent, getting back an action and action input, calling the tool that the action references with the generated input, getting the output of the tool, and then passing all that information back into the agent to get the next action it should take.
 Although an agent can be constructed in many ways, it typically involves these components:
 - **Prompt template**: Responsible for taking the user input and previous steps and constructing a prompt
   to send to the language model
 - **Language model**: Takes the prompt with use input and action history and decides what to do next
 - **Output parser**: Takes the output of the language model and parses it into the next action or a final answer
 ## Plan-and-execute agents
 At a high-level a plan-and-execute agent:
 . Receives user input
 . Plans the full sequence of steps to take
 . Executes the steps in order, passing the outputs of past steps as inputs to future steps
 The most typical implementation is to have the planner be a language model, and the executor be an action agent. Read more [here](/docs/modules/agents/agent_types/plan_and_execute.html).
 ## Get started
 import GetStarted from "@snippets/modules/agents/get_started.mdx"
 <GetStarted/>

10

docs/docs_skeleton/docs/modules/agents/toolkits/index.mdx Normal file

View File

@@ -0,0 +1,10 @@
 ---
 sidebar_position: 3
 ---
 # Toolkits
 Toolkits are collections of tools that are designed to be used together for specific tasks and have convenience loading methods.
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

									
										2

docs/docs_skeleton/docs/modules/agents/tools/how_to/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				label: 'How-to'

				position: 0

17

docs/docs_skeleton/docs/modules/agents/tools/index.mdx Normal file

View File

@@ -0,0 +1,17 @@
 ---
 sidebar_position: 2
 ---
 # Tools
 Tools are interfaces that an agent can use to interact with the world.
 ## Get started
 Tools are functions that agents can use to interact with the world.
 These tools can be generic utilities (e.g. search), other chains, or even other agents.
 Currently, tools can be loaded with the following snippet:
 import GetStarted from "@snippets/modules/agents/tools/get_started.mdx"
 <GetStarted/>

									
										1

docs/docs_skeleton/docs/modules/agents/tools/integrations/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				label: 'Integrations'

									
										2

docs/docs_skeleton/docs/modules/callbacks/how_to/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				label: 'How-to'

				position: 0

10

docs/docs_skeleton/docs/modules/callbacks/index.mdx Normal file

View File

@@ -0,0 +1,10 @@
 ---
 sidebar_position: 5
 ---
 # Callbacks
 LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
 import GetStarted from "@snippets/modules/callbacks/get_started.mdx"
 <GetStarted/>

									
										1

docs/docs_skeleton/docs/modules/callbacks/integrations/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				label: 'Integrations'

216

docs/docs_skeleton/docs/modules/callbacks/integrations/promptlayer.ipynb Normal file

View File

@@ -0,0 +1,216 @@
 {
  "cells": [
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "# PromptLayer\n",
     "\n",
     "<img src=\"https://promptlayer.com/logo.png\"  height=\"300\">\n"
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "[PromptLayer](https://promptlayer.com) is a an observability platform for prompts and LLMs. In this guide we will go over how to setup the `PromptLayerCallbackHandler`. While PromptLayer does have LLMs that integrate directly with LangChain (eg [`PromptLayerOpenAI`](https://python.langchain.com/docs/modules/model_io/models/llms/integrations/promptlayer_openai)), this callback will be an easier and more feature rich way to integrate PromptLayer with any model on LangChain. \n",
     "\n",
     "This callback is also the recommended way to connect with PromptLayer when building Chains and Agents on LangChain."
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {
     "tags": []
    },
    "source": [
     "## Installation and Setup"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
     "!pip install promptlayer --upgrade"
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### Getting API Credentials\n",
     "\n",
     "If you have not already create an account on [PromptLayer](https://www.promptlayer.com) and get an API key by clicking on the settings cog in the navbar\n",
     "Set it as an environment variabled called `PROMPTLAYER_API_KEY`\n"
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### Usage\n",
     "\n",
     "To get started with `PromptLayerCallbackHandler` is fairly simple, it takes two optional arguments:\n",
     "1. `pl_tags` - an optional list of strings that will be tags tracked on PromptLayer\n",
     "2. `pl_id_callback` - an optional function that will get a `promptlayer_request_id` as an argument. This id can be used with all of PromptLayers tracking features to track, metadata, scores, and prompt usage."
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### Simple Example\n",
     "\n",
     "In this simple example we use `PromptLayerCallbackHandler` with `ChatOpenAI`. We add a PromptLayer tag named `chatopenai`"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 3,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "content=\"Sure, here's one:\\n\\nWhy did the tomato turn red?\\n\\nBecause it saw the salad dressing!\" additional_kwargs={} example=False\n"
      ]
     }
    ],
    "source": [
     "from langchain.chat_models import ChatOpenAI\n",
     "from langchain.schema import (\n",
     "    HumanMessage,\n",
     ")\n",
     "from langchain.callbacks import PromptLayerCallbackHandler\n",
     "\n",
     "chat_llm = ChatOpenAI(\n",
     "    temperature=0,\n",
     "    callbacks=[PromptLayerCallbackHandler(pl_tags=[\"chatopenai\"])],\n",
     ")\n",
     "llm_results = chat_llm(\n",
     "    [\n",
     "        HumanMessage(content=\"What comes after 1,2,3 ?\"),\n",
     "        HumanMessage(content=\"Tell me another joke?\"),\n",
     "    ]\n",
     ")\n",
     "print(llm_results)\n"
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### Full Featured Example\n",
     "\n",
     "In this example we unlock more of the power of PromptLayer.\n",
     "\n",
     "We are using the Prompt Registry and fetching the prompt called `example`.\n",
     "\n",
     "We also define a `pl_id_callback` function that tracks a score, metadata and the prompt used. Read more about tracking on [our docs](docs.promptlayer.com)."
    ]
   },
   {
    "cell_type": "code",
    "execution_count": 4,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
       "prompt layer id  6050929\n"
      ]
     },
     {
      "data": {
       "text/plain": [
        "'\\nToasterCo.'"
       ]
      },
      "execution_count": 4,
      "metadata": {},
      "output_type": "execute_result"
     }
    ],
    "source": [
     "from langchain.llms import OpenAI\n",
     "from langchain.callbacks import PromptLayerCallbackHandler\n",
     "import promptlayer\n",
     "\n",
     "def pl_id_callback(promptlayer_request_id):\n",
     "    print(\"prompt layer id \", promptlayer_request_id)\n",
     "    promptlayer.track.score(\n",
     "        request_id=promptlayer_request_id, score=100\n",
     "    )  # score is an integer 0-100\n",
     "    promptlayer.track.metadata(\n",
     "        request_id=promptlayer_request_id, metadata={\"foo\": \"bar\"}\n",
     "    )  # metadata is a dictionary of key value pairs that is tracked on PromptLayer\n",
     "    promptlayer.track.prompt(\n",
     "        request_id=promptlayer_request_id,\n",
     "        prompt_name=\"example\",\n",
     "        prompt_input_variables={\"product\": \"toasters\"},\n",
     "        version=1,\n",
     "    )\n",
     "\n",
     "\n",
     "openai_llm = OpenAI(\n",
     "    model_name=\"text-davinci-002\",\n",
     "    callbacks=[PromptLayerCallbackHandler(pl_id_callback=pl_id_callback)],\n",
     ")\n",
     "\n",
     "example_prompt = promptlayer.prompts.get(\"example\", version=1, langchain=True)\n",
     "openai_llm(example_prompt.format(product=\"toasters\"))"
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "That is all it takes! After setup all your requests will show up on the PromptLayer dasahboard.\n",
     "This callback also works with any LLM implemented on LangChain."
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": []
   }
  ],
  "metadata": {
   "kernelspec": {
    "display_name": "base",
    "language": "python",
    "name": "python3"
   },
   "language_info": {
    "codemirror_mode": {
     "name": "ipython",
     "version": 3
    },
    "file_extension": ".py",
    "mimetype": "text/x-python",
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
    "version": "3.8.8"
   },
   "vscode": {
    "interpreter": {
     "hash": "c4fe2cd85a8d9e8baaec5340ce66faff1c77581a9f43e6c45e85e09b6fced008"
    }
   }
  },
  "nbformat": 4,
  "nbformat_minor": 4
 }

7

docs/docs_skeleton/docs/modules/chains/additional/analyze_document.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Analyze Document
 The AnalyzeDocumentChain can be used as an end-to-end to chain. This chain takes in a single document, splits it up, and then runs it through a CombineDocumentsChain.
 import Example from "@snippets/modules/chains/additional/analyze_document.mdx"
 <Example/>

7

docs/docs_skeleton/docs/modules/chains/additional/constitutional_chain.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Self-critique chain with constitutional AI
 The ConstitutionalChain is a chain that ensures the output of a language model adheres to a predefined set of constitutional principles. By incorporating specific rules and guidelines, the ConstitutionalChain filters and modifies the generated content to align with these principles, thus providing more controlled, ethical, and contextually appropriate responses. This mechanism helps maintain the integrity of the output while minimizing the risk of generating content that may violate guidelines, be offensive, or deviate from the desired context.
 import Example from "@snippets/modules/chains/additional/constitutional_chain.mdx"
 <Example/>

8

docs/docs_skeleton/docs/modules/chains/additional/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 4
 ---
 # Additional
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

8

docs/docs_skeleton/docs/modules/chains/additional/moderation.mdx Normal file

View File

@@ -0,0 +1,8 @@
 # Moderation
 This notebook walks through examples of how to use a moderation chain, and several common ways for doing so. Moderation chains are useful for detecting text that could be hateful, violent, etc. This can be useful to apply on both user input, but also on the output of a Language Model. Some API providers, like OpenAI, [specifically prohibit](https://beta.openai.com/docs/usage-policies/use-case-policy) you, or your end users, from generating some types of harmful content. To comply with this (and to just generally prevent your application from being harmful) you may often want to append a moderation chain to any LLMChains, in order to make sure any output the LLM generates is not harmful.
 If the content passed into the moderation chain is harmful, there is not one best way to handle it, it probably depends on your application. Sometimes you may want to throw an error in the Chain (and have your application handle that). Other times, you may want to return something to the user explaining that the text was harmful. There could even be other ways to handle it! We will cover all these ways in this walkthrough.
 import Example from "@snippets/modules/chains/additional/moderation.mdx"
 <Example/>

7

docs/docs_skeleton/docs/modules/chains/additional/multi_prompt_router.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Dynamically selecting from multiple prompts
 This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects the prompt to use for a given input. Specifically we show how to use the `MultiPromptChain` to create a question-answering chain that selects the prompt which is most relevant for a given question, and then answers the question using that prompt.
 import Example from "@snippets/modules/chains/additional/multi_prompt_router.mdx"
 <Example/>

7

docs/docs_skeleton/docs/modules/chains/additional/multi_retrieval_qa_router.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Dynamically selecting from multiple retrievers
 This notebook demonstrates how to use the `RouterChain` paradigm to create a chain that dynamically selects which Retrieval system to use. Specifically we show how to use the `MultiRetrievalQAChain` to create a question-answering chain that selects the retrieval QA chain which is most relevant for a given question, and then answers the question using it.
 import Example from "@snippets/modules/chains/additional/multi_retrieval_qa_router.mdx"
 <Example/>

13

docs/docs_skeleton/docs/modules/chains/additional/question_answering.mdx Normal file

View File

@@ -0,0 +1,13 @@
 # Document QA
 Here we walk through how to use LangChain for question answering over a list of documents. Under the hood we'll be using our [Document chains](/docs/modules/chains/document/).
 import Example from "@snippets/modules/chains/additional/question_answering.mdx"
 <Example/>
 ## Document QA with sources
 import ExampleWithSources from "@snippets/modules/chains/additional/qa_with_sources.mdx"
 <ExampleWithSources/>

16

docs/docs_skeleton/docs/modules/chains/document/index.mdx Normal file

View File

@@ -0,0 +1,16 @@
 ---
 sidebar_position: 2
 ---
 # Documents
 These are the core chains for working with Documents. They are useful for summarizing documents, answering questions over documents, extracting information from documents, and more.
 These chains all implement a common interface:
 import Interface from "@snippets/modules/chains/document/combine_docs.mdx"
 <Interface/>
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

5

docs/docs_skeleton/docs/modules/chains/document/map_reduce.mdx Normal file

View File

@@ -0,0 +1,5 @@
 # Map reduce
 The map reduce documents chain first applies an LLM chain to each document individually (the Map step), treating the chain output as a new document. It then passes all the new documents to a separate combine documents chain to get a single output (the Reduce step). It can optionally first compress, or collapse, the mapped documents to make sure that they fit in the combine documents chain (which will often pass them to an LLM). This compression step is performed recursively if necessary.
 ![map_reduce_diagram](/img/map_reduce.jpg)

5

docs/docs_skeleton/docs/modules/chains/document/map_rerank.mdx Normal file

View File

@@ -0,0 +1,5 @@
 # Map re-rank
 The map re-rank documents chain runs an initial prompt on each document, that not only tries to complete a task but also gives a score for how certain it is in its answer. The highest scoring response is returned.
 ![map_rerank_diagram](/img/map_rerank.jpg)

12

docs/docs_skeleton/docs/modules/chains/document/refine.mdx Normal file

View File

@@ -0,0 +1,12 @@
 ---
 sidebar_position: 1
 ---
 # Refine
 The refine documents chain constructs a response by looping over the input documents and iteratively updating its answer. For each document, it passes all non-document inputs, the current document, and the latest intermediate answer to an LLM chain to get a new answer.
 Since the Refine chain only passes a single document to the LLM at a time, it is well-suited for tasks that require analyzing more documents than can fit in the model's context.
 The obvious tradeoff is that this chain will make far more LLM calls than, for example, the Stuff documents chain.
 There are also certain tasks which are difficult to accomplish iteratively. For example, the Refine chain can perform poorly when documents frequently cross-reference one another or when a task requires detailed information from many documents.
 ![refine_diagram](/img/refine.jpg)

12

docs/docs_skeleton/docs/modules/chains/document/stuff.mdx Normal file

View File

@@ -0,0 +1,12 @@
 ---
 sidebar_position: 0
 ---
 # Stuff
 The stuff documents chain ("stuff" as in "to stuff" or "to fill") is the most straightforward of the document chains. It takes a list of documents, inserts them all into a prompt and passes that prompt to an LLM.
 This chain is well-suited for applications where documents are small and only a few are passed in for most calls.
 ![stuff_diagram](/img/stuff.jpg)

8

docs/docs_skeleton/docs/modules/chains/foundational/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 1
 ---
 # Foundational
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

11

docs/docs_skeleton/docs/modules/chains/foundational/llm_chain.mdx Normal file

View File

@@ -0,0 +1,11 @@
 # LLM
 An LLMChain is a simple chain that adds some functionality around language models. It is used widely throughout LangChain, including in other chains and agents.
 An LLMChain consists of a PromptTemplate and a language model (either an LLM or chat model). It formats the prompt template using the input key values provided (and also memory key values, if available), passes the formatted string to LLM and returns the LLM output.
 ## Get started
 import Example from "@snippets/modules/chains/foundational/llm_chain.mdx"
 <Example/>

14

docs/docs_skeleton/docs/modules/chains/foundational/sequential_chains.mdx Normal file

View File

@@ -0,0 +1,14 @@
 # Sequential
 <!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! Instead, edit the notebook w/the location & name as this file. -->
 The next step after calling a language model is make a series of calls to a language model. This is particularly useful when you want to take the output from one call and use it as the input to another.
 In this notebook we will walk through some examples for how to do this, using sequential chains. Sequential chains allow you to connect multiple chains and compose them into pipelines that execute some specific scenario.. There are two types of sequential chains:
 - `SimpleSequentialChain`: The simplest form of sequential chains, where each step has a singular input/output, and the output of one step is the input to the next.
 - `SequentialChain`: A more general form of sequential chains, allowing for multiple inputs/outputs.
 import Example from "@snippets/modules/chains/foundational/sequential_chains.mdx"
 <Example/>

8

docs/docs_skeleton/docs/modules/chains/how_to/debugging.mdx Normal file

View File

@@ -0,0 +1,8 @@
 # Debugging chains
 It can be hard to debug a `Chain` object solely from its output as most `Chain` objects involve a fair amount of input prompt preprocessing and LLM output post-processing.
 import Example from "@snippets/modules/chains/how_to/debugging.mdx"
 <Example/>

8

docs/docs_skeleton/docs/modules/chains/how_to/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 0
 ---
 # How to
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

10

docs/docs_skeleton/docs/modules/chains/how_to/memory.mdx Normal file

View File

@@ -0,0 +1,10 @@
 # Adding memory (state)
 Chains can be initialized with a Memory object, which will persist data across calls to the chain. This makes a Chain stateful.
 ## Get started
 import GetStarted from "@snippets/modules/chains/how_to/memory.mdx"
 <GetStarted/>

33

docs/docs_skeleton/docs/modules/chains/index.mdx Normal file

View File

@@ -0,0 +1,33 @@
 ---
 sidebar_position: 2
 ---
 # Chains
 Using an LLM in isolation is fine for simple applications,
 but more complex applications require chaining LLMs - either with each other or with other components.
 LangChain provides the **Chain** interface for such "chained" applications. We define a Chain very generically as a sequence of calls to components, which can include other chains. The base interface is simple:
 import BaseClass from "@snippets/modules/chains/base_class.mdx"
 <BaseClass/>
 This idea of composing components together in a chain is simple but powerful. It drastically simplifies and makes more modular the implementation of complex applications, which in turn makes it much easier to debug, maintain, and improve your applications.
 For more specifics check out:
 - [How-to](/docs/modules/chains/how_to/) for walkthroughs of different chain features
 - [Foundational](/docs/modules/chains/foundational/) to get acquainted with core building block chains
 - [Document](/docs/modules/chains/document/) to learn how to incorporate documents into chains
 - [Popular](/docs/modules/chains/popular/) chains for the most common use cases
 - [Additional](/docs/modules/chains/additional/) to see some of the more advanced chains and integrations that you can use out of the box
 ## Why do we need chains?
 Chains allow us to combine multiple components together to create a single, coherent application. For example, we can create a chain that takes user input, formats it with a PromptTemplate, and then passes the formatted response to an LLM. We can build more complex chains by combining multiple chains together, or by combining chains with other components.
 ## Get started
 import GetStarted from "@snippets/modules/chains/get_started.mdx"
 <GetStarted/>

9

docs/docs_skeleton/docs/modules/chains/popular/api.mdx Normal file

View File

@@ -0,0 +1,9 @@
 ---
 sidebar_position: 0
 ---
 # API chains
 APIChain enables using LLMs to interact with APIs to retrieve relevant information. Construct the chain by providing a question relevant to the provided API documentation.
 import Example from "@snippets/modules/chains/popular/api.mdx"
 <Example/>

14

docs/docs_skeleton/docs/modules/chains/popular/chat_vector_db.mdx Normal file

View File

@@ -0,0 +1,14 @@
 ---
 sidebar_position: 2
 ---
 # Conversational Retrieval QA
 The ConversationalRetrievalQA chain builds on RetrievalQAChain to provide a chat history component.
 It first combines the chat history (either explicitly passed in or retrieved from the provided memory) and the question into a standalone question, then looks up relevant documents from the retriever, and finally passes those documents and the question to a question answering chain to return a response.
 To create one, you will need a retriever. In the below example, we will create one from a vector store, which can be created from embeddings.
 import Example from "@snippets/modules/chains/popular/chat_vector_db.mdx"
 <Example/>

Compare commits

681 Commits v0.0.178 ... vwp/fix_pr

37 .devcontainer/README.md Normal file Unescape Escape View File

45 .devcontainer/devcontainer.json Unescape Escape View File

7 .devcontainer/docker-compose.yaml Unescape Escape View File

3 .gitattributes vendored Normal file Unescape Escape View File

43 .github/CONTRIBUTING.md vendored Unescape Escape View File

2 .github/ISSUE_TEMPLATE/bug-report.yml vendored Unescape Escape View File

60 .github/PULL_REQUEST_TEMPLATE.md vendored Unescape Escape View File

38 .github/workflows/linkcheck.yml vendored Unescape Escape View File

17 .gitignore vendored Unescape Escape View File

4 .gitmodules vendored Normal file Unescape Escape View File

7 .readthedocs.yaml Unescape Escape View File

3 Makefile Unescape Escape View File

14 README.md Unescape Escape View File

11 .devcontainer/Dockerfile → dev.Dockerfile Unescape Escape View File

12 docs/.local_build.sh Executable file Unescape Escape View File

57 docs/additional_resources/tracing.md Unescape Escape View File

0 docs/Makefile → docs/api_reference/Makefile Unescape Escape View File

0 docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css Unescape Escape View File

1860 docs/api_reference/api_reference.rst Normal file View File

58 docs/conf.py → docs/api_reference/conf.py Unescape Escape View File

94 docs/api_reference/create_api_rst.py Normal file Unescape Escape View File

8 docs/api_reference/index.rst Normal file Unescape Escape View File

0 docs/make.bat → docs/api_reference/make.bat Unescape Escape View File

27 docs/api_reference/templates/COPYRIGHT.txt Normal file Unescape Escape View File

28 docs/api_reference/templates/class.rst Normal file Unescape Escape View File

15 docs/api_reference/templates/redirects.html Normal file Unescape Escape View File

27 docs/api_reference/themes/COPYRIGHT.txt Normal file Unescape Escape View File

67 docs/api_reference/themes/scikit-learn-modern/javascript.html Normal file Unescape Escape View File

142 docs/api_reference/themes/scikit-learn-modern/layout.html Normal file Unescape Escape View File

85 docs/api_reference/themes/scikit-learn-modern/nav.html Normal file Unescape Escape View File

16 docs/api_reference/themes/scikit-learn-modern/search.html Normal file Unescape Escape View File

1395 docs/api_reference/themes/scikit-learn-modern/static/css/theme.css Normal file View File

6 docs/api_reference/themes/scikit-learn-modern/static/css/vendor/bootstrap.min.css vendored Normal file View File

6 docs/api_reference/themes/scikit-learn-modern/static/js/vendor/bootstrap.min.js vendored Normal file View File

2 docs/api_reference/themes/scikit-learn-modern/static/js/vendor/jquery-3.6.3.slim.min.js vendored Normal file View File

8 docs/api_reference/themes/scikit-learn-modern/theme.conf Normal file Unescape Escape View File

7 docs/docs_skeleton/.gitignore vendored Normal file Unescape Escape View File

49 docs/docs_skeleton/README.md Normal file Unescape Escape View File

12 docs/docs_skeleton/babel.config.js Normal file Unescape Escape View File

76 docs/docs_skeleton/code-block-loader.js Normal file Unescape Escape View File

0 docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored Unescape Escape View File

0 docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/DataberryDashboard.png vendored Unescape Escape View File

0 docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored Unescape Escape View File

0 docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored Unescape Escape View File

0 docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file View File

21 docs/docs_skeleton/docs/_static/css/custom.css vendored Normal file Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file View File

0 docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js vendored Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file View File

8 docs/docs_skeleton/docs/ecosystem/integrations/index.mdx Normal file Unescape Escape View File

5 docs/docs_skeleton/docs/get_started/installation.mdx Normal file Unescape Escape View File

65 docs/docs_skeleton/docs/get_started/introduction.mdx Normal file Unescape Escape View File

158 docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file Unescape Escape View File

13 docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx Normal file Unescape Escape View File

57 docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx Normal file Unescape Escape View File

11 docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx Normal file Unescape Escape View File

11 docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx Normal file Unescape Escape View File

15 docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx Normal file Unescape Escape View File

10 docs/docs_skeleton/docs/modules/agents/agent_types/structured_chat.mdx Normal file Unescape Escape View File

2 docs/docs_skeleton/docs/modules/agents/how_to/_category_.yml Normal file Unescape Escape View File

14 docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_agent.mdx Normal file Unescape Escape View File

14 docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_chat_agent.mdx Normal file Unescape Escape View File

16 docs/docs_skeleton/docs/modules/agents/how_to/mrkl.mdx Normal file Unescape Escape View File

51 docs/docs_skeleton/docs/modules/agents/index.mdx Normal file Unescape Escape View File

10 docs/docs_skeleton/docs/modules/agents/toolkits/index.mdx Normal file Unescape Escape View File

2 docs/docs_skeleton/docs/modules/agents/tools/how_to/_category_.yml Normal file Unescape Escape View File

17 docs/docs_skeleton/docs/modules/agents/tools/index.mdx Normal file Unescape Escape View File

1 docs/docs_skeleton/docs/modules/agents/tools/integrations/_category_.yml Normal file Unescape Escape View File

2 docs/docs_skeleton/docs/modules/callbacks/how_to/_category_.yml Normal file Unescape Escape View File

10 docs/docs_skeleton/docs/modules/callbacks/index.mdx Normal file Unescape Escape View File

1 docs/docs_skeleton/docs/modules/callbacks/integrations/_category_.yml Normal file Unescape Escape View File

681 Commits

v0.0.178 ... vwp/fix_pr

37

.devcontainer/README.md Normal file

View File

45

.devcontainer/devcontainer.json

View File

7

.devcontainer/docker-compose.yaml

View File

3

.gitattributes vendored Normal file

View File

43

.github/CONTRIBUTING.md vendored

View File

2

.github/ISSUE_TEMPLATE/bug-report.yml vendored

View File

60

.github/PULL_REQUEST_TEMPLATE.md vendored

View File

38

.github/workflows/linkcheck.yml vendored

View File

17

.gitignore vendored

View File

4

.gitmodules vendored Normal file

View File

7

.readthedocs.yaml

View File

3

Makefile

View File

14

README.md

View File

11

.devcontainer/Dockerfile → dev.Dockerfile

View File

12

docs/.local_build.sh Executable file

View File

57

docs/additional_resources/tracing.md

View File

0

docs/Makefile → docs/api_reference/Makefile

View File

0

docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css

View File

1860

docs/api_reference/api_reference.rst Normal file

View File

58

docs/conf.py → docs/api_reference/conf.py

View File

94

docs/api_reference/create_api_rst.py Normal file

View File

8

docs/api_reference/index.rst Normal file

View File

0

docs/make.bat → docs/api_reference/make.bat

View File

27

docs/api_reference/templates/COPYRIGHT.txt Normal file

View File

28

docs/api_reference/templates/class.rst Normal file

View File

15

docs/api_reference/templates/redirects.html Normal file

View File

27

docs/api_reference/themes/COPYRIGHT.txt Normal file

View File

67

docs/api_reference/themes/scikit-learn-modern/javascript.html Normal file

View File

142

docs/api_reference/themes/scikit-learn-modern/layout.html Normal file

View File

85

docs/api_reference/themes/scikit-learn-modern/nav.html Normal file

View File

16

docs/api_reference/themes/scikit-learn-modern/search.html Normal file

View File

1395

docs/api_reference/themes/scikit-learn-modern/static/css/theme.css Normal file

View File

6

docs/api_reference/themes/scikit-learn-modern/static/css/vendor/bootstrap.min.css vendored Normal file

View File

6

docs/api_reference/themes/scikit-learn-modern/static/js/vendor/bootstrap.min.js vendored Normal file

View File

2

docs/api_reference/themes/scikit-learn-modern/static/js/vendor/jquery-3.6.3.slim.min.js vendored Normal file

View File

8

docs/api_reference/themes/scikit-learn-modern/theme.conf Normal file

View File

7

docs/docs_skeleton/.gitignore vendored Normal file

View File

49

docs/docs_skeleton/README.md Normal file

View File

12

docs/docs_skeleton/babel.config.js Normal file

View File

76

docs/docs_skeleton/code-block-loader.js Normal file

View File

0

docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored

View File

0

docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/DataberryDashboard.png vendored

View File

0

docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored

View File

0

docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored

View File

0

docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored

View File

BIN
docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file

View File

21

docs/docs_skeleton/docs/_static/css/custom.css vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file

View File

0

docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js vendored

View File

BIN
docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file

View File

8

docs/docs_skeleton/docs/ecosystem/integrations/index.mdx Normal file

View File

5

docs/docs_skeleton/docs/get_started/installation.mdx Normal file

View File

65

docs/docs_skeleton/docs/get_started/introduction.mdx Normal file

View File

158

docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file

View File

13

docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx Normal file

View File

57

docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx Normal file

View File

11

docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx Normal file

View File

11

docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx Normal file

View File

15

docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx Normal file

View File

10

docs/docs_skeleton/docs/modules/agents/agent_types/structured_chat.mdx Normal file

View File

2

docs/docs_skeleton/docs/modules/agents/how_to/_category_.yml Normal file

View File

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_agent.mdx Normal file

View File

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_chat_agent.mdx Normal file

View File

16

docs/docs_skeleton/docs/modules/agents/how_to/mrkl.mdx Normal file

View File

51

docs/docs_skeleton/docs/modules/agents/index.mdx Normal file

View File

10

docs/docs_skeleton/docs/modules/agents/toolkits/index.mdx Normal file

View File

2

docs/docs_skeleton/docs/modules/agents/tools/how_to/_category_.yml Normal file

View File

17

docs/docs_skeleton/docs/modules/agents/tools/index.mdx Normal file

View File

1

docs/docs_skeleton/docs/modules/agents/tools/integrations/_category_.yml Normal file

View File

2

docs/docs_skeleton/docs/modules/callbacks/how_to/_category_.yml Normal file

View File

10

docs/docs_skeleton/docs/modules/callbacks/index.mdx Normal file

View File

1

docs/docs_skeleton/docs/modules/callbacks/integrations/_category_.yml Normal file

View File

216

docs/docs_skeleton/docs/modules/callbacks/integrations/promptlayer.ipynb Normal file

View File