langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-02-04 08:10:25 +00:00

Author	SHA1	Message	Date
Eugene Yurtsev	e2f037529c	test	2023-07-25 09:51:19 -04:00
Leonid Ganeline	afc55a4fee	Refactored `requests` (#8203 ) Refactored `requests.py`. The same as https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099 requests.py is in the root code folder. This creates the `langchain.requests: Requests` group on the API Reference navigation ToC, on the same level as Chains and Agents which is incorrect. Refactoring: - copied requests.py content into utils/requests.py - I added the backwards compatibility ref in the original requests.py. - updated imports to requests objects @hwchase17, @baskaryan	2023-07-24 21:23:59 -07:00
William FH	0a16b3d84b	Update Integrations links (#8206 )	2023-07-24 21:20:32 -07:00
Alex Stachowiak	a7efa95775	Update base chain type hints (#7680 ) Addresses #7578. `run()` can return dictionaries, Pydantic objects or strings, so the type hints should reflect that. See the chain from `create_structured_output_chain` for an example of a non-string return type from `run()`. I've updated the BaseLLMChain return type hint from `str` to `Any`. Although, the differences between `run()` and `__call__()` seem less clear now. CC: @baskaryan Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 21:16:41 -07:00
Ani peter benjamin	e58b1d7073	feat: temp fixed Could not parse LLM output on agents folder (#7746 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 19:20:37 -07:00
Dayuan Jiang	125ae6d9de	add Hybrid retriever that not require any external service (#8108 ) - Until now, hybrid search was limited to modules requiring external services, such as Weaviate/Pinecone Hybrid Search. However, I have developed a hybrid retriever that can merge a list of retrievers using the [Reciprocal Rank Fusion](https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf) algorithm. This new approach, similar to Weaviate hybrid search, does not require the initialization of any external service. - Dependencies: No - Twitter handle: dayuanjian21687 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 19:16:10 -07:00
Dario Ruben	04e45f9cde	Fixed grammar in LLM models documentation (#8210 ) Description: I fixed a typo in the documentation related to LLMs (https://python.langchain.com/docs/modules/model_io/models/llms/)	2023-07-24 19:14:32 -07:00
earonesty	59a7c5877a	Update supabase.py, add filter to query (matches latest supabase docs & js) (#7721 ) - Description: Update supabase to support optional filter argument (if present, used, if not, doesn't break things) - Tag maintainer: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 19:13:52 -07:00
Aditya S	00de334f81	Fixed sparql SELECT and UPDATE query function (#7758 ) - Description: Changed "SELECT" and "UPDTAE" intent check from "=" to "in", - Issue: Based on my own testing, most of the LLM (StarCoder, NeoGPT3, etc..) doesn't return a single word response ("SELECT" / "UPDATE") through this modification, we can accomplish the same output without curated prompt engineering. - Dependencies: None - Tag maintainer: @baskaryan - Twitter handle: @aditya_0290 Thank you for maintaining this library, Keep up the good efforts. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 18:29:30 -07:00
William FH	3662aca7d4	Add async support for transform chain (#8205 )	2023-07-24 17:45:17 -07:00
Taqi Jaffri	8f158b72fc	Added stop sequence support to replicate (#8107 ) Stop sequences are useful if you are doing long-running completions and need to early-out rather than running for the full max_length... not only does this save inference cost on Replicate, it is also much faster if you are going to truncate the output later anyway. Other LLMs support stop sequences natively (e.g. OpenAI) but I didn't see this for Replicate so adding this via their prediction cancel method. Housekeeping: I ran `make format` and `make lint`, no issues reported in the files I touched. I did update the replicate integration test and ran `poetry run pytest tests/integration_tests/llms/test_replicate.py` successfully. Finally, I am @tjaffri https://twitter.com/tjaffri for feature announcement tweets... or if you could please tag @docugami https://twitter.com/docugami we would really appreciate that :-) Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	2023-07-24 17:34:13 -07:00
glaze	f7ad14acfa	Add etherscan document loader (#7943 ) @rlancemartin The modification includes: * etherscanLoader * test_etherscan * document ipynb I have run the test, lint, format, and spell check. I do encounter a linting error on ipynb, I am not sure how to address that. ``` docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:55: error: Name "null" is not defined [name-defined] docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:76: error: Name "null" is not defined [name-defined] Found 2 errors in 1 file (checked 1 source file) ``` - Description: The Etherscan loader uses etherscan api to load transaction histories under specific accounts on Ethereum Mainnet. - No dependency is introduced by this PR. - Twitter handle: glazecl --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 17:09:16 -07:00
Julien Salinas	73d5cba308	Allow user to modify the GPU and language settings when using NLP Cloud (#7985 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 17:08:56 -07:00
Bagatur	483f6c2fe3	mv eval docs (#8209 )	2023-07-24 16:31:20 -07:00
Liu Ming	24f889f2bc	Change with_history option to False for ChatGLM by default (#8076 ) ChatGLM LLM integration will by default accumulate conversation history(with_history=True) to ChatGLM backend api, which is not expected in most cases. This PR set with_history=False by default, user should explicitly set llm.with_history=True to turn this feature on. Related PR: #8048 #7774 --------- Co-authored-by: mlot <limpo2000@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 15:46:02 -07:00
Mahip Soni	1f055775f8	Fixing issue with MSSQL connection (#8040 ) My team recently faced an issue while using MSSQL and passing a schema name. We noticed that "SET search_path TO {self.schema}" is being called for us, which is not a valid ms-sql query, and is specific to postgresql dialect. We were able to run it locally after this fix. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 15:45:40 -07:00
Anthony Mahanna	76102971c0	ArangoDB/AQL support for Graph QA Chain (#7880 ) Description: Serves as an introduction to LangChain's support for [ArangoDB](https://github.com/arangodb/arangodb), similar to https://github.com/hwchase17/langchain/pull/7165 and https://github.com/hwchase17/langchain/pull/4881 Issue: No issue has been created for this feature Dependencies: `python-arango` has been added as an optional dependency via the `CONTRIBUTING.md` guidelines Twitter handle: [at]arangodb - Integration test has been added - Notebook has been added: [graph_arangodb_qa.ipynb](https://github.com/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb) [![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb) ``` docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD= arangodb/arangodb ``` ``` pip install git+https://github.com/amahanna/langchain.git ``` ```python from arango import ArangoClient from langchain.chat_models import ChatOpenAI from langchain.graphs import ArangoGraph from langchain.chains import ArangoGraphQAChain db = ArangoClient(hosts="localhost:8529").db(name="_system", username="root", password="", verify=True) graph = ArangoGraph(db) chain = ArangoGraphQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph) chain.run("Is Ned Stark alive?") ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 15:16:52 -07:00
Adilkhan Sarsen	3e7d2a1b64	SelfQuery support for deeplake (#7888 ) Added support SelfQuery for Deeplake	2023-07-24 14:22:33 -07:00
Leonid Ganeline	c580c81cca	docstrings `experimental` (#7969 ) - added/changed docstring for `experimental` - added/changed docstrings for different artifacts - @baskaryan	2023-07-24 14:21:48 -07:00
Leonid Ganeline	3eb4112a1f	Refactored `example_generator` (#8099 ) Refactored `example_generator.py`. The same as #7961 `example_generator.py` is in the root code folder. This creates the `langchain.example_generator: Example Generator ` group on the API Reference navigation ToC, on the same level as `Chains` and `Agents` which is not correct. Refactoring: - moved `example_generator.py` content into `chains/example_generator.py` (not in `utils` because the `example_generator` has dependencies on other LangChain classes. It also doesn't work for moving into `utilities/`) - added the backwards compatibility ref in the original `example_generator.py` @hwchase17	2023-07-24 13:36:44 -07:00
Juan José Torres	1cc7d4c9eb	Update SageMaker Endpoint Embeddings docs to be up to date with current requirements (#8103 ) - Description: Simple change of the Class that ContentHandler inherits from. To create an object of type SagemakerEndpointEmbeddings, the property content_handler must be of type EmbeddingsContentHandler not ContentHandlerBase anymore, - Twitter handle: @Juanjo_Torres11 Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 13:35:06 -07:00
Leonid Ganeline	7cbe28ba9b	Refactored `input` (#8202 ) Refactored `input.py`. The same as https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099 input.py is in the root code folder. This creates the `langchain.input: Input` group on the API Reference navigation ToC, on the same level as Chains and Agents which is incorrect. Refactoring: - copied input.py file into utils/input.py - I added the backwards compatibility ref in the original input.py. - changed several imports to a new ref @hwchase17, @baskaryan	2023-07-24 13:10:03 -07:00
Monty Evans	72eb4fa4e8	Change WebBaseLoader metadata parsing to set missing metadata to descriptive string instead of `None` (#8175 ) Solves #8174 & #3542 Co-authored-by: mevans <mevans@palantir.com>	2023-07-24 12:17:49 -07:00
Bagatur	1a7d8667c8	Bagatur/gateway chat (#8198 ) Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: dbczumar <corey.zumar@databricks.com>	2023-07-24 12:17:00 -07:00
Ettore Di Giacinto	ae28568e2a	Add embeddings for LocalAI (#8134 ) Description: This PR adds embeddings for LocalAI ( https://github.com/go-skynet/LocalAI ), a self-hosted OpenAI drop-in replacement. As LocalAI can re-use OpenAI clients it is mostly following the lines of the OpenAI embeddings, however when embedding documents, it just uses string instead of sending tokens as sending tokens is best-effort depending on the model being used in LocalAI. Sending tokens is also tricky as token id's can mismatch with the model - so it's safer to just send strings in this case. Partly related to: https://github.com/hwchase17/langchain/issues/5256 Dependencies: No new dependencies Twitter: @mudler_it --------- Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 12:16:49 -07:00
Mike Nitsenko	d983046f90	Extend Cube Semantic Loader functionality (#8186 ) PR Description: This pull request introduces several enhancements and new features to the `CubeSemanticLoader`. The changes include the following: 1. Added imports for the `json` and `time` modules. 2. Added new constructor parameters: `load_dimension_values`, `dimension_values_limit`, `dimension_values_max_retries`, and `dimension_values_retry_delay`. 3. Updated the class documentation with descriptions for the new constructor parameters. 4. Added a new private method `_get_dimension_values()` to retrieve dimension values from Cube's REST API. 5. Modified the `load()` method to load dimension values for string dimensions if `load_dimension_values` is set to `True`. 6. Updated the API endpoint in the `load()` method from the base URL to the metadata endpoint. 7. Refactored the code to retrieve metadata from the response JSON. 8. Added the `column_member_type` field to the metadata dictionary to indicate if a column is a measure or a dimension. 9. Added the `column_values` field to the metadata dictionary to store the dimension values retrieved from Cube's API. 10. Modified the `page_content` construction to include the column title and description instead of the table name, column name, data type, title, and description. These changes improve the functionality and flexibility of the `CubeSemanticLoader` class by allowing the loading of dimension values and providing more detailed metadata for each document. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 12:11:58 -07:00
Bagatur	82b8d8596c	bump lc241 exp3 (#8193 )	2023-07-24 11:52:44 -07:00
Leonid Ganeline	848454d1e7	Refactored `formatting` (#8191 ) Refactored `formatting.py`. The same as https://github.com/langchain-ai/langchain/pull/7961 #8098 #8099 formatting.py is in the root code folder. This creates the `langchain.formatting: Formatting` group on the API Reference navigation ToC, on the same level as Chains and Agents which is incorrect. Refactoring: - moved formatting.py content into utils/formatting.py - I did not add the backwards compatibility ref in the original formatting.py. It seems unnecessary. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 11:34:15 -07:00
Bagatur	4928f7a9f5	undo bump (#8192 )	2023-07-24 11:32:17 -07:00
Bagatur	14aa27b5f4	redirect (#8189 )	2023-07-24 10:45:12 -07:00
Bagatur	e7d64f8b15	Bagatur/vercel test 3 (#8188 )	2023-07-24 10:11:54 -07:00
Leonid Ganeline	120cdf813d	docstrings `memory` (#8018 ) docstrings `memory`: - added module summary - added missed docstrings - updated docstrings into consistent format - @baskaryan	2023-07-24 10:05:36 -07:00
Bagatur	026269bfa9	redirects (#8183 )	2023-07-24 08:32:49 -07:00
Bagatur	d5689d58ab	Bagatur/bump 241 (#8182 )	2023-07-24 07:47:40 -07:00
Harrison Chase	3caccf304c	Harrison/hugginggpt (#8162 ) Co-authored-by: Yongliang Shen <withsyl@163.com>	2023-07-24 07:36:24 -07:00
rajib	f3908627ed	changed to mlflow-ai-gateway in llms/__init__.py (#8114 ) - Description: In the llms/__init__.py, the key name is wrong for mlflowaigateway. It should be mlflow-ai-gateway - Issue: NA - Dependencies: NA - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: na Without this fix, when we run the code for mlflowaigateway, we will get error as below ValueError: Loading mlflow-ai-gateway LLM not supported --------- Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-23 23:30:46 -07:00
Bagatur	c8c8635dc9	mv module integrations docs (#8101 )	2023-07-23 23:23:16 -07:00
Adarsh Shirawalmath	8ea840432f	Generalize Comment on Streaming Support for LLM Implementations and add examples (#8115 ) The example provided demonstrates the usage of the HuggingFaceTextGenInference implementation with streaming enabled.	2023-07-23 22:59:59 -07:00
Gordon Clark	80b3ec5869	GitHub toolkit improvements (#8121 ) Fixes an issue with the github tool where the API returned special objects but the tool was expecting dictionaries. Also added proper docstrings to the GitHubAPIWraper methods and a (very basic) integration test. Maintainer responsibilities: - Agents / Tools / Toolkits: @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-23 20:17:53 -07:00
Harrison Chase	33fd6184ba	beef up getting started (#8139 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-23 19:57:43 -07:00
Lawrence Lim	fa8906a9b7	fix typo: Entity Summary Memory documentation (#8145 ) Fixed a small typo I came across in the Memory documentation.	2023-07-23 19:36:50 -07:00
shibuiwilliam	8f5000146c	add faiss test for score threshold (#8143 ) # What - Add faiss vector search test for score threshold - Fix failing faiss vector search test; filtering with list value is wrong. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: Add faiss vector search test for score threshold; Fix failing faiss vector search test; filtering with list value is wrong. - Issue: None - Dependencies: None - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @MlopsJ Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-23 19:36:38 -07:00
Nolan	7686dabd36	Unbreak devcontainer (#8154 ) Codespaces and devcontainer was broken by the [repo restructure](https://github.com/langchain-ai/langchain/discussions/8043). - Description: Add libs/langchain to container so it can be built without error. - Issue: - - Dependencies: - - Tag maintainer: @hwchase17 @baskaryan - Twitter handle: @finnless The failed build log says: ``` #10 [langchain-dev-dependencies 2/2] RUN poetry install --no-interaction --no-ansi --with dev,test,docs #10 sha256:e850ee99fc966158bfd2d85e82b7c57244f47ecbb1462e75bd83b981a56a1929 2023-07-23 23:30:33.692Z: #10 0.827 #10 0.827 Directory libs/langchain does not exist 2023-07-23 23:30:33.738Z: #10 ERROR: executor failed running [/bin/sh -c poetry install --no-interaction --no-ansi --with dev,test,docs]: exit code: 1 ``` The new pyproject.toml imports from libs/langchain: `77bf75c236/pyproject.toml (L14-L16)` But libs/langchain is never added to the dev.Dockerfile: `77bf75c236/libs/langchain/dev.Dockerfile (L37-L39)`	2023-07-23 19:33:47 -07:00
Fielding Johnston	fb62f2be70	nit: small typo in evaluation module docs (#8155 ) Hopefully, this doesn't come across as nitpicky! That isn't the intention. I only noticed it, because I enjoy reading the documentation and when I hit a mental road bump it is usually due to a missing word or something =) @baskaryan	2023-07-23 18:25:14 -07:00
Harrison Chase	9205919ad2	actually use input key (#8136 )	2023-07-23 18:02:45 -07:00
Leonid Ganeline	670304a8b3	simplified nmspace (#8152 ) recreated #7894 (it is easy to recreate than resolve conflicts) A small refactoring to improve the API Reference Agents table @baskaryan	2023-07-23 18:02:20 -07:00
William FH	c5b50be225	Function calling logging fixup (#8153 ) Fix bad overwriting of "functions" arg in invocation params. Cleanup precedence in the dict Clean up some inappropriate types (mapping should be dict) Example: https://dev.smith.langchain.com/public/9a7a6817-1679-49d8-8775-c13916975aae/r ![image](https://github.com/langchain-ai/langchain/assets/13333726/94cd0775-b6ef-40c3-9e5a-3ab65e466ab9)	2023-07-23 18:01:33 -07:00
SlapDrone	961a0e200f	Implement AgentExecutorIterator (#6929 ) - Description: Implements a `.iter()` method for the `AgentExecutor` class. This allows hooking into and intercepting intermediate agent steps. - Issue: #6925 - Dependencies: None - Tag maintainer: @vowelparrot @agola11 - Twitter handle: @SlapDron3 @lacicocodes --------- Co-authored-by: Lacico <Lacicocodes@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-23 18:00:22 -07:00
Harrison Chase	77bf75c236	bump experimental to 002 (#8150 )	2023-07-23 09:22:39 -07:00
Harrison Chase	e46126eac6	add llamaapi (#8140 )	2023-07-23 09:16:16 -07:00
Harrison Chase	f0eb5db670	Harrison/agent intro (#8138 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-22 22:14:59 -07:00
Harrison Chase	cbf2fc8af8	prompt ergonomics (#7799 )	2023-07-22 14:19:17 -07:00
Samuel Berthe	d81d6e874f	doc(sqldatabasechain): use views when jsonb column description is not available (#8133 ) I think the PR diff is self explaining ;) @baskaryan	2023-07-22 11:30:04 -07:00
Harrison Chase	506b21bfc2	Update MIGRATE.md	2023-07-22 09:11:43 -07:00
Harrison Chase	9854d9e5cb	cr	2023-07-22 09:07:26 -07:00
Harrison Chase	9f3073d418	bump versions (#8129 )	2023-07-22 08:46:37 -07:00
Harrison Chase	86946a47a8	Harrison/add back in experimental (#8128 )	2023-07-22 08:27:29 -07:00
Karthik Raja A	8b08687fc4	MultiOn client toolkit (#8110 ) Addition of MultiOn Client Agent Toolkit Dependencies: multion pip package This PR consists of the following: - MultiOn utility,tools and integration with agent - sample jupyter notebook. Request @hwchase17 , @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-22 08:19:01 -07:00
Harrison Chase	aa0e69bc98	Harrison/official pre release (#8106 )	2023-07-21 18:44:32 -07:00
Philip Kiely - Baseten	95bcf68802	add kwargs support for Baseten models (#8091 ) This bugfix PR adds kwargs support to Baseten model invocations so that e.g. the following script works properly: ```python chatgpt_chain = LLMChain( llm=Baseten(model="MODEL_ID"), prompt=prompt, verbose=False, memory=ConversationBufferWindowMemory(k=2), llm_kwargs={"max_length": 4096} ) ```	2023-07-21 13:56:27 -07:00
Harrison Chase	8dcabd9205	bump releases rc0 (#8097 )	2023-07-21 13:54:57 -07:00
Bagatur	58f65fcf12	use top nav docs (#8090 )	2023-07-21 13:52:03 -07:00
Harrison Chase	0faba034b1	add experimental release action (#8096 )	2023-07-21 13:38:35 -07:00
Harrison Chase	d353d668e4	remove CVEs (#8092 ) This PR aims to move all code with CVEs into `langchain.experimental`. Note that we are NOT yet removing from the core `langchain` package - we will give people a week to migrate here. See MIGRATE.md for how to migrate Zero changes to functionality Vulnerabilities this addresses: PALChain: - https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5752409 - https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5759265 SQLDatabaseChain - https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5759268 `load_prompt` (Python files only) - https://security.snyk.io/vuln/SNYK-PYTHON-LANGCHAIN-5725807	2023-07-21 13:32:39 -07:00
Bagatur	08c658d3f8	fix api ref (#8083 )	2023-07-21 12:37:21 -07:00
Harrison Chase	344cbd9c90	update contributor guide (#8088 )	2023-07-21 12:01:05 -07:00
Harrison Chase	17c06ee456	cr	2023-07-21 10:48:00 -07:00
Harrison Chase	da04760de1	Harrison/move experimental (#8084 )	2023-07-21 10:36:28 -07:00
Harrison Chase	f35db9f43e	(WIP) set up experimental (#7959 )	2023-07-21 09:20:24 -07:00
c-bata	623b321e75	Fix `allowed_search_types` in `VectorStoreRetriever` (#8064 ) Unexpectedly changed at `6792a3557d` <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> I guess `allowed_search_types` is unexpectedly changed in `6792a3557d`, so that we cannot specify `similarity_score_threshold` here. ```python class VectorStoreRetriever(BaseRetriever): ... allowed_search_types: ClassVar[Collection[str]] = ( "similarity", "similarityatscore_threshold", "mmr", ) @root_validator() def validate_search_type(cls, values: Dict) -> Dict: """Validate search type.""" search_type = values["search_type"] if search_type not in cls.allowed_search_types: raise ValueError(...) if search_type == "similarity_score_threshold": ... # UNREACHABLE CODE ``` VectorStores Maintainers: @rlancemartin @eyurtsev	2023-07-21 08:39:36 -07:00
Bagatur	95e369b38d	bump 239 (#8077 )	2023-07-21 07:31:14 -07:00
William FH	c38965fcba	Add embedding and vectorstore provider info as tags (#8027 ) Example: https://smith.langchain.com/public/bcd3714d-abba-4790-81c8-9b5718535867/r The vectorstore implementations aren't super standardized yet, so just adding an optional embeddings property to pass in.	2023-07-20 22:40:01 -07:00
Mohammad Mohtashim	355b7d8b86	Getting SQL cmd directly from SQLDatabase Chain. (#7940 ) - Description: Get SQL Cmd directly generated by SQL-Database Chain without executing it in the DB engine. - Issue: #4853 - Tag maintainer: @hinthornw,@baskaryan --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-20 22:36:55 -07:00
Lance Martin	5a084e1b20	Async HTML loader and HTML2Text transformer (#8036 ) New HTML loader that asynchronously loader a list of urls. New transformer using [HTML2Text](https://github.com/Alir3z4/html2text/) for HTML to clean, easy-to-read plain ASCII text (valid Markdown).	2023-07-20 22:30:59 -07:00
Wey Gu	cf60cff1ef	feat: Add with_history option for chatglm (#8048 ) In certain 0-shot scenarios, the existing stateful language model can unintentionally send/accumulate the .history. This commit adds the "with_history" option to chatglm, allowing users to control the behavior of .history and prevent unintended accumulation. Possible reviewers @hwchase17 @baskaryan @mlot Refer to discussion over this thread: https://twitter.com/wey_gu/status/1681996149543276545?s=20	2023-07-20 22:25:37 -07:00
Harrison Chase	1f3b987860	Harrison/GitHub toolkit (#8047 ) Co-authored-by: Trevor Dobbertin <trevordobbertin@gmail.com>	2023-07-20 22:24:55 -07:00
Leonid Ganeline	ae8bc9e830	Refactored `sql_database` (#7945 ) The `sql_database.py` is unnecessarily placed in the root code folder. A similar code is usually placed in the `utilities/`. As a byproduct of this placement, the sql_database is [placed on the top level of classes in the API Reference](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.sql_database) which is confusing and not correct. - moved the `sql_database.py` from the root code folder to the `utilities/` @baskaryan --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-20 22:17:55 -07:00
William FH	dc9d6cadab	Dedup methods (#8049 )	2023-07-20 22:13:22 -07:00
Harrison Chase	f99f497b2c	Harrison/predibase (#8046 ) Co-authored-by: Abhay Malik <32989166+Abhay-765@users.noreply.github.com>	2023-07-20 19:26:50 -07:00
Jacob Lee	56c6ab1715	Fix bad docs sidebar header (#7966 ) Quick fix for: <img width="283" alt="Screenshot 2023-07-19 at 2 49 44 PM" src="https://github.com/hwchase17/langchain/assets/6952323/91e4868c-b75e-413d-9f8f-d34762abf164"> CC @baskaryan	2023-07-20 19:06:57 -07:00
Wian Stipp	ebc5ff2948	HuggingFaceTextGenInference bug fix: Multiple values for keyword argument (#8044 ) Fixed the bug causing: `TypeError: generate() got multiple values for keyword argument 'stop_sequences'` ```python res = await self.async_client.generate( prompt, self._default_params, stop_sequences=stop, kwargs, ) ``` The above throws an error because stop_sequences is in also in the self._default_params. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 19:05:08 -07:00
Kacper Łukawski	ed6a5532ac	Implement async support in Qdrant local mode (#8001 ) I've extended the support of async API to local Qdrant mode. It is faked but allows prototyping without spinning a container. The tests are improved to test the in-memory case as well. @baskaryan @rlancemartin @eyurtsev @agola11	2023-07-20 19:04:33 -07:00
Bagatur	7717c24fc4	fix redis cache chat model (#8041 ) Redis cache currently stores model outputs as strings. Chat generations have Messages which contain more information than just a string. Until Redis cache supports fully storing messages, cache should not interact with chat generations.	2023-07-20 19:00:05 -07:00
Taqi Jaffri	973593c5c7	Added streaming support to Replicate (#8045 ) Streaming support is useful if you are doing long-running completions or need interactivity e.g. for chat... adding it to replicate, using a similar pattern to other LLMs that support streaming. Housekeeping: I ran `make format` and `make lint`, no issues reported in the files I touched. I did update the replicate integration test but ran into some issues, specifically: 1. The original test was failing for me due to the model argument not being specified... perhaps this test is not regularly run? I fixed it by adding a call to the lightweight hello world model which should not be burdensome for replicate infra. 2. I couldn't get the `make integration_tests` command to pass... a lot of failures in other integration tests due to missing dependencies... however I did make sure the particluar test file I updated does pass, by running `poetry run pytest tests/integration_tests/llms/test_replicate.py` Finally, I am @tjaffri https://twitter.com/tjaffri for feature announcement tweets... or if you could please tag @docugami https://twitter.com/docugami we would really appreciate that :-) Tagging model maintainers @hwchase17 @baskaryan Thank for all the awesome work you folks are doing. --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	2023-07-20 18:59:54 -07:00
Piyush Jain	31b7ddc12c	Neptune graph and openCypher QA Chain (#8035 ) ## Description This PR adds a graph class and an openCypher QA chain to work with the Amazon Neptune database. ## Dependencies `requests` which is included in the LangChain dependencies. ## Maintainers for Review @krlawrence @baskaryan ### Twitter handle pjain7	2023-07-20 18:56:47 -07:00
Leonid Ganeline	995220b797	Refactored `math_utils` (#7961 ) `math_utils.py` is in the root code folder. This creates the `langchain.math_utils: Math Utils` group on the API Reference navigation ToC, on the same level with `Chains` and `Agents` which is not correct. Refactoring: - created the `utils/` folder - moved `math_utils.py` to `utils/math.py` - moved `utils.py` to `utils/utils.py` - split `utils.py` into `utils.py, env.py, strings.py` - added module description @baskaryan	2023-07-20 18:55:43 -07:00
Paolo Picello	5137f40dd6	Update mongodb_atlas.py docstrings (#8033 ) Hi all, I just added the "index_name" parameter to the docstrings for mongodb_atlas.py (it is missing in the [public doc page](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.mongodb_atlas.MongoDBAtlasVectorSearch.html#langchain-vectorstores-mongodb-atlas-mongodbatlasvectorsearch). Thanks	2023-07-20 17:35:07 -07:00
felixocker	9226fda58b	fix: create schema description from URIs and str w/out rdflib warnings (#8025 ) - Description: fix to avoid rdflib warnings when concatenating URIs and strings to create the text snippet for the knowledge graph's schema. @marioscrock pointed this out in a comment related to #7165 - Issue: None, but the problem was mentioned as a comment in #7165 - Dependencies: None - Tag maintainer: Related to memory -> @hwchase17, maybe @baskaryan as it is a fix	2023-07-20 15:55:19 -07:00
Emory Petermann	7239d57a53	Update Golden integration documentation (#8030 ) fixes some typos and cleans up onboarding for golden, thank you! @hinthornw	2023-07-20 15:53:44 -07:00
Jonathon Belotti	021bb9be84	Update Modal.com integration docs (#8014 ) Hey, I'm a Modal Labs engineer and I'm making this docs update after getting a user question in [our beta Slack space](https://join.slack.com/t/modalbetatesters/shared_invite/zt-1xl9gbob8-1QDgUY7_PRPg6dQ49hqEeQ) about the Langchain integration docs. 🔗 [Modal beta-testers link to docs discussion thread](https://modalbetatesters.slack.com/archives/C031Z7DBQFL/p1689777700594819?thread_ts=1689775859.855849&cid=C031Z7DBQFL)	2023-07-20 15:53:06 -07:00
Jeffrey Wang	62d0475c29	Add Metaphor new field and reformat docs (#8022 ) This PR reformats our python notebook example and also adds a new field we have. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-07-20 15:50:54 -07:00
William FH	e2a99bd169	Different error strings (#8010 )	2023-07-20 09:58:25 -07:00
Bagatur	ec4f93b629	bump 238 (#8012 )	2023-07-20 09:21:15 -07:00
vrushankportkey	5f10d2ea1d	Add Portkey LLMOps integration (#7877 ) Integrating Portkey, which adds production features like caching, tracing, tagging, retries, etc. to langchain apps. - Dependencies: None - Twitter handle: https://twitter.com/portkeyai - test_portkey.py added for tests - example notebook added in new utilities folder in modules Also fixed a bug with OpenAIEmbeddings where headers weren't passing. cc @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 09:08:44 -07:00
Boris Nieuwenhuis	095937ad52	Add google place ID to google places tool response (#7789 ) - Description: this change will add the google place ID of the found location to the response of the GooglePlacesTool - Issue: Not applicable - Dependencies: no dependencies - Tag maintainer: @hinthornw - Twitter handle: Not applicable	2023-07-20 09:04:31 -07:00
Bagatur	7c24a6b9d1	Bagatur/apify (#8008 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Jiří Moravčík <jiri.moravcik@gmail.com> Co-authored-by: Jan Čurn <jan.curn@gmail.com>	2023-07-20 08:36:01 -07:00
Aiden Le	1d7414a371	Feature: Add openai_api_model attribute to Doctran models (#7868 ) - Description: Added the ability to define the open AI model. - Issue: Currently the Doctran instance uses gpt-4 by default, this does not work if the user has no access to gpt -4. - rlancemartin, @eyurtsev, @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 07:27:56 -07:00
Dwai Banerjee	d8c40253c3	Adding endpoint_url to embeddings/bedrock.py and updated docs (#7927 ) BedrockEmbeddings does not have endpoint_url so that switching to custom endpoint is not possible. I have access to Bedrock custom endpoint and cannot use BedrockEmbeddings --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 07:25:59 -07:00
Bagatur	ea028b66ab	undo vectstore memory bug (#8007 )	2023-07-20 07:25:23 -07:00
Mohammad Mohtashim	453d4c3a99	VectorStoreRetrieverMemory exclude additional input keys feature (#7941 ) - Description: Added a parameter in VectorStoreRetrieverMemory which filters the input given by the key when constructing the buffering the document for Vector. This feature is helpful if you have certain inputs apart from the VectorMemory's own memory_key that needs to be ignored e.g when using combined memory, we might need to filter the memory_key of the other memory, Please see the issue. - Issue: #7695 - Tag maintainer: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 07:23:27 -07:00
Constantin Musca	d593833e4d	Add Golden Query Tool (#7930 ) Description: Golden Query is a wrapper on top of the [Golden Query API](https://docs.golden.com/reference/query-api) which enables programmatic access to query results on entities across Golden's Knowledge Base. For more information about Golden API, please see the [Golden API Getting Started](https://docs.golden.com/reference/getting-started) page. Issue: None Dependencies: requests(already present in project) Tag maintainer: @hinthornw Signed-off-by: Constantin Musca <constantin.musca@gmail.com>	2023-07-20 07:03:20 -07:00
eahova	aea97efe8b	Adding code to allow pandas to show all columns instead of truncating… (#7901 ) - Description: Adding code to set pandas dataframe to display all the columns. Otherwise, some data get truncated (it puts a "..." in the middle and just shows the first 4 and last 4 columns) and the LLM doesn't realize it isn't getting the full data. Default value is 8, so this helps Dataframes larger than that. - Issue: none - Dependencies: none - Tag maintainer: @hinthornw - Twitter handle: none	2023-07-20 07:02:01 -07:00
Santiago Delgado	c416dbe8e0	Amadeus Flight and Travel Search Tool (#7890 ) ## Background With the addition on email and calendar tools, LangChain is continuing to complete its functionality to automate business processes. ## Challenge One of the pieces of business functionality that LangChain currently doesn't have is the ability to search for flights and travel in order to book business travel. ## Changes This PR implements an integration with the [Amadeus](https://developers.amadeus.com/) travel search API for LangChain, enabling seamless search for flights with a single authentication process. ## Who can review? @hinthornw ## Appendix @tsolakoua and @minjikarin, I utilized your [amadeus-python](https://github.com/amadeus4dev/amadeus-python) library extensively. Given the rising popularity of LangChain and similar AI frameworks, the convergence of libraries like amadeus-python and tools like this one is likely. So, I wanted to keep you updated on our progress. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 06:59:29 -07:00
Hanit	ea149dbd89	Allowing outside parameters for Qdrant. (#7910 ) @baskaryan @rlancemartin, @eyurtsev	2023-07-20 06:58:54 -07:00
Sheik Irfan Basha	d6493590da	Add Verbose support (#7982 ) (#7984 ) - Description: Add verbose support for the extraction_chain - Issue: Fixes #7982 - Dependencies: NA - Twitter handle: sheikirfanbasha @hwchase17 and @agola11 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 06:52:13 -07:00
Junlin Zhou	812a1643db	chore(hf-text-gen): extract default params for reusing (#7929 ) This PR extract common code (default generation params) for `HuggingFaceTextGenInference`. Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>	2023-07-20 06:49:12 -07:00
Yun Kim	54e02e4392	Add datadog-langchain integration doc (#7955 ) ## Description Added a doc about the [Datadog APM integration for LangChain](https://github.com/DataDog/dd-trace-py/pull/6137). Note that the integration is on `ddtrace`'s end and so no code is introduced/required by this integration into the langchain library. For that reason I've refrained from adding an example notebook (although I've added setup instructions for enabling the integration in the doc) as no code is technically required to enable the integration. Tagging @baskaryan as reviewer on this PR, thank you very much! ## Dependencies Datadog APM users will need to have `ddtrace` installed, but the integration is on `ddtrace` end and so does not introduce any external dependencies to the LangChain project. Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 06:44:58 -07:00
Wian Stipp	0ffb7fc10c	One Line Fix: missing text output with huggingface TGI LLM (#7972 ) Small bug fix. The async _call method was missing a line to return the generated text. @baskaryan	2023-07-20 06:44:29 -07:00
Jithin James	493cbc9410	docs: fix a couple of small indentation errors in the strings (#7951 ) Fixed a few indentations I came across in the docs @baskaryan	2023-07-20 06:34:01 -07:00
Bhashithe Abeysinghe	73901ef132	Added windows specific instructions to Llama.cpp documentation. (#8000 ) - Description: Added windows specific instructions on llama.cpp in the notebook file - Issue: #6356 - Dependencies: None - Tag maintainer: @baskaryan	2023-07-20 06:31:25 -07:00
Leonid Ganeline	24b26a922a	docstrings for `embeddings` (#7973 ) Added/updated docstrings for the `embeddings` @baskaryan	2023-07-20 06:26:44 -07:00
Leonid Ganeline	0613ed5b95	docstrings for `LLMs` (#7976 ) docstrings for the `llms/`: - added missed docstrings - update existing docstrings to consistent format (no `Wrappers`!) @baskaryan	2023-07-20 06:26:16 -07:00
Jeff Huber	5694e7b8cf	Update chroma notebook (#7978 ) Fix up the Chroma notebook - remove `.persist()` -- this is no longer in Chroma as of `0.4.0` - update output to match `0.4.0` - other cleanup work	2023-07-20 06:25:31 -07:00
Harutaka Kawamura	4a5894db47	Fix incorrect field name in MLflow AI Gateway config example (#7983 )	2023-07-20 06:24:59 -07:00
Kacper Łukawski	19e8472521	Add async Qdrant to async_agent.ipynb (#7993 ) I added Qdrant to the async API docs. This is the only vector store that supports full async API. @baskaryan @rlancemartin, @eyurtsev	2023-07-20 06:23:15 -07:00
Nuno Campos	8edb1db9dc	Fix key errors in weaviate hybrid retriever init (#7988 )	2023-07-20 06:22:18 -07:00
Harrison Chase	df84e1bb64	pass callbacks along baby ai (#7908 )	2023-07-19 22:40:33 -07:00
William FH	a4c5914c9a	Bump LS Version (#7970 )	2023-07-19 17:12:16 -07:00
Bagatur	5d021c0962	nb fix (#7962 )	2023-07-19 15:27:43 -07:00
Julien Salinas	3adab5e5be	Integrate NLP Cloud embeddings endpoint (#7931 ) Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings). --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-19 15:27:34 -07:00
Bagatur	854a2be0ca	Add debugging guide (#7956 )	2023-07-19 14:15:11 -07:00
Brendan Collins	9aef79c2e3	Add Geopandas.GeoDataFrame Document Loader (#3817 ) Work in Progress. WIP Not ready... Adds Document Loader support for [Geopandas.GeoDataFrames](https://geopandas.org/) Example: - [x] stub out `GeoDataFrameLoader` class - [x] stub out integration tests - [ ] Experiment with different geometry text representations - [ ] Verify CRS is successfully added in metadata - [ ] Test effectiveness of searches on geometries - [ ] Test with different geometry types (point, line, polygon with multi-variants). - [ ] Add documentation --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>	2023-07-19 12:14:41 -07:00
Lance Martin	dfc533aa74	Add llama-v2 to local document QA (#7952 )	2023-07-19 11:15:47 -07:00
Bagatur	d9b5bcd691	bump (#7948 )	2023-07-19 10:23:21 -07:00
Bagatur	f97535b33e	fix (#7947 )	2023-07-19 10:23:10 -07:00
Adilkhan Sarsen	7bb843477f	Removed kwargs from add_texts (#7595 ) Removing **kwargs argument from add_texts method in DeepLake vectorstore as it confuses users and doesn't fail when user is typing incorrect parameters. Also added small test to ensure the change is applies correctly. Guys could pls take a look: @rlancemartin, @eyurtsev, this is a small PR. Thx so much!	2023-07-19 09:23:49 -07:00
Bagatur	4d8b48bdb3	bump 236 (#7938 )	2023-07-19 07:51:40 -07:00
Harutaka Kawamura	f6839a8682	Add integration for MLflow AI Gateway (#7113 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Adds integration for MLflow AI Gateway (this will be shipped in MLflow 2.5 this week). Manual testing: ```sh # Move to mlflow repo cd /path/to/mlflow # install langchain pip install git+https://github.com/harupy/langchain.git@gateway-integration # launch gateway service mlflow gateway start --config-path examples/gateway/openai/config.yaml # Then, run the examples in this PR ```	2023-07-19 07:40:55 -07:00
David Preti	6792a3557d	Update openai.py compatibility with azure 2023-07-01-preview (#7937 ) Fixed missing "content" field in azure. Added a check for "content" in _dict (missing for azure api=2023-07-01-preview) @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-19 07:31:18 -07:00
王斌(Bin Wang)	b65102bdb2	fix: pgvector search_type of similarity_score_threshold not working (#7771 ) - Description: VectorStoreRetriever->similarity_score_threshold with search_type of "similarity_score_threshold" not working with the following two minor issues, - Issue: 1. In line 237 of `vectorstores/base.py`, "score_threshold" is passed to `_similarity_search_with_relevance_scores` as in the kwargs, while score_threshold is not a valid argument of this method. As a fix, before calling `_similarity_search_with_relevance_scores`, score_threshold is popped from kwargs. 2. In line 596 to 607 of `vectorstores/pgvector.py`, it's checking the distance_strategy against the string in Enum. However, self.distance_strategy will get the property of distance_strategy from line 316, where the callable function is passed. To solve this issue, self.distance_strategy is changed to self._distance_strategy to avoid calling the property method., - Dependencies: No, - Tag maintainer: @rlancemartin, @eyurtsev, - Twitter handle: No --------- Co-authored-by: Bin Wang <bin@arcanum.ai>	2023-07-19 07:20:52 -07:00
William FH	9d7e57f5c0	Docs Nit (#7918 )	2023-07-18 21:47:28 -07:00
Wilson Leao Neto	8bb33f2296	Exposes Kendra result item DocumentAttributes in the document metadata (#7781 ) - Description: exposes the ResultItem DocumentAttributes as document metadata with key 'document_attributes' and refactors AmazonKendraRetriever by providing a ResultItem base class in order to avoid duplicate code; - Tag maintainer: @3coins @hupe1980 @dev2049 @baskaryan - Twitter handle: wilsonleao ### Why? Some use cases depend on specific document attributes returned by the retriever in order to improve the quality of the overall completion and adjust what will be displayed to the user. For the sake of consistency, we need to expose the DocumentAttributes as document metadata so we are sure that we are using the values returned by the kendra request issued by langchain. I would appreciate your review @3coins @hupe1980 @dev2049. Thank you in advance! ### References - [Amazon Kendra DocumentAttribute](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttribute.html) - [Amazon Kendra DocumentAttributeValue](https://docs.aws.amazon.com/kendra/latest/APIReference/API_DocumentAttributeValue.html) --------- Co-authored-by: Piyush Jain <piyushjain@duck.com>	2023-07-18 18:46:38 -07:00
Wilson Leao Neto	efa67ed0ef	fix #7782 : check title and excerpt separately for page_content (#7783 ) - Description: check title and excerpt separately for page_content so that if title is empty but excerpt is present, the page_content will only contain the excerpt - Issue: #7782 - Tag maintainer: @3coins @baskaryan - Twitter handle: wilsonleao	2023-07-18 18:46:23 -07:00
Leonid Ganeline	d92926cbc2	docstrings `chains` (#7892 ) Added/updated docstrings.	2023-07-18 18:25:42 -07:00
Leonid Ganeline	4a810756f8	docstrings `chains` (#7892 ) Added/updated docstrings. @baskaryan	2023-07-18 18:25:27 -07:00
Jarek Kazmierczak	f2ef3ff54a	Google Cloud Enterprise Search retriever (#7857 ) Added a retriever that encapsulated Google Cloud Enterprise Search. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 18:24:08 -07:00
Alonso Silva Allende	1152f4d48b	Allow chat models that do not return token usage (#7907 ) - Description: It allows to use chat models that do not return token usage - Issue: [#7900](https://github.com/hwchase17/langchain/issues/7900) - Dependencies: None - Tag maintainer: @agola11 @hwchase17 - Twitter handle: @alonsosilva --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>	2023-07-18 18:12:09 -07:00
Zizhong Zhang	bdf0c2267f	docs(custom_chain) fix typo (#7898 ) Fix typo in the document of custom_chain	2023-07-18 18:03:19 -07:00
Jeff Huber	2139d0197e	upgrade chroma to 0.4.0 (#7749 ) This should land Monday the 17th Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to build, more durable, faster, smaller, and more extensible. This comes with a few changes: 1. A simplified and improved client setup. Instead of having to remember weird settings, users can just do `EphemeralClient`, `PersistentClient` or `HttpClient` (the underlying direct `Client` implementation is also still accessible) 2. We migrated data stores away from `duckdb` and `clickhouse`. This changes the api for the `PersistentClient` that used to reference `chroma_db_impl="duckdb+parquet"`. Now we simply set `is_persistent=true`. `is_persistent` is set for you to `true` if you use `PersistentClient`. 3. Because we migrated away from `duckdb` and `clickhouse` - this also means that users need to migrate their data into the new layout and schema. Chroma is committed to providing extension notification and tooling around any schema and data migrations (for example - this PR!). After upgrading to `0.4.0` - if users try to access their data that was stored in the previous regime, the system will throw an `Exception` and instruct them how to use the migration assistant to migrate their data. The migration assitant is a pip installable CLI: `pip install chroma_migrate`. And is runnable by calling `chroma_migrate` -- TODO ADD here is a short video demonstrating how it works. Please reference the readme at [chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate) to see a full write-up of our philosophy on migrations as well as more details about this particular migration. Please direct any users facing issues upgrading to our Discord channel called [#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883). We have also created a [email listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers directly in the future about breaking changes. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 17:20:54 -07:00
Gergely Papp	10246375a5	Gpapp/chromadb (#7891 ) - Description: version check to make sure chromadb >=0.4.0 does not throw an error, and uses the default sqlite persistence engine when the directory is set, - Issue: the issue #7887 For attention of - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 17:03:42 -07:00
Lance Martin	41c841ec85	Add Llama-v2 to Llama.cpp notebook (#7913 )	2023-07-18 15:13:27 -07:00
Bagatur	b9639f6067	fix docs (#7911 )	2023-07-18 14:25:45 -07:00
Jeff Huber	dc8b790214	Improve vector store onboarding exp (#6698 ) This PR - fixes the `similarity_search_by_vector` example, makes the code run and adds the example to mirror `similarity_search` - reverts back to chroma from faiss to remove sharp edges / create a happy path for new developers. (1) real metadata filtering, (2) expected functionality like `update`, `delete`, etc to serve beyond the most trivial use cases @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 13:48:42 -07:00
Bagatur	25a2bdfb70	add pr template instructions (#7904 )	2023-07-18 13:22:28 -07:00
Hanit	0d23c0c82a	Allowing additional params for OpenAIEmbeddings. (#7752 ) (#7654) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 12:14:51 -07:00
Lance Martin	862268175e	Add llama-v2 to docs (#7893 )	2023-07-18 12:09:09 -07:00
TRY-ER	21d1c988a9	Try er/redis index retrieval retry00 (#7773 ) Replace this comment with: - Description: Modified the code to return the document id from the redis document search as metadata. - Issue: the issue # it fixes retrieval of id as metadata as string - Tag maintainer: @rlancemartin, @eyurtsev --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 10:49:50 -07:00
shibuiwilliam	177baef3a1	Add test for svm retriever (#7768 ) # What - This is to add unit test for svm retriever. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 09:57:24 -07:00
Filip Michalsky	69b9db2b5e	Notebook update: sales agent with tools (#7753 ) - Description: This is an update to a previously published notebook. Sales Agent now has access to tools, and this notebook shows how to use a Product Knowledge base to reduce hallucinations and act as a better sales person! - Issue: N/A - Dependencies: `chromadb openai tiktoken` - Tag maintainer: @baskaryan @hinthornw - Twitter handle: @FilipMichalsky	2023-07-18 09:53:12 -07:00
shibuiwilliam	f29a5d4bcc	add test for knn retriever (#7769 ) # What - This is to add test for knn retriever. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 09:52:11 -07:00
Orgil	75d3f1e5e6	remove unused import in voice assistant doc (#7757 ) Description: Removed unused import in voice_assistant doc. Tag maintainer: @baskaryan	2023-07-18 09:51:28 -07:00
maciej-skorupka	c6d1d6d7fc	feat: moving azure OpenAI API version to the latest 2023-05-15 (#7764 ) Moving to the latest non-preview Azure OpenAI API version=2023-05-15. The previous 2023-03-15-preview doesn't have support, SLA etc. For instance, OpenAI SDK has moved to this version https://github.com/openai/openai-python/releases/tag/v0.27.7 @baskaryan	2023-07-18 09:50:15 -07:00
satorioh	259a409998	docs(zilliz): connection_args add token description for serverless cl… (#7810 ) Description: Currently, Zilliz only support dedicated clusters using a pair of username and password for connection. Regarding serverless clusters, they can connect to them by using API keys( [ see official note detail](https://docs.zilliz.com/docs/manage-cluster-credentials)), so I add API key(token) description in Zilliz docs to make it more obvious and convenient for this group of users to better utilize Zilliz. No changes done to code. --------- Co-authored-by: Robin.Wang <3Jg$94sbQ@q1> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 09:31:39 -07:00
shibuiwilliam	235264a246	Add/test faiss (#7809 ) # What - Add missing test cases to faiss vectore stores	2023-07-18 08:30:35 -07:00
maciej-skorupka	5de7815310	docs: added comment from azure llm to azure chat about GPT-4 (#7884 ) Azure GPT-4 models can't be accessed via LLM model. It's easy to miss that and a lot of discussions about that are on the Internet. Therefore I added a comment in Azure LLM docs that mentions that and points to Azure Chat OpenAI docs. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 08:05:41 -07:00
Leonid Ganeline	4a05b7f772	docstrings `prompts` (#7844 ) Added missed docstrings in `prompts` @baskaryan	2023-07-18 07:58:22 -07:00
Bill Zhang	dda11d2a05	WeaviateHybridSearchRetriever option to enable scores. (#7861 ) Description: This PR adds the option to retrieve scores and explanations in the WeaviateHybridSearchRetriever. This feature improves the usability of the retriever by allowing users to understand the scoring logic behind the search results and further refine their search queries. Issue: This PR is a solution to the issue #7855 Dependencies: This PR does not introduce any new dependencies. Tag maintainer: @rlancemartin, @eyurtsev I have included a unit test for the added feature, ensuring that it retrieves scores and explanations correctly. I have also included an example notebook demonstrating its use.	2023-07-18 07:57:17 -07:00
Leonid Ganeline	527210972e	docstrings `output_parsers` (#7859 ) Added/updated the docstrings from `output_parsers` @baskaryan	2023-07-18 07:51:44 -07:00
Jonathan Pedoeem	c460c29a64	Adding Docs for `PromptLayerCallbackHandler` (#7860 ) Here I am adding documentation for the `PromptLayerCallbackHandler`. When we created the initial PR for the callback handler the docs were causing issues, so we merged without the docs.	2023-07-18 07:51:16 -07:00
ljeagle	3902b85657	Add metadata and page_content filters of documents in AwaDB (#7862 ) 1. Add the metadata filter of documents. 2. Add the text page_content filter of documents 3. fix the bug of similarity_search_with_score Improvement and fix bug of AwaDB Fix the conflict https://github.com/hwchase17/langchain/pull/7840 @rlancemartin @eyurtsev Thanks! --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-07-18 07:50:17 -07:00
German Martin	f1eaa9b626	Lost in the middle: We have been ordering documents the WRONG way. (for long context) (#7520 ) Motivation, it seems that when dealing with a long context and "big" number of relevant documents we must avoid using out of the box score ordering from vector stores. See: https://arxiv.org/pdf/2306.01150.pdf So, I added an additional parameter that allows you to reorder the retrieved documents so we can work around this performance degradation. The relevance respect the original search score but accommodates the lest relevant document in the middle of the context. Extract from the paper (one image speaks 1000 tokens): ![image](https://github.com/hwchase17/langchain/assets/1821407/fafe4843-6e18-4fa6-9416-50cc1d32e811) This seems to be common to all diff arquitectures. SO I think we need a good generic way to implement this reordering and run some test in our already running retrievers. It could be that my approach is not the best one from the architecture point of view, happy to have a discussion about that. For me this was the best place to introduce the change and start retesting diff implementations. @rlancemartin, @eyurtsev --------- Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-18 07:45:15 -07:00
Bagatur	6a32f93669	add ls link (#7847 )	2023-07-18 07:39:26 -07:00
Leonid Ganeline	17956ff08e	docstrings `agents` (#7866 ) Added/Updated docstrings for `agents` @baskaryan	2023-07-18 02:23:24 -07:00
William FH	c6f2d27789	Docs Nits (#7874 ) Add links to reference docs	2023-07-18 01:50:14 -07:00
William FH	3179ee3a56	Evals docs (#7460 ) Still don't have good "how to's", and the guides / examples section could be further pruned and improved, but this PR adds a couple examples for each of the common evaluator interfaces. - [x] Example docs for each implemented evaluator - [x] "how to make a custom evalutor" notebook for each low level APIs (comparison, string, agent) - [x] Move docs to modules area - [x] Link to reference docs for more information - [X] Still need to finish the evaluation index page - ~[ ] Don't have good data generation section~ - ~[ ] Don't have good how to section for other common scenarios / FAQs like regression testing, testing over similar inputs to measure sensitivity, etc.~	2023-07-18 01:00:01 -07:00
William FH	d87564951e	LS0010 (#7871 ) Bump langsmith version. Has some additional UX improvements	2023-07-18 00:28:37 -07:00
William FH	e294ba475a	Some mitigations for RCE in PAL chain (#7870 ) Some docstring / small nits to #6003 --------- Co-authored-by: BoazWasserman <49598618+boazwasserman@users.noreply.github.com> Co-authored-by: HippoTerrific <49598618+HippoTerrific@users.noreply.github.com> Co-authored-by: Or Raz <orraz1994@gmail.com>	2023-07-17 22:58:47 -07:00
Nicolas	46330da2e7	docs: Mendable: Fixes pretty sources not working (#7863 ) This new version fixes the"Verified Sources" display that got broken. Instead of displaying the full URL, it shows the title of the page the source is from.	2023-07-17 18:23:46 -07:00
Leonid Ganeline	f5ae8f1980	docstrings `tools` (#7848 ) Added docstrings in `tools`. @baskaryan	2023-07-17 17:50:19 -07:00
Leonid Ganeline	74b701f42b	docstrings `retrievers` (#7858 ) Added/updated docstrings `retrievers` @baskaryan	2023-07-17 17:47:17 -07:00
Jasper	5b4d53e8ef	Add text_content kwarg to BrowserlessLoader (#7856 ) Added keyword argument to toggle between getting the text content of a site versus its HTML when using the `BrowserlessLoader`	2023-07-17 17:02:19 -07:00
William FH	2aa3cf4e5f	update notebook (#7852 )	2023-07-17 14:46:42 -07:00
Matt Robinson	3c489be773	feat: optional post-processing for Unstructured loaders (#7850 ) ### Summary Adds a post-processing method for Unstructured loaders that allows users to optionally modify or clean extracted elements. ### Testing ```python from langchain.document_loaders import UnstructuredFileLoader from unstructured.cleaners.core import clean_extra_whitespace loader = UnstructuredFileLoader( "./example_data/layout-parser-paper.pdf", mode="elements", post_processors=[clean_extra_whitespace], ) docs = loader.load() docs[:5] ``` ### Reviewrs - @rlancemartin - @eyurtsev - @hwchase17	2023-07-17 12:13:05 -07:00
Bagatur	2a315dbee9	fix nb (#7843 )	2023-07-17 09:39:11 -07:00
Bagatur	3f1302a4ab	bump 235 (#7836 )	2023-07-17 09:37:20 -07:00
Mike Lambert	9cdea4e0e1	Update to Anthropic's claude-v2 (#7793 )	2023-07-17 08:55:49 -07:00
Bagatur	98c48f303a	fix (#7838 )	2023-07-17 07:53:11 -07:00
Bagatur	111bd7ddbe	specify comparators (#7805 )	2023-07-17 07:30:48 -07:00
Dayuan Jiang	ee40d37098	add bm25 module (#7779 ) - Description: Add a BM25 Retriever that do not need Elastic search - Dependencies: rank_bm25(if it is not installed it will be install by using pip, just like TFIDFRetriever do) - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: DayuanJian21687 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-17 07:30:17 -07:00
Liu Ming	fa0a9e502a	Add LLM for ChatGLM(2)-6B API (#7774 ) Description: Add LLM for ChatGLM-6B & ChatGLM2-6B API Related Issue: Will the langchain support ChatGLM? #4766 Add support for selfhost models like ChatGLM or transformer models #1780 Dependencies: No extra library install required. It wraps api call to a ChatGLM(2)-6B server(start with api.py), so api endpoint is required to run. Tag maintainer: @mlot Any comments on this PR would be appreciated. --------- Co-authored-by: mlot <limpo2000@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-17 07:27:17 -07:00
sseide	25e3d3f283	Support Redis Sentinel database connections (#5196 ) # Support Redis Sentinel database connections This PR adds the support to connect not only to Redis standalone servers but High Availability Replication sets too (https://redis.io/docs/management/sentinel/) Redis Replica Sets have on Master allowing to write data and 2+ replicas with read-only access to the data. The additional Redis Sentinel instances monitor all server and reconfigure the RW-Master on the fly if it comes unavailable. Therefore all connections must be made through the Sentinels the query the current master for a read-write connection. This PR adds basic support to also allow a redis connection url specifying a Sentinel as Redis connection. Redis documentation and Jupyter notebook with Redis examples are updated to mention how to connect to a redis Replica Set with Sentinels - Remark - i did not found test cases for Redis server connections to add new cases here. Therefor i tests the new utility class locally with different kind of setups to make sure different connection urls are working as expected. But no test case here as part of this PR.	2023-07-17 07:18:51 -07:00
Yifei Song	2e47412073	Add Xorbits agent (#7647 ) - [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data. - This PR added support for the Xorbits agent, which allows langchain to interact with Xorbits Pandas dataframe and Xorbits Numpy array. - Dependencies: This change requires the Xorbits library to be installed in order to be used. `pip install xorbits` - Request for review: @hinthornw - Twitter handle: https://twitter.com/Xorbitsio	2023-07-17 07:09:51 -07:00
Ankush Gola	ff3aada0b2	minor langsmith notebook fixes (#7814 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-16 21:27:03 -07:00
William FH	ca79044948	Export Tracer from callbacks (#7812 ) Improve discoverability	2023-07-16 20:58:13 -07:00
William FH	beb38f4f4d	Share client in evaluation callback (#7807 ) Guarantee the evaluator traces go to same endpoint	2023-07-16 17:47:38 -07:00
William FH	1db13e8a85	Fix chat example output mapper (#7808 ) Was only serializing when no key was provided	2023-07-16 17:47:05 -07:00
William FH	c58d35765d	Add examples to docstrings (#7796 ) and: - remove dataset name from autogenerated project name - print out project name to view	2023-07-16 12:05:56 -07:00
William FH	ed97af423c	Accept LLM via constructor (#7794 )	2023-07-16 08:46:36 -07:00
Ankush Gola	c4ece52dac	update LangSmith notebook (#7767 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-15 21:05:09 -07:00
Kenny	0d058d4046	Add try except block to OpenAIWhisperParser (#7505 )	2023-07-15 15:42:00 -07:00
William FH	4cb9f1eda8	Update langsmith version (#7759 )	2023-07-15 12:01:41 -07:00
Lance Martin	1d06eee3b5	Fix ntbk link in docs (#7755 ) Minor fix to running to [docs](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).	2023-07-15 09:11:18 -07:00
William FH	2e3d77c34e	Fix eval loader when overriding arguments (#7734 ) - Update the negative criterion descriptions to prevent bad predictions - Add support for normalizing the string distance - Fix potential json deserializing into float issues in the example mapper	2023-07-15 08:30:32 -07:00
Bagatur	c871c04270	bump 234 (#7754 )	2023-07-15 10:49:51 -04:00
Gordon Clark	96f3dff050	MediaWiki docloader improvements + unit tests (#5879 ) Starting over from #5654 because I utterly borked the poetry.lock file. Adds new paramerters for to the MWDumpLoader class: * skip_redirecst (bool) Tells the loader to skip articles that redirect to other articles. False by default. * stop_on_error (bool) Tells the parser to skip any page that causes a parse error. True by default. * namespaces (List[int]) Tells the parser which namespaces to parse. Contains namespaces from -2 to 15 by default. Default values are chosen to preserve backwards compatibility. Sample dump XML and full unit test coverage (with extended tests that pass!) also included! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-15 10:49:36 -04:00
Xavier	4c8106311f	Add `pip install langsmith` for Quick Install part of README (#7694 ) Issue When I use conda to install langchain, a dependency error throwed - "ModuleNotFoundError: No module named 'langsmith'" Updated Run `pip install langsmith` when install langchain with conda Co-authored-by: xaver.xu <xavier.xu@batechworks.com>	2023-07-15 10:27:32 -04:00
Mohammad Mohtashim	b8b8a138df	Simple Import fix in Tools Exception Docs (#7740 ) Issue: #7720 @hinthornw	2023-07-15 10:25:34 -04:00
Nicolas	43f900fd38	docs: Mendable Search Improvements (#7744 ) - New pin-to-side (button). This functionality allows you to search the docs while asking the AI for questions - Fixed the search bar in Firefox that won't detect a mouse click - Fixes and improvements overall in the model's performance	2023-07-15 10:19:21 -04:00
rjarun8	b7c409152a	Document loader/debug (#7750 ) Description: Added debugging output in DirectoryLoader to identify the file being processed. Issue: [Need a trace or debug feature in Lanchain DirectoryLoader #7725](https://github.com/hwchase17/langchain/issues/7725) Dependencies: No additional dependencies are required. Tag maintainer: @rlancemartin, @eyurtsev This PR enhances the DirectoryLoader with debugging output to help diagnose issues when loading documents. This new feature does not add any dependencies and has been tested on a local machine.	2023-07-15 10:18:27 -04:00
Lance Martin	b015647e31	Add GPT4All embeddings (#7743 ) Support for [GPT4All embeddings](https://docs.gpt4all.io/gpt4all_python_embedding.html) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-15 10:04:29 -04:00
Chang Sau Sheong	b6a7f40ad3	added support for Google Images search (#7751 ) - Description: Added Google Image Search support for SerpAPIWrapper - Issue: NA - Dependencies: None - Tag maintainer: @hinthornw - Twitter handle: @sausheong --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-15 10:04:18 -04:00
Kacper Łukawski	1ff5b67025	Implement async API for Qdrant vector store (#7704 ) Inspired by #5550, I implemented full async API support in Qdrant. The docs were extended to mention the existence of asynchronous operations in Langchain. I also used that chance to restructure the tests of Qdrant and provided a suite of tests for the async version. Async API requires the GRPC protocol to be enabled. Thus, it doesn't work on local mode yet, but we're considering including the support to be consistent.	2023-07-15 09:33:26 -04:00
Bearnardd	275b926cf7	add missing import (#7730 ) Just a nit documentation fix @baskaryan	2023-07-14 20:03:23 -04:00
Bearnardd	9800c6051c	add support for truncate arg for HuggingFaceTextGenInference class (#7728 ) Fixes https://github.com/hwchase17/langchain/issues/7650 * add support for `truncate` argument of `HugginFaceTextGenInference` @baskaryan	2023-07-14 16:23:56 -04:00
Lorenzo	77e6bbe6f0	fix typo in deeplake.ipynb (#7718 ) - Fixing typos in deeplake documentation - @baskaryan	2023-07-14 13:38:31 -04:00
Samuel Berthe	2be3515a66	SQLDatabase: adding security disclamer (#7710 ) It might be obvious to most engineers, but I think everybody should be cautious when using such a chain. ![image](https://github.com/hwchase17/langchain/assets/2951285/a1df6567-9d56-4c12-98ea-767401ae2ac8)	2023-07-14 13:38:16 -04:00
William FH	fcf98dc4c1	Check for Tiktoken (#7705 )	2023-07-14 09:49:01 -07:00
Bagatur	bae93682f6	update docs (#7714 )	2023-07-14 11:49:09 -04:00
Bagatur	b065da6933	Bagatur/docs nit (#7712 )	2023-07-14 11:13:02 -04:00
Bagatur	87d81b6acc	Redirect old text splitter page (#7708 ) related to #7665	2023-07-14 11:12:18 -04:00
Aarav Borthakur	210296a71f	Integrate Rockset as a document loader (#7681 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Integrate [Rockset](https://rockset.com/docs/) as a document loader. Issue: None Dependencies: Nothing new (rockset's dependency was already added [here](https://github.com/hwchase17/langchain/pull/6216)) Tag maintainer: @rlancemartin I have added a test for the integration and an example notebook showing its use. I ran `make lint` and everything looks good. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-14 07:58:13 -07:00
Bagatur	ad7d97670b	bump 233 (#7707 )	2023-07-14 10:38:13 -04:00
Samuel Berthe	7d4843fe84	feat(chains): adding ElasticsearchDatabaseChain for interacting with analytics database (#7686 ) This pull request adds a ElasticsearchDatabaseChain chain for interacting with analytics database, in the manner of the SQLDatabaseChain. Maintainer: @samber Twitter handler: samuelberthe --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-14 10:30:57 -04:00
Daniel	6d88b23ef7	Update pgembedding.ipynb (#7699 ) Update the extension name. It changed from pg_hnsw to pg_embedding. Thank you. I missed this in my previous commit.	2023-07-14 08:39:01 -04:00
Eric Speidel	663b0933e4	Allow passing auth objects in TextRequestsWrapper (#7701 ) - Description: This allows passing auth objects in request wrappers. Currently, we can handle auth by editing headers in the RequestsWrappers, but more complex auth methods, such as Kerberos, could be handled better by using existing functionality within the requests library. There are many authentication options supported both natively and by extensions, such as requests-kerberos or requests-ntlm. - Issue: Fixes #7542 - Dependencies: none Co-authored-by: eric.speidel@de.bosch.com <eric.speidel@de.bosch.com>	2023-07-14 08:38:24 -04:00
Nuno Campos	1e40427755	Enabled nesting chain group (#7697 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-14 10:03:16 +01:00
Leonid Kuligin	85e1c9b348	Added support for examples for VertexAI chat models. (#7636 ) #5278 Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-07-14 02:03:04 -04:00
Richy Wang	45bb414be2	Add LLM for Alibaba's Damo Academy's Tongyi Qwen API (#7477 ) - Add langchain.llms.Tonyi for text completion, in examples into the Tonyi Text API, - Add system tests. Note async completion for the Text API is not yet supported and will be included in a future PR. Dependencies: dashscope. It will be installed manually cause it is not need by everyone. Happy for feedback on any aspect of this PR @hwchase17 @baskaryan.	2023-07-14 01:58:22 -04:00
Lance Martin	6325a3517c	Make recursive loader yield while crawling (#7568 ) Support actual lazy_load since it can take a while to crawl larger directories.	2023-07-13 21:55:20 -07:00
UmerHA	82f3e32d8d	[Small upgrade] Allow document limit in AzureCognitiveSearchRetriever (#7690 ) Multiple people have asked in #5081 for a way to limit the documents returned from an AzureCognitiveSearchRetriever. This PR adds the `top_n` parameter to allow that. Twitter handle: [@UmerHAdil](twitter.com/umerHAdil)	2023-07-13 23:04:40 -04:00
AI-Chef	af6d333147	Fix same issue #7524 in FileCallbackHandler (#7687 ) Fix for Serializable class to include name, used in FileCallbackHandler as same issue #7524 Description: Fixes the Serializable class to include 'name' attribute (class_name) in the dict created, This is used in Callbacks, specifically the StdOutCallbackHandler, FileCallbackHandler. Issue: As described in issue #7524 Dependencies: None Tag maintainer: SInce this is related to the callback module, tagging @agola11 @idoru Comments: Glad to see issue #7524 fixed in pull #6124, but you forget to change the same place in FileCallbackHandler	2023-07-13 22:39:21 -04:00
Ben Perry	3874bb256e	Weaviate: Batch embed texts (#5903 ) When a custom Embeddings object is set, embed all given texts in a batch instead of passing them through individually. Any code calling add_texts can then appropriately size the chunks of texts that are passed through to take full advantage of the hardware it's running on.	2023-07-13 20:57:58 -04:00
Charles P	574698a5fb	Make so explicit class constructor is called in ElasticVectorSearch from_texts (#6199 ) Fixes #6198 ElasticKnnSearch.from_texts is actually ElasticVectorSearch.from_texts and throws because it calls ElasticKnnSearch constructor with the wrong arguments. Now ElasticKnnSearch has its own from_texts, which constructs a proper ElasticKnnSearch. --------- Co-authored-by: Charles Parker <charlesparker@FiltaMacbook.local> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 19:55:20 -04:00
Daniel	854f3fe9b1	Update pgembedding.ipynb (#7682 ) Correct links to the pg_embedding repository and the Neon documentation.	2023-07-13 19:54:07 -04:00
William FH	051fac1e66	Improve walkthrough links for sphinx (#7672 ) Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	2023-07-13 16:08:31 -07:00
Bagatur	5db4dba526	add integrations hub link to docs (#7675 )	2023-07-13 18:44:10 -04:00
Kenton Parton	9124221d31	Fixed handling of absolute URLs in `RecursiveUrlLoader` (#7677 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Description This PR addresses a bug in the RecursiveUrlLoader class where absolute URLs were being treated as relative URLs, causing malformed URLs to be produced. The fix involves using the urljoin function from the urllib.parse module to correctly handle both absolute and relative URLs. @rlancemartin @eyurtsev --------- Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-13 15:34:00 -07:00
EllieRoseS	c087ce74f7	Added matching async load func to PlaywrightURLLoader (#5938 ) Fixes # (issue) The existing PlaywrightURLLoader load() function uses a synchronous browser which is not compatible with jupyter. This PR adds a sister function aload() which can be run insisde a notebook. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-13 17:51:38 -04:00
William FH	ae7714f1ba	Configure Tracer Workers (#7676 ) Mainline the tracer to avoid calling feedback before run is posted. Chose a bool over `max_workers` arg for configuring since we don't want to support > 1 for now anyway. At some point may want to manage the pool ourselves (ordering only really matters within a run and with parent runs)	2023-07-13 14:00:14 -07:00
Jasper	fbc97a77ed	add browserless loader (#7562 ) # Browserless Added support for Browserless' `/content` endpoint as a document loader. ### About Browserless Browserless is a cloud service that provides access to headless Chrome browsers via a REST API. It allows developers to automate Chromium in a serverless fashion without having to configure and maintain their own Chrome infrastructure. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-13 13:18:28 -07:00
mebstyne-msft	120c52589b	Enabled Azure Active Directory token-based auth access to OpenAI completions (#6313 ) With AzureOpenAI openai_api_type defaulted to "azure" the logic in utils' get_from_dict_or_env() function triggered by the root validator never looks to environment for the user's runtime openai_api_type values. This inhibits folks using token-based auth, or really any auth model other than "azure." By removing the "default" value, this allows environment variables to be pulled at runtime for the openai_api_type and thus enables the other api_types which are expected to work. --------- Co-authored-by: Ebo <mebstyne@microsoft.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-07-13 16:05:47 -04:00
frangin2003	c7b687e944	Simplify GraphQL Tool Initialization documentation by Removing 'llm' Argument (#7651 ) This PR is aimed at enhancing the clarity of the documentation in the langchain project. Description: In the graphql.ipynb file, I have removed the unnecessary 'llm' argument from the initialization process of the GraphQL tool (of type _EXTRA_OPTIONAL_TOOLS). The 'llm' argument is not required for this process. Its presence could potentially confuse users. This modification simplifies the understanding of tool initialization and minimizes potential confusion. Issue: Not applicable, as this is a documentation improvement. Dependencies: None. I kindly request a review from the following maintainer: @hinthornw, who is responsible for Agents / Tools / Toolkits. No new integration is being added in this PR, hence no need for a test or an example notebook. Please see the changes for more detail and let me know if any further modification is necessary.	2023-07-13 14:52:07 -04:00
William FH	aab2a7cd4b	Normalize Trajectory Eval Score (#7668 )	2023-07-13 09:58:28 -07:00
William FH	5f03cc3511	spelling nit (#7667 )	2023-07-13 09:12:57 -07:00
Bagatur	3dd0704e38	bump 232 (#7659 )	2023-07-13 10:32:39 -04:00
Tamas Molnar	24c1654208	Fix SQLAlchemy LLM cache clear (#7653 ) Fixes #7652 Description: This is a fix for clearing the cache for SQL Alchemy based LLM caches. The langchain.llm_cache.clear() did not take effect for SQLite cache. Reason: it didn't commit the deletion database change. See SQLAlchemy documentation for proper usage: https://docs.sqlalchemy.org/en/20/orm/session_basics.html#opening-and-closing-a-session https://docs.sqlalchemy.org/en/20/orm/session_basics.html#deleting @hwchase17 @baskaryan --------- Co-authored-by: Tamas Molnar <tamas.molnar@nagarro.com>	2023-07-13 09:39:04 -04:00
Bagatur	c17a80f11c	fix chroma updated upsert interface (#7643 ) new chroma release seems to not support empty dicts for metadata. related to #7633	2023-07-13 09:27:14 -04:00
William FH	a673a51efa	[Breaking] Update Evaluation Functionality (#7388 ) - Migrate from deprecated langchainplus_sdk to `langsmith` package - Update the `run_on_dataset()` API to use an eval config - Update a number of evaluators, as well as the loading logic - Update docstrings / reference docs - Update tracer to share single HTTP session	2023-07-13 02:13:06 -07:00
Sam Coward	224199083b	Fix missing chain classname in StdOutCallbackHandler.on_chain_start (#6124 ) Retrieves the name of the class from new location as of commit `18af149e91` Co-authored-by: Zander Chase <130414180+vowelparrot@users.noreply.github.com>	2023-07-13 03:05:36 -04:00
lucasiscovici	af3f401015	update base class of ListStepContainer to BaseStepContainer (#6232 ) update base class of ListStepContainer to BaseStepContainer Fixes #6231	2023-07-13 03:03:02 -04:00
Matt Adams	98e1bbfbbd	Add missing dependencies to apify.ipynb (#6331 ) Fixes errors caused by missing dependencies when running the notebook.	2023-07-13 03:02:23 -04:00
Ma Donghao	6f62e5461c	Update the parser regex of map_rerank (#6419 ) Sometimes the score responded by chatgpt would be like 'Respone example\nScore: 90 (fully answers the question, but could provide more detail on the specific error message)' For the score contains not only numbers, it raise a ValueError like Update the RegexParser from `.` to `\d` would help us to ignore the text after number. Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 03:01:42 -04:00
Bagatur	b08f903755	fix chroma init bug (#7639 )	2023-07-13 03:00:33 -04:00
Nir Gazit	f307ca094b	fix(memory): allow internal chains to use memory (#6769 ) Fixed #6768. This is a workaround only. I think a better longer-term solution is for chains to declare how many input variables they actually need (as opposed to ones that are in the prompt, where some may be satisfied by the memory). Then, a wrapping chain can check the input match against the actual input variables. @hwchase17	2023-07-13 02:47:44 -04:00
Francisco Ingham	488d2d5da9	Entity extraction improvements (#6342 ) Added fix to avoid irrelevant attributes being returned plus an example of extracting unrelated entities and an exampe of using an 'extra_info' attribute to extract unstructured data for an entity. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 02:16:05 -04:00
Nir Gazit	a8bbfb2da3	feat(agents): allow trimming of intermediate steps to last N (#6476 ) Added an option to trim intermediate steps to last N steps. This is especially useful for long-running agents. Users can explicitly specify N or provide a function that does custom trimming/manipulation on intermediate steps. I've mimicked the API of the `handle_parsing_errors` parameter.	2023-07-13 02:09:25 -04:00
Zeeland	92ef77da35	fix: remove useless variable k (#6524 ) remove useless variable k --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 01:58:36 -04:00
Bagatur	7f8ff2a317	add tagger nb (#7637 )	2023-07-13 01:48:23 -04:00
Sidchat95	c5e50c40c9	Fix Document Similarity Check with passed Threshold (#6845 ) Converting the Similarity obtained in the similarity_search_with_score_by_vector method whilst comparing to the passed threshold. This is because the passed threshold is a number between 0 to 1 and is already in the relevance_score_fn format. As of now, the function is comparing two different scoring parameters and that wouldn't work. Dependencies None Issue: Different scores being compared in similarity_search_with_score_by_vector method in FAISS. Tag maintainer @hwchase17 <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 01:30:47 -04:00
Jacob Ajit	a08baa97c5	Use modern OpenAI endpoints for embeddings (#6573 ) - Description: LangChain passes [engine](https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/openai.py#L256) and not `model` as a field when making OpenAI requests. Within the `openai` Python library, for OpenAI requests, this [makes a call](https://github.com/openai/openai-python/blob/main/openai/api_resources/abstract/engine_api_resource.py#L58) to an endpoint of the form `https://api.openai.com/v1/engines/{engine_id}/embeddings`. These endpoints are [deprecated](https://help.openai.com/en/articles/6283125-what-happened-to-engines) in favor of endpoints of the format `https://api.openai.com/v1/embeddings`, where `model` is passed as a parameter in the request body. While these deprecated endpoints continue to function for now, they may not be supported indefinitely and should be avoided in favor of the newer API format. It appears that `engine` was passed in instead of `model` to make both Azure OpenAI and OpenAI calls work similarly. However, the inclusion of `engine` [causes](https://github.com/openai/openai-python/blob/main/openai/api_resources/abstract/engine_api_resource.py#L58) OpenAI to use the deprecated endpoint, requiring a diverging code path for Azure OpenAI calls where `engine` is passed in additionally (Azure OpenAI requires `engine` to specify a deployment, and can optionally take in `model`). In the long-term, it may be worth considering spinning off Azure OpenAI embeddings into a separate class for ease of use and maintenance, similar to the [implementation for chat models](https://github.com/hwchase17/langchain/blob/master/langchain/chat_models/azure_openai.py).	2023-07-13 01:23:17 -04:00
Jacob Lee	cdb93ab5ca	Adds OpenAI functions powered document metadata tagger (#7521 ) Adds a new document transformer that automatically extracts metadata for a document based on an input schema. I also moved `document_transformers.py` to `document_transformers/__init__.py` to group it with this new transformer - it didn't seem to cause issues in the notebook, but let me know if I've done something wrong there. Also had a linter issue I couldn't figure out: ``` MacBook-Pro:langchain jacoblee$ make lint poetry run mypy . docs/dist/conf.py: error: Duplicate module named "conf" (also at "./docs/api_reference/conf.py") docs/dist/conf.py: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules for more info docs/dist/conf.py: note: Common resolutions include: a) using `--exclude` to avoid checking one of them, b) adding `__init__.py` somewhere, c) using `--explicit-package-bases` or adjusting MYPYPATH Found 1 error in 1 file (errors prevented further checking) make: *** [lint] Error 2 ``` @rlancemartin @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 01:12:41 -04:00
Jason Fan	8effd90be0	Add new types of document transformers (#7379 ) - Description: Add two new document transformers that translates documents into different languages and converts documents into q&a format to improve vector search results. Uses OpenAI function calling via the [doctran](https://github.com/psychic-api/doctran/tree/main) library. - Issue: N/A - Dependencies: `doctran = "^0.0.5"` - Tag maintainer: @rlancemartin @eyurtsev @hwchase17 - Twitter handle: @psychicapi or @jfan001 Notes - Adheres to the `DocumentTransformer` abstraction set by @dev2049 in #3182 - refactored `EmbeddingsRedundantFilter` to put it in a file under a new `document_transformers` module - Added basic docs for `DocumentInterrogator`, `DocumentTransformer` as well as the existing `EmbeddingsRedundantFilter` --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 23:53:30 -04:00
Piyush Jain	f11d845dee	Fixed validation error when credentials_profile_name, or region_name is not passed (#7629 ) ## Summary This PR corrects the checks for credentials_profile_name, and region_name attributes. This was causing validation exceptions when either of these values were missing during creation of the retriever class. Fixes #7571 #### Requested reviewers: @baskaryan	2023-07-12 23:47:35 -04:00
Jamie Broomall	0e1d7a27c6	WhyLabsCallbackHandler updates (#7621 ) Updates to the WhyLabsCallbackHandler and example notebook - Update dependency to langkit 0.0.6 which defines new helper methods for callback integrations - Update WhyLabsCallbackHandler to use the new `get_callback_instance` so that the callback is mostly defined in langkit - Remove much of the implementation of the WhyLabsCallbackHandler here in favor of the callback instance This does not change the behavior of the whylabs callback handler implementation but is a reorganization that moves some of the implementation externally to our optional dependency package, and should make future updates easier. @agola11	2023-07-12 23:46:56 -04:00
Gaurang Pawar	53722dcfdc	Fixed a typo in pinecone_hybrid_search.ipynb (#7627 ) Fixed a small typo in documentation	2023-07-12 23:46:41 -04:00
Bagatur	1d4db1327a	fix openai structured chain with pydantic (#7622 ) should return pydantic class	2023-07-12 23:46:13 -04:00
Bagatur	ee70d4a0cd	mv tutorials (#7614 )	2023-07-12 17:33:36 -04:00
William FH	9b215e761e	Stop warning when parent run ID not present (#7611 )	2023-07-12 14:04:32 -07:00
William FH	2f848294cb	Rm Warning that Tracing is Experimental (#7612 )	2023-07-12 14:04:28 -07:00
Yaohui Wang	d85c33a5c3	Fix the markdown rendering issue with a code block inside a markdown code block (#6625 ) ### Description - Fix the markdown rendering issue with a code block inside a markdown, using a different number of backticks for the delimiters. Current doc site: <https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/code_splitter#markdown> After fix: <img width="480" alt="image" src="https://github.com/hwchase17/langchain/assets/3115235/d9921d59-64e6-4a34-9c62-79743667f528"> ### Who can review PTAL @dev2049 Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>	2023-07-12 16:29:25 -04:00
Yaroslav Halchenko	0d92a7f357	codespell: workflow, config + some (quite a few) typos fixed (#6785 ) Probably the most boring PR to review ;) Individual commits might be easier to digest --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-07-12 16:20:08 -04:00
Sam	931e68692e	Adds a chain around sympy for symbolic math (#6834 ) - Description: Adds a new chain that acts as a wrapper around Sympy to give LLMs the ability to do some symbolic math. - Dependencies: SymPy --------- Co-authored-by: sreiswig <sreiswig@github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 15:17:32 -04:00
Bharat Ramanathan	be29a6287d	feat: add model architecture back to wandb tracer (#6806 ) # Description This PR adds model architecture to the `WandbTracer` from the Serialized Run kwargs. This allows visualization of the calling parameters of an Agent, LLM and Tool in Weights & Biases. 1. Safely serialize the run objects to WBTraceTree model_dict 2. Refactors the run processing logic to be more organized. - Twitter handle: @parambharat --------- Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 15:00:18 -04:00
Alex Iribarren	adc96d60b6	Implement Function Callback tracer (#6835 ) Description: I wanted to be able to redirect debug output to a function, but it wasn't very easy. I figured it would make sense to implement a `FunctionCallbackHandler`, and reimplement `ConsoleCallbackHandler` as a subclass that calls the `print` function. Now I can create a simple subclass in my project that calls `logging.info` or whatever I need. Tag maintainer: @agola11 Twitter handle: `@andandaraalex`	2023-07-12 14:38:41 -04:00
Ducasse-Arthur	93a84f6182	Update bedrock.py - support of other endpoint url (esp. for users of … (#7592 ) Added an _endpoint_url_ attribute to Bedrock(LLM) class - I have access to Bedrock only via us-west-2 endpoint and needed to change the endpoint url, this could be useful to other users	2023-07-12 10:43:23 -04:00
Bagatur	22525bad65	bump 231 (#7584 )	2023-07-12 10:43:12 -04:00
Subsegment	6e1000dc8d	docs : Use more meaningful cnosdb examples (#7587 ) This change makes the ecosystem integrations cnosdb documentation more realistic and easy to understand. - change examples of question and table - modify typo and format	2023-07-12 10:31:55 -04:00
Samuel ROZE	f3c9bf5e4b	fix(typo): Clarify the point of `llm_chain` (#7593 ) Fixes a typo introduced in https://github.com/hwchase17/langchain/pull/7080 by @hwchase17. In the example (visible on [the online documentation](https://api.python.langchain.com/en/latest/chains/langchain.chains.conversational_retrieval.base.ConversationalRetrievalChain.html#langchain-chains-conversational-retrieval-base-conversationalretrievalchain)), the `llm_chain` variable is unused as opposed to being used for the question generator. This change makes it clearer.	2023-07-12 10:31:00 -04:00
Alec Flett	6cdd4b5edc	only add handlers if they are new (#7504 ) When using callbacks, there are times when callbacks can be added redundantly: for instance sometimes you might need to create an llm with specific callbacks, but then also create and agent that uses a chain that has those callbacks already set. This means that "callbacks" might get passed down again to the llm at predict() time, resulting in duplicate calls to the `on_llm_start` callback. For the sake of simplicity, I made it so that langchain never adds an exact handler/callbacks object in `add_handler`, thus avoiding the duplicate handler issue. Tagging @hwchase17 for callback review --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 03:48:29 -04:00
ausboss	50316f6477	Adding LLM wrapper for Kobold AI (#7560 ) - Description: add wrapper that lets you use KoboldAI api in langchain - Issue: n/a - Dependencies: none extra, just what exists in lanchain - Tag maintainer: @baskaryan - Twitter handle: @zanzibased --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 03:48:12 -04:00
Rohit Kumar Singh	603a0bea29	Fixes incorrect docstore creation in faiss.py (#7026 ) - Description: Current implementation assumes that the length of `texts` and `ids` should be same but if the passed `ids` length is not equal to the passed length of `texts`, current code `dict(zip(index_to_id.values(), documents))` is not failing or giving any warning and silently creating docstores only for the passed `ids` i.e. if `ids = ['A']` and `texts=["I love Open Source","I love langchain"]` then only one `docstore` will be created. But either two docstores should be created assuming same id value for all the elements of `texts` or an error should be raised. - Issue: My change fixes this by using dictionary comprehension instead of `zip`. This was if lengths of `ids` and `texts` mismatches an explicit `IndexError` will be raised. @rlancemartin, @eyurtsev	2023-07-12 03:35:49 -04:00
Tommy Hyeonwoo Kim	3f7213586e	add supported properties for notiondb document loader's metadata (#7570 ) fix #7569 add following properties for Notion DB document loader's metadata - `unique_id` - `status` - `people` @rlancemartin, @eyurtsev (Since this is a change related to `DataLoaders`) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 03:34:54 -04:00
Junlin Zhou	5f17c57174	Update chat agents' output parser to extract action by regex (#7511 ) Currently `ChatOutputParser` extracts actions by splitting the text on "```", and then load the second part as a json string. But sometimes the LLM will wrap the action in markdown code block like: ````markdown ```json { "action": "foo", "action_input": "bar" } ``` ```` Splitting text on "```" will cause `OutputParserException` in such case. This PR changes the behaviour to extract the `$JSON_BLOB` by regex, so that it can handle both ` ``` ``` ` and ` ```json ``` ` @hinthornw --------- Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>	2023-07-12 03:12:02 -04:00
Bagatur	ebcb144342	unit test sqlalachemy (#7582 )	2023-07-12 03:03:16 -04:00
Harrison Chase	641fd74baa	Harrison/pg vector move (#7580 )	2023-07-12 02:22:34 -04:00
os1ma	2667ddc686	Fix `make docs_build` and related scripts (#7276 ) Description: a description of the change Fixed `make docs_build` and related scripts which caused errors. There are several changes. First, I made the build of the documentation and the API Reference into two separate commands. This is because it takes less time to build. The commands for documents are `make docs_build`, `make docs_clean`, and `make docs_linkcheck`. The commands for API Reference are `make api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`. It looked like `docs/.local_build.sh` could be used to build the documentation, so I used that. Since `.local_build.sh` was also building API Rerefence internally, I removed that process. `.local_build.sh` also added some Bash options to stop in error or so. Futher more added `cd "${SCRIPT_DIR}"` at the beginning so that the script will work no matter which directory it is executed in. `docs/api_reference/api_reference.rst` is removed, because which is generated by `docs/api_reference/create_api_rst.py`, and added it to .gitignore. Finally, the description of CONTRIBUTING.md was modified. Issue: the issue # it fixes (if applicable) https://github.com/hwchase17/langchain/issues/6413 Dependencies: any dependencies required for this change `nbdoc` was missing in group docs so it was added. I installed it with the `poetry add --group docs nbdoc` command. I am concerned if any modifications are needed to poetry.lock. I would greatly appreciate it if you could pay close attention to this file during the review. Tag maintainer - General / Misc / if you don't know who to tag: @baskaryan If this PR needs any additional changes, I'll be happy to make them! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 22:05:14 -04:00
Pharbie	74c28df363	Update Pinecone Upsert method usage (#7358 ) Description: Refactor the upsert method in the Pinecone class to allow for additional keyword arguments. This change adds flexibility and extensibility to the method, allowing for future modifications or enhancements. The upsert method now accepts the `**kwargs` parameter, which can be used to pass any additional arguments to the Pinecone index. This change has been made in both the `upsert` method in the `Pinecone` class and the `upsert` method in the `similarity_search_with_score` class method. Falls in line with the usage of the upsert method in [Pinecone-Python-Client](`4640c4cf27/pinecone/index.py (L73)`) Issue: [This feature request in Pinecone Repo](https://github.com/pinecone-io/pinecone-python-client/issues/184) Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - Memory: @hwchase17 --------- Co-authored-by: kwesi <22204443+yankskwesi@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>	2023-07-11 21:14:42 -04:00
Kazuki Maeda	5c3fe8b0d1	Enhance Makefile with 'format_diff' Option and Improved Readability (#7394 ) ### Description: This PR introduces a new option format_diff to the existing Makefile. This option allows us to apply the formatting tools (Black and isort) only to the changed Python and ipynb files since the last commit. This will make our development process more efficient as we only format the codes that we modify. Along with this change, comments were added to make the Makefile more understandable and maintainable. ### Issue: N/A ### Dependencies: Add dependency to black. ### Tag maintainer: @baskaryan ### Twitter handle: [kzk_maeda](https://twitter.com/kzk_maeda) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 21:03:17 -04:00
Bagatur	2babe3069f	Revert pinecone v4 support (#7566 ) Revert `9d13dcd`	2023-07-11 20:58:59 -04:00
schop-rob	e811c5e8c6	Add OpenAI organization ID to docs (#7398 ) Description: I added an example of how to reference the OpenAI API Organization ID, because I couldn't find it before. In the example, it is mentioned how to achieve this using environment variables as well as parameters for the OpenAI()-class Issue: - Dependencies: - Twitter @schop-rob	2023-07-11 20:51:58 -04:00
Kenny	8741e55e7c	Template formats documentation (#7404 ) Simple addition to the documentation, adding the correct import statement & showcasing using Python FStrings.	2023-07-11 18:24:24 -04:00
Fielding Johnston	00c466627a	minor bug fix: properly await AsyncRunManager's method call in MulitRouteChain (#7487 ) This simply awaits `AsyncRunManager`'s method call in `MulitRouteChain`. Noticed this while playing around with Langchain's implementation of `MultiPromptChain`. @baskaryan cheers --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 18:18:47 -04:00
tonomura	cc0585af42	Improvement/add finish reason to generation info in chat open ai (#7478 ) Description: ChatOpenAI model does not return finish_reason in generation_info. Issue: #2702 Dependencies: None Tag maintainer: @baskaryan Thank you --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 18:12:57 -04:00
Junlin Zhou	b96ac13f3d	Minor update to reference other sql tool by tool names instead of hard coded string. (#7514 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Currently there are 4 tools in SQL agent-toolkits, and 2 of them have reference to the other 2. This PR change the reference from hard coded string to `{tool.name}` Co-authored-by: Junlin Zhou <jlzhou@zjuici.com>	2023-07-11 17:44:23 -04:00
OwenElliott	9cb2347453	Fix broken link from Marqo Ecosystem (#7510 ) Small fix to a link from the Marqo page in the ecosystem. The link was not updated correctly when the documentation structure changed to html pages instead of links to notebooks.	2023-07-11 17:15:15 -04:00
Matt Robinson	c4d53f98dc	docs: update unstructured docstrings (#7561 ) ### Summary Updates the docstrings in the Unstructured document loaders to display more useful information on the integrations page.	2023-07-11 17:12:05 -04:00
Ben Auffarth	2c2f0e15a6	clarify about api key (#7540 ) I found it unclear, where to get the API keys for JinaChat. Mentioning this in the docstring should be helpful. #7490 Twitter handle: benji1a @delgermurun	2023-07-11 16:46:06 -04:00
Jona Sassenhagen	0ea7224535	[Minor] Remove tagger from spacy sentencizer (#7534 ) @svlandeg gave me a tip for how to improve a bit on https://github.com/hwchase17/langchain/pull/7442 for some extra speed and memory gains. The tagger isn't needed for sentencization, so can be disabled too.	2023-07-11 16:43:46 -04:00
Kacper Łukawski	1f83b5f47e	Reuse the existing collection if configured properly in Qdrant.from_texts (#7530 ) This PR changes the behavior of `Qdrant.from_texts` so the collection is reused if not requested to recreate it. Previously, calling `Qdrant.from_texts` or `Qdrant.from_documents` resulted in removing the old data which was confusing for many.	2023-07-11 16:24:35 -04:00
Leonid Kuligin	6674b33cf5	Added support for chat_history (#7555 ) #7469 Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-07-11 15:27:26 -04:00
Felix Brockmeier	406a9dc11f	Add notebook example for Lemon AI NLP Workflow Automation (#7556 ) - Description: Added notebook to LangChain docs that explains how to use Lemon AI NLP Workflow Automation tool with Langchain - Issue: not applicable - Dependencies: not applicable - Tag maintainer: @agola11 - Twitter handle: felixbrockm	2023-07-11 15:15:11 -04:00
Lance Martin	9e067b8cc9	Add env setup (#7550 ) Include setup	2023-07-11 09:48:40 -07:00
Bagatur	3c4338470e	bump 230 (#7544 )	2023-07-11 11:24:08 -04:00
Bagatur	d2137eea9f	fix cpal docs (#7545 )	2023-07-11 11:07:45 -04:00
Boris	9129318466	CPAL (#6255 ) # Causal program-aided language (CPAL) chain ## Motivation This builds on the recent [PAL](https://arxiv.org/abs/2211.10435) to stop LLM hallucination. The problem with the [PAL](https://arxiv.org/abs/2211.10435) approach is that it hallucinates on a math problem with a nested chain of dependence. The innovation here is that this new CPAL approach includes causal structure to fix hallucination. For example, using the below word problem, PAL answers with 5, and CPAL answers with 13. "Tim buys the same number of pets as Cindy and Boris." "Cindy buys the same number of pets as Bill plus Bob." "Boris buys the same number of pets as Ben plus Beth." "Bill buys the same number of pets as Obama." "Bob buys the same number of pets as Obama." "Ben buys the same number of pets as Obama." "Beth buys the same number of pets as Obama." "If Obama buys one pet, how many pets total does everyone buy?" The CPAL chain represents the causal structure of the above narrative as a causal graph or DAG, which it can also plot, as shown below. ![complex-graph](https://github.com/hwchase17/langchain/assets/367522/d938db15-f941-493d-8605-536ad530f576) . The two major sections below are: 1. Technical overview 2. Future application Also see [this jupyter notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb) doc. ## 1. Technical overview ### CPAL versus PAL Like [PAL](https://arxiv.org/abs/2211.10435), CPAL intends to reduce large language model (LLM) hallucination. The CPAL chain is different from the PAL chain for a couple of reasons. * CPAL adds a causal structure (or DAG) to link entity actions (or math expressions). * The CPAL math expressions are modeling a chain of cause and effect relations, which can be intervened upon, whereas for the PAL chain math expressions are projected math identities. PAL's generated python code is wrong. It hallucinates when complexity increases. ```python def solution(): """Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?""" obama_pets = 1 tim_pets = obama_pets cindy_pets = obama_pets + obama_pets boris_pets = obama_pets + obama_pets total_pets = tim_pets + cindy_pets + boris_pets result = total_pets return result # math result is 5 ``` CPAL's generated python code is correct. ```python story outcome data name code value depends_on 0 obama pass 1.0 [] 1 bill bill.value = obama.value 1.0 [obama] 2 bob bob.value = obama.value 1.0 [obama] 3 ben ben.value = obama.value 1.0 [obama] 4 beth beth.value = obama.value 1.0 [obama] 5 cindy cindy.value = bill.value + bob.value 2.0 [bill, bob] 6 boris boris.value = ben.value + beth.value 2.0 [ben, beth] 7 tim tim.value = cindy.value + boris.value 4.0 [cindy, boris] query data { "question": "how many pets total does everyone buy?", "expression": "SELECT SUM(value) FROM df", "llm_error_msg": "" } # query result is 13 ``` Based on the comments below, CPAL's intended location in the library is `experimental/chains/cpal` and PAL's location is`chains/pal`. ### CPAL vs Graph QA Both the CPAL chain and the Graph QA chain extract entity-action-entity relations into a DAG. The CPAL chain is different from the Graph QA chain for a few reasons. * Graph QA does not connect entities to math expressions * Graph QA does not associate actions in a sequence of dependence. * Graph QA does not decompose the narrative into these three parts: 1. Story plot or causal model 4. Hypothetical question 5. Hypothetical condition ### Evaluation Preliminary evaluation on simple math word problems shows that this CPAL chain generates less hallucination than the PAL chain on answering questions about a causal narrative. Two examples are in [this jupyter notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb) doc. ## 2. Future application ### "Describe as Narrative, Test as Code" The thesis here is that the Describe as Narrative, Test as Code approach allows you to represent a causal mental model both as code and as a narrative, giving you the best of both worlds. #### Why describe a causal mental mode as a narrative? The narrative form is quick. At a consensus building meeting, people use narratives to persuade others of their causal mental model, aka. plan. You can share, version control and index a narrative. #### Why test a causal mental model as a code? Code is testable, complex narratives are not. Though fast, narratives are problematic as their complexity increases. The problem is LLMs and humans are prone to hallucination when predicting the outcomes of a narrative. The cost of building a consensus around the validity of a narrative outcome grows as its narrative complexity increases. Code does not require tribal knowledge or social power to validate. Code is composable, complex narratives are not. The answer of one CPAL chain can be the hypothetical conditions of another CPAL Chain. For stochastic simulations, a composable plan can be integrated with the [DoWhy library](https://github.com/py-why/dowhy). Lastly, for the futuristic folk, a composable plan as code allows ordinary community folk to design a plan that can be integrated with a blockchain for funding. An explanation of a dependency planning application is [here.](https://github.com/borisdev/cpal-llm-chain-demo) --- Twitter handle: @boris_dev --------- Co-authored-by: Boris Dev <borisdev@Boriss-MacBook-Air.local>	2023-07-11 10:11:21 -04:00
Alejandra De Luna	2e4047e5e7	feat: support generate as an early stopping method for `OpenAIFunctionsAgent` (#7229 ) This PR proposes an implementation to support `generate` as an `early_stopping_method` for the new `OpenAIFunctionsAgent` class. The motivation behind is to facilitate the user to set a maximum number of actions the agent can take with `max_iterations` and force a final response with this new agent (as with the `Agent` class). The following changes were made: - The `OpenAIFunctionsAgent.return_stopped_response` method was overwritten to support `generate` as an `early_stopping_method` - A boolean `with_functions` parameter was added to the `OpenAIFunctionsAgent.plan` method This way the `OpenAIFunctionsAgent.return_stopped_response` method can call the `OpenAIFunctionsAgent.plan` method with `with_function=False` when the `early_stopping_method` is set to `generate`, making a call to the LLM with no functions and forcing a final response from the `"assistant"`. - Relevant maintainer: @hinthornw - Twitter handle: @aledelunap --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 09:25:02 -04:00
Hashem Alsaket	1dd4236177	Fix HF endpoint returns blank for text-generation (#7386 ) Description: Current `_call` function in the `langchain.llms.HuggingFaceEndpoint` class truncates response when `task=text-generation`. Same error discussed a few days ago on Hugging Face: https://huggingface.co/tiiuae/falcon-40b-instruct/discussions/51 Issue: Fixes #7353 Tag maintainer: @hwchase17 @baskaryan @hinthornw --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 03:06:05 -04:00
Lance Martin	4a94f56258	Minor edits to QA docs (#7507 ) Small clean-ups	2023-07-10 22:15:05 -07:00
Raymond Yuan	5171c3bcca	Refactor vector storage to correctly handle relevancy scores (#6570 ) Description: This pull request aims to support generating the correct generic relevancy scores for different vector stores by refactoring the relevance score functions and their selection in the base class and subclasses of VectorStore. This is especially relevant with VectorStores that require a distance metric upon initialization. Note many of the current implenetations of `_similarity_search_with_relevance_scores` are not technically correct, as they just return `self.similarity_search_with_score(query, k, **kwargs)` without applying the relevant score function Also includes changes associated with: https://github.com/hwchase17/langchain/pull/6564 and https://github.com/hwchase17/langchain/pull/6494 See more indepth discussion in thread in #6494 Issue: https://github.com/hwchase17/langchain/issues/6526 https://github.com/hwchase17/langchain/issues/6481 https://github.com/hwchase17/langchain/issues/6346 Dependencies: None The changes include: - Properly handling score thresholding in FAISS `similarity_search_with_score_by_vector` for the corresponding distance metric. - Refactoring the `_similarity_search_with_relevance_scores` method in the base class and removing it from the subclasses for incorrectly implemented subclasses. - Adding a `_select_relevance_score_fn` method in the base class and implementing it in the subclasses to select the appropriate relevance score function based on the distance strategy. - Updating the `__init__` methods of the subclasses to set the `relevance_score_fn` attribute. - Removing the `_default_relevance_score_fn` function from the FAISS class and using the base class's `_euclidean_relevance_score_fn` instead. - Adding the `DistanceStrategy` enum to the `utils.py` file and updating the imports in the vector store classes. - Updating the tests to import the `DistanceStrategy` enum from the `utils.py` file. --------- Co-authored-by: Hanit <37485638+hanit-com@users.noreply.github.com>	2023-07-10 20:37:03 -07:00
Lance Martin	bd0c6381f5	Minor update to clarify map-reduce custom prompt usage (#7453 ) Update docs for map-reduce custom prompt usage	2023-07-10 16:43:44 -07:00
Lance Martin	28d2b213a4	Update landing page for "question answering over documents" (#7152 ) Improve documentation for a central use-case, qa / chat over documents. This will be merged as an update to `index.mdx` [here](https://python.langchain.com/docs/use_cases/question_answering/). Testing w/ local Docusaurus server: ``` From `docs` directory: mkdir _dist cp -r {docs_skeleton,snippets} _dist cp -r extras/* _dist/docs_skeleton/docs cd _dist/docs_skeleton yarn install yarn start ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 14:15:13 -07:00
William FH	dd648183fa	Rm create_project line (#7486 ) not needed	2023-07-10 10:49:55 -07:00
Leonid Ganeline	5eec74d9a5	docstrings `document_loaders` 3 (#6937 ) - Updated docstrings for `document_loaders` - Mass update `"""Loader that loads` to `"""Loads` @baskaryan - please, review	2023-07-10 08:56:53 -07:00
Stanko Kuveljic	9d13dcd17c	Pinecone: Add V4 support (#7473 )	2023-07-10 08:39:47 -07:00
Adilkhan Sarsen	5debd5043e	Added deeplake use case examples of the new features (#6528 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> 1. Added use cases of the new features 2. Done some code refactoring --------- Co-authored-by: Ivo Stranic <istranic@gmail.com>	2023-07-10 07:04:29 -07:00
Bagatur	9b615022e2	bump 229 (#7467 )	2023-07-10 04:38:55 -04:00
Kazuki Maeda	92b4418c8c	Datadog logs loader (#7356 ) ### Description Created a Loader to get a list of specific logs from Datadog Logs. ### Dependencies `datadog_api_client` is required. ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 04:27:55 -04:00
Yifei Song	7d29bb2c02	Add Xorbits Dataframe as a Document Loader (#7319 ) - [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data. - This PR added support for the Xorbits document loader, which allows langchain to leverage Xorbits to parallelize and distribute the loading of data. - Dependencies: This change requires the Xorbits library to be installed in order to be used. `pip install xorbits` - Request for review: @rlancemartin, @eyurtsev - Twitter handle: https://twitter.com/Xorbitsio Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 04:24:47 -04:00
Sergio Moreno	21a353e9c2	feat: ctransformers support async chain (#6859 ) - Description: Adding async method for CTransformers - Issue: I've found impossible without this code to run Websockets inside a FastAPI micro service and a CTransformers model. - Tag maintainer: Not necessary yet, I don't like to mention directly - Twitter handle: @_semoal	2023-07-10 04:23:41 -04:00
Paul-Emile Brotons	d2cf0d16b3	adding max_marginal_relevance_search method to MongoDBAtlasVectorSearch (#7310 ) Adding a maximal_marginal_relevance method to the MongoDBAtlasVectorSearch vectorstore enhances the user experience by providing more diverse search results Issue: #7304	2023-07-10 04:04:19 -04:00
Bagatur	04cddfba0d	Add lark import error (#7465 )	2023-07-10 03:21:23 -04:00
Matt Robinson	bcab894f4e	feat: Add `UnstructuredTSVLoader` (#7367 ) ### Summary Adds an `UnstructuredTSVLoader` for TSV files. Also updates the doc strings for `UnstructuredCSV` and `UnstructuredExcel` loaders. ### Testing ```python from langchain.document_loaders.tsv import UnstructuredTSVLoader loader = UnstructuredTSVLoader( file_path="example_data/mlb_teams_2012.csv", mode="elements" ) docs = loader.load() ```	2023-07-10 03:07:10 -04:00
Ronald Li	490f4a9ff0	Fixes KeyError in AmazonKendraRetriever initializer (#7464 ) ### Description argument variable client is marked as required in commit `81e5b1ad36` which breaks the default way of initialization providing only index_id. This commit avoid KeyError exception when it is initialized without a client variable ### Dependencies no dependency required	2023-07-10 03:02:36 -04:00
Jona Sassenhagen	7ffc431b3a	Add spacy sentencizer (#7442 ) `SpacyTextSplitter` currently uses spacy's statistics-based `en_core_web_sm` model for sentence splitting. This is a good splitter, but it's also pretty slow, and in this case it's doing a lot of work that's not needed given that the spacy parse is then just thrown away. However, there is also a simple rules-based spacy sentencizer. Using this is at least an order of magnitude faster than using `en_core_web_sm` according to my local tests. Also, spacy sentence tokenization based on `en_core_web_sm` can be sped up in this case by not doing the NER stage. This shaves some cycles too, both when loading the model and when parsing the text. Consequently, this PR adds the option to use the basic spacy sentencizer, and it disables the NER stage for the current approach, which is kept as the default. Lastly, when extracting the tokenized sentences, the `text` attribute is called directly instead of doing the string conversion, which is IMO a bit more idiomatic.	2023-07-10 02:52:05 -04:00
charosen	50a9fcccb0	feat(module): add param ids to ElasticVectorSearch.from_texts method (#7425 ) # add param ids to ElasticVectorSearch.from_texts method. - Description: add param ids to ElasticVectorSearch.from_texts method. - Issue: NA. It seems `add_texts` already supports passing in document ids, but param `ids` is omitted in `from_texts` classmethod, - Dependencies: None, - Tag maintainer: @rlancemartin, @eyurtsev please have a look, thanks ``` # ElasticVectorSearch add_texts def add_texts( self, texts: Iterable[str], metadatas: Optional[List[dict]] = None, refresh_indices: bool = True, ids: Optional[List[str]] = None, kwargs: Any, ) -> List[str]: ... ``` ``` # ElasticVectorSearch from_texts @classmethod def from_texts( cls, texts: List[str], embedding: Embeddings, metadatas: Optional[List[dict]] = None, elasticsearch_url: Optional[str] = None, index_name: Optional[str] = None, refresh_indices: bool = True, kwargs: Any, ) -> ElasticVectorSearch: ``` Co-authored-by: charosen <charosen@bupt.cn>	2023-07-10 02:25:35 -04:00
James Yin	a5fd8873b1	fix: type hint of get_chat_history in BaseConversationalRetrievalChain (#7461 ) The type hint of `get_chat_history` property in `BaseConversationalRetrievalChain` is incorrect. @baskaryan	2023-07-10 02:14:00 -04:00
nikkie	dfc3f83b0f	docs(vectorstores/integrations/chroma): Fix loading and saving (#7437 ) - Description: Fix loading and saving code about Chroma - Issue: the issue #7436 - Dependencies: - - Twitter handle: https://twitter.com/ftnext	2023-07-10 02:05:15 -04:00
Daniel Chalef	c7f7788d0b	Add ZepMemory; improve ZepChatMessageHistory handling of metadata; Fix bugs (#7444 ) Hey @hwchase17 - This PR adds a `ZepMemory` class, improves handling of Zep's message metadata, and makes it easier for folks building custom chains to persist metadata alongside their chat history. We've had plenty confused users unfamiliar with ChatMessageHistory classes and how to wrap the `ZepChatMessageHistory` in a `ConversationBufferMemory`. So we've created the `ZepMemory` class as a light wrapper for `ZepChatMessageHistory`. Details: - add ZepMemory, modify notebook to demo use of ZepMemory - Modify summary to be SystemMessage - add metadata argument to add_message; add Zep metadata to Message.additional_kwargs - support passing in metadata	2023-07-10 01:53:49 -04:00
Saurabh Chaturvedi	8f8e8d701e	Fix info about YouTube (#7447 ) (Unintentionally mean 😅) nit: YouTube wasn't created by Google, this PR fixes the mention in docs.	2023-07-10 01:52:55 -04:00
Leonid Ganeline	560c4dfc98	docstrings: `docstore` and `client` (#6783 ) updated docstrings in `docstore/` and `client/` @baskaryan	2023-07-09 01:34:28 -04:00
Jeroen Van Goey	f5bd88757e	Fix typo (#7416 ) `quesitons` -> `questions`.	2023-07-09 00:54:48 -04:00
Alejandro Garrido Mota	ea9c3cc9c9	Fix syntax erros in documentation (#7409 ) - Description: Tiny documentation fix. In Python, when defining function parameters or providing arguments to a function or class constructor, we do not use the `:` character. - Issue: N/A - Dependencies: N/A, - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @mogaal	2023-07-08 19:52:01 -04:00
Nolan	5da9f9abcb	docs(agents/toolkits): Fix error in document_comparison_toolkit.ipynb (#7417 ) Replace this comment with: - Description: Removes unneeded output warning in documentation at https://python.langchain.com/docs/modules/agents/toolkits/document_comparison_toolkit - Issue: - - Dependencies: - - Tag maintainer: @baskaryan - Twitter handle: @finnless	2023-07-08 19:51:08 -04:00
nikkie	2eb4a2ceea	docs(retrievers/get-started): Fix broken state_of_the_union.txt link (#7399 ) Thank you for this awesome library. - Description: Fix broken link in documentation - Issue: - https://python.langchain.com/docs/modules/data_connection/retrievers/#get-started - the URL: https://github.com/hwchase17/langchain/blob/master/docs/modules/state_of_the_union.txt - I think the right one is https://github.com/hwchase17/langchain/blob/master/docs/extras/modules/state_of_the_union.txt - Dependencies: - - Tag maintainer: @baskaryan - Twitter handle: -	2023-07-08 11:11:05 -04:00
Delgermurun	e7420789e4	improve description of JinaChat (#7397 ) very small doc string change in the `JinaChat` class.	2023-07-08 10:57:11 -04:00
Bagatur	26c86a197c	bump 228 (#7393 )	2023-07-08 03:05:20 -04:00
SvMax	1d649b127e	Added param to return only a structured json from the get_format_instructions method (#5848 ) I just added a parameter to the method get_format_instructions, to return directly the JSON instructions without the leading instruction sentence. I'm planning to use it to define the structure of a JSON object passed in input, the get_format_instructions(). --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-08 02:57:26 -04:00
Bagatur	362bc301df	fix jina (#7392 )	2023-07-08 02:41:54 -04:00
Delgermurun	a1603fccfb	integrate JinaChat (#6927 ) Integration with https://chat.jina.ai/api. It is OpenAI compatible API. - Twitter handle: [https://twitter.com/JinaAI_](https://twitter.com/JinaAI_) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-08 02:17:04 -04:00
William FH	4ba7396f96	Add single run eval loader (#7390 ) Plus - add evaluation name to make string and embedding validators work with the run evaluator loader. - Rm unused root validator	2023-07-07 23:06:49 -07:00
Roger Yu	633b673b85	Update pinecone.ipynb (#7382 ) Fix typo	2023-07-08 01:48:03 -04:00
Oleg Zabluda	4d697d3f24	Allow passing custom prompts to GraphIndexCreator (#7381 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-08 01:47:53 -04:00
William FH	612a74eb7e	Make Ref Example Threadsafe (#7383 ) Have noticed transient ref example misalignment. I believe this is caused by the logic of assigning an example within the thread executor rather than before.	2023-07-07 21:50:42 -07:00
William FH	4789c99bc2	Add String Distance and Embedding Evaluators (#7123 ) Add a string evaluator and pairwise string evaluator implementation for: - Embedding distance - String distance Update docs	2023-07-07 21:44:31 -07:00
ljeagle	fb6e63dc36	Upgrade the AwaDB from 0.3.5 to 0.3.6 (#7363 )	2023-07-07 20:41:17 -07:00
William FH	c5edbea34a	Load Run Evaluator (#7101 ) Current problems: 1. Evaluating LLMs or Chat models isn't smooth. Even specifying 'generations' as the output inserts a redundant list into the eval template 2. Configuring input / prediction / reference keys in the `get_qa_evaluator` function is confusing. Unless you are using a chain with the default keys, you have to specify all the variables and need to reason about whether the key corresponds to the traced run's inputs, outputs or the examples inputs or outputs. Proposal: - Configure the run evaluator according to a model. Use the model type and input/output keys to assert compatibility where possible. Only need to specify a reference_key for certain evaluators (which is less confusing than specifying input keys) When does this work: - If you have your langchain model available (assumed always for run_on_dataset flow) - If you are evaluating an LLM, Chat model, or chain - If the LLM or chat models are traced by langchain (wouldn't work if you add an incompatible schema via the REST API) When would this fail: - Currently if you directly create an example from an LLM run, the outputs are generations with all the extra metadata present. A simple `example_key` and dumping all to the template could make the evaluations unreliable - Doesn't help if you're not using the low level API - If you want to instantiate the evaluator without instantiating your chain or LLM (maybe common for monitoring, for instance) -> could also load from run or run type though What's ugly: - Personally think it's better to load evaluators one by one since passing a config down is pretty confusing. - Lots of testing needs to be added - Inconsistent in that it makes a separate run and example input mapper instead of the original `RunEvaluatorInputMapper`, which maps a run and example to a single input. Example usage running the for an LLM, Chat Model, and Agent. ``` # Test running for the string evaluators evaluator_names = ["qa", "criteria"] model = ChatOpenAI() configured_evaluators = load_run_evaluators_for_model(evaluator_names, model=model, reference_key="answer") run_on_dataset(ds_name, model, run_evaluators=configured_evaluators) ``` <details> <summary>Full code with dataset upload</summary> ``` ## Create dataset from langchain.evaluation.run_evaluators.loading import load_run_evaluators_for_model from langchain.evaluation import load_dataset import pandas as pd lcds = load_dataset("llm-math") df = pd.DataFrame(lcds) from uuid import uuid4 from langsmith import Client client = Client() ds_name = "llm-math - " + str(uuid4())[0:8] ds = client.upload_dataframe(df, name=ds_name, input_keys=["question"], output_keys=["answer"]) ## Define the models we'll test over from langchain.llms import OpenAI from langchain.chat_models import ChatOpenAI from langchain.agents import initialize_agent, AgentType from langchain.tools import tool llm = OpenAI(temperature=0) chat_model = ChatOpenAI(temperature=0) @tool def sum(a: float, b: float) -> float: """Add two numbers""" return a + b def construct_agent(): return initialize_agent( llm=chat_model, tools=[sum], agent=AgentType.OPENAI_MULTI_FUNCTIONS, ) agent = construct_agent() # Test running for the string evaluators evaluator_names = ["qa", "criteria"] models = [llm, chat_model, agent] run_evaluators = [] for model in models: run_evaluators.append(load_run_evaluators_for_model(evaluator_names, model=model, reference_key="answer")) # Run on LLM, Chat Model, and Agent from langchain.client.runner_utils import run_on_dataset to_test = [llm, chat_model, construct_agent] for model, configured_evaluators in zip(to_test, run_evaluators): run_on_dataset(ds_name, model, run_evaluators=configured_evaluators, verbose=True) ``` </details> --------- Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-07-07 19:57:59 -07:00
Bagatur	1ac347b4e3	update databerry-chaindesk redirect (#7378 )	2023-07-07 19:11:46 -04:00
Joshua Carroll	705d2f5b92	Update the API Reference link in Streamlit integration docs (#7377 ) This page: https://python.langchain.com/docs/modules/callbacks/integrations/streamlit Has a bad API Reference link currently. This PR fixes it to the correct link. Also updates the embedded app link to https://langchain-mrkl.streamlit.app/ (better name) which is hosted in langchain-ai/streamlit-agent repo	2023-07-07 17:35:57 -04:00
Georges Petrov	ec033ae277	Rename Databerry to Chaindesk (#7022 ) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-07 17:28:04 -04:00
Philip Meier	da5b0723d2	update MosaicML inputs and outputs (#7348 ) As of today (July 7, 2023), the [MosaicML API](https://docs.mosaicml.com/en/latest/inference.html#text-completion-requests) uses `"inputs"` for the prompt This PR adds support for this new format. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-07 17:23:11 -04:00
Bearnardd	184ede4e48	Fix buggy output from GraphQAChain (#7372 ) fixes https://github.com/hwchase17/langchain/issues/7289 A simple fix of the buggy output of `graph_qa`. If we have several entities with triplets then the last entry of `triplets` for a given entity merges with the first entry of the `triplets` of the next entity.	2023-07-07 17:19:53 -04:00
Harrison Chase	7cdf97ba9b	Harrison/add to imports (#7370 ) pgvector cleanup	2023-07-07 16:27:44 -04:00
Bagatur	4d427b2397	Base language model docstrings (#7104 )	2023-07-07 16:09:10 -04:00
ॐ shivam mamgain	2179d4eef8	Fix for KeyError in MlflowCallbackHandler (#7051 ) - Description: `MlflowCallbackHandler` fails with `KeyError: "['name'] not in index"`. See https://github.com/hwchase17/langchain/issues/5770 for more details. Root cause is that LangChain does not pass "name" as a part of `serialized` argument to `on_llm_start()` callback method. The commit where this change was made is probably this: `18af149e91`. My bug fix derives "name" from "id" field. - Issue: https://github.com/hwchase17/langchain/issues/5770 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-07 16:08:06 -04:00
Alex Gamble	df746ad821	Add a callback handler for Context (https://getcontext.ai ) (#7151 ) ### Description Adding a callback handler for Context. Context is a product analytics platform for AI chat experiences to help you understand how users are interacting with your product. I've added the callback library + an example notebook showing its use. ### Dependencies Requires the user to install the `context-python` library. The library is lazily-loaded when the callback is instantiated. ### Announcing the feature We spoke with Harrison a few weeks ago about also doing a blog post announcing our integration, so will coordinate this with him. Our Twitter handle for the company is @getcontextai, and the founders are @_agamble and @HenrySG. Thanks in advance!	2023-07-07 15:33:29 -04:00
Austin	c9a0f24646	Add verbose parameter for llamacpp (#7253 ) Title: Add verbose parameter for llamacpp Description: This pull request adds a 'verbose' parameter to the llamacpp module. The 'verbose' parameter, when set to True, will enable the output of detailed logs during the execution of the Llama model. This added parameter can aid in debugging and understanding the internal processes of the module. The verbose parameter is a boolean that prints verbose output to stderr when set to True. By default, the verbose parameter is set to True but can be toggled off if less output is desired. This new parameter has been added to the `validate_environment` method of the `LlamaCpp` class which initializes the `llama_cpp.Llama` API: ```python class LlamaCpp(LLM): ... @root_validator() def validate_environment(cls, values: Dict) -> Dict: ... model_param_names = [ ... "verbose", # New verbose parameter added ] ... values["client"] = Llama(model_path, **model_params) ... ``` --------- Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>	2023-07-07 15:08:25 -04:00
Kenny	34a2755a54	Allow passing api key into OpenAIWhisperParser (#7281 ) This just allows the user to pass in an api_key directly into OpenAIWhisperParser. Very simple addition.	2023-07-07 15:07:45 -04:00
mrkhalil6	4e7d0c115b	Add support for filters and namespaces in similarity search in Pinecone similarity_score_threshold (#7301 ) At the moment, pinecone vectorStore does not support filters and namespaces when using similarity_score_threshold search type. In this PR, I've implemented that. It passes all the kwargs except "score_threshold" as that is not a supported argument for method "similarity_search_with_score". --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-07 15:03:59 -04:00
Manuel Saelices	01dca1e438	Add context to an output parsing error on Pydantic schema to improve exception handling (#7344 ) ## Changes - [X] Fill the `llm_output` param when there is an output parsing error in a Pydantic schema so that we can get the original text that failed to parse when handling the exception ## Background With this change, we could do something like this: ``` output_parser = PydanticOutputParser(pydantic_object=pydantic_obj) chain = ConversationChain(..., output_parser=output_parser) try: response: PydanticSchema = chain.predict(input=input) except OutputParserException as exc: logger.error( 'OutputParserException while parsing chatbot response: %s', exc.llm_output, ) ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-07 14:49:37 -04:00
Raouf Chebri	1ac6deda89	update extension name (#7359 ) hi @rlancemartin , We had a new deployment and the `pg_extension` creation command was updated from `CREATE EXTENSION pg_embedding` to `CREATE EXTENSION embedding`. https://github.com/neondatabase/neon/pull/4646 The extension not made public yet. No users will be affected by this. Will be public next week. Please let me know if you have any questions. Thank you in advance 🙏	2023-07-07 11:35:51 -07:00
William FH	4e180dc54e	Unset Cache in Tests (#7362 ) This is impacting other unit tests that use callbacks since the cache is still set (just empty)	2023-07-07 11:05:09 -07:00
German Martin	3ce4e46c8c	The Fellowship of the Vectors: New Embeddings Filter using clustering. (#7015 ) Continuing with Tolkien inspired series of langchain tools. I bring to you: The Fellowship of the Vectors, AKA EmbeddingsClusteringFilter. This document filter uses embeddings to group vectors together into clusters, then allows you to pick an arbitrary number of documents vector based on proximity to the cluster centers. That's a representative sample of the cluster. The original idea is from [Greg Kamradt](https://github.com/gkamradt) from this video (Level4): https://www.youtube.com/watch?v=qaPMdcCqtWk&t=365s I added few tricks to make it a bit more versatile, so you can parametrize what to do with duplicate documents in case of cluster overlap: replace the duplicates with the next closest document or remove it. This allow you to use it as an special kind of redundant filter too. Additionally you can choose 2 diff orders: grouped by cluster or respecting the original retriever scores. In my use case I was using the docs grouped by cluster to run refine chains per cluster to generate summarization over a large corpus of documents. Let me know if you want to change anything! @rlancemartin, @eyurtsev, @hwchase17, --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-07-07 10:28:17 -07:00
Leonid Ganeline	b489466488	docs: `dependents` update 4 (#7360 ) Updated links and counters of the `dependents` page.	2023-07-07 13:22:30 -04:00
William FH	38ca5c84cb	Explicitly list requires_reference in function (#7357 )	2023-07-07 10:04:03 -07:00
Harrison Chase	49b2b0e3c0	change embedding to None (#7355 )	2023-07-07 12:33:03 -04:00
imaprogrammer	a2830e3056	Update chroma.py: Persist directory from client_settings if provided there (#7087 ) Change details: - Description: When calling db.persist(), a check prevents from it proceeding as the constructor only sets member `_persist_directory` from parameters. But the ChromaDB client settings also has this parameter, and if the client_settings parameter is used without passing the persist_directory (which is optional), the `persist` method raises `ValueError` for not setting `_persist_directory`. This change fixes it by setting the member `_persist_directory` variable from client_settings if it is set, else uses the constructor parameter. - Issue: I didn't find any github issue of this, but I discovered it after calling the persist method - Dependencies: None - Tag maintainer: vectorstore related change - @rlancemartin, @eyurtsev - Twitter handle: Don't have one :( Additional discussion: We may need to discuss the way I implemented the fallback using `or`. --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-07-07 09:20:27 -07:00
Bagatur	cb4e88e4fb	bump 227 (#7354 )	2023-07-07 11:52:35 -04:00
Bagatur	d1c7237034	openai fn update nb (#7352 )	2023-07-07 11:52:21 -04:00
Bagatur	0ed2da7020	bump 226 (#7335 )	2023-07-07 05:59:13 -04:00
Bagatur	1c8cff32f1	Generic OpenAI fn chain (#7270 ) Add loading functions for openai function chains and add docs page	2023-07-07 05:44:53 -04:00
Bagatur	fd7145970f	Output parser redirect (#7330 ) Related to ##7311	2023-07-07 04:26:34 -04:00
OwenElliott	3074306ae1	Marqo Vector Store Examples & Type Hints (#7326 ) This PR improves the example notebook for the Marqo vectorstore implementation by adding a new RetrievalQAWithSourcesChain example. The `embedding` parameter in `from_documents` has its type updated to `Union[Embeddings, None]` and a default parameter of None because this is ignored in Marqo. This PR also upgrades the Marqo version to 0.11.0 to remove the device parameter after a breaking change to the API. Related to #7068 @tomhamer @hwchase17 --------- Co-authored-by: Tom Hamer <tom@marqo.ai>	2023-07-07 04:11:20 -04:00
Nayjest	5809c3d29d	Pack of small fixes and refactorings that don't affect functionality (#6990 ) Description: Pack of small fixes and refactorings that don't affect functionality, just making code prettier & fixing some misspelling (hand-filtered improvements proposed by SeniorAi.online, prototype of code improving tool based on gpt4), agents and callbacks folders was covered. Dependencies: Nothing changed Twitter: https://twitter.com/nayjest Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-07 03:40:49 -04:00
Bagatur	87f75cb322	Add base Chain docstrings (#7114 )	2023-07-07 03:06:33 -04:00
Leonid Ganeline	284d40b7af	docstrings top level update (#7173 ) Updated docstrings so, that [API Reference](https://api.python.langchain.com/en/latest/api_reference.html) page has text in the second column (class/function/... description.	2023-07-07 02:42:28 -04:00
Stav Sapir	8d961b9e33	add preset ability to textgen llm (#7196 ) add an ability for textgen llm to work with preset provided by text gen webui API. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-07 02:41:24 -04:00
Bagatur	a9c5b4bcea	Bagatur/clarifai update (#7324 ) This PR improves upon the Clarifai LangChain integration with improved docs, errors, args and the addition of embedding model support in LancChain for Clarifai's embedding models and an overview of the various ways you can integrate with Clarifai added to the docs. --------- Co-authored-by: Matthew Zeiler <zeiler@clarifai.com>	2023-07-07 02:23:20 -04:00
Oleg Zabluda	9954eff8fd	Rename prompt_template => _DEFAULT_GRAPH_QA_TEMPLATE and PROMPT => GRAPH_QA_PROMPT to make consistent with the rest of the files (#7250 ) Rename prompt_template => _DEFAULT_GRAPH_QA_TEMPLATE to make consistent with the rest of the file.	2023-07-07 02:17:40 -04:00
Nikhil Kumar Gupta	6095a0a310	Added number_of_head_rows to pandas agent parameters (#7271 ) Description: Added number_of_head_rows as a parameter to pandas agent. number_of_head_rows allows the user to select the number of rows to pass with the prompt when include_df_in_prompt is True. This gives the ability to control the token length and can be helpful in dealing with large dataframe. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-07 02:17:26 -04:00
John Landahl	e047541b5f	Corrected a typo in elasticsearch.ipynb (#7318 ) Simple typo fix	2023-07-07 01:35:32 -04:00
Subsegment	152dc59060	docs : add cnosdb to Ecosystem Integrations (#7316 ) - Implement a `from_cnosdb` method for the `SQLDatabase` class - Write CnosDB documentation and add it to Ecosystem Integrations	2023-07-07 01:35:22 -04:00
Bagatur	927c8eb91a	Refac package version check (#7312 )	2023-07-07 01:21:53 -04:00
Sparsh Jain	bac56618b4	Solving anthropic packaging version issue (#7306 ) - Description: Solving, anthropic packaging version issue by clearing the mixup from package.version that is being confused with version from - importlib.metadata.version. - Issue: it fixes the issue #7283 - Maintainer: @hwchase17 The following change has been explained in the comment - https://github.com/hwchase17/langchain/issues/7283#issuecomment-1624328978	2023-07-06 19:35:42 -04:00
Jason B. Koh	d642609a23	Fix: Recognize `List` at `from_function` (#7178 ) - Description: pydantic's `ModelField.type_` only exposes the native data type but not complex type hints like `List`. Thus, generating a Tool with `from_function` through function signature produces incorrect argument schemas (e.g., `str` instead of `List[str]`) - Issue: N/A - Dependencies: N/A - Tag maintainer: @hinthornw - Twitter handle: `mapped` All the unittest (with an additional one in this PR) passed, though I didn't try integration tests...	2023-07-06 17:22:09 -04:00
Chathura Rathnayake	ec10787bc7	Fixed the confluence loader ".csv" files loading issue (#7195 ) - Description: Sometimes there are csv attachments with the media type "application/vnd.ms-excel". These files failed to be loaded via the xlrd library. It throws a corrupted file error. I fixed it by separately processing excel files using pandas. Excel files will be processed just like before. - Dependencies: pandas, os, io --------- Co-authored-by: Chathura <chathurar@yaalalabs.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-06 17:21:43 -04:00
Andre Elizondo	b21c2f8704	Update docs for whylabs (langkit) callback handler (#7293 ) - Description: Update docs for whylabs callback handler - Issue: none - Dependencies: none - Tag maintainer: @agola11 - Twitter handle: @useautomation @whylabs --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Jamie Broomall <jamie@whylabs.ai>	2023-07-06 17:21:28 -04:00
William FH	e736d60516	Load Evaluator (#6942 ) Create a `load_evaluators()` function so you don't have to import all the individual evaluator classes	2023-07-06 13:58:58 -07:00
David Duong	12d14f8947	Fix secrets serialisation for ChatAnthropic (#7300 )	2023-07-06 21:57:12 +01:00
William FH	cb9ff6efb8	Add function call params to invocation params (#7240 )	2023-07-06 13:56:07 -07:00
William FH	1f4a51cb9c	Add Agent Trajectory Interface (#7122 )	2023-07-06 13:33:33 -07:00
Bagatur	a6b39afe0e	rm side nav (#7297 )	2023-07-06 15:19:29 -04:00
Bruno Bornsztein	1a4ca3eff9	handle missing finish_reason (#7296 ) In some cases, the OpenAI response is missing the `finish_reason` attribute. It seems to happen when using Ada or Babbage and `stream=true`, but I can't always reproduce it. This change just gracefully handles the missing key.	2023-07-06 15:13:51 -04:00
Leonid Ganeline	6ff9e9b34a	updated `huggingface_hub` examples (#7292 ) Added examples for models: - Google `Flan` - TII `Falcon` - Salesforce `XGen`	2023-07-06 15:04:37 -04:00
Avinash Raj	09acbb8410	Modified PromptLayerChatOpenAI class to support function call (#6366 ) Introduction of newest function calling feature doesn't work properly with PromptLayerChatOpenAI model since on the `_generate` method, functions argument are not even getting passed to the `ChatOpenAI` base class which results in empty `ai_message.additional_kwargs` Fixes #6365	2023-07-06 13:16:04 -04:00
Dídac Sabatés	e0cb3ea90c	Fix sql_database.ipynb link (#6525 ) Looks like the [SQLDatabaseChain](https://langchain.readthedocs.io/en/latest/modules/chains/examples/sqlite.html) in the SQL Database Agent page was broken I've change it to the SQL Chain page	2023-07-06 13:07:37 -04:00
Leonid Ganeline	4450791edd	docs: tutorials update (#7230 ) updated `tutorials.mdx`: - added a link to new `Deeplearning AI` course on LangChain - added links to other tutorial videos - fixed format @baskaryan, @hwchase17	2023-07-06 12:44:23 -04:00
Diego Machado	a7ae35fe4e	Fix duplicated sentence in documentation's introduction (#6351 ) Fix duplicated sentence in documentation's introduction	2023-07-06 12:12:18 -04:00
Bagatur	681f2678a3	add elasticknn to init (#7284 )	2023-07-06 11:58:24 -04:00
hayao-k	c23e16c459	docs: Fixed typos in Amazon Kendra Retriever documentation (#7261 ) ## Description Fixed to the official service name Amazon Kendra. ## Tag maintainer @baskaryan	2023-07-06 11:56:52 -04:00
zhujiangwei	8c371e12eb	refactor BedrockEmbeddings class (#7266 ) #### Description refactor BedrockEmbeddings class to clean code as below: 1. inline content type and accept 2. rewrite input_body as a dictionary literal 3. no need to declare embeddings variable, so remove it	2023-07-06 11:56:30 -04:00
Chui	c7cf11b8ab	Remove whitespace in filename (#7264 )	2023-07-06 11:55:42 -04:00
Jan Kubica	fed64ae060	Chroma: add vector search with scores (#6864 ) - Description: Adding to Chroma integration the option to run a similarity search by a vector with relevance scores. Fixing two minor typos. - Issue: The "lambda_mult" typo is related to #4861 - Maintainer: @rlancemartin, @eyurtsev	2023-07-06 10:01:55 -04:00
William FH	576880abc5	Re-use Trajectory Evaluator (#7248 ) Use the trajectory eval chain in the run evaluation implementation and update the prepare inputs method to apply to both asynca nd sync	2023-07-06 07:00:24 -07:00
zhaoshengbo	e8f24164f0	Improve the alibaba cloud opensearch vector store documentation (#6964 ) Based on user feedback, we have improved the Alibaba Cloud OpenSearch vector store documentation. Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	2023-07-06 09:47:49 -04:00
Eduard van Valkenburg	ae5aa496ee	PowerBI updates (#7143 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Several updates for the PowerBI tools: - Handle 0 records returned by requesting redo with different filtering - Handle too large results by optionally tokenizing the result and comparing against a max (change in signature, non-breaking) - Implemented LLMChain with Chat for chat models for the tools. - Updates to the main prompt including tables - Update to Tool prompt with TOPN function - Split the tool prompt to allow the LLMChain with ChatPromptTemplate Smaller fixes for stability. For visibility: @hinthornw	2023-07-06 09:39:23 -04:00
emarco177	b9d6d4cd4c	added template repo for CI/CD deployment on Google Cloud Run (#7218 ) Replace this comment with: - Description: added documentation for a template repo that helps dockerizing and deploying a LangChain using a Cloud Build CI/CD pipeline to Google Cloud build serverless - Issue: None, - Dependencies: None, - Tag maintainer: @baskaryan, - Twitter handle: EdenEmarco177 If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use.	2023-07-06 09:38:38 -04:00
Leonid Kuligin	8b19f6a0da	Added retries for Vertex LLM (#7219 ) #7217 --------- Co-authored-by: Leonid Kuligin <kuligin@google.com>	2023-07-06 09:38:01 -04:00
William FH	ec66d5188c	Add Better Errors for Comparison Chain (#7033 ) + change to ABC - this lets us add things like the evaluation name for loading	2023-07-06 06:37:04 -07:00
Stefano Lottini	e61cfb6e99	FLARE Example notebook: switch to named arg to pass pydantic validation (#7267 ) Adding the name of the parameter to comply with latest requirements by Pydantic usage for BaseModels.	2023-07-06 09:32:00 -04:00
Sasmitha Manathunga	0c7a5cb206	Fix inconsistent behavior of `CharacterTextSplitter` when changing `keep_separator` (#7263 ) - Description: - When `keep_separator` is `True` the `_split_text_with_regex()` method in `text_splitter` uses regex to split, but when `keep_separator` is `False` it uses `str.split()`. This causes problems when the separator is a special regex character like `.` or `*`. This PR fixes that by using `re.split()` in both cases. - Issue: #7262 - Tag maintainer: @baskaryan	2023-07-06 09:30:03 -04:00
os1ma	b151d4257a	docs: Update documentation for Wikipedia tool to use WikipediaQueryRun (#7258 ) Description In the following page, "Wikipedia" tool is explained. https://python.langchain.com/docs/modules/agents/tools/integrations/wikipedia However, the WikipediaAPIWrapper being used is not a tool. This PR updated the documentation to use a tool WikipediaQueryRun. Issue None Tag maintainer Agents / Tools / Toolkits: @hinthornw	2023-07-06 09:29:38 -04:00
Jeroen Van Goey	887bb12287	Use correct Language for html_splitter (#7274 ) `html_splitter` was using `Language.MARKDOWN`.	2023-07-06 09:24:25 -04:00
Shantanu Nair	f773c21723	Update supabase match_docs ddl and notebook to use expected id type (#7257 ) - Description: Switch supabase match function DDL to use expected uuid type instead of bigint - Issue: https://github.com/hwchase17/langchain/issues/6743, https://github.com/hwchase17/langchain/issues/7179 - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: https://twitter.com/ShantanuNair	2023-07-06 09:22:41 -04:00
Myeongseop Kim	0e878ccc2d	Add HumanInputChatModel (#7256 ) - Description: This is a chat model equivalent of HumanInputLLM. An example notebook is also added. - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: N/A	2023-07-06 09:21:03 -04:00
Myeongseop Kim	57d8a3d1e8	Make tqdm for OpenAIEmbeddings optional (#7247 ) - Description: I have added a `show_progress_bar` parameter (defaults.to `False`) to the `OpenAIEmbeddings`. If the user sets `show_progress_bar` to `True`, a progress bar will be displayed. - Issue: #7246 - Dependencies: N/A - Tag maintainer: @hwchase17, @baskaryan - Twitter handle: N/A	2023-07-05 23:36:01 -04:00
Harrison Chase	c36f852846	fix conversational retrieval docs (#7245 )	2023-07-05 21:51:33 -04:00
Harrison Chase	035ad33a5b	bump ver to 225 (#7244 )	2023-07-05 21:22:18 -04:00
Shantanu Nair	cabd358c3a	Add missing token_max in reduce.py acombine_docs (#7241 ) Replace this comment with: - Description: reduce.py reduce chain implementation's acombine_docs call does not propagate token_max. Without this, the async call will end up using 3000 tokens, the default, for the collapse chain. - Tag maintainer: @hwchase17 @agola11 @baskaryan - Twitter handle: https://twitter.com/ShantanuNair Related PR: https://github.com/hwchase17/langchain/pull/7201 and https://github.com/hwchase17/langchain/pull/7204 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 21:02:45 -04:00
Harrison Chase	52b016920c	Harrison/update anthropic (#7237 ) Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>	2023-07-05 21:02:35 -04:00
Harrison Chase	695e7027e6	Harrison/parameter (#7081 ) add parameter to use original question or not --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-05 20:51:25 -04:00
Yevgnen	930e319ca7	Add concurrency to GitbookLoader (#7069 ) - Description: Fetch all pages concurrently. - Dependencies: `scrape_all` -> `fetch_all` -> `_fetch_with_rate_limit` -> `_fetch` (might be broken currently: https://github.com/hwchase17/langchain/pull/6519) - Tag maintainer: @rlancemartin, @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 20:51:10 -04:00
Hashem Alsaket	6aa66fd2b0	Update Hugging Face Hub notebook (#7236 ) Description: `flan-t5-xl` hangs, updated to `flan-t5-xxl`. Tested all stabilityai LLMs- all hang so removed from tutorial. Temperature > 0 to prevent unintended determinism. Issue: #3275 Tag maintainer: @baskaryan	2023-07-05 20:45:02 -04:00
Mykola Zomchak	8afc8e6f5d	Fix web_base.py (#6519 ) Fix for bug in SitemapLoader `aiohttp` `get` does not accept `verify` argument, and currently throws error, so SitemapLoader is not working This PR fixes it by removing `verify` param for `get` function call Fixes #6107 #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: techcenary <127699216+techcenary@users.noreply.github.com>	2023-07-05 16:53:57 -07:00
William FH	f891f7d69f	Skip evaluation of unfinished runs (#7235 ) Cut down on errors logged Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	2023-07-05 16:35:20 -07:00
William FH	83cf01683e	Add 'eval' tag (#7209 ) Add an "eval" tag to traced evaluation runs Most of this PR is actually https://github.com/hwchase17/langchain/pull/7207 but I can't diff off two separate PRs --------- Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	2023-07-05 16:28:34 -07:00
William FH	607708a411	Add tags support for langchaintracer (#7207 )	2023-07-05 16:19:04 -07:00
William FH	75aa408f10	Send evaluator logs to new session (#7206 ) Also stop specifying "eval" mode since explicit project modes are deprecated	2023-07-05 16:15:29 -07:00
Harrison Chase	0dc700eebf	Harrison/scene xplain (#7228 ) Co-authored-by: Kevin Pham <37129444+deoxykev@users.noreply.github.com>	2023-07-05 18:34:50 -04:00
Harrison Chase	d6541da161	remove arize nb (#7238 ) was causing some issues with docs build	2023-07-05 18:34:20 -04:00
Mike Nitsenko	d669b9ece9	Document loader for Cube Semantic Layer (#6882 ) ### Description This pull request introduces the "Cube Semantic Layer" document loader, which demonstrates the retrieval of Cube's data model metadata in a format suitable for passing to LLMs as embeddings. This enhancement aims to provide contextual information and improve the understanding of data. Twitter handle: @the_cube_dev --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-07-05 15:18:12 -07:00
Tom	e533da8bf2	Adding Marqo to vectorstore ecosystem (#7068 ) This PR brings in a vectorstore interface for [Marqo](https://www.marqo.ai/). The Marqo vectorstore exposes some of Marqo's functionality in addition the the VectorStore base class. The Marqo vectorstore also makes the embedding parameter optional because inference for embeddings is an inherent part of Marqo. Docs, notebook examples and integration tests included. Related PR: https://github.com/hwchase17/langchain/pull/2807 --------- Co-authored-by: Tom Hamer <tom@marqo.ai> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 14:44:12 -07:00
Filip Haltmayer	836d2009cb	Update milvus and zilliz docstring (#7216 ) Description: Updating the docstrings for Milvus and Zilliz so that they appear correctly on https://integrations.langchain.com/vectorstores. No changes done to code. Maintainer: @baskaryan Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>	2023-07-05 17:03:51 -04:00
Matt Robinson	d65b1951bd	docs: update docs strings for base unstructured loaders (#7222 ) ### Summary Updates the docstrings for the unstructured base loaders so more useful information appears on the integrations page. If these look good, will add similar docstrings to the other loaders. ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	2023-07-05 17:02:26 -04:00
Mike Salvatore	265f05b10e	Enable InMemoryDocstore to be constructed without providing a dict (#6976 ) - Description: Allow `InMemoryDocstore` to be created without passing a dict to the constructor; the constructor can create a dict at runtime if one isn't provided. - Tag maintainer: @dev2049	2023-07-05 16:56:31 -04:00
Harrison Chase	47e7d09dff	fix arize nb (#7227 )	2023-07-05 16:55:48 -04:00
Feras Almannaa	79b59a8e06	optimize pgvector `add_texts` (#7185 ) - Description: At the moment, inserting new embeddings to pgvector is querying all embeddings every time as the defined `embeddings` relationship is using the default params, which sets `lazy="select"`. This change drastically improves the performance and adds a few additional cleanups: * remove `collection.embeddings.append` as it was querying all embeddings on insert, replace with `collection_id` param * centralize storing logic in add_embeddings function to reduce duplication * remove boilerplate - Issue: No issue was opened. - Dependencies: None. - Tag maintainer: this is a vectorstore update, so I think @rlancemartin, @eyurtsev - Twitter handle: @falmannaa	2023-07-05 13:19:42 -07:00
Harrison Chase	6711854e30	Harrison/dataforseo (#7214 ) Co-authored-by: Alexander <sune357@gmail.com>	2023-07-05 16:02:02 -04:00
Richy Wang	cab7d86f23	Implement delete interface of vector store on AnalyticDB (#7170 ) Hi, there This pull request contains two commit: 1. Implement delete interface with optional ids parameter on AnalyticDB. 2. Allow customization of database connection behavior by exposing engine_args parameter in interfaces. - This commit adds the `engine_args` parameter to the interfaces, allowing users to customize the behavior of the database connection. The `engine_args` parameter accepts a dictionary of additional arguments that will be passed to the create_engine function. Users can now modify various aspects of the database connection, such as connection pool size and recycle time. This enhancement provides more flexibility and control to users when interacting with the database through the exposed interfaces. This commit is related to VectorStores @rlancemartin @eyurtsev Thank you for your attention and consideration.	2023-07-05 13:01:00 -07:00
Mike Salvatore	3ae11b7582	Handle kwargs in FAISS.load_local() (#6987 ) - Description: This allows parameters such as `relevance_score_fn` to be passed to the `FAISS` constructor via the `load_local()` class method. - Tag maintainer: @rlancemartin @eyurtsev	2023-07-05 15:56:40 -04:00
Jamal	a2f191a322	Replace JIRA Arbitrary Code Execution vulnerability with finer grain API wrapper (#6992 ) This fixes #4833 and the critical vulnerability https://nvd.nist.gov/vuln/detail/CVE-2023-34540 Previously, the JIRA API Wrapper had a mode that simply pipelined user input into an `exec()` function. [The intended use of the 'other' mode is to cover any of Atlassian's API that don't have an existing interface](`cc33bde74f/langchain/tools/jira/prompt.py (L24)`) Fortunately all of the [Atlassian JIRA API methods are subfunctions of their `Jira` class](https://atlassian-python-api.readthedocs.io/jira.html), so this implementation calls these subfunctions directly. As well as passing a string representation of the function to call, the implementation flexibly allows for optionally passing args and/or keyword-args. These are given as part of the dictionary input. Example: ``` { "function": "update_issue_field", #function to execute "args": [ #list of ordered args similar to other examples in this JiraAPIWrapper "key", {"summary": "New summary"} ], "kwargs": {} #dict of key value keyword-args pairs } ``` the above is equivalent to `self.jira.update_issue_field("key", {"summary": "New summary"})` Alternate query schema designs are welcome to make querying easier without passing and evaluating arbitrary python code. I considered parsing (without evaluating) input python code and extracting the function, args, and kwargs from there and then pipelining them into the callable function via `f(args, *kwargs)` - but this seemed more direct. @vowelparrot @dev2049 --------- Co-authored-by: Jamal Rahman <jamal.rahman@builder.ai>	2023-07-05 15:56:01 -04:00
Hakan Tekgul	61938a02a1	Create arize_llm_observability.ipynb (#7000 ) Adding documentation and notebook for Arize callback handler. - @dev2049 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11	2023-07-05 15:55:47 -04:00
Leonid Ganeline	ecee4d6e92	docs: update `youtube` videos and tutorials (#6515 ) added tutorials.mdx; updated youtube.mdx Rationale: the Tutorials section in the documentation is top-priority. (for example, https://pytorch.org/docs/stable/index.html) Not every project has resources to make tutorials. We have such a privilege. Community experts created several tutorials on YouTube. But the tutorial links are now hidden on the YouTube page and not easily discovered by first-time visitors. - Added new videos and tutorials that were created since the last update. - Made some reprioritization between videos on the base of the view numbers. #### Who can review? - @hwchase17 - @dev2049	2023-07-05 12:50:31 -07:00
Santiago Delgado	fa55c5a16b	Fixed Office365 tool __init__.py files, tests, and get_tools() function (#7046 ) ## Description Added Office365 tool modules to `__init__.py` files ## Issue As described in Issue https://github.com/hwchase17/langchain/issues/6936, the Office365 toolkit can't be loaded easily because it is not included in the `__init__.py` files. ## Reviewer @dev2049	2023-07-05 15:46:21 -04:00
wewebber-merlin	8a7c95e555	Retryable exception for empty OpenAI embedding. (#7070 ) Description: The OpenAI "embeddings" API intermittently falls into a failure state where an embedding is returned as [ Nan ], rather than the expected 1536 floats. This patch checks for that state (specifically, for an embedding of length 1) and if it occurs, throws an ApiError, which will cause the chunk to be retried. Issue: I have been unable to find an official langchain issue for this problem, but it is discussed (by another user) at https://stackoverflow.com/questions/76469415/getting-embeddings-of-length-1-from-langchain-openaiembeddings Maintainer: @dev2049 Testing: Since this is an intermittent OpenAI issue, I have not provided a unit or integration test. The provided code has, though, been run successfully over several million tokens. --------- Co-authored-by: William Webber <william@williamwebber.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 15:23:45 -04:00
Nuno Campos	e4459e423b	Mark some output parsers as serializable (cross-checked w/ JS) (#7083 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-05 14:53:56 -04:00
Ankush Gola	4c1c05c2c7	support adding custom metadata to runs (#7120 ) - [x] wire up tools - [x] wire up retrievers - [x] add integration test <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-05 11:11:38 -07:00
Josh Reini	30d8d1d3d0	add trulens integration (#7096 ) Description: Add TruLens integration. Twitter: @trulensml For review: - Tracing: @agola11 - Tools: @hinthornw	2023-07-05 14:04:55 -04:00
Hyoseung Kim	9abf1847f4	Fix steamship import error (#7133 ) Description: Fix steamship import error When running multi_modal_output_agent: field "steamship" not yet prepared so type is still a ForwardRef, you might need to call SteamshipImageGenerationTool.update_forward_refs(). Tag maintainer: @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 14:04:38 -04:00
Mohammad Mohtashim	7d92e9407b	Jinja2 validation changed to issue warnings rather than issuing exceptions. (#7161 ) - Description: If their are missing or extra variables when validating Jinja 2 template then a warning is issued rather than raising an exception. This allows for better flexibility for the developer as described in #7044. Also changed the relevant test so pytest is checking for raised warnings rather than exceptions. - Issue: #7044 - Tag maintainer: @hwchase17, @baskaryan --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 14:04:29 -04:00
whying	e288410e72	fix: Chroma filter symbols not supporting LIKE and CONTAIN (#7169 ) Fixing issue with SelfQueryRetriever due to unsupported LIKE and CONTAIN comparators in Chroma's WHERE filter statements. This pull request introduces a redefined set of comparators in Chroma to address the problem and make it compatible with SelfQueryRetriever. For information on the comparators supported by Chroma's filter, please refer to https://docs.trychroma.com/usage-guide#using-where-filters. <img width="495" alt="image" src="https://github.com/hwchase17/langchain/assets/22267652/34789191-0293-4f63-9bdf-ad1e1f2567c4"> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 14:04:18 -04:00
Nuno Campos	26409b01bd	Remove extra base model (#7213 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-05 14:02:27 -04:00
Samhita Alla	6f358bb04a	make textstat optional in the flyte callback handler (#7186 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> This PR makes the `textstat` library optional in the Flyte callback handler. @hinthornw, would you mind reviewing this PR since you merged the flyte callback handler code previously? --------- Signed-off-by: Samhita Alla <aallasamhita@gmail.com>	2023-07-05 13:15:56 -04:00
Conrad Fernandez	6eff0fa2ca	Added documentation for add_texts function for Pinecone integration (#7134 ) - Description: added some documentation to the Pinecone vector store docs page. - Issue: #7126 - Dependencies: None - Tag maintainer: @baskaryan I can add more documentation on the Pinecone integration functions as I am going to go in great depth into this area. Just wanted to check with the maintainers is if this is all good.	2023-07-05 13:11:37 -04:00
Nuno Campos	81e5b1ad36	Add serialized object to retriever start callback (#7074 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-05 18:04:43 +01:00
Efkan S. Goktepe	baf48d3583	Replace stop clause with shorter, pythonic alternative (#7159 ) Replace this comment with: - Description: Replace `if var is not None:` with `if var:`, a concise and pythonic alternative - Issue: N/A - Dependencies: None - Tag maintainer: Unsure - Twitter handle: N/A Signed-off-by: serhatgktp <efkan@ibm.com>	2023-07-05 13:03:22 -04:00
Shuqian	8045870a0f	fix: prevent adding an empty string to the result queue in AsyncIteratorCallbackHandler (#7180 ) - Description: Modify the code for AsyncIteratorCallbackHandler.on_llm_new_token to ensure that it does not add an empty string to the result queue. - Tag maintainer: @agola11 When using AsyncIteratorCallbackHandler with OpenAIFunctionsAgent, if the LLM response function_call instead of direct answer, the AsyncIteratorCallbackHandler.on_llm_new_token would be called with empty string. see also: langchain.chat_models.openai.ChatOpenAI._generate An alternative solution is to modify the langchain.chat_models.openai.ChatOpenAI._generate and do not call the run_manager.on_llm_new_token when the token is empty string. I am not sure which solution is better. @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 13:00:35 -04:00
felixocker	db98c44f8f	Support for SPARQL (#7165 ) # [SPARQL](https://www.w3.org/TR/rdf-sparql-query/) for [LangChain](https://github.com/hwchase17/langchain) ## Description LangChain support for knowledge graphs relying on W3C standards using RDFlib: SPARQL/ RDF(S)/ OWL with special focus on RDF \ * Works with local files, files from the web, and SPARQL endpoints * Supports both SELECT and UPDATE queries * Includes both a Jupyter notebook with an example and integration tests ## Contribution compared to related PRs and discussions * [Wikibase agent](https://github.com/hwchase17/langchain/pull/2690) - uses SPARQL, but specifically for wikibase querying * [Cypher qa](https://github.com/hwchase17/langchain/pull/5078) - graph DB question answering for Neo4J via Cypher * [PR 6050](https://github.com/hwchase17/langchain/pull/6050) - tries something similar, but does not cover UPDATE queries and supports only RDF * Discussions on [w3c mailing list](mailto:semantic-web@w3.org) related to the combination of LLMs (specifically ChatGPT) and knowledge graphs ## Dependencies * [RDFlib](https://github.com/RDFLib/rdflib) ## Tag maintainer Graph database related to memory -> @hwchase17	2023-07-05 13:00:16 -04:00
Paul Cook	7cd0936b1c	Update in_memory.py to fix "TypeError: keywords must be strings" (#7202 ) Update in_memory.py to fix "TypeError: keywords must be strings" on certain dictionaries Simple fix to prevent a "TypeError: keywords must be strings" error I encountered in my use case. @baskaryan Thanks! Hope useful! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 12:48:38 -04:00
Prakul Agarwal	38f853dfa3	Fixed typos in MongoDB Atlas Vector Search documentation (#7174 ) Fix for typos in MongoDB Atlas Vector Search documentation <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-05 12:48:00 -04:00
Shuqian	ee1d488c03	fix: rename the invalid function name of GoogleSerperResults Tool for OpenAIFunctionCall (#7176 ) - Description: rename the invalid function name of GoogleSerperResults Tool for OpenAIFunctionCall - Tag maintainer: @hinthornw When I use the GoogleSerperResults in OpenAIFunctionCall agent, the following error occurs: ```shell openai.error.InvalidRequestError: 'Google Serrper Results JSON' does not match '^[a-zA-Z0-9_-]{1,64}$' - 'functions.0.name' ``` So I rename the GoogleSerperResults's property "name" from "Google Serrper Results JSON" to "google_serrper_results_json" just like GoogleSerperRun's name: "google_serper", and it works. I guess this should be reasonable.	2023-07-05 12:47:50 -04:00
Nir Gazit	6666e422c6	fix: missing parameter in POST/PUT/PATCH HTTP requests (#7194 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-05 12:47:30 -04:00
Harrison Chase	8410c6a747	add token max parameter (#7204 )	2023-07-05 12:09:25 -04:00
Harrison Chase	7b585c7585	add tqdm to embeddings (#7205 ) for longer running embeddings, can be helpful to visualize	2023-07-05 12:04:22 -04:00
Raouf Chebri	6fc24743b7	Add pg_hnsw vectorstore integration (#6893 ) Hi @rlancemartin, @eyurtsev! - Description: Adding HNSW extension support for Postgres. Similar to pgvector vectorstore, with 3 differences 1. it uses HNSW extension for exact and ANN searches, 2. Vectors are of type array of real 3. Only supports L2 - Dependencies: [HNSW](https://github.com/knizhnik/hnsw) extension for Postgres - Example: ```python db = HNSWVectoreStore.from_documents( embedding=embeddings, documents=docs, collection_name=collection_name, connection_string=connection_string ) query = "What did the president say about Ketanji Brown Jackson" docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query) ``` The example notebook is in the PR too.	2023-07-05 08:10:10 -07:00
Harrison Chase	79fb90aafd	bump version to 224 (#7203 )	2023-07-05 10:41:26 -04:00
Harrison Chase	1415966d64	propogate token max (#7201 )	2023-07-05 10:25:48 -04:00
Harrison Chase	a94c4cca68	more formatting (#7200 )	2023-07-05 10:03:02 -04:00
Harrison Chase	e18e838aae	fix weird bold issues in docs (#7198 )	2023-07-05 09:52:49 -04:00
Baichuan Sun	e27ba9d92b	fix AmazonAPIGateway _identifying_params (#7167 ) - correct `endpoint_name` to `api_url` - add `headers` <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-04 23:14:51 -04:00
Harrison Chase	39e685b80f	Harrison/conv retrieval docs (#7080 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-04 20:17:43 -04:00
Shuqian	bf9e4ef35f	feat: implement python repl tool arun (#7125 ) Description: implement python repl tool arun Tag maintainer: @agola11	2023-07-04 20:15:49 -04:00
Alex Iribarren	9cfb311ecb	Remove duplicate lines (#7138 ) I believe these two lines are unnecessary, the variable `function_call` is already defined.	2023-07-04 20:13:27 -04:00
volodymyr-memsql	405865c91a	feat(SingleStoreVectorStore): change connection attributes in the database connection (#7142 ) Minor change to the SingleStoreVectorStore: Updated connection attributes names according to the SingleStoreDB recommendations @rlancemartin, @eyurtsev --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>	2023-07-04 20:12:56 -04:00
Hashem Alsaket	c9f696f063	LlamaCppEmbeddings not under langchain.llms (#7164 ) Description: doc string suggests `from langchain.llms import LlamaCppEmbeddings` under `LlamaCpp()` class example but `LlamaCppEmbeddings` is not in `langchain.llms` Issue: None open Tag maintainer: @baskaryan	2023-07-04 19:32:40 -04:00
Harrison Chase	e8531769f7	improve docstring of doc formatting (#7162 ) so it shows up nice	2023-07-04 19:31:29 -04:00
Max Cembalest	2984803597	cleaned Arthur tracking demo notebook (#7147 ) Cleaned title and reduced clutter for integration demo notebook for the Arthur callback handler	2023-07-04 18:15:25 -04:00
Deepankar Mahapatro	da69a6771f	docs: update Jina ecosystem (#7149 ) Documentation update for [Jina ecosystem](https://python.langchain.com/docs/ecosystem/integrations/jina) and `langchain-serve` in the deployments section to latest features. @hwchase17 <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-04 18:07:50 -04:00
Harrison Chase	b39017dc11	add docstring for in memory class (#7160 )	2023-07-04 14:59:17 -07:00
Bagatur	898087d02c	bump 223 (#7155 )	2023-07-04 14:13:41 -06:00
Harrison Chase	0ad984fa27	Docs combine document chain (#6994 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-04 12:51:04 -06:00
Simon Cheung	81eebc4070	Add HugeGraphQAChain to support gremlin generating chain (#7132 ) [Apache HugeGraph](https://github.com/apache/incubator-hugegraph) is a convenient, efficient, and adaptable graph database, compatible with the Apache TinkerPop3 framework and the Gremlin query language. In this PR, the HugeGraph and HugeGraphQAChain provide the same functionality as the existing integration with Neo4j and enables query generation and question answering over HugeGraph database. The difference is that the graph query language supported by HugeGraph is not cypher but another very popular graph query language [Gremlin](https://tinkerpop.apache.org/gremlin.html). A notebook example and a simple test case have also been added. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-04 10:21:21 -06:00
Saverio Proto	5585607654	Improve Bing Search example (#7128 ) # Description Improve Bing Search example:	2023-07-04 09:58:03 -06:00
Lance Martin	265c285057	Fix GPT4All bug w/ "n_ctx" param (#7093 ) Running `GPT4All` per the [docs](https://python.langchain.com/docs/modules/model_io/models/llms/integrations/gpt4all), I see: ``` $ from langchain.llms import GPT4All $ model = GPT4All(model=local_path) $ model("The capital of France is ", max_tokens=10) TypeError: generate() got an unexpected keyword argument 'n_ctx' ``` It appears `n_ctx` is [no longer a supported param](https://docs.gpt4all.io/gpt4all_python.html#gpt4all.gpt4all.GPT4All.generate) in the GPT4All API from https://github.com/nomic-ai/gpt4all/pull/1090. It now uses `max_tokens`, so I set this. And I also set other defaults used in GPT4All client [here](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-bindings/python/gpt4all/gpt4all.py). Confirm it now works: ``` $ from langchain.llms import GPT4All $ model = GPT4All(model=local_path) $ model("The capital of France is ", max_tokens=10) < Model logging > "....Paris." ``` --------- Co-authored-by: R. Lance Martin <rlm@Rs-MacBook-Pro.local>	2023-07-04 08:53:52 -07:00
Stefano Lottini	6631fd5168	Align cassio versions between examples for Cassandra integration (#7099 ) Just reducing confusion by requiring cassio>=0.0.7 consistently across examples.	2023-07-04 04:21:48 -06:00
Nuno Campos	696886f397	Use serialized format for messages in tracer (#6827 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-04 10:19:08 +01:00
Ruixi Fan	0b69a7e9ab	[Document fix] Fix an expired link qa_benchmarking_pg.ipynb (#7110 ) ## Change description - Description: Fix an expired link that points to the readthedocs site. - Dependencies: No	2023-07-03 19:03:16 -06:00
Lance Martin	9ca4c54428	Minor updates to notebook for MultiQueryRetriever (#7102 ) * Add an easier-to-run example. * Add logging per https://github.com/hwchase17/langchain/pull/6891. * Updated params per https://github.com/hwchase17/langchain/pull/5962. --------- Co-authored-by: R. Lance Martin <rlm@Rs-MacBook-Pro.local> Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-03 17:32:50 -07:00
William FH	dfa48dc3b5	Update sdk version (#7109 )	2023-07-03 16:42:08 -07:00
William FH	04001ff077	Log errors (#7105 ) Re-add change that was inadvertently undone in #6995	2023-07-03 14:47:32 -07:00
William FH	3f9744c9f4	Accept no 'reasoning' response in qa evaluator (#7107 ) Re add since #6995 inadvertently undid #7031	2023-07-03 14:47:17 -07:00
Bagatur	fd3f8efec7	fix retriever signatures (#7097 )	2023-07-03 14:21:36 -06:00
Nicolas	490fcf9d98	docs: New experimental UI for Mendable Search (#6558 ) This PR introduces a new Mendable UI tailored to a better search experience. We're more closely integrating our traditional search with our AI generation. With this change, you won't have to tab back and forth between the mendable bot and the keyword search. Both types of search are handled in the same bar. This should make the docs easier to navigate. while still letting users get code generations or AI-summarized answers if they so wish. Also, it should reduce the cost. Would love to hear your feedback :) Cc: @dev2049 @hwchase17	2023-07-03 20:52:13 +01:00
Nuno Campos	c8f8b1b327	Add events to tracer runs (#7090 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-03 12:43:43 -07:00
genewoo	e49abd1277	Add Metal support to llama.cpp doc (#7092 ) - Description: Add Metal support to llama.cpp doc - Issue: #7091 - Dependencies: N/A - Twitter handle: gene_wu	2023-07-03 13:35:39 -06:00
Bagatur	fad2c7e5e0	update pr tmpl (#7095 )	2023-07-03 13:34:03 -06:00
Nuno Campos	98dbea6310	Add tags to all callback handler methods (#7073 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-03 10:39:46 -07:00
Mike Salvatore	d0c7f7c317	Remove `None` default value for FAISS relevance_score_fn (#7085 ) ## Description The type hint for `FAISS.__init__()`'s `relevance_score_fn` parameter allowed the parameter to be set to `None`. However, a default function is provided by the constructor. This led to an unnecessary check in the code, as well as a test to verify this check. ASSUMPTION: There's no reason to ever set `relevance_score_fn` to `None`. This PR changes the type hint and removes the unnecessary code.	2023-07-03 10:11:49 -06:00
Bagatur	719316e84c	bump 222 (#7086 )	2023-07-03 10:03:55 -06:00
rjarun8	e2d61ab85a	Add SpacyEmbeddings class (#6967 ) - Description: Added a new SpacyEmbeddings class for generating embeddings using the Spacy library. - Issue: Sentencebert/Bert/Spacy/Doc2vec embedding support #6952 - Dependencies: This change requires the Spacy library and the 'en_core_web_sm' Spacy model. - Tag maintainer: @dev2049 - Twitter handle: N/A This change includes a new SpacyEmbeddings class, but does not include a test or an example notebook. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-03 09:38:31 -06:00
Leonid Ganeline	16fbd528c5	docs: commented out `editUrl` option (#6440 )	2023-07-03 07:59:11 -07:00
adam91holt	80e86b602e	Remove duplicate mongodb integration doc (#7006 )	2023-07-03 02:23:33 -06:00
joaomsimoes	c669d98693	Update get_started.mdx (#7005 ) typo in chat = ChatOpenAI(open_api_key="...") should be openai_api_key	2023-07-03 02:23:12 -06:00
Bagatur	1cdb33a090	openapi chain nit (#7012 )	2023-07-03 02:22:53 -06:00
Johnny Lim	a081e419a0	Fix sample in FAISS section (#7050 ) This PR fixes a sample in the FAISS section in the reference docs.	2023-07-03 02:18:32 -06:00
Ikko Eltociear Ashimine	be93775ebc	Fix typo in google_places_api.py (#7055 )	2023-07-03 02:14:18 -06:00
Harrison Chase	60b05511d3	move base prompt to schema (#6995 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-02 22:38:59 -04:00
Leonid Ganeline	200be43da6	added `Brave Search` document_loader (#6989 ) - Added `Brave Search` document loader. - Refactored BraveSearch wrapper - Added a Jupyter Notebook example - Added `Ecosystem/Integrations` BraveSearch page Please review: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	2023-07-02 19:01:24 -07:00
Sergey Kozlov	6d15854cda	Add JSON Lines support to JSONLoader (#6913 ) Description: The JSON Lines format is used by some services such as OpenAI and HuggingFace. It's also a convenient alternative to CSV. This PR adds JSON Lines support to `JSONLoader` and also updates related tests. Tag maintainer: @rlancemartin, @eyurtsev. PS I was not able to build docs locally so didn't update related section.	2023-07-02 12:32:41 -07:00
Ofer Mendelevitch	153b56d19b	Vectara upd2 (#6506 ) Update to Vectara integration - By user request added "add_files" to take advantage of Vectara capabilities to process files on the backend, without the need for separate loading of documents and chunking in the chain. - Updated vectara.ipynb example notebook to be broader and added testing of add_file() @hwchase17 - project lead --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-07-02 12:15:50 -07:00
Leonid Ganeline	1feac83323	docstrings `document_loaders` 2 (#6890 ) updated docstring for the `document_loaders` Maintainer responsibilities: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	2023-07-02 12:14:22 -07:00
Leonid Ganeline	77ae8084a0	docstrings `document_loaders` 1 (#6847 ) - Updated docstrings in `document_loaders` - several code fixes. - added `docs/extras/ecosystem/integrations/airtable.md` @rlancemartin, @eyurtsev	2023-07-02 12:13:04 -07:00
0xcha05	e41b382e1c	Added filter and delete all option to delete function in Pinecone integration, updated base VectorStore's delete function (#6876 ) ### Description: Updated the delete function in the Pinecone integration to allow for deletion of vectors by specifying a filter condition, and to delete all vectors in a namespace. Made the ids parameter optional in the delete function in the base VectorStore class and allowed for additional keyword arguments. Updated the delete function in several classes (Redis, Chroma, Supabase, Deeplake, Elastic, Weaviate, and Cassandra) to match the changes made in the base VectorStore class. This involved making the ids parameter optional and allowing for additional keyword arguments.	2023-07-02 11:46:19 -07:00
Bagatur	5a45363954	bump 221 (#7047 )	2023-07-02 08:32:15 -06:00
Bagatur	7acd524210	Rm retriever kwargs (#7013 ) Doesn't actually limit the Retriever interface but hopefully in practice it does	2023-07-02 08:22:24 -06:00
Johnny Lim	9dc77614e3	Polish reference docs (#7045 ) This PR fixes broken links in the reference docs.	2023-07-02 08:08:51 -06:00
skspark	e5f6f0ffc4	Support params on GoogleSearchApiWrapper (#6810 ) (#7014 ) ## Description Support search params in GoogleSearchApiWrapper's result call, for the extra filtering on search, to support extra query parameters that google cse provides: https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list?hl=ko ## Issue #6810	2023-07-02 01:18:38 -06:00
Johnny Lim	052c797429	Fix typo (#7023 ) This PR fixes a typo.	2023-07-02 01:17:30 -06:00
Alex Iribarren	dc2264619a	Fix openai multi functions agent docs (#7028 )	2023-07-02 01:16:40 -06:00
William FH	6a64870ea0	Accept no 'reasoning' response in qa evaluator (#7030 )	2023-07-01 12:46:19 -07:00
William FH	7ebb76a5fa	Log Errors in Evaluator Callback (#7031 )	2023-07-01 12:10:00 -07:00
Stefano Lottini	8d2281a8ca	Second Attempt - Add concurrent insertion of vector rows in the Cassandra Vector Store (#7017 ) Retrying with the same improvements as in #6772, this time trying not to mess up with branches. @rlancemartin doing a fresh new PR from a branch with a new name. This should do. Thank you for your help! --------- Co-authored-by: Jonathan Ellis <jbellis@datastax.com> Co-authored-by: rlm <pexpresss31@gmail.com>	2023-07-01 11:09:52 -07:00
Harrison Chase	3bfe7cf467	Harrison/split schema dir (#7025 ) should be no functional changes also keep __init__ exposing a lot for backwards compat --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-01 13:39:19 -04:00
Davis Chase	556c425042	Improve docstrings for langchain.schema.py (#6802 ) Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-01 09:46:52 -07:00
Matt Robinson	0498dad562	feat: enable `UnstructuredEmailLoader` to process attachments (#6977 ) ### Summary Updates `UnstructuredEmailLoader` so that it can process attachments in addition to the e-mail content. The loader will process attachments if the `process_attachments` kwarg is passed when the loader is instantiated. ### Testing ```python file_path = "fake-email-attachment.eml" loader = UnstructuredEmailLoader( file_path, mode="elements", process_attachments=True ) docs = loader.load() docs[-1] ``` ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	2023-07-01 06:09:26 -07:00
Matthew Foster Walsh	59697b406d	Fix typo in quickstart.mdx (#6985 ) Removed an extra "to" from a sentence. @dev2049 very minor documentation fix.	2023-07-01 02:53:52 -06:00
Paul Grillenberger	aa37b10b28	Fix: Correct typo (#6988 ) Description: Correct a minor typo in the docs. @dev2049	2023-07-01 02:53:34 -06:00
Zander Chase	b0859c9b18	Add New Retriever Interface with Callbacks (#5962 ) Handle the new retriever events in a way that (I think) is entirely backwards compatible? Needs more testing for some of the chain changes and all. This creates an entire new run type, however. We could also just treat this as an event within a chain run presumably (same with memory) Adds a subclass initializer that upgrades old retriever implementations to the new schema, along with tests to ensure they work. First commit doesn't upgrade any of our retriever implementations (to show that we can pass the tests along with additional ones testing the upgrade logic). Second commit upgrades the known universe of retrievers in langchain. - [X] Add callback handling methods for retriever start/end/error (open to renaming to 'retrieval' if you want that) - [X] Update BaseRetriever schema to support callbacks - [X] Tests for upgrading old "v1" retrievers for backwards compatibility - [X] Update existing retriever implementations to implement the new interface - [X] Update calls within chains to .{a]get_relevant_documents to pass the child callback manager - [X] Update the notebooks/docs to reflect the new interface - [X] Test notebooks thoroughly Not handled: - Memory pass throughs: retrieval memory doesn't have a parent callback manager passed through the method --------- Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>	2023-06-30 14:44:03 -07:00
William FH	a5b206caf3	Remove Promptlayer Notebook (#6996 ) It's breaking our docs build	2023-06-30 14:30:24 -07:00
Daniel Chalef	b26cca8008	Zep Authentication (#6728 ) ## Description: Add Zep API Key argument to ZepChatMessageHistory and ZepRetriever - correct docs site links - add zep api_key auth to constructors ZepChatMessageHistory: @hwchase17, ZepRetriever: @rlancemartin, @eyurtsev	2023-06-30 14:24:26 -07:00
William FH	e4625846e5	Add Flyte Callback Handler (#6139 ) (#6986 ) Signed-off-by: Samhita Alla <aallasamhita@gmail.com> Co-authored-by: Samhita Alla <aallasamhita@gmail.com>	2023-06-30 12:25:22 -07:00
Bagatur	e3b7effc8f	Beef up import test (#6979 )	2023-06-30 09:26:05 -07:00
Bagatur	1ce9ef3828	Rm pytz dep (#6978 )	2023-06-30 09:24:01 -07:00
Davis Chase	eb180e321f	Page per class-style api reference (#6560 ) can make it prettier, but what do we think of overall structure? https://api.python.langchain.com/en/dev2049-page_per_class/api_ref.html --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-06-30 09:23:32 -07:00
William FH	64039b9f11	Promptlayer Callback (#6975 ) Co-authored-by: Saleh Hindi <saleh.hindi.one@gmail.com> Co-authored-by: jped <jonathanped@gmail.com>	2023-06-30 08:32:42 -07:00
William FH	13c62cf6b1	Arthur Callback (#6972 ) Co-authored-by: Max Cembalest <115359769+arthuractivemodeling@users.noreply.github.com>	2023-06-30 07:48:02 -07:00
William FH	8c73037dff	Simplify eval arg names (#6944 ) It'll be easier to switch between these if the names of predictions are consistent	2023-06-30 07:47:53 -07:00
Bagatur	8f5eca236f	release v220 (#6962 )	2023-06-30 06:52:09 -07:00
Bagatur	60b0d6ea35	Bagatur/openllm ensure available (#6960 ) Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-30 00:54:23 -07:00
Siraj Aizlewood	521c6f0233	Provided default values for tags and inheritable_tags args in BaseRun… (#6858 ) when running AsyncCallbackManagerForChainRun (from langchain.callbacks.manager import AsyncCallbackManagerForChainRun), provided default values for tags and inheritable_tages of empty lists in manager.py BaseRunManager. - Description: In manager.py, `BaseRunManager`, default values were provided for the `__init__` args `tags` and `inheritable_tags`. They default to empty lists (`[]`). - Issue: When trying to use Nvidia NeMo Guardrails with LangChain, the following exception was raised:	2023-06-29 22:01:08 -07:00
Davis Chase	bd6a0ee9e9	Redirect vecstores (#6948 )	2023-06-29 19:22:21 -07:00
Davis Chase	f780678910	Add back in clickhouse mongo vecstore notebooks (#6949 )	2023-06-29 19:21:47 -07:00
Jacob Lee	73831ef3d8	Change code block color scheme (#6945 ) Adds contrast, makes code blocks more readable.	2023-06-29 19:21:11 -07:00
Tahjyei Thompson	7d8830f707	Add `OpenAIMultiFunctionsAgent` to import list in agents directory (#6824 ) - Added OpenAIMultiFunctionsAgent to the import list of the Agents directory --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-29 18:34:26 -07:00
Matt Florence	0f6737735d	Order messages in PostgresChatMessageHistory (#6830 ) Fixes issue: https://github.com/hwchase17/langchain/issues/6829 This guarantees message history is in the correct order. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-29 18:10:28 -07:00
lucasiscovici	e9950392dd	Add password to PyPDR loader and parser (#6908 ) Add password to PyPDR loader and parser --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-29 17:35:50 -07:00
Zander Chase	429f4dbe4d	Add Input Mapper in run_on_dataset (#6894 ) If you create a dataset from runs and run the same chain or llm on it later, it usually works great. If you have an agent dataset and want to run a different agent on it, or have more complex schema, it's hard for us to automatically map these values every time. This PR lets you pass in an input_mapper function that converts the example inputs to whatever format your model expects	2023-06-29 16:53:49 -07:00
Lei Pan	76d03f398d	support max_chunk_bytes in OpensearchVectorSearch to pass down to bulk (#6855 ) Support `max_chunk_bytes` kwargs to pass down to `buik` helper, in order to support the request limits in Opensearch locally and in AWS. @rlancemartin, @eyurtsev	2023-06-29 15:50:08 -07:00
Hashem Alsaket	5861770a53	Updated QA notebook (#6801 ) Description: `all_metadatas` was not defined, `OpenAIEmbeddings` was not imported, Issue: #6723 the issue # it fixes (if applicable), Dependencies: lark, Tag maintainer: @vowelparrot , @dev2049 --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-29 15:41:53 -07:00
Kacper Łukawski	140ba682f1	Support named vectors in Qdrant (#6871 ) # Description This PR makes it possible to use named vectors from Qdrant in Langchain. That was requested multiple times, as people want to reuse externally created collections in Langchain. It doesn't change anything for the existing applications. The changes were covered with some integration tests and included in the docs. ## Example ```python Qdrant.from_documents( docs, embeddings, location=":memory:", collection_name="my_documents", vector_name="custom_vector", ) ``` ### Issue: #2594 Tagging @rlancemartin & @eyurtsev. I'd appreciate your review.	2023-06-29 15:14:22 -07:00
bradcrossen	9ca1cf003c	Re-add Support for SQLAlchemy <1.4 (#6895 ) Support for SQLAlchemy 1.3 was removed in version 0.0.203 by change #6086. Re-adding support. - Description: Imports SQLAlchemy Row at class creation time instead of at init to support SQLAlchemy <1.4. This is the only breaking change and was introduced in version 0.0.203 #6086. A similar change was merged before: https://github.com/hwchase17/langchain/pull/4647 - Dependencies: Reduces SQLAlchemy dependency to > 1.3 - Tag maintainer: @rlancemartin, @eyurtsev, @hwchase17, @wangxuqi --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-29 14:49:35 -07:00
corranmac	20c6ade2fc	Grobid parser for Scientific Articles from PDF (#6729 ) ### Scientific Article PDF Parsing via Grobid `Description:` This change adds the GrobidParser class, which uses the Grobid library to parse scientific articles into a universal XML format containing the article title, references, sections, section text etc. The GrobidParser uses a local Grobid server to return PDFs document as XML and parses the XML to optionally produce documents of individual sentences or of whole paragraphs. Metadata includes the text, paragraph number, pdf relative bboxes, pages (text may overlap over two pages), section title (Introduction, Methodology etc), section_number (i.e 1.1, 2.3), the title of the paper and finally the file path. Grobid parsing is useful beyond standard pdf parsing as it accurately outputs sections and paragraphs within them. This allows for post-fitering of results for specific sections i.e. limiting results to the methodology section or results. While sections are split via headings, ideally they could be classified specifically into introduction, methodology, results, discussion, conclusion. I'm currently experimenting with chatgpt-3.5 for this function, which could later be implemented as a textsplitter. `Dependencies:` For use, the grobid repo must be cloned and Java must be installed, for colab this is: ``` !apt-get install -y openjdk-11-jdk -q !update-alternatives --set java /usr/lib/jvm/java-11-openjdk-amd64/bin/java !git clone https://github.com/kermitt2/grobid.git os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-11-openjdk-amd64" os.chdir('grobid') !./gradlew clean install ``` Once installed the server is ran on localhost:8070 via ``` get_ipython().system_raw('nohup ./gradlew run > grobid.log 2>&1 &') ``` @rlancemartin, @eyurtsev Twitter Handle: @Corranmac Grobid Demo Notebook is [here](https://colab.research.google.com/drive/1X-St_mQRmmm8YWtct_tcJNtoktbdGBmd?usp=sharing). --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-29 14:29:29 -07:00
Baichuan Sun	6157bdf9d9	Add API Header for Amazon API Gateway Authentication (#6902 ) Add API Headers support for Amazon API Gateway to enable Authentication using DynamoDB. <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-06-29 12:58:07 -07:00
Wey Gu	1c66aa6d56	chore: NebulaGraph prompt optmization (#6904 ) Was preparing for a demo project of NebulaGraphQAChain to find out the prompt needed to be optimized a little bit. Please @hwchase17 kindly help review. Thanks!	2023-06-29 12:57:39 -07:00
Harrison Chase	0ba175e13f	move octo notebook (#6901 )	2023-06-29 12:20:55 -07:00
Stefano Lottini	75fb9d2fdc	Cassandra support for chat history using CassIO library (#6771 ) ### Overview This PR aims at building on #4378, expanding the capabilities and building on top of the `cassIO` library to interface with the database (as opposed to using the core drivers directly). Usage of `cassIO` (a library abstracting Cassandra access for ML/GenAI-specific purposes) is already established since #6426 was merged, so no new dependencies are introduced. In the same spirit, we try to uniform the interface for using Cassandra instances throughout LangChain: all our appreciation of the work by @jj701 notwithstanding, who paved the way for this incremental work (thank you!), we identified a few reasons for changing the way a `CassandraChatMessageHistory` is instantiated. Advocating a syntax change is something we don't take lighthearted way, so we add some explanations about this below. Additionally, this PR expands on integration testing, enables use of Cassandra's native Time-to-Live (TTL) features and improves the phrasing around the notebook example and the short "integrations" documentation paragraph. We would kindly request @hwchase to review (since this is an elaboration and proposed improvement of #4378 who had the same reviewer). ### About the __init__ breaking changes There are [many](https://docs.datastax.com/en/developer/python-driver/3.28/api/cassandra/cluster/) options when creating the `Cluster` object, and new ones might be added at any time. Choosing some of them and exposing them as `__init__` parameters `CassandraChatMessageHistory` will prove to be insufficient for at least some users. On the other hand, working through `kwargs` or adding a long, long list of arguments to `__init__` is not a desirable option either. For this reason, (as done in #6426), we propose that whoever instantiates the Chat Message History class provide a Cassandra `Session` object, ready to use. This also enables easier injection of mocks and usage of Cassandra-compatible connections (such as those to the cloud database DataStax Astra DB, obtained with a different set of init parameters than `contact_points` and `port`). We feel that a breaking change might still be acceptable since LangChain is at `0.*`. However, while maintaining that the approach we propose will be more flexible in the future, room could be made for a "compatibility layer" that respects the current init method. Honestly, we would to that only if there are strong reasons for it, as that would entail an additional maintenance burden. ### Other changes We propose to remove the keyspace creation from the class code for two reasons: first, production Cassandra instances often employ RBAC so that the database user reading/writing from tables does not necessarily (and generally shouldn't) have permission to create keyspaces, and second that programmatic keyspace creation is not a best practice (it should be done more or less manually, with extra care about schema mismatched among nodes, etc). Removing this (usually unnecessary) operation from the `__init__` path would also improve initialization performance (shorter time). We suggest, likewise, to remove the `__del__` method (which would close the database connection), for the following reason: it is the recommended best practice to create a single Cassandra `Session` object throughout an application (it is a resource-heavy object capable to handle concurrency internally), so in case Cassandra is used in other ways by the app there is the risk of truncating the connection for all usages when the history instance is destroyed. Moreover, the `Session` object, in typical applications, is best left to garbage-collect itself automatically. As mentioned above, we defer the actual database I/O to the `cassIO` library, which is designed to encode practices optimized for LLM applications (among other) without the need to expose LangChain developers to the internals of CQL (Cassandra Query Language). CassIO is already employed by the LangChain's Vector Store support for Cassandra. We added a few more connection options in the companion notebook example (most notably, Astra DB) to encourage usage by anyone who cannot run their own Cassandra cluster. We surface the `ttl_seconds` option for automatic handling of an expiration time to chat history messages, a likely useful feature given that very old messages generally may lose their importance. We elaborated a bit more on the integration testing (Time-to-live, separation of "session ids", ...). ### Remarks from linter & co. We reinstated `cassio` as a dependency both in the "optional" group and in the "integration testing" group of `pyproject.toml`. This might not be the right thing do to, in which case the author of this PR offer his apologies (lack of confidence with Poetry - happy to be pointed in the right direction, though!). During linter tests, we were hit by some errors which appear unrelated to the code in the PR. We left them here and report on them here for awareness: ``` langchain/vectorstores/mongodb_atlas.py:137: error: Argument 1 to "insert_many" of "Collection" has incompatible type "List[Dict[str, Sequence[object]]]"; expected "Iterable[Union[MongoDBDocumentType, RawBSONDocument]]" [arg-type] langchain/vectorstores/mongodb_atlas.py:186: error: Argument 1 to "aggregate" of "Collection" has incompatible type "List[object]"; expected "Sequence[Mapping[str, Any]]" [arg-type] langchain/vectorstores/qdrant.py:16: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:19: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:20: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:22: error: Name "grpc" is not defined [name-defined] langchain/vectorstores/qdrant.py:23: error: Name "grpc" is not defined [name-defined] ``` In the same spirit, we observe that to even get `import langchain` run, it seems that a `pip install bs4` is missing from the minimal package installation path. Thank you!	2023-06-29 10:50:34 -07:00
Zander Chase	f5663603cf	Throw error if evaluation key not present (#6874 )	2023-06-29 10:30:39 -07:00
Zander Chase	be164b20d8	Accept any single input (#6888 ) If I upload a dataset with a single input and output column, we should be able to let the chain prepare the input without having to maintain a strict dataset format.	2023-06-29 10:29:16 -07:00
Harrison Chase	8502117f62	bump version to 219 (#6899 )	2023-06-28 23:48:42 -07:00
Pablo	6370808d41	Adding support for async (_acall) for VertexAICommon LLM (#5588 ) # Adding support for async (_acall) for VertexAICommon LLM This PR implements the `_acall` method under `_VertexAICommon`. Because VertexAI itself does not provide an async interface, I implemented it via a ThreadPoolExecutor that can delegate execution of VertexAI calls to other threads. Twitter handle: @polecitoem : ) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: fyi - @agola11 for async functionality fyi - @Ark-kun from VertexAI	2023-06-28 23:07:41 -07:00
Mike Salvatore	cbd759aaeb	Fix inconsistent logging_and_data_dir parameter in AwaDB (#6775 ) ## Description Tag maintainer: @rlancemartin, @eyurtsev ### log_and_data_dir `AwaDB.__init__()` accepts a parameter named `log_and_data_dir`. But `AwaDB.from_texts()` and `AwaDB.from_documents()` accept a parameter named `logging_and_data_dir`. This inconsistency in this parameter name can lead to confusion on the part of the caller. This PR renames `logging_and_data_dir` to `log_and_data_dir` to make all functions consistent with the constructor. ### embedding `AwaDB.__init__()` accepts a parameter named `embedding_model`. But `AwaDB.from_texts()` and `AwaDB.from_documents()` accept a parameter named `embeddings`. This inconsistency in this parameter name can lead to confusion on the part of the caller. This PR renames `embedding_model` to `embeddings` to make AwaDB's constructor consistent with the classmethod "constructors" as specified by `VectorStore` abstract base class.	2023-06-28 23:06:52 -07:00
Harrison Chase	3ac08c3de4	Harrison/octo ml (#6897 ) Co-authored-by: Bassem Yacoube <125713079+AI-Bassem@users.noreply.github.com> Co-authored-by: Shotaro Kohama <khmshtr28@gmail.com> Co-authored-by: Rian Dolphin <34861538+rian-dolphin@users.noreply.github.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Shashank Deshpande <shashankdeshpande18@gmail.com>	2023-06-28 23:04:11 -07:00
Jiří Moravčík	a6b40b73e5	Add `call_actor_task` to the Apify integration (#6862 ) A user has been testing the Apify integration inside langchain and he was not able to run saved Actor tasks. This PR adds support for calling saved Actor tasks on the Apify platform to the existing integration. The structure of very similar to the one of calling Actors.	2023-06-28 22:13:47 -07:00
Shashank Deshpande	99cfe192da	added example notebook - use custom functions with openai agent (#6865 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-06-28 22:07:33 -07:00
Rian Dolphin	2e39ede848	add with score option for max marginal relevance (#6867 ) ### Adding the functionality to return the scores with retrieved documents when using the max marginal relevance - Description: Add the method `max_marginal_relevance_search_with_score_by_vector` to the FAISS wrapper. Functionality operates the same as `similarity_search_with_score_by_vector` except for using the max marginal relevance retrieval framework like is used in the `max_marginal_relevance_search_by_vector` method. - Dependencies: None - Tag maintainer: @rlancemartin @eyurtsev - Twitter handle: @RianDolphin --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-28 22:00:34 -07:00
Shotaro Kohama	398e4cd2dc	Update `langchain.chains.create_extraction_chain_pydantic` to parse results successfully (#6887 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Description: - The current code uses `PydanticSchema.schema()` and `_get_extraction_function` at the same time. As a result, a response from OpenAI has two nested `info`, and `PydanticAttrOutputFunctionsParser` fails to parse it. This PR will use the pydantic class given as an arg instead. - Issue: no related issue yet - Dependencies: no dependency change - Tag maintainer: @dev2049 - Twitter handle: @shotarok28	2023-06-28 21:57:41 -07:00
Eduard van Valkenburg	57f370cde9	PowerBI Toolkit additional logs (#6881 ) Added some additional logs to better be able to troubleshoot and understand the performance of the call to PBI vs the rest of the work.	2023-06-28 18:16:41 -07:00
Robert Lewis	c9c8d2599e	Update Zapier Jupyter notebook to include brief OAuth example (#6892 ) Description: Adds a brief example of using an OAuth access token with the Zapier wrapper. Also links to the Zapier documentation to learn more about OAuth flows. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-28 18:06:22 -07:00
Zhicheng Geng	16b11bda83	Use `getLogger` instead of `basicConfig` in `multi_query.py` (#6891 ) Remove `logging.basicConfig`, which turns on logging. Use `getLogger` instead	2023-06-28 18:06:10 -07:00
Davis Chase	f07dd02b50	Docs /redirects (#6790 ) Auto-generated a bunch of redirects from initial docs refactor commit	2023-06-28 17:07:53 -07:00
Harrison Chase	e5611565b7	bump version to 218 (#6857 )	2023-06-27 23:36:37 -07:00
Yaohui Wang	9d1bd18596	feat (documents): add LarkSuite document loader (#6420 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> ### Summary This PR adds a LarkSuite (FeiShu) document loader. > [LarkSuite](https://www.larksuite.com/) is an enterprise collaboration platform developed by ByteDance. ### Tests - an integration test case is added - an example notebook showing usage is added. [Notebook preview](https://github.com/yaohui-wyh/langchain/blob/master/docs/extras/modules/data_connection/document_loaders/integrations/larksuite.ipynb) <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review? - PTAL @eyurtsev @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>	2023-06-27 23:08:05 -07:00
Jingsong Gao	a435a436c1	feat(document_loaders): add tencent cos directory and file loader (#6401 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> - add tencent cos directory and file support for document-loader #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @eyurtsev	2023-06-27 23:07:20 -07:00
Ninely	d6cd0deaef	feat: Add streaming only final aiter of agent (#6274 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> #### Add streaming only final async iterator of agent This callback returns an async iterator and only streams the final output of an agent. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-27 23:06:25 -07:00
Shashank Deshpande	1db266b20d	Update link in apis.mdx (#6812 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-06-27 23:00:26 -07:00
Lance Martin	3f9900a864	Create MultiQueryRetriever (#6833 ) Distance-based vector database retrieval embeds (represents) queries in high-dimensional space and finds similar embedded documents based on "distance". But, retrieval may produce difference results with subtle changes in query wording or if the embeddings do not capture the semantics of the data well. Prompt engineering / tuning is sometimes done to manually address these problems, but can be tedious. The `MultiQueryRetriever` automates the process of prompt tuning by using an LLM to generate multiple queries from different perspectives for a given user input query. For each query, it retrieves a set of relevant documents and takes the unique union across all queries to get a larger set of potentially relevant documents. By generating multiple perspectives on the same question, the `MultiQueryRetriever` might be able to overcome some of the limitations of the distance-based retrieval and get a richer set of results. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-27 22:59:40 -07:00
Tim Asp	3ca1a387c2	Web Loader: Add proxy support (#6792 ) Proxies are helpful, especially when you start querying against more anti-bot websites. [Proxy services](https://developers.oxylabs.io/advanced-proxy-solutions/web-unblocker/making-requests) (of which there are many) and `requests` make it easy to rotate IPs to prevent banning by just passing along a simple dict to `requests`. CC @rlancemartin, @eyurtsev	2023-06-27 22:27:49 -07:00
Ayan Bandyopadhyay	f92ccf70fd	Update to the latest Psychic python library version (#6804 ) Update the Psychic document loader to use the latest `psychicapi` python library version: `0.8.0`	2023-06-27 22:26:38 -07:00
Hun-soo Jung	f3d178f600	Specify utilities package in SerpAPIWrapper docstring (#6821 ) - Description: Specify utilities package in SerpAPIWrapper docstring - Issue: Not an issue - Dependencies: (n/a) - Tag maintainer: @dev2049 - Twitter handle: (n/a)	2023-06-27 22:26:20 -07:00
Matt Robinson	dd2a151543	Docs/unstructured api key (#6781 ) ### Summary The Unstructured API will soon begin requiring API keys. This PR updates the Unstructured integrations docs with instructions on how to generate Unstructured API keys. ### Reviewers @rlancemartin @eyurtsev @hwchase17	2023-06-27 16:54:15 -07:00
Matthew Plachter	d6664af0ee	add async to zapier nla tools (#6791 ) Replace this comment with: - Description: Add Async functionality to Zapier NLA Tools - Issue: n/a - Dependencies: n/a - Tag maintainer: Maintainer responsibilities: - Agents / Tools / Toolkits: @vowelparrot - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md	2023-06-27 16:53:35 -07:00
Neil Neuwirth	efe0d39c6a	Adjusted OpenAI cost calculation (#6798 ) Added parentheses to ensure the division operation is performed before multiplication. This now correctly calculates the cost by dividing the number of tokens by 1000 first (to get the cost per token), and then multiplies it with the model's cost per 1k tokens @agola11	2023-06-27 16:53:06 -07:00
Ian	b4c196f785	fix pinecone delete bug (#6816 ) The implementation of delete in pinecone vector omits the namespace, which will cause delete failed	2023-06-27 16:50:17 -07:00
Janos Tolgyesi	f1070de038	WebBaseLoader: optionally raise exception in the case of http error (#6823 ) - Description: this PR adds the possibility to raise an exception in the case the http request did not return a 2xx status code. This is particularly useful in the situation when the url points to a non-existent web page, the server returns a http status of 404 NOT FOUND, but WebBaseLoader anyway parses and returns the http body of the error message. - Dependencies: none, - Tag maintainer: @rlancemartin, @eyurtsev, - Twitter handle: jtolgyesi	2023-06-27 16:43:59 -07:00
rafael	ef72a7cf26	rail_parser: Allow creation from pydantic (#6832 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Adds a way to create the guardrails output parser from a pydantic model.	2023-06-27 16:40:52 -07:00
Augustine Theodore	a980095efc	Enhancement : Ignore deleted messages and media in WhatsAppChatLoader (#6839 ) - Description: Ignore deleted messages and media - Issue: #6838 - Dependencies: No new dependencies - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-27 16:36:55 -07:00
Robert Lewis	74848aafea	Zapier - Add better error messaging for 401 responses (#6840 ) Description: When a 401 response is given back by Zapier, hint to the end user why that may have occurred - If an API Key was initialized with the wrapper, ask them to check their API Key value - if an access token was initialized with the wrapper, ask them to check their access token or verify that it doesn't need to be refreshed. Tag maintainer: @dev2049	2023-06-27 16:35:42 -07:00
Matt Robinson	b24472eae3	feat: Add `UnstructuredOrgModeLoader` (#6842 ) ### Summary Adds `UnstructuredOrgModeLoader` for processing [Org-mode](https://en.wikipedia.org/wiki/Org-mode) documents. ### Testing ```python from langchain.document_loaders import UnstructuredOrgModeLoader loader = UnstructuredOrgModeLoader( file_path="example_data/README.org", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @rlancemartin - @eyurtsev - @hwchase17	2023-06-27 16:34:17 -07:00
Piyush Jain	e53995836a	Added missing attribute value object (#6849 ) ## Description Adds a missing type class for [AdditionalResultAttributeValue](https://docs.aws.amazon.com/kendra/latest/APIReference/API_AdditionalResultAttributeValue.html). Fixes validation failure for the query API that have `AdditionalAttributes` in the response. cc @dev2049 cc @zhichenggeng	2023-06-27 16:30:11 -07:00
Cristóbal Carnero Liñán	e494b0a09f	feat (documents): add a source code loader based on AST manipulation (#6486 ) #### Summary A new approach to loading source code is implemented: Each top-level function and class in the code is loaded into separate documents. Then, an additional document is created with the top-level code, but without the already loaded functions and classes. This could improve the accuracy of QA chains over source code. For instance, having this script: ``` class MyClass: def __init__(self, name): self.name = name def greet(self): print(f"Hello, {self.name}!") def main(): name = input("Enter your name: ") obj = MyClass(name) obj.greet() if __name__ == '__main__': main() ``` The loader will create three documents with this content: First document: ``` class MyClass: def __init__(self, name): self.name = name def greet(self): print(f"Hello, {self.name}!") ``` Second document: ``` def main(): name = input("Enter your name: ") obj = MyClass(name) obj.greet() ``` Third document: ``` # Code for: class MyClass: # Code for: def main(): if __name__ == '__main__': main() ``` A threshold parameter is added to control whether small scripts are split in this way or not. At this moment, only Python and JavaScript are supported. The appropriate parser is determined by examining the file extension. #### Tests This PR adds: - Unit tests - Integration tests #### Dependencies Only one dependency was added as optional (needed for the JavaScript parser). #### Documentation A notebook is added showing how the loader can be used. #### Who can review? @eyurtsev @hwchase17 --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-27 15:58:47 -07:00
Robert Lewis	da462d9dd4	Zapier update oauth support (#6780 ) Description: Update documentation to 1) point to updated documentation links at Zapier.com (we've revamped our help docs and paths), and 2) To provide clarity how to use the wrapper with an access token for OAuth support Demo: Initializing the Zapier Wrapper with an OAuth Access Token `ZapierNLAWrapper(zapier_nla_oauth_access_token="<redacted>")` Using LangChain to resolve the current weather in Vancouver BC leveraging Zapier NLA to lookup weather by coords. ``` > Entering new chain... I need to use a tool to get the current weather. Action: The Weather: Get Current Weather Action Input: Get the current weather for Vancouver BC Observation: {"coord__lon": -123.1207, "coord__lat": 49.2827, "weather": [{"id": 802, "main": "Clouds", "description": "scattered clouds", "icon": "03d", "icon_url": "http://openweathermap.org/img/wn/03d@2x.png"}], "weather[]icon_url": ["http://openweathermap.org/img/wn/03d@2x.png"], "weather[]icon": ["03d"], "weather[]id": [802], "weather[]description": ["scattered clouds"], "weather[]main": ["Clouds"], "base": "stations", "main__temp": 71.69, "main__feels_like": 71.56, "main__temp_min": 67.64, "main__temp_max": 76.39, "main__pressure": 1015, "main__humidity": 64, "visibility": 10000, "wind__speed": 3, "wind__deg": 155, "wind__gust": 11.01, "clouds__all": 41, "dt": 1687806607, "sys__type": 2, "sys__id": 2011597, "sys__country": "CA", "sys__sunrise": 1687781297, "sys__sunset": 1687839730, "timezone": -25200, "id": 6173331, "name": "Vancouver", "cod": 200, "summary": "scattered clouds", "_zap_search_was_found_status": true} Thought: I now know the current weather in Vancouver BC. Final Answer: The current weather in Vancouver BC is scattered clouds with a temperature of 71.69 and wind speed of 3 ```	2023-06-27 11:46:32 -07:00
Joshua Carroll	24e4ae95ba	Initial Streamlit callback integration doc (md) (#6788 ) Description: Add a documentation page for the Streamlit Callback Handler integration (#6315) Notes: - Implemented as a markdown file instead of a notebook since example code runs in a Streamlit app (happy to discuss / consider alternatives now or later) - Contains an embedded Streamlit app -> https://mrkl-minimal.streamlit.app/ Currently this app is hosted out of a Streamlit repo but we're working to migrate the code to a LangChain owned repo ![streamlit_docs](https://github.com/hwchase17/langchain/assets/116604821/0b7a6239-361f-470c-8539-f22c40098d1a) cc @dev2049 @tconkling	2023-06-27 11:43:49 -07:00
Harrison Chase	8392ca602c	bump version to 217 (#6831 )	2023-06-27 09:39:56 -07:00
Ismail Pelaseyed	fcb3a64799	Add support for passing headers and search params to openai openapi chain (#6782 ) - Description: add support for passing headers and search params to OpenAI OpenAPI chains. - Issue: n/a - Dependencies: n/a - Tag maintainer: @hwchase17 - Twitter handle: @pelaseyed --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-27 09:09:03 -07:00
Zander Chase	e1fdb67440	Update description in Evals notebook (#6808 )	2023-06-27 00:26:49 -07:00
Zander Chase	ad028bbb80	Permit Constitutional Principles (#6807 ) In the criteria evaluator.	2023-06-27 00:23:54 -07:00
Zander Chase	6ca383ecf6	Update to RunOnDataset helper functions to accept evaluator callbacks (#6629 ) Also improve docstrings and update the tracing datasets notebook to focus on "debug, evaluate, monitor"	2023-06-26 23:58:13 -07:00
WaseemH	7ac9b22886	`RecusiveUrlLoader` to `RecursiveUrlLoader` (#6787 )	2023-06-26 23:12:14 -07:00
Mshoven	4535b0b41e	🎯Bug: format the url and path_params (#6755 ) - Description: format the url and path_params correctly, - Issue: #6753, - Dependencies: None, - Tag maintainer: @vowelparrot, - Twitter handle: @0xbluesecurity	2023-06-26 23:03:57 -07:00
Zander Chase	07d802d088	Don't raise error if parent not found (#6538 ) Done so that you can pass in a run from the low level api	2023-06-26 22:57:52 -07:00
Leonid Ganeline	49c864fa18	docs: vectorstore upgrades 2 (#6796 ) updated vectorstores/ notebooks; added new integrations into ecosystem/integrations/ @dev2049 @rlancemartin, @eyurtsev	2023-06-26 22:55:04 -07:00
Zander Chase	d7dbf4aefe	Clean up agent trajectory interface (#6799 ) - Enable reference - Enable not specifying tools at the start - Add methods with keywords	2023-06-26 22:54:04 -07:00
Zander Chase	cc60fed3be	Add a Pairwise Comparison Chain (#6703 ) Notebook shows preference scoring between two chains and reports wilson score interval + p value I think I'll add the option to insert ground truth labels but doesn't have to be in this PR	2023-06-26 20:47:41 -07:00
Hakan Tekgul	2928b080f6	Update arize_callback.py - bug fix (#6784 ) - Description: Bug Fix - Added a step variable to keep track of prompts - Issue: Bug from internal Arize testing - The prompts and responses that are ingested were not mapped correctly - Dependencies: N/A	2023-06-26 16:49:46 -07:00
Zander Chase	c460b04c64	Update String Evaluator (#6615 ) - Add protocol for `evaluate_strings` - Move the criteria evaluator out so it's not restricted to being applied on traced runs	2023-06-26 14:16:14 -07:00
AaaCabbage	b3f8324de9	feat: fix the Chinese characters in the solution content will be conv… (#6734 ) fix the Chinese characters in the solution content will be converted to ascii encoding, resulting in an abnormally long number of tokens Co-authored-by: qixin <qixin@fintec.ai>	2023-06-26 13:14:48 -07:00
Chris Pappalardo	70f7c2bb2e	align chroma vectorstore get with chromadb to enable where filtering (#6686 ) allows for where filtering on collection via get - Description: aligns langchain chroma vectorstore get with underlying [chromadb collection get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103) allowing for where filtering, etc. - Issue: NA - Dependencies: none - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: @pappanaka	2023-06-26 10:51:20 -07:00
Zander Chase	9ca3b4645e	Add support for tags in chain group context manager (#6668 ) Lets you specify local and inheritable tags in the group manager. Also, add more verbose docstrings for our reference docs.	2023-06-26 10:37:33 -07:00
Harrison Chase	d1bcc58beb	bump version to 216 (#6770 )	2023-06-26 09:46:19 -07:00
Zander Chase	6d30acffcb	Fix breaking tags (#6765 ) Fix tags change that broke old way of initializing agent Closes #6756	2023-06-26 09:28:11 -07:00
James Croft	ba622764cb	Improve performance when retrieving Notion DB pages (#6710 )	2023-06-26 05:46:09 -07:00
Richy Wang	ec8247ec59	Fixed bug in AnalyticDB Vector Store caused by upgrade SQLAlchemy version (#6736 )	2023-06-26 05:35:25 -07:00
Santiago Delgado	d84a3bcf7a	Office365 Tool (#6306 ) #### Background With the development of [structured tools](https://blog.langchain.dev/structured-tools/), the LangChain team expanded the platform's functionality to meet the needs of new applications. The GMail tool, empowered by structured tools, now supports multiple arguments and powerful search capabilities, demonstrating LangChain's ability to interact with dynamic data sources like email servers. #### Challenge The current GMail tool only supports GMail, while users often utilize other email services like Outlook in Office365. Additionally, the proposed calendar tool in PR https://github.com/hwchase17/langchain/pull/652 only works with Google Calendar, not Outlook. #### Changes This PR implements an Office365 integration for LangChain, enabling seamless email and calendar functionality with a single authentication process. #### Future Work With the core Office365 integration complete, future work could include integrating other Office365 tools such as Tasks and Address Book. #### Who can review? @hwchase17 or @vowelparrot can review this PR #### Appendix @janscas, I utilized your [O365](https://github.com/O365/python-o365) library extensively. Given the rising popularity of LangChain and similar AI frameworks, the convergence of libraries like O365 and tools like this one is likely. So, I wanted to keep you updated on our progress. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-26 02:59:09 -07:00
Xiaochao Dong	a15afc102c	Relax the action input check for actions that require no input (#6357 ) When the tool requires no input, the LLM often gives something like this: ```json { "action": "just_do_it" } ``` I have attempted to enhance the prompt, but it doesn't appear to be functioning effectively. Therefore, I believe we should consider easing the check a little bit. Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>	2023-06-26 02:30:17 -07:00
Ethan Bowen	cc33bde74f	Confluence added (#6432 ) Adding Confluence to Jira tool. Can create a page in Confluence with this PR. If accepted, will extend functionality to Bitbucket and additional Confluence features. --------- Co-authored-by: Ethan Bowen <ethan.bowen@slalom.com>	2023-06-26 02:28:04 -07:00
Surya Nudurupati	2aeb8e7dbc	Improved Documentation: Eliminating Redundancy in the Introduction.mdx (#6360 ) When the documentation was originally written there was a redundant typing of the word "using the"	2023-06-26 02:27:36 -07:00
rajib	0f6ef048d2	The openai_info.py does not have gpt-35-turbo which is the underlying Azure Open AI model name (#6321 ) Since this model name is not there in the list MODEL_COST_PER_1K_TOKENS, when we use get_openai_callback(), for gpt 3.5 model in Azure AI, we do not get the cost of the tokens. This will fix this issue #### Who can review? @hwchase17 @agola11 Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-26 02:16:39 -07:00
ArchimedesFTW	fe941cb54a	Change tags(str) to tags(dict) in mlflow_callback.py docs (#6473 ) Fixes #6472 #### Who can review? @agola11	2023-06-26 02:12:23 -07:00
0xcrusher	9187d2f3a9	Fixed caching bug for Multiple Caching types by correctly checking types (#6746 ) - Fixed an issue where some caching types check the wrong types, hence not allowing caching to work Maintainer responsibilities: - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev	2023-06-26 01:14:32 -07:00
Harrison Chase	e9877ea8b1	Tiktoken override (#6697 )	2023-06-26 00:49:32 -07:00
Gabriel Altay	f9771700e4	prevent DuckDuckGoSearchAPIWrapper from consuming top result (#6727 ) remove the `next` call that checks for None on the results generator	2023-06-25 19:54:15 -07:00
Pau Ramon Revilla	87802c86d9	Added a MHTML document loader (#6311 ) MHTML is a very interesting format since it's used both for emails but also for archived webpages. Some scraping projects want to store pages in disk to process them later, mhtml is perfect for that use case. This is heavily inspired from the beautifulsoup html loader, but extracting the html part from the mhtml file. --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-25 13:12:08 -07:00
Janos Tolgyesi	05eec99269	beautifulsoup get_text kwargs in WebBaseLoader (#6591 ) # beautifulsoup get_text kwargs in WebBaseLoader - Description: this PR introduces an optional `bs_get_text_kwargs` parameter to `WebBaseLoader` constructor. It can be used to pass kwargs to the downstream BeautifulSoup.get_text call. The most common usage might be to pass a custom text separator, as seen also in `BSHTMLLoader`. - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: jtolgyesi	2023-06-25 12:42:27 -07:00
Matt Robinson	be68f6f8ce	feat: Add `UnstructuredRSTLoader` (#6594 ) ### Summary Adds an `UnstructuredRSTLoader` for loading [reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText) file. ### Testing ```python from langchain.document_loaders import UnstructuredRSTLoader loader = UnstructuredRSTLoader( file_path="example_data/README.rst", mode="elements" ) docs = loader.load() print(docs[0]) ``` ### Reviewers - @hwchase17 - @rlancemartin - @eyurtsev	2023-06-25 12:41:57 -07:00
Chip Davis	b32cc01c9f	feat: added tqdm progress bar to UnstructuredURLLoader (#6600 ) - Description: Adds a simple progress bar with tqdm when using UnstructuredURLLoader. Exposes new paramater `show_progress_bar`. Very simple PR. - Issue: N/A - Dependencies: N/A - Tag maintainer: @rlancemartin @eyurtsev --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-25 12:41:25 -07:00
Augustine Theodore	afc292e58d	Fix WhatsAppChatLoader : Enable parsing additional formats (#6663 ) - Description: Updated regex to support a new format that was observed when whatsapp chat was exported. - Issue: #6654 - Dependencies: No new dependencies - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-25 12:08:43 -07:00
Sumanth Donthula	3e30a5d967	updated sql_database.py for returning sorted table names. (#6692 ) Added code to get the tables info in sorted order in methods get_usable_table_names and get_table_info. Linked to Issue: #6640	2023-06-25 12:04:24 -07:00
刘方瑞	9d1b3bab76	Fix Typo in LangChain MyScale Integration Doc (#6705 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Description: Fix Typo in LangChain MyScale Integration Doc @hwchase17	2023-06-25 11:54:00 -07:00
sudolong	408c8d0178	fix chroma _similarity_search_with_relevance_scores missing `kwargs` … (#6708 ) Issue: https://github.com/hwchase17/langchain/issues/6707	2023-06-25 11:53:42 -07:00
Zander Chase	d89e10d361	Fix Multi Functions Agent Tracing (#6702 ) Confirmed it works now: https://dev.langchain.plus/public/0dc32ce0-55af-432e-b09e-5a1a220842f5/r	2023-06-25 10:39:04 -07:00
Harrison Chase	1742db0c30	bump version to 215 (#6719 )	2023-06-25 08:52:51 -07:00
Ankush Gola	e1b801be36	split up batch llm calls into separate runs (#5804 )	2023-06-24 21:03:31 -07:00
Davis Chase	1da99ce013	bump v214 (#6694 )	2023-06-24 14:23:11 -07:00
Lance Martin	dd36adc0f4	Make bs4 a local import in recursive_url_loader.py (#6693 ) Resolve https://github.com/hwchase17/langchain/issues/6679	2023-06-24 13:54:10 -07:00
Harrison Chase	ef4c7b54ef	bump to version 213 (#6688 )	2023-06-24 11:56:37 -07:00
UmerHA	068142fce2	Add caching to BaseChatModel (issue #1644 ) (#5089 ) # Add caching to BaseChatModel Fixes #1644 (Sidenote: While testing, I noticed we have multiple implementations of Fake LLMs, used for testing. I consolidated them.) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-24 11:45:09 -07:00
Harrison Chase	c289cc891a	Harrison/optional ids opensearch (#6684 ) Co-authored-by: taekimsmar <66041442+taekimsmar@users.noreply.github.com>	2023-06-24 09:19:57 -07:00
Hrag Balian	2518e6c95b	Session deletion method in motorhead memory (#6609 ) Motorhead Memory module didn't support deletion of a session. Added a method to enable deletion. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-23 21:27:42 -07:00
Baichuan Sun	9fbe346860	Amazon API Gateway hosted LLM (#6673 ) This PR adds a new LLM class for the Amazon API Gateway hosted LLM. The PR also includes example notebooks for using the LLM class in an Agent chain. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-23 21:27:25 -07:00
Davis Chase	fa1bb873e2	Fix openapi parameter parsing (#6676 ) Ensure parameters are json serializable, related to #6671	2023-06-23 21:19:12 -07:00
Akash	b7e1c54947	Just corrected a small inconsistency on a doc page (#6603 ) ### Just corrected a small inconsistency on a doc page (not exactly a typo, per se) - Description: There was inconsistency due to the use of single quotes at one place on the [Squential Chains](https://python.langchain.com/docs/modules/chains/foundational/sequential_chains) page of the docs, - Issue: NA, - Dependencies: NA, - Tag maintainer: @dev2049, - Twitter handle: kambleakash0	2023-06-23 16:09:29 -07:00
Davis Chase	2da1aab50b	Wiki loader lint (#6670 )	2023-06-23 16:05:42 -07:00
Leonid Ganeline	1c81883d42	added docstrings where they missed (#6626 ) This PR targets the `API Reference` documentation. - Several classes and functions missed `docstrings`. These docstrings were created. - In several places this ``` except ImportError: raise ValueError( ``` was replaced to ``` except ImportError: raise ImportError( ```	2023-06-23 15:49:44 -07:00
Shashank	3364e5818b	Changed generate_prompt.py (#6644 ) Modified regex for Fix: ValueError: Could not parse output	2023-06-23 15:48:33 -07:00
Davis Chase	f1e1ac2a01	chroma nb close img tag (#6669 )	2023-06-23 15:41:54 -07:00
eLafo	db8b13df4c	adds doc_content_chars_max argument to WikipediaLoader (#6645 ) # Description It adds a new initialization param in `WikipediaLoader` so we can override the `doc_content_chars_max` param used in `WikipediaAPIWrapper` under the hood, e.g: ```python from langchain.document_loaders import WikipediaLoader # doc_content_chars_max is the new init param loader = WikipediaLoader(query="python", doc_content_chars_max=90000) ``` ## Decisions `doc_content_chars_max` default value will be 4000, because it's the current value I have added pycode comments # Issue #6639 # Dependencies None # Twitter handle [@elafo](https://twitter.com/elafo)	2023-06-23 15:22:09 -07:00
Davis Chase	5e5b30b74f	openapi -> openai nit (#6667 )	2023-06-23 15:09:02 -07:00
Jeff Huber	2acf109c4b	update chroma notebook (#6664 ) @rlancemartin I updated the notebook for Chroma to hopefully be a lot easier for users.	2023-06-23 15:03:06 -07:00
Eduard van Valkenburg	48381f1f78	PowerBI: catch outdated token (#6634 ) This adds just a small tweak to catch the error that says the token is expired rather then retrying.	2023-06-23 15:01:08 -07:00
Piyush Jain	b1de927f1b	Kendra retriever api (#6616 ) ## Description Replaces [Kendra Retriever](https://github.com/hwchase17/langchain/blob/master/langchain/retrievers/aws_kendra_index_retriever.py) with an updated version that uses the new [retriever API](https://docs.aws.amazon.com/kendra/latest/dg/searching-retrieve.html) which is better suited for retrieval augmented generation (RAG) systems. Note: This change requires the latest version (1.26.159) of boto3 to work. `pip install -U boto3` to upgrade the boto3 version. cc @hupe1980 cc @dev2049	2023-06-23 14:59:35 -07:00
ChrisLovejoy	4e5d78579b	fix minor typo in vector_db_qa.mdx (#6604 ) - Description: minor typo fixed - doesn't instead of does. No other changes.	2023-06-23 14:57:37 -07:00
Ikko Eltociear Ashimine	73da193a4b	Fix typo in myscale_self_query.ipynb (#6601 )	2023-06-23 14:57:12 -07:00
Saarthak Maini	ba256b23f2	Fix Typo (#6595 ) Resolves #6582	2023-06-23 14:56:54 -07:00
kourosh hakhamaneshi	f6fdabd20b	Fix ray-project/Aviary integration (#6607 ) - Description: The aviary integration has changed url link. This PR provide fix for those changes and also it makes providing the input URL optional to the API (since they can be set via env variables). - Issue: N/A - Dependencies: N/A - Twitter handle: N/A --------- Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2023-06-23 14:49:53 -07:00
northern-64bit	dbe1d029ec	Fix grammar mistake in base.py in planners (#6611 ) Fix a typo in `langchain/experimental/plan_and_execute/planners/base.py`, by changing "Given input, decided what to do." to "Given input, decide what to do." This is in the docstring for functions running LLM chains which shall create a plan, "decided" does not make any sense in this context.	2023-06-23 14:47:10 -07:00
Aaron Pham	082976d8d0	fix(docs): broken link for OpenLLM (#6622 ) This link for the notebook of OpenLLM is not migrated to the new format Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @dev2049 - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @dev2049 - Memory: @hwchase17 - Agents / Tools / Toolkits: @vowelparrot - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com>	2023-06-23 13:59:17 -07:00
Davis Chase	fe828185ed	Dev2049/bump 212 (#6665 )	2023-06-23 13:48:02 -07:00
Hassan Ouda	9e52134d30	ChatVertexAI broken - Fix error with sending context in params (#6652 ) vertex Ai chat is broken right now. That is because context is in params and chat.send_message doesn't accept that as a params. - Closes issue [ChatVertexAI Error: _ChatSessionBase.send_message() got an unexpected keyword argument 'context' #6610](https://github.com/hwchase17/langchain/issues/6610)	2023-06-23 13:38:21 -07:00
Lance Martin	c2b25c17c5	Recursive URL loader (#6455 ) We may want to process load all URLs under a root directory. For example, let's look at the [LangChain JS documentation](https://js.langchain.com/docs/). This has many interesting child pages that we may want to read in bulk. Of course, the `WebBaseLoader` can load a list of pages. But, the challenge is traversing the tree of child pages and actually assembling that list! We do this using the `RecusiveUrlLoader`. This also gives us the flexibility to exclude some children (e.g., the `api` directory with > 800 child pages).	2023-06-23 13:09:00 -07:00
Lance Martin	be02572d58	Add delete and ensure add_texts performs upsert (w/ ID optional) (#6126 ) ## Goal We want to ensure consistency across vectordbs: 1/ add `delete` by ID method to the base vectorstore class 2/ ensure `add_texts` performs `upsert` with ID optionally passed ## Testing - [x] Pinecone: notebook test w/ `langchain_test` vectorstore. - [x] Chroma: Review by @jeffchuber, notebook test w/ in memory vectorstore. - [x] Supabase: Review by @copple, notebook test w/ `langchain_test` table. - [x] Weaviate: Notebook test w/ `langchain_test` index. - [x] Elastic: Revied by @vestal. Notebook test w/ `langchain_test` table. - [ ] Redis: Asked for review from owner of recent `delete` method https://github.com/hwchase17/langchain/pull/6222	2023-06-23 13:03:10 -07:00
Lance Martin	393f469eb3	Create merge loader that combines documents from a set of loaders (#6659 ) Simple utility loader that combines documents from a set of specified loaders.	2023-06-23 13:02:48 -07:00
Davis Chase	6988039975	openapi_openai docstring (#6661 )	2023-06-23 11:38:33 -07:00
Davis Chase	b25933b607	bump 211 (#6660 )	2023-06-23 11:10:48 -07:00
Davis Chase	e013459b18	Openapi to openai (#6658 )	2023-06-23 11:00:34 -07:00
Davis Chase	b062a3f938	bump 210 (#6656 )	2023-06-23 09:37:58 -07:00
Alejandra De Luna	980c865174	fix: remove callbacks arg from Tool and StructuredTool inferred schema (#6483 ) Fixes #5456 This PR removes the `callbacks` argument from a tool's schema when creating a `Tool` or `StructuredTool` with the `from_function` method and `infer_schema` is set to `True`. The `callbacks` argument is now removed in the `create_schema_from_function` and `_get_filtered_args` methods. As suggested by @vowelparrot, this fix provides a straightforward solution that minimally affects the existing implementation. A test was added to verify that this change enables the expected use of `Tool` and `StructuredTool` when using a `CallbackManager` and inferring the tool's schema. - @hwchase17	2023-06-23 01:48:27 -07:00
Zander Chase	b4fe7f3a09	Session to project (#6249 ) Sessions are being renamed to projects in the tracer	2023-06-23 01:11:01 -07:00
Zander Chase	9c09861946	Add tags in agent initialization (#6559 ) Add better docstrings for agent executor as well Inspo: https://github.com/hwchase17/langchainjs/pull/1722 ![image](https://github.com/hwchase17/langchain/assets/130414180/d11662bc-0c0e-4166-9ff3-354d41a9144a)	2023-06-22 22:35:00 -07:00
Lance Martin	6e69bfbb28	Loader for OpenCityData and minor cleanups to Pandas, Airtable loaders (#6301 ) Many cities have open data portals for events like crime, traffic, etc. Socrata provides an API for many, including SF (e.g., see [here](https://dev.socrata.com/foundry/data.sfgov.org/tmnf-yvry)). This is a new data loader for city data that uses Socrata API.	2023-06-22 22:20:42 -07:00
Christoph Kahl	9d42621fa4	added redis method to delete entries by keys (#6222 ) In addition to my last pr (return keys of added entries), we also need a method to delete the entries by keys. @dev2049	2023-06-22 13:26:47 -07:00
Tim Conkling	c28990d871	StreamlitCallbackHandler (#6315 ) A new implementation of `StreamlitCallbackHandler`. It formats Agent thoughts into Streamlit expanders. You can see the handler in action here: https://langchain-mrkl.streamlit.app/ Per a discussion with Harrison, we'll be adding a `StreamlitCallbackHandler` implementation to an upcoming [Streamlit](https://github.com/streamlit/streamlit) release as well, and will be updating it as we add new LLM- and LangChain-specific features to Streamlit. The idea with this PR is that the LangChain `StreamlitCallbackHandler` will "auto-update" in a way that keeps it forward- (and backward-) compatible with Streamlit. If the user has an older Streamlit version installed, the LangChain `StreamlitCallbackHandler` will be used; if they have a newer Streamlit version that has an updated `StreamlitCallbackHandler`, that implementation will be used instead. (I'm opening this as a draft to get the conversation going and make sure we're on the same page. We're really excited to land this into LangChain!) #### Who can review? @agola11, @hwchase17	2023-06-22 13:14:28 -07:00
Nuno Campos	74ac6fb6b9	Allow callback handlers to opt into being run inline (#6424 ) This is useful eg for callback handlers that use context vars (like open telemetry) See https://github.com/hwchase17/langchain/pull/6095	2023-06-22 11:36:19 -07:00
Harrison Chase	a9108c1809	add mongo (HOLD) (#6437 ) do not merge in	2023-06-22 11:08:12 -07:00
Lance Martin	30f7288082	MD header text splitter returns Documents (#6571 ) Return `Documents` from MD header text splitter to simplify UX. Updates the test as well as example notebooks.	2023-06-22 09:25:38 -07:00
Rogério Chaves	3436da65a4	Fix callback forwarding in async plan method for OpenAI function agent (#6584 ) The callback argument was missing, preventing me to get callbacks to work properly when using it async	2023-06-22 08:18:31 -07:00
Davis Chase	b909bc8b58	bump 209 (#6593 )	2023-06-22 08:18:19 -07:00
minhajul-clarifai	6e57306a13	Clarifai integration (#5954 ) # Changes This PR adds [Clarifai](https://www.clarifai.com/) integration to Langchain. Clarifai is an end-to-end AI Platform. Clarifai offers user the ability to use many types of LLM (OpenAI, cohere, ect and other open source models). As well, a clarifai app can be treated as a vector database to upload and retrieve data. The integrations includes: - Clarifai LLM integration: Clarifai supports many types of language model that users can utilize for their application - Clarifai VectorDB: A Clarifai application can hold data and embeddings. You can run semantic search with the embeddings #### Before submitting - [x] Added integration test for LLM - [x] Added integration test for VectorDB - [x] Added notebook for LLM - [x] Added notebook for VectorDB Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-22 08:00:15 -07:00
Jeroen Van Goey	7f6f5c2a6a	Add missing word in comment (#6587 ) Changed ``` # Do this so we can exactly what's going on under the hood ``` to ``` # Do this so we can see exactly what's going on under the hood ```	2023-06-22 07:54:28 -07:00
Davis Chase	d50de2728f	Add AzureML endpoint LLM wrapper (#6580 ) ### Description We have added a new LLM integration `azureml_endpoint` that allows users to leverage models from the AzureML platform. Microsoft recently announced the release of [Azure Foundation Models](https://learn.microsoft.com/en-us/azure/machine-learning/concept-foundation-models?view=azureml-api-2) which users can find in the AzureML Model Catalog. The Model Catalog contains a variety of open source and Hugging Face models that users can deploy on AzureML. The `azureml_endpoint` allows LangChain users to use the deployed Azure Foundation Models. ### Dependencies No added dependencies were required for the change. ### Tests Integration tests were added in `tests/integration_tests/llms/test_azureml_endpoint.py`. ### Notebook A Jupyter notebook demonstrating how to use `azureml_endpoint` was added to `docs/modules/llms/integrations/azureml_endpoint_example.ipynb`. ### Twitters [Prakhar Gupta](https://twitter.com/prakhar_in) [Matthew DeGuzman](https://twitter.com/matthew_d13) --------- Co-authored-by: Matthew DeGuzman <91019033+matthewdeguzman@users.noreply.github.com> Co-authored-by: prakharg-msft <75808410+prakharg-msft@users.noreply.github.com>	2023-06-22 01:46:01 -07:00
Davis Chase	4fabd02d25	Add OpenLLM wrapper(#6578 ) LLM wrapper for models served with OpenLLM --------- Signed-off-by: Aaron <29749331+aarnphm@users.noreply.github.com> Authored-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com> Co-authored-by: Chaoyu <paranoyang@gmail.com>	2023-06-22 01:18:14 -07:00
Brendan Graham	d718f3b6d0	feat: interfaces for async embeddings, implement async openai (#6563 ) Since it seems like #6111 will be blocked for a bit, I've forked @tyree731's fork and implemented the requested changes. This change adds support to the base Embeddings class for two methods, aembed_query and aembed_documents, those two methods supporting async equivalents of embed_query and embed_documents respectively. This ever so slightly rounds out async support within langchain, with an initial implementation of this functionality being implemented for openai. Implements https://github.com/hwchase17/langchain/issues/6109 --------- Co-authored-by: Stephen Tyree <tyree731@gmail.com>	2023-06-21 23:16:33 -07:00
ljeagle	ca24dc2d5f	Upgrade the version of AwaDB and add some new interfaces (#6565 ) 1. upgrade the version of AwaDB 2. add some new interfaces 3. fix bug of packing page content error @dev2049 please review, thanks! --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-21 23:15:18 -07:00
Harrison Chase	937a7e93f2	add motherduck docs (#6572 )	2023-06-21 23:13:45 -07:00
Muhammad Vaid	ae81b96b60	Detailed using the Twilio tool to send messages with 3rd party apps incl. WhatsApp (#6562 ) Everything needed to support sending messages over WhatsApp Business Platform (GA), Facebook Messenger (Public Beta) and Google Business Messages (Private Beta) was present. Just added some details on leveraging it.	2023-06-21 19:26:50 -07:00
Kenzie Mihardja	b8d78424ab	Change Data Loader Namespace (#6568 ) Description: Update the artifact name of the xml file and the namespaces. Co-authored with @tjaffri Co-authored-by: Kenzie Mihardja <kenzie@docugami.com>	2023-06-21 19:24:04 -07:00
Gengliang Wang	0673245d0c	Remove duplicate databricks entries in ecosystem integrations (#6569 ) Currently, there are two Databricks entries in https://python.langchain.com/docs/ecosystem/integrations/ <img width="277" alt="image" src="https://github.com/hwchase17/langchain/assets/1097932/86ab4ad2-6bce-4459-9d56-1ab2fbb69f6d"> The reason is that there are duplicated notebooks for Databricks integration: * https://github.com/hwchase17/langchain/blob/master/docs/extras/ecosystem/integrations/databricks.ipynb * https://github.com/hwchase17/langchain/blob/master/docs/extras/ecosystem/integrations/databricks/databricks.ipynb This PR is to remove the second one for simplicity.	2023-06-21 19:14:33 -07:00
Suri Chen	14b9418cc5	Fix whatsappchatloader - enable parsing new datetime format on WhatsApp chat (#6555 ) - Description: observed new format on WhatsApp exported chat - example: `[2023/5/4, 16:17:13] ~ Carolina: 🥺` - Dependencies: no additional dependencies required - Tag maintainer: @rlancemartin, @eyurtsev	2023-06-21 19:11:49 -07:00
Zander Chase	5322bac5fc	Wait for all futures (#6554 ) - Expose method to wait for all futures - Wait for submissions in the run_on_dataset functions to ensure runs are fully submitted before cleaning up	2023-06-21 18:20:17 -07:00
HenriZuber	e0605b464b	feat: faiss filter from list (#6537 ) ### Feature Using FAISS on a retrievalQA task, I found myself wanting to allow in multiple sources. From what I understood, the filter feature takes in a dict of form {key: value} which then will check in the metadata for the exact value linked to that key. I added some logic to be able to pass a list which will be checked against instead of an exact value. Passing an exact value will also work. Here's an example of how I could then use it in my own project: ``` pdfs_to_filter_in = ["file_A", "file_B"] filter_dict = { "source": [f"source_pdfs/{pdf_name}.pdf" for pdf_name in pdfs_to_filter_in] } retriever = db.as_retriever() retriever.search_kwargs = {"filter": filter_dict} ``` I added an integration test based on the other ones I found in `tests/integration_tests/vectorstores/test_faiss.py` under `test_faiss_with_metadatas_and_list_filter()`. It doesn't feel like this is worthy of its own notebook or doc, but I'm open to suggestions if needed. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 10:49:01 -07:00
Davis Chase	00a7403236	update pr tmpl (#6552 )	2023-06-21 10:03:52 -07:00
Jeroen Van Goey	57b5f42847	Remove unintended double negation in docstring (#6541 ) Small typo fix. `ImportError: If importing vertexai SDK didn't not succeed.` -> `ImportError: If importing vertexai SDK did not succeed.`.	2023-06-21 10:01:28 -07:00
Andrey E. Vedishchev	a2a0715bd4	Minor Grammar Fixes in Docs and Comments (#6536 ) Just some grammar fixes: I found "retriver" instead of "retriever" in several comments across the documentation and in the comments. I fixed it. Co-authored-by: andrey.vedishchev <andrey.vedishchev@rgigroup.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 09:53:31 -07:00
dirtysalt	57cc3d1d3d	[Feature][VectorStore] Support StarRocks as vector db (#6119 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Here are some examples to use StarRocks as vectordb ``` from langchain.vectorstores import StarRocks from langchain.vectorstores.starrocks import StarRocksSettings embeddings = OpenAIEmbeddings() # conifgure starrocks settings settings = StarRocksSettings() settings.port = 41003 settings.host = '127.0.0.1' settings.username = 'root' settings.password = '' settings.database = 'zya' # to fill new embeddings docsearch = StarRocks.from_documents(split_docs, embeddings, config = settings) # or to use already-built embeddings in database. docsearch = StarRocks(embeddings, settings) ``` #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 09:02:33 -07:00
Zander Chase	7a4ff424fc	Relax string input mapper check (#6544 ) for run evaluator. It could be that an evalutor doesn't need the output	2023-06-21 08:01:42 -07:00
Harrison Chase	ace442b992	bump to ver 208 (#6540 )	2023-06-21 07:32:36 -07:00
Harrison Chase	53c1f120a8	Harrison/multi tool (#6518 )	2023-06-21 07:19:52 -07:00
Naman Modi	37a89918e0	Infino integration for simplified logs, metrics & search across LLM data & token usage (#6218 ) ### Integration of Infino with LangChain for Enhanced Observability This PR aims to integrate [Infino](https://github.com/infinohq/infino), an open source observability platform written in rust for storing metrics and logs at scale, with LangChain, providing users with a streamlined and efficient method of tracking and recording LangChain experiments. By incorporating Infino into LangChain, users will be able to gain valuable insights and easily analyze the behavior of their language models. #### Please refer to the following files related to integration: - `InfinoCallbackHandler`: A [callback handler](https://github.com/naman-modi/langchain/blob/feature/infino-integration/langchain/callbacks/infino_callback.py) specifically designed for storing chain responses within Infino. - Example `infino.ipynb` file: A comprehensive notebook named [infino.ipynb](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/modules/callbacks/integrations/infino.ipynb) has been included to guide users on effectively leveraging Infino for tracking LangChain requests. - [Integration Doc](https://github.com/naman-modi/langchain/blob/feature/infino-integration/docs/extras/ecosystem/integrations/infino.mdx) for Infino integration. By integrating Infino, LangChain users will gain access to powerful visualization and debugging capabilities. Infino enables easy tracking of inputs, outputs, token usage, execution time of LLMs. This comprehensive observability ensures a deeper understanding of individual executions and facilitates effective debugging. Co-authors: @vinaykakade @savannahar68 --------- Co-authored-by: Vinay Kakade <vinaykakade@gmail.com>	2023-06-21 01:38:20 -07:00
Elijah Tarr	e0f468f6c1	Update model token mappings/cost to include 0613 models (#6122 ) Add `gpt-3.5-turbo-16k` to model token mappings, as per the following new OpenAI blog post: https://openai.com/blog/function-calling-and-other-api-updates Fixes #6118 Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 01:37:16 -07:00
Jakub Misiło	5d149e4d50	Fix issue with non-list `To` header in GmailSendMessage Tool (#6242 ) Fixing the problem of feeding `str` instead of `List[str]` to the email tool. Fixes #6234 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 01:25:49 -07:00
Anubhav Bindlish	94c7899257	Integrate Rockset as Vectorstore (#6216 ) This PR adds Rockset as a vectorstore for langchain. [Rockset](https://rockset.com/blog/introducing-vector-search-on-rockset/) is a real time OLAP database which provides a fast and efficient vector search functionality. Further since it is entirely schemaless, it can store metadata in separate columns thereby allowing fast metadata filters during vector similarity search (as opposed to storing the entire metadata in a single JSON column). It currently supports three distance functions: `COSINE_SIMILARITY`, `EUCLIDEAN_DISTANCE`, and `DOT_PRODUCT`. This PR adds `rockset` client as an optional dependency. We would love a twitter shoutout, our handle is https://twitter.com/RocksetCloud --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 01:22:27 -07:00
ElReyZero	ab7ecc9c30	Feat: Add a prompt template parameter to qa with structure chains (#6495 ) This pull request introduces a new feature to the LangChain QA Retrieval Chains with Structures. The change involves adding a prompt template as an optional parameter for the RetrievalQA chains that utilize the recently implemented OpenAI Functions. The main purpose of this enhancement is to provide users with the ability to input a more customizable prompt to the chain. By introducing a prompt template as an optional parameter, users can tailor the prompt to their specific needs and context, thereby improving the flexibility and effectiveness of the RetrievalQA chains. ## Changes Made - Created a new optional parameter, "prompt", for the RetrievalQA with structure chains. - Added an example to the RetrievalQA with sources notebook. My twitter handle is @El_Rey_Zero --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-21 00:23:36 -07:00
Mircea Pasoi	2e024823d2	Add async support for HuggingFaceTextGenInference (#6507 ) Adding support for async calls in `HuggingFaceTextGenInference` Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-20 23:12:24 -07:00
Hassan Ouda	456ca3d587	Be able to use Codey models on Vertex AI (#6354 ) Added the functionality to leverage 3 new Codey models from Vertex AI: - code-bison - Code generation using the existing LLM integration - code-gecko - Code completion using the existing LLM integration - codechat-bison - Code chat using the existing chat_model integration --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-20 23:11:54 -07:00
囧囧	0fce8ef178	Add KuzuQAChain (#6454 ) This PR adds `KuzuGraph` and `KuzuQAChain` for interacting with [Kùzu database](https://github.com/kuzudb/kuzu). Kùzu is an in-process property graph database management system (GDBMS) built for query speed and scalability. The `KuzuGraph` and `KuzuQAChain` provide the same functionality as the existing integration with NebulaGraph and Neo4j and enables query generation and question answering over Kùzu database. A notebook example and a simple test case have also been added. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-20 22:07:00 -07:00
Chanin Nantasenamat	6e07283dd5	Update index.mdx (#6326 ) #### Fix Added the mention of "store" amongst the tasks that the data connection module can perform aside from the existing 3 (load, transform and query). Particularly, this implies the generation of embeddings vectors and the creation of vector stores.	2023-06-20 21:40:20 -07:00
Zander Chase	ffa4ff1a2e	Export trajectory eval fn (#6509 ) from the run_evaluators dir	2023-06-20 21:18:28 -07:00
TheOnlyWayUp	bb437646fc	typo(llamacpp.ipynb): 'condiser' -> 'consider' (#6474 )	2023-06-20 18:48:25 -07:00
northern-64bit	7492060525	Fix typo in docstring of format_tool_to_openai_function (#6479 ) Fixes typo "open AI" to "OpenAI" in docstring of `format_tool_to_openai_function` in `langchain/tools/convert_to_openai.py`.	2023-06-20 18:42:30 -07:00
Davis Chase	b3c49e94a0	Make streamlit import optional (#6510 )	2023-06-20 18:41:59 -07:00
Daniel McDonald	cece8c8bf0	Fixed: 'readible' -> readable (#6492 ) Hello there👋 I have made a pull request to fix a small typo.	2023-06-20 18:39:59 -07:00
hsparmar	834c3378af	Documentation Fix: Correct the example code output in the prompt templates doc (#6496 ) Documentation is showing the wrong example output for the prompt templates code snippet. This PR fixes that issue.	2023-06-20 17:21:09 -07:00
Davis Chase	c91cf68754	Fix link (#6501 )	2023-06-20 14:44:22 -07:00
Davis Chase	3298bf4f00	docs/fix links (#6498 )	2023-06-20 14:06:50 -07:00
Lance Martin	ae6196507d	Update notebook for MD header splitter and create new cookbook (#6399 ) Move MD header text splitter example to its own cookbook.	2023-06-20 13:53:41 -07:00
Stefano Lottini	22af93d851	Vector store support for Cassandra (#6426 ) This addresses #6291 adding support for using Cassandra (and compatible databases, such as DataStax Astra DB) as a [Vector Store](https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-30%3A+Approximate+Nearest+Neighbor(ANN)+Vector+Search+via+Storage-Attached+Indexes). A new class `Cassandra` is introduced, which complies with the contract and interface for a vector store, along with the corresponding integration test, a sample notebook and modified dependency toml. Dependencies: the implementation relies on the library `cassio`, which simplifies interacting with Cassandra for ML- and LLM-oriented workloads. CassIO, in turn, uses the `cassandra-driver` low-lever drivers to communicate with the database. The former is added as optional dependency (+ in `extended_testing`), the latter was already in the project. Integration testing relies on a locally-running instance of Cassandra. [Here](https://cassio.org/more_info/#use-a-local-vector-capable-cassandra) a detailed description can be found on how to compile and run it (at the time of writing the feature has not made it yet to a release). During development of the integration tests, I added a new "fake embedding" class for what I consider a more controlled way of testing the MMR search method. Likewise, I had to amend what looked like a glitch in the behaviour of `ConsistentFakeEmbeddings` whereby an `embed_query` call would have bypassed storage of the requested text in the class cache for use in later repeated invocations. @dev2049 might be the right person to tag here for a review. Thank you! --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-20 10:46:20 -07:00
Harrison Chase	cac6e45a67	improve documentation on base chain (#6468 ) Co-authored-by: Nuno Campos <nuno@boringbits.io>	2023-06-20 10:34:57 -07:00
Zeeland	ad7089a6d0	fix: change ddg to DDGS (#6480 ) This commit updates the duckduckgo search utility by using a more accurate name in the import statement.	2023-06-20 10:15:05 -07:00
Davis Chase	8cd5f65a6f	release 207 (#6488 )	2023-06-20 10:14:29 -07:00
zhaoshengbo	ab44c24333	Add Alibaba Cloud OpenSearch as a new vector store (#6154 ) Hello Folks, Thanks for creating and maintaining this great project. I'm excited to submit this PR to add Alibaba Cloud OpenSearch as a new vector store. OpenSearch is a one-stop platform to develop intelligent search services. OpenSearch was built based on the large-scale distributed search engine developed by Alibaba. OpenSearch serves more than 500 business cases in Alibaba Group and thousands of Alibaba Cloud customers. OpenSearch helps develop search services in different search scenarios, including e-commerce, O2O, multimedia, the content industry, communities and forums, and big data query in enterprises. OpenSearch provides the vector search feature. In specific scenarios, especially test question search and image search scenarios, you can use the vector search feature together with the multimodal search feature to improve the accuracy of search results. This PR includes: A AlibabaCloudOpenSearch class that can connect to the Alibaba Cloud OpenSearch instance. add embedings and metadata into a opensearch datasource. querying by squared euclidean and metadata. integration tests. ipython notebook and docs. I have read your contributing guidelines. And I have passed the tests below - [x] make format - [x] make lint - [x] make coverage - [x] make test --------- Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>	2023-06-20 10:07:40 -07:00
Davis Chase	b7ad4c4c30	fix openai qa chain (#6487 )	2023-06-20 10:01:13 -07:00
thehunmonkgroup	10adec5f1b	add FunctionMessage support to `_convert_dict_to_message()` in OpenAI chat model (#6382 ) Already supported in the reverse operation in `_convert_message_to_dict()`, this just provides parity. @hwchase17 @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-20 08:25:55 -07:00
Harrison Chase	7414e9d196	bump version to 206 (#6465 )	2023-06-19 23:05:09 -07:00
Hubert	22601b0b63	fix neo4j schema query (#6381 ) Fix issue #6380 <!-- Remove if not applicable --> Fixes #6380 (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: HubertKl <HubertKl>	2023-06-19 22:48:35 -07:00
Gavin	b0d80c4b3e	Update serpapi.py Support baidu list type answer_box (#6386 ) Support baidu list type answer_box From [this document](https://serpapi.com/baidu-answer-box), we can know that the answer_box attribute returned by the Baidu interface is a list, and the list contains only one Object, but an error will occur when the current code is executed. So when answer_box is a list, we reset res["answer_box"] so that the code can execute successfully.	2023-06-19 22:48:18 -07:00
Bryce Drennan	384fa43fc3	fix: llm caching for replicate (#6396 ) Caching wasn't accounting for which model was used so a result for the first executed model would return for the same prompt on a different model. This was because `Replicate._identifying_params` did not include the `model` parameter. FYI - @cbh123 - @hwchase17 - @agola11	2023-06-19 22:47:59 -07:00
Zeeland	8a604b93ab	feat: use latest duckduckgo_search API to call (#6409 ) # Provider the latest duckduckgo_search API The Git commit contents involve two files related to some DuckDuckGo query operations, and an upgrade of the DuckDuckGo module to version 3.8.3. A suitable commit message could be "Upgrade DuckDuckGo module to version 3.8.3, including query operations". Specifically, in the duckduckgo_search.py file, a DDGS() class instance is newly added to replace the previous ddg() function, and the time parameter name in the get_snippets() and results() methods is changed from "time" to "timelimit" to accommodate recent changes. In the pyproject.toml file, the duckduckgo-search module is upgraded to version 3.8.3. [duckduckgo_search readme attention](https://github.com/deedy5/duckduckgo_search): Versions before v2.9.4 no longer work as of May 12, 2023 ## Who can review? @vowelparrot --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:47:39 -07:00
Harrison Chase	9eec7c3206	Harrison/unstructured page number (#6464 ) Co-authored-by: Reza Sanaie <reza@sanaie.ca>	2023-06-19 22:31:43 -07:00
Alonso Silva Allende	b82ddf9cfb	Improve error message (#6275 ) Trying to use OpenAI models like 'text-davinci-002' or 'text-davinci-003' the agent doesn't work and the message is 'Only supported with OpenAI models.' The error message should be 'Only supported with ChatOpenAI models.' My Twitter handle is @alonsosilva <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> Co-authored-by: SILVA Alonso <alonso.silva@nokia-bell-labs.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:21:01 -07:00
zengbo	7e5f5ebf86	Fix the issue where ANTHROPIC_API_URL set in environment is not takin… (#6400 ) I apologize for the error: the 'ANTHROPIC_API_URL' environment variable doesn't take effect if the 'anthropic_api_url' parameter has a default value. #### Who can review? Models - @hwchase17 - @agola11	2023-06-19 22:20:36 -07:00
Grayson Adkins	9f5f747dc3	Fix broken links in autonomous agents docs (#6398 ) Fixes broken links here: https://python.langchain.com/docs/use_cases/autonomous_agents.html #### Who can review? Tag maintainers/contributors who might be interested: Agents / Tools / Toolkits - @hwchase17	2023-06-19 22:20:00 -07:00
volodymyr-memsql	d2e9b621ab	Update SinglStoreDB vectorstore (#6423 ) 1. Introduced new distance strategies support: DOT_PRODUCT and EUCLIDEAN_DISTANCE for enhanced flexibility. 2. Implemented a feature to filter results based on metadata fields. 3. Incorporated connection attributes specifying "langchain python sdk" usage for enhanced traceability and debugging. 4. Expanded the suite of integration tests for improved code reliability. 5. Updated the existing notebook with the usage example @dev2049 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:08:58 -07:00
Avinash Raj	6efd5fa2b9	Fix for #6431 - chatprompt template with partial variables giing validation error (#6456 ) W.r.t recent changes, ChatPromptTemplate does not accepting partial variables. This PR should fix that issue. Fixes #6431 #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:08:15 -07:00
Harrison Chase	02c0a1e77e	Harrison/functions in retrieval (#6463 )	2023-06-19 22:07:58 -07:00
Swapnil Sharma	dc4ffa8d9b	Incorrect argument count handling (#5543 ) Throwing ToolException when incorrect arguments are passed to tools so that that agent can course correct them. # Incorrect argument count handling I was facing an error where the agent passed incorrect arguments to tools. As per the discussions going around, I started throwing ToolException to allow the model to course correct. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:06:20 -07:00
kYLe	3a58c4c3a0	Fixed a link typo /-/route -> /-/routes. and change endpoint format (#6186 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes a link typo from `/-/route` to `/-/routes`. and change endpoint format from `f"{self.anyscale_service_url}/{self.anyscale_service_route}"` to `f"{self.anyscale_service_url}{self.anyscale_service_route}"` Also adding documentation about the format of the endpoint #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:05:54 -07:00
Leonid Ganeline	03b16ed2b1	docs `retrievers` fixes (#6299 ) Fixed several inconsistencies: - file names and notebook titles should be similar otherwise ToC on the [retrievers page](https://python.langchain.com/en/latest/modules/indexes/retrievers.html) and on the left ToC tab are different. For example, now, `Self-querying with Chroma` is not correctly alphabetically sorted because its file named `chroma_self_query.ipynb` - `Stringing compressors and document transformers...` demoted from `#` to `##`. Otherwise, it appears in Toc. - several formatting problems #### Who can review? @hwchase17 @dev2049 Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 22:04:35 -07:00
M. Tolga Cangöz	bccee85c8f	Update introduction.mdx (#6425 ) Fix typo	2023-06-19 22:04:09 -07:00
Nir Gazit	95b77a5215	Fix Custom LLM Agent example (#6429 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> The `CustomOutputParser` needs to throw `OutputParserException` when it fails to parse the response from the agent, so that the executor can [catch it and retry](`be9371ca8f/langchain/agents/agent.py (L767)`) when `handle_parsing_errors=True`. <!-- Remove if not applicable --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 22:03:58 -07:00
ykerus	b697bbb5b5	Remove backticks without clear purpose from docs (#6442 ) #### Description - Removed two backticks surrounding the phrase "chat messages as" - This phrase stood out among other formatted words/phrases such as `prompt`, `role`, `PromptTemplate`, etc., which all seem to have a clear function. - `chat messages as`, formatted as such, confused me while reading, leading me to believe the backticks were misplaced. #### Who can review? @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 22:03:38 -07:00
Dhruvil Shah	9494623869	Update web_base.ipynb (#6430 ) Minor new line character in the markdown. Also, this option is not yet in the latest version of LangChain (0.0.190) from Conda. Maybe in the next update. @eyurtsev @hwchase17	2023-06-19 21:43:35 -07:00
Wenchen Li	76ae9da9db	Add `_similarity_search_with_relevance_scores` in `Pinecone` (#6446 ) Just so it is consistent with other `VectorStore` classes. This is a follow-up of #6056 which also discussed the potential of adding `similarity_search_by_vector_returning_embeddings` that we will continue the discussion here. potentially related: #6286 #### Who can review? Tag maintainers/contributors who might be interested: @rlancemartin <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-19 21:36:40 -07:00
Ismail Pelaseyed	d4e8e0f5ab	Add example for question answering over documents with OpenAI Function Agent (#6448 ) This PR adds an example of doing question answering over documents using OpenAI Function Agents. #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-19 21:35:45 -07:00
Andrey Avtomonov	68a675cc68	Remove extra word in the introduction documentation (#6450 ) Removed an extra word in the introduction documentation, a simple typo	2023-06-19 21:31:17 -07:00
Ankush Gola	a9246333fd	fix anthropic chat model mutating input list (#6457 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes: ChatAnthropic was mutating the input message list during formatting which isn't ideal bc you could be changing the behavior for other chat models when using the same input #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested:	2023-06-19 21:30:52 -07:00
Zander Chase	bc0af67aaf	Add Trajectory Eval RunEvaluator (#6449 )	2023-06-19 21:11:50 -07:00
Hakan Tekgul	6a157cf8bb	Update arize_callback.py (#6433 ) Arize released a new Generative LLM Model Type, adjusting the callback function to new logging. Added arize imports, please delete if not necessary. Specifically, this change makes sure that the prompt and response pairs from LangChain agents are logged into Arize as a Generative LLM model, instead of our previous categorical model. In order to do this, the callback functions collects the necessary data and passes the data into Arize using Python Pandas SDK. Arize library, specifically pandas.logger is an additional dependency. Notebook For Test: https://docs.arize.com/arize/resources/integrations/langchain Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead Tracing / Callbacks @agola11	2023-06-19 18:33:49 -07:00
Zander Chase	00f276d23f	Run eval in eval mode (#6447 ) For the `run_on_dataset` sessions	2023-06-19 18:31:38 -07:00
Harrison Chase	1300a4bc8c	expose docs chains (#6453 )	2023-06-19 17:18:54 -07:00
Harrison Chase	286452c7f0	remove mongo	2023-06-19 10:04:14 -07:00
David Duong	be9371ca8f	Include placeholder value for all secrets, not just kwargs (#6421 ) Mirror PR for https://github.com/hwchase17/langchainjs/pull/1696 Secrets passed via environment variables should be present in the serialised chain	2023-06-19 15:41:45 +01:00
Harrison Chase	df40cd233f	bump version to 205 (#6410 )	2023-06-18 23:21:26 -07:00
Harrison Chase	e9c2b280db	Harrison/refactor functions (#6408 )	2023-06-18 23:13:42 -07:00
Harrison Chase	6a4a950a3c	changes to llm chain (#6328 ) - return raw and full output (but keep run shortcut method functional) - change output parser to take in generations (good for working with messages) - add output parser to base class, always run (default to same as current) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-06-18 22:49:47 -07:00
Davis Chase	d3c2eab0b3	Docs nit (#6350 )	2023-06-18 20:58:12 -07:00
Davis Chase	af96de6552	fix prod docs build (#6402 )	2023-06-18 20:56:12 -07:00
Fei Wang	50556f3b35	support memory for functions (#6165 ) #### Before submitting Add memory support for `OpenAIFunctionsAgent` like `StructuredChatAgent`. #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 19:00:40 -07:00
Dhruvil Shah	b2b9ded12f	Update web_base.py _fetch() method For SiteMapLoader (#6256 ) A must-include for SiteMap Loader to avoid the SSL verification error. Setting the 'verify' to False by ``` sitemap_loader.requests_kwargs = {"verify": False}``` does not bypass the SSL verification in some websites. There are websites (https:// researchadmin.asu.edu/ sitemap.xml) where setting "verify" to False as shown below would not work: sitemap_loader.requests_kwargs = {"verify": False} We need this merge to tell the Session to use a connector with a specific argument about SSL: \# For SiteMap SSL verification if not self.request_kwargs['verify']: connector = aiohttp.TCPConnector(ssl=False) else: connector = None <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #5483 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 18:34:18 -07:00
Harrison Chase	10bff4ecc4	Harrison/chroma fix (#6390 ) Co-authored-by: Junu Moon(Fran) <francomoon7@gmail.com>	2023-06-18 18:33:26 -07:00
Harrison Chase	5c1fa3e70e	Harrison/typesense fix (#6391 ) Co-authored-by: Gaurav Chauhan <2796gaurav@gmail.com> Co-authored-by: gaurav <gaurav.chauhan1@rksv.in>	2023-06-18 18:33:15 -07:00
Harrison Chase	5ccebce777	rm pandas from arize (#6392 )	2023-06-18 18:33:04 -07:00
matias-biatoz	3b7c4c51d5	Added gpt-3.5-turbo 0613 16k and 16k-0613 pricing (#6287 ) @agola11 Issue #6193 I added the new pricing for the new models. Also, now gpt-3.5-turbo got split into "input" and "output" pricing. It currently does not support that.	2023-06-18 18:32:20 -07:00
Ly Nguyen	1e0af59f69	- Fix pass system_message argument in new feature openai_functions_agent (#6297 ) can't pass system_message argument, the prompt always show default message "System: You are a helpful AI assistant." ``` system_message = SystemMessage( content="You are an AI that provides information to Human regarding documentation." ) agent = initialize_agent( tools, llm=openai_llm_chat, agent=AgentType.OPENAI_FUNCTIONS, system_message=system_message, agent_kwargs={ "system_message": system_message, }, verbose=False, ) ``` #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:54:00 -07:00
georgian	e64bafed3a	Fixes typo in Vectara.similarity_search (#6277 ) Fixes a simple typo. @hwchase17 @dev2049 Co-authored-by: Georgian Sarghi <georgian.sarghi@gmail.com>	2023-06-18 17:48:54 -07:00
Ted	112695e4da	Iterate through filtered file types instead of all listed files (#6258 ) # Iterate through filtered file types instead of all listed files Fixes https://github.com/hwchase17/langchain/issues/6257 https://github.com/hwchase17/langchain/pull/4926 originally added the functionality to filter by file type, storing the filtered files in `_files` https://github.com/hwchase17/langchain/pull/5220 removed the functionality when adding code to filter trashed files by using the `files` variables instead of the `_files` variable. This PR simply adds the functionality back by using `_files` again. #### Who can review? @hwchase17 - project lead @eyurtsev	2023-06-18 17:47:58 -07:00
Dhruvil Shah	ba90e3c990	Update web_base.ipynb for guiding purposes (#6248 ) To bypass SSL verification errors during fetching, you can include the `verify=False` parameter. This markdown proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #6079 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:10 -07:00
Dhruvil Shah	92f05a67a4	Add markdown to specify important arguments (#6246 ) To bypass SSL verification errors during web scraping, you can include the ssl_verify=False parameter along with the headers parameter. This combination of arguments proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #1829 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:00 -07:00
ikebo	ca7a44d024	add max_context_size property in BaseOpenAI (#6239 ) Hi, I make a small improvement for BaseOpenAI. I added a max_context_size attribute to BaseOpenAI so that we can get the max context size directly instead of only getting the maximum token size of the prompt through the max_tokens_for_prompt method. Who can review? @hwchase17 @agola11 I followed the [Common Tasks](`c7db9febb0/.github/CONTRIBUTING.md`), the test is all passed. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:46:35 -07:00
Jan Pawellek	3e3ed8c5c9	Fix LLM types so that they can be loaded from config dicts (#6235 ) LLM configurations can be loaded from a Python dict (or JSON file deserialized as dict) using the [load_llm_from_config](`8e1a7a8646/langchain/llms/loading.py (L12)`) function. However, the type string in the `type_to_cls_dict` lookup dict differs from the type string defined in some LLM classes. This means that the LLM object can be saved, but not loaded again, because the type strings differ.	2023-06-18 17:46:22 -07:00
Shu	46782ad79b	Fixed an unhandled error that was raised when DynamoDB did not have any chat history. (#6141 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> The current version of chat history with DynamoDB doesn't handle the case correctly when a table has no chat history. This change solves this error handling. <!-- Remove if not applicable --> Fixes https://github.com/hwchase17/langchain/issues/6088 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:39:19 -07:00
Cameron Vetter	2286204354	Correct AzureSearch Vector Store not applying search_kwargs when searching (#6132 ) Fixes #6131 Simply passes kwargs forward from similarity_search to helper functions so that search_kwargs are applied to search as originally intended. See bug for repro steps. #### Who can review? @hwchase17 @dev2049 Twitter: poshporcupine	2023-06-18 17:39:06 -07:00
Pierre Dulac	395a2a3724	Fix typo in the CAI critique prompt (#6123 ) Very small typo in the Constitutional AI critique default prompt. The negation "If there is no material critique of ..." is used two times, should be used only on the first one. Cheers, Pierre	2023-06-18 17:38:56 -07:00
Hao Chen	38057f0d2e	Fix latest clickhouse vector schema change (#6385 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes https://github.com/hwchase17/langchain/issues/6208 <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> VectorStores / Retrievers / Memory - @dev2049	2023-06-18 17:34:53 -07:00
Davit Buniatyan	1ab9dc8293	[hotfix] Deep Lake fails on newer version due to hardcode (#6383 ) Hot Fixes for Deep Lake [would highly appreciate expedited review] * deeplake version was hardcoded and since deeplake upgraded the integration fails with confusing error * an additional integration test fixed due to embedding function * Additionally fixed docs for code understanding links after docs upgraded * notebook removal of public parameter to make sure code understanding notebook works #### Who can review? @hwchase17 @dev2049 --------- Co-authored-by: Davit Buniatyan <d@activeloop.ai>	2023-06-18 17:33:49 -07:00
hp0404	6aa7b04f79	Fix integration tests for Faiss vector store (#6281 ) Fixes #5807 (issue) #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:25:49 -07:00
Chakib Benziane	ddd518a161	searx_search: updated tools and doc (#6276 ) - Allows using the same wrapper to create multiple tools ```python wrapper = SearxSearchWrapper(searx_host="**") github_tool = SearxSearchResults(name="Github", wrapper=wrapper, kwargs = { "engines": ["github"], }) arxiv_tool = SearxSearchResults(name="Arxiv", wrapper=wrapper, kwargs = { "engines": ["arxiv"] }) ``` - Updated link to searx documentation Agents / Tools / Toolkits - @hwchase17	2023-06-18 17:23:12 -07:00
ju-bezdek	e2f36ee608	OpenAI functions dont work with async streaming... #6225 (#6226 ) Related to this https://github.com/hwchase17/langchain/issues/6225 Just copied the implementation from `generate` function to `agenerate` and tested it. Didn't run any official tests thought <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #6225 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17, @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:05:16 -07:00
Jan Pawellek	ea6a5b03e0	Fix output final text for HuggingFaceTextGenInference when streaming (#6211 ) The LLM integration [HuggingFaceTextGenInference](https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_text_gen_inference.py) already has streaming support. However, when streaming is enabled, it always returns an empty string as the final output text when the LLM is finished. This is because `text` is instantiated with an empty string and never updated. This PR fixes the collection of the final output text by concatenating new tokens.	2023-06-18 17:01:15 -07:00
Tomaz Bratanic	b3bccabc66	Add option to save/load graph cypher QA (#6219 ) Similar as https://github.com/hwchase17/langchain/pull/5818 Added the functionality to save/load Graph Cypher QA Chain due to a user reporting the following error > raise NotImplementedError("Saving not supported for this chain type.")\nNotImplementedError: Saving not supported for this chain type.\n'	2023-06-18 17:00:27 -07:00
Harrison Chase	495128ba95	Harrison/functions docs improvements (#6389 ) Co-authored-by: Sumanth Donthula <46747610+sumanthdonthula@users.noreply.github.com>	2023-06-18 16:57:33 -07:00
Leonid Ganeline	c7ca350cd3	Fix class promotion (#6187 ) In LangChain, all module classes are enumerated in the `__init__.py` file of the correspondent module. But some classes were missed and were not included in the module `__init__.py` This PR: - added the missed classes to the module `__init__.py` files - `__init__.py:__all_` variable value (a list of the class names) was sorted - `langchain.tools.sql_database.tool.QueryCheckerTool` was renamed into the `QuerySQLCheckerTool` because it conflicted with `langchain.tools.spark_sql.tool.QueryCheckerTool` - changes to `pyproject.toml`: - added `pgvector` to `pyproject.toml:extended_testing` - added `pandas` to `pyproject.toml:[tool.poetry.group.test.dependencies]` - commented out the `streamlit` from `collbacks/__init__.py`, It is because now the `streamlit` requires Python >=3.7, !=3.9.7 - fixed duplicate names in `tools` - fixed correspondent ut-s #### Who can review? @hwchase17 @dev2049	2023-06-18 16:55:18 -07:00
Harrison Chase	c0c2fd0782	Harrison/zep mem (#6388 ) Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-06-18 16:53:35 -07:00
Harrison Chase	b7159c15cc	Harrison/metaphor search fix (#6387 ) Co-authored-by: jeffzwang <jeffreyzhiyuanwang@gmail.com>	2023-06-18 16:53:24 -07:00
Harrison Chase	9bf5b0defa	Harrison/myscale self query (#6376 ) Co-authored-by: Fangrui Liu <fangruil@moqi.ai> Co-authored-by: 刘方瑞 <fangrui.liu@outlook.com> Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>	2023-06-18 16:53:10 -07:00
Harrison Chase	bd8d418a95	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-18 16:45:49 -07:00
Harrison Chase	3a75d59c3d	searx - docs	2023-06-18 16:45:42 -07:00
MIDORIBIN	5be465bd86	Fixed PermissionError on windows (#6170 ) Fixed PermissionError that occurred when downloading PDF files via http in BasePDFLoader on windows. When downloading PDF files via http in BasePDFLoader, NamedTemporaryFile is used. This function cannot open the file again on Windows.[Python Doc](https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile) So, we created a temporary directory with TemporaryDirectory and placed the downloaded file there. temporary directory is deleted in the deconstruct. Fixes #2698 #### Who can review? Tag maintainers/contributors who might be interested: - @eyurtsev - @hwchase17	2023-06-18 16:39:57 -07:00
xleven	4fc7939848	fix link of callbacks on modules page (#6323 ) Since [Callbacks](https://python.langchain.com/docs/modules/callbacks/getting_started/) on [Modules](https://python.langchain.com/docs/modules/) went to a "Page Not Found".	2023-06-18 15:08:12 -07:00
Vijay	2b3b4e0f60	Add the ability to run the map_reduce chains process results step as async (#6181 ) This will add the ability to add an AsyncCallbackManager (handler) for the reducer chain, which would be able to stream the tokens via the `async def on_llm_new_token` callback method Fixes # (issue) [5532](https://github.com/hwchase17/langchain/issues/5532) @hwchase17 @agola11 The following code snippet explains how this change would be used to enable `reduce_llm` with streaming support in a `map_reduce` chain I have tested this change and it works for the streaming use-case of reducer responses. I am happy to share more information if this makes solution sense. ``` AsyncHandler .......................... class StreamingLLMCallbackHandler(AsyncCallbackHandler): """Callback handler for streaming LLM responses.""" def __init__(self, websocket): self.websocket = websocket # This callback method is to be executed in async async def on_llm_new_token(self, token: str, **kwargs: Any) -> None: resp = ChatResponse(sender="bot", message=token, type="stream") await self.websocket.send_json(resp.dict()) Chain .......... stream_handler = StreamingLLMCallbackHandler(websocket) stream_manager = AsyncCallbackManager([stream_handler]) streaming_llm = ChatOpenAI( streaming=True, callback_manager=stream_manager, verbose=False, temperature=0, ) main_llm = OpenAI( temperature=0, verbose=False, ) doc_chain = load_qa_chain( llm=main_llm, reduce_llm=streaming_llm, chain_type="map_reduce", callback_manager=manager ) qa_chain = ConversationalRetrievalChain( retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator, callback_manager=manager, ) # Here `acall` will trigger `acombine_docs` on `map_reduce` which should then call `_aprocess_result` which in turn will call `self.combine_document_chain.arun` hence async callback will be awaited result = await qa_chain.acall( {"question": question, "chat_history": chat_history} ) ```	2023-06-18 13:19:56 -07:00
Alvaro Bartolome	e0dea577ee	Extend `ArgillaCallbackHandler` support (#6153 ) Hi again @agola11! 🤗 ## What's in this PR? After playing around with different chains we noticed that some chains were using different `output_key`s and we were just handling some, so we've extended the support to any output, either if it's a Python list or a string. Kudos to @dvsrepo for spotting this! --------- Co-authored-by: Daniel Vila Suero <daniel@argilla.io>	2023-06-18 11:18:33 -07:00
Harrison Chase	a8cb9ee013	Harrison/gdrive enhancements (#6375 ) Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>	2023-06-18 11:07:23 -07:00
rafael	ebfffaa38f	Guardrails output parser: Pass LLM api for reasking (#6089 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes https://github.com/ShreyaR/guardrails/issues/155 Enables guardrails reasking by specifying an LLM api in the output parser.	2023-06-18 10:50:20 -07:00
Davis Chase	ec850e607f	bump 203 (#6372 )	2023-06-18 09:20:47 -07:00
Lance Martin	370becdfc2	Add self query retriever example with MD header splitting (#6359 ) Flesh out the notebook example for `MarkdownHeaderTextSplitter`	2023-06-17 21:40:20 -07:00
Lance Martin	2c97fbabbd	Update MD header text splitter notebook (#6339 ) Highlight use case for maintaining header groups when splitting.	2023-06-17 13:19:27 -07:00
Harrison Chase	a2bbe3dda4	Harrison/mmr support for opensearch (#6349 ) Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>	2023-06-17 12:22:37 -07:00
Davis Chase	2eea5d4cb4	Add ignore vercel preview script (#6320 ) skip building preview of docs for anything branch that doesn't start with `__docs__`. will eventually update to look at code diff directories but patching for now	2023-06-17 11:17:08 -07:00
Harrison Chase	7a48d9ee82	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-17 11:16:19 -07:00
Kenny	e30fdffd1e	Add new openai 0613 model costs (#6110 ) Added costs for gpt-4-32k-0613, gpt-4-0613, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, and gpt-3.5-turbo-16k-0613 to openai_info callback based on this [OpenAI post](https://openai.com/blog/function-calling-and-other-api-updates) @agola11	2023-06-17 11:11:47 -07:00
Dhruvil Shah	2eec687474	update web_base.py to have verify option (#6107 ) We propose an enhancement to the web-based loader initialize method by introducing a "verify" option. This enhancement addresses the issue of SSL verification errors encountered on certain web pages. By providing users with the option to set the verify parameter to False, we offer greater flexibility and control. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ### Fixes #6079 #### Who can review? @eyurtsev @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-17 11:10:48 -07:00
Harrison Chase	680d6bbbf8	fix titles in documentation	2023-06-17 11:09:11 -07:00
Nuno Campos	e194dc5306	Make lckwargs private (#6344 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-17 19:08:25 +01:00
Harrison Chase	8cfb52ddbb	fix spelling	2023-06-17 11:06:54 -07:00
zengbo	5d5298087f	Custom Anthropic API URL (#6221 ) [Feature] User can custom the Anthropic API URL #### Who can review? Tag maintainers/contributors who might be interested: Models - @hwchase17 - @agola11	2023-06-17 11:01:29 -07:00
Harrison Chase	61e4a1adf9	Harrison/faiss score (#6341 ) Co-authored-by: Frank Stein <16441059+simonfromla@users.noreply.github.com> Co-authored-by: Sims Juju <sims@Ju.lan>	2023-06-17 11:00:47 -07:00
Harrison Chase	42a28ac1ba	Harrison/error zero tools (#6340 ) Co-authored-by: Juhee Kim <46583939+juppytt@users.noreply.github.com>	2023-06-17 11:00:35 -07:00
Slawomir Gonet	eef62bf4e9	qdrant: search by vector (#6043 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Added support to `search_by_vector` to Qdrant Vector store. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ### Who can review VectorStores / Retrievers / Memory - @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 -->	2023-06-17 09:44:28 -07:00
Mark	b7ba7e8a7b	Allow GoogleDrive to authenticate via application default credentials on Cloud Run/GCE etc without service key (#6035 ) @eyurtsev The existing GoogleDrive implementation always needs a service account to be available at the credentials location. When running on GCP services such as Cloud Run, a service account already exists in the metadata of the service, so no physical key is necessary. This change adds a check to see if it is running in such an environment, and uses that authentication instead. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-17 09:44:17 -07:00
lonestriker	6f36f0f930	Add oobabooga/text-generation-webui support as a llm (#5997 ) Add oobabooga/text-generation-webui support as an LLM. Currently, supports using text-generation-webui's non-streaming API interface. Allows users who already have text-gen running to use the same models with langchain. #### Before submitting Simple usage, similar to existing LLM supported: ``` from langchain.llms import TextGen llm = TextGen(model_url = "http://localhost:5000") ``` #### Who can review? @hwchase17 - project lead --------- Co-authored-by: Hien Ngo <Hien.Ngo@adia.ae>	2023-06-17 09:42:15 -07:00
Richy Wang	444ca3f669	Improve AnalyticDB Vector Store implementation without affecting user (#6086 ) Hi there: As I implement the AnalyticDB VectorStore use two table to store the document before. It seems just use one table is a better way. So this commit is try to improve AnalyticDB VectorStore implementation without affecting user behavior: 1. Streamline the `post_init `behavior by creating a single table with vector indexing. 2. Update the `add_texts` API for document insertion. 3. Optimize `similarity_search_with_score_by_vector` to retrieve results directly from the table. 4. Implement `_similarity_search_with_relevance_scores`. 5. Add `embedding_dimension` parameter to support different dimension embedding functions. Users can continue using the API as before. Test cases added before is enough to meet this commit.	2023-06-17 09:36:31 -07:00
Ja-sonYun	cdd1d78bf2	make modelname_to_contextsize as a staticmethod (#6040 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes ##6039 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17　@agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-17 09:13:08 -07:00
Saba Sturua	427551eabf	DocArray as a Retriever (#6031 ) ## DocArray as a Retriever [DocArray](https://github.com/docarray/docarray) is an open-source tool for managing your multi-modal data. It offers flexibility to store and search through your data using various document index backends. This PR introduces `DocArrayRetriever` - which works with any available backend and serves as a retriever for Langchain apps. Also, I added 2 notebooks: DocArray Backends - intro to all 5 currently supported backends, how to initialize, index, and use them as a retriever DocArray Usage - showcasing what additional search parameters you can pass to create versatile retrievers Example: ```python from docarray.index import InMemoryExactNNIndex from docarray import BaseDoc, DocList from docarray.typing import NdArray from langchain.embeddings.openai import OpenAIEmbeddings from langchain.retrievers import DocArrayRetriever # define document schema class MyDoc(BaseDoc): description: str description_embedding: NdArray[1536] embeddings = OpenAIEmbeddings() # create documents descriptions = ["description 1", "description 2"] desc_embeddings = embeddings.embed_documents(texts=descriptions) docs = DocList[MyDoc]( [ MyDoc(description=desc, description_embedding=embedding) for desc, embedding in zip(descriptions, desc_embeddings) ] ) # initialize document index with data db = InMemoryExactNNIndex[MyDoc](docs) # create a retriever retriever = DocArrayRetriever( index=db, embeddings=embeddings, search_field="description_embedding", content_field="description", ) # find the relevant document doc = retriever.get_relevant_documents("action movies") print(doc) ``` #### Who can review? @dev2049 --------- Signed-off-by: jupyterjazz <saba.sturua@jina.ai>	2023-06-17 09:09:33 -07:00
Masafumi Mori	7bb437146d	fix links to prompt templates and example selectors (#6332 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # links to prompt templates and example selectors on the [Prompts](https://python.langchain.com/docs/modules/model_io/prompts/) page are invalid. #### Before submitting Just a small note that I tried to run `make docs_clean` and other related commands before PR written [here](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md#build-documentation-locally), it gives me an error: ```bash langchain % make docs_clean Traceback (most recent call last): File "/Users/masafumi/Downloads/langchain/.venv/bin/make", line 5, in <module> from scripts.proto import main ModuleNotFoundError: No module named 'scripts' make: *** [docs_clean] Error 1 # Poetry (version 1.5.1) # Python 3.9.13 ``` I couldn't figure out how to fix this, so I didn't run those command. But links should work. #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 Similar issue #6323 Co-authored-by: masafumimori <m.masafumimori@outlook.com>	2023-06-17 09:07:14 -07:00
Francisco Ingham	83eea230f3	changed height in the nb example (#6327 ) changed height in the example to a more reasonable number (from 9 feet to 6 feet)	2023-06-17 00:05:48 -07:00
James O'Dwyer	0475d015fe	Handle Managed Motorhead Data Key (#6169 ) # Handle Managed Motorhead Data Key Managed motorhead will return a payload with a `data` key. we need to handle this to properly access messages from the server.	2023-06-16 20:36:18 -07:00
Luke Stanley	364f8e7b5d	Better Entity Memory code documentation (#6318 ) Just adds some comments and docstring improvements. There was some behaviour that was quite unclear to me at first like: - "when do things get updated?" - "why are there only entity names and no summaries?" - "why do the entity names disappear?" Now it can be much more obvious to many. I am lukestanley on Twitter.	2023-06-16 18:08:44 -07:00
Harrison Chase	af18413d97	Harrison/deeplake new features (#6263 ) Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-16 17:53:55 -07:00
Davis Chase	6640293087	fix eval guide links (#6319 )	2023-06-16 17:53:46 -07:00
ljeagle	ad324a39ae	Improve the performance of add_texts interface and upgrade the AwaDB from 0.3.2 to 0.3.3 (#6316 ) 1. Changed the implementation of add_texts interface for the AwaDB vector store in order to improve the performance 2. Upgrade the AwaDB from 0.3.2 to 0.3.3 --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-16 16:50:01 -07:00
Davis Chase	24b2af5218	nit (#6305 )	2023-06-16 16:21:27 -07:00
Pierre Alexandre SCHEMBRI	9ca11c06b7	Fixes #6282 (#6283 ) Fixes #6282 1 liner to fix default http headers not passed by `LLMRequestsChain`	2023-06-16 16:21:01 -07:00
Davis Chase	23cdebddc4	Del linkcheck readme (#6317 )	2023-06-16 16:18:45 -07:00
Brigit Murtaugh	ccd916babe	Update dev container (#6189 ) Fixes https://github.com/hwchase17/langchain/issues/6172 As described in https://github.com/hwchase17/langchain/issues/6172, I'd love to help update the dev container in this project. Summary of changes: - Dev container now builds (the current container in this repo won't build for me) - Dockerfile updates - Update image to our [currently-maintained Python image](https://github.com/devcontainers/images/tree/main/src/python/.devcontainer) (`mcr.microsoft.com/devcontainers/python`) rather than the deprecated image from vscode-dev-containers - Move Dockerfile to root of repo - in order for `COPY` to work properly, it needs the files (in this case, `pyproject.toml` and `poetry.toml`) in the same directory - devcontainer.json updates - Removed `customizations` and `remoteUser` since they should be covered by the updated image in the Dockerfile - Update comments - Update docker-compose.yaml to properly point to updated Dockerfile - Add a .gitattributes to avoid line ending conversions, which can result in hundreds of pending changes ([info](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files)) - Add a README in the .devcontainer folder and info on the dev container in the contributing.md Outstanding questions: - Is it expected for `poetry install` to take some time? It takes about 30 minutes for this dev container to finish building in a Codespace, but a user should only have to experience this once. Through some online investigation, this doesn't seem unusual - Versions of poetry newer than 1.3.2 failed every time - based on some of the guidance in contributing.md and other online resources, it seemed changing poetry versions might be a good solution. 1.3.2 is from Jan 2023 --------- Co-authored-by: bamurtaugh <brmurtau@microsoft.com> Co-authored-by: Samruddhi Khandale <samruddhikhandale@github.com>	2023-06-16 15:42:14 -07:00
Davis Chase	03b5891cf7	more redirect (#6314 )	2023-06-16 14:43:59 -07:00
Davis Chase	eaee492dbc	basic redirect (#6309 )	2023-06-16 13:39:58 -07:00
Davis Chase	d2243757a3	update readme (#6304 )	2023-06-16 12:27:16 -07:00
Davis Chase	2f47e5c766	update api link (#6303 )	2023-06-16 12:18:17 -07:00
Davis Chase	d558bcfad8	rm ignore_vercel (#6302 )	2023-06-16 12:06:58 -07:00
Davis Chase	87e502c6bc	Doc refactor (#6300 ) Co-authored-by: jacoblee93 <jacoblee93@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-16 11:52:56 -07:00
Harrison Chase	94c82a189d	bump to 202 (#6262 )	2023-06-16 06:52:36 -07:00
hp0404	b01cf0dd54	ArxivAPIWrapper - doc_content_chars_max (#6063 ) This PR refactors the ArxivAPIWrapper class making `doc_content_chars_max` parameter optional. Additionally, tests have been added to ensure the functionality of the doc_content_chars_max parameter. Fixes #6027 (issue)	2023-06-15 22:16:42 -07:00
Daniel King	a9b97aa6f4	Update output format of MosaicML endpoint to be more flexible (#6060 ) There will likely be another change or two coming over the next couple weeks as we stabilize the API, but putting this one in now which just makes the integration a bit more flexible with the response output format. ``` (langchain) danielking@MML-1B940F4333E2 langchain % pytest tests/integration_tests/llms/test_mosaicml.py tests/integration_tests/embeddings/test_mosaicml.py =================================================================================== test session starts =================================================================================== platform darwin -- Python 3.10.11, pytest-7.3.1, pluggy-1.0.0 rootdir: /Users/danielking/github/langchain configfile: pyproject.toml plugins: asyncio-0.20.3, mock-3.10.0, dotenv-0.5.2, cov-4.0.0, anyio-3.6.2 asyncio: mode=strict collected 12 items tests/integration_tests/llms/test_mosaicml.py ...... [ 50%] tests/integration_tests/embeddings/test_mosaicml.py ...... [100%] =================================================================================== slowest 5 durations =================================================================================== 4.76s call tests/integration_tests/llms/test_mosaicml.py::test_retry_logic 4.74s call tests/integration_tests/llms/test_mosaicml.py::test_mosaicml_llm_call 4.13s call tests/integration_tests/llms/test_mosaicml.py::test_instruct_prompt 0.91s call tests/integration_tests/llms/test_mosaicml.py::test_short_retry_does_not_loop 0.66s call tests/integration_tests/llms/test_mosaicml.py::test_mosaicml_extra_kwargs =================================================================================== 12 passed in 19.70s =================================================================================== ``` #### Who can review? @hwchase17 @dev2049	2023-06-15 22:15:39 -07:00
JaysonAlbert	50d9c7d5a4	Fix: change the chatgpt plugin retriever metadata format (#5920 ) the current implement put the doc itself as the metadata, but the document chatgpt plugin retriever returned already has a `metadata` field, it's better to use that instead. the original code will throw the following exception when using `RetrievalQAWithSourcesChain`, becuse it can not find the field `metadata`: ```python Exception has occurred: ValueError (note: full exception trace is shown but execution is paused at: _run_module_as_main) Document prompt requires documents to have metadata variables: ['source']. Received document with missing metadata: ['source']. File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 27, in format_document raise ValueError( File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 65, in <listcomp> doc_strings = [format_document(doc, self.document_prompt) for doc in docs] File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 65, in _get_inputs doc_strings = [format_document(doc, self.document_prompt) for doc in docs] File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/stuff.py", line 85, in combine_docs inputs = self._get_inputs(docs, **kwargs) File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/combine_documents/base.py", line 84, in _call output, extra_return_dict = self.combine_docs( File "/home/wangjie/anaconda3/envs/chatglm/lib/python3.10/site-packages/langchain/chains/base.py", line 140, in __call__ raise e ``` Additionally, the `metadata` filed in the `chatgpt plugin retriever` have these fileds by default: ```json { "source": "file", //email, file or chat "source_id": "filename.docx", // the filename "url": "", ... } ``` so, we should set `source_id` to `source` in the langchain metadata. ```python metadata = d.pop("metadata", d) if(metadata.get("source_id")): metadata["source"] = metadata.pop("source_id") ``` #### Who can review? @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: wangjie <wangjie@htffund.com>	2023-06-15 22:04:45 -07:00
Harrison Chase	e67b26eee9	Harrison/openai functions (#6261 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 21:54:39 -07:00
Harrison Chase	6aafb46807	Harrison/openai functions (#6223 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 21:43:33 -07:00
Zander Chase	bc9b8c8239	Improve Error Message for failed callback (#6247 ) Include the handler class name in the warning	2023-06-15 19:18:37 -07:00
Alon Roth	0013256e81	Support chat history persistence in AutoGPT (#5716 ) Short Description Added a new argument to AutoGPT class which allows to persist the chat history to a file. Changes 1. Removed the `self.full_message_history: List[BaseMessage] = []` 2. Replaced it with `chat_history_memory` which can take any subclasses of `BaseChatMessageHistory` --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-15 17:49:03 -07:00
Martin Antos	1913320cbe	Feature/add acreom loader (#5780 ) adding new loader for [acreom](https://acreom.com) vaults. It's based on the Obsidian loader with some additional text processing for acreom specific markdown elements. @eyurtsev please take a look! --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-15 11:53:00 -07:00
Zander Chase	ae76e473e1	Add Tags for LLMs (#6229 ) - [x] Add tracing tags to LLMs + Chat Models (both inheritable and local) - [x] Add tags for the run_on_dataset helper function(s)	2023-06-15 11:24:11 -07:00
Harrison Chase	8e1a7a8646	bump version to 201 (#6233 )	2023-06-15 08:28:47 -07:00
Harrison Chase	e82687ddf4	Harrison/use functions agent (#6185 ) Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>	2023-06-15 08:18:50 -07:00
Ryo Kanazawa	7d2b946d0b	Fix typo `pandocs` to `pandoc` (#6203 ) Fixes https://github.com/hwchase17/langchain/issues/6204 ### Context An typo issue with `pandoc`. #### Who can review? @hwchase17	2023-06-15 08:18:27 -07:00
Kyle Roth	c7db9febb0	count tokens for new OpenAI model versions (#6195 ) Trying to call `ChatOpenAI.get_num_tokens_from_messages` returns the following error for the newly announced models `gpt-3.5-turbo-0613` and `gpt-4-0613`: ``` NotImplementedError: get_num_tokens_from_messages() is not presently implemented for model gpt-3.5-turbo-0613.See https://github.com/openai/openai-python/blob/main/chatml.md for information on how messages are converted to tokens. ``` This adds support for counting tokens for those models, by counting tokens the same way they're counted for the previous versions of `gpt-3.5-turbo` and `gpt-4`. #### reviewers - @hwchase17 - @agola11	2023-06-15 06:16:03 -07:00
xu0o0	7ad13cdbdb	feat: add content_format param to ConfluenceLoader.load() (#5922 ) Confluence API supports difference format of page content. The storage format is the raw XML representation for storage. The view format is the HTML representation for viewing with macros rendered as though it is viewed by users. Add the `content_format` parameter to `ConfluenceLoader.load()` to specify the content format, this is set to `ContentFormat.STORAGE` by default. #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-14 16:56:28 -07:00
0xJordan	c5a46e7435	feat: Add support for the Solidity language (#6054 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ## Add Solidity programming language support for code splitter. Twitter: @0xjord4n_ <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-14 14:25:02 -07:00
Nuno Campos	17c4ec4812	Add docs for tags (#6155 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-14 14:01:58 -07:00
thiswillbeyourgithub	4a649e3b14	typo: 'following following' to 'following' (#6163 ) Co-authored-by: thiswillbeyourgithub <github@32mail.33mail.com>	2023-06-14 10:58:47 -07:00
Maciej Bryński	8a44c879c6	Update readthedocs_documentation.ipynb (#6148 ) Minor fix in documentation. Change URL in wget call to proper one.	2023-06-14 07:21:48 -07:00
Zander Chase	e0e3ef1c57	Update Name (#6136 )	2023-06-13 22:25:36 -07:00
Zander Chase	4555ad5d1f	Add Run Collector Callback (#6133 ) Add a callback handler that can collect nested run objects. Useful for evaluation.	2023-06-13 22:17:37 -07:00
Harrison Chase	6ac120f299	bump ver to 200 (#6130 )	2023-06-13 19:33:51 -07:00
Harrison Chase	e41f0b341c	add functions agent (#6113 )	2023-06-13 18:51:01 -07:00
Zander Chase	b3b155d488	Return session name in runner response (#6112 ) Makes it easier to then run evals w/o thinking about specifying a session	2023-06-13 16:59:43 -07:00
Harrison Chase	e74733ab9e	support streaming for functions (#6115 )	2023-06-13 15:26:26 -07:00
Nuno Campos	11ab0be11a	Add support for tags (#5898 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-13 12:30:59 -07:00
Harrison Chase	1281fdf0f2	Harrison/notebook functions (#6103 )	2023-06-13 10:52:54 -07:00
Harrison Chase	34ebb29726	bump version to 199 (#6102 )	2023-06-13 10:50:33 -07:00
Wenchen Li	f9edf76e7c	Implement `max_marginal_relevance_search` in `VectorStore` of Pinecone (#6056 ) This adds implementation of MMR search in pinecone; and I have two semi-related observations about this vector store class: - Maybe we should also have a `similarity_search_by_vector_returning_embeddings` like in supabase, but it's not in the base `VectorStore` class so I didn't implement - Talking about the base class, there's `similarity_search_with_relevance_scores`, but in pinecone it is called `similarity_search_with_score`; maybe we should consider renaming it to align with other `VectorStore` base and sub classes (or add that as an alias for backward compatibility) #### Who can review? Tag maintainers/contributors who might be interested: - VectorStores / Retrievers / Memory - @dev2049	2023-06-13 10:46:45 -07:00
Harrison Chase	970b2f9d38	convert tools to openai (#6100 )	2023-06-13 10:40:49 -07:00
Harrison Chase	292accde2b	support functions (#6099 )	2023-06-13 10:32:58 -07:00
Lance Martin	ee3d0513ad	Add tests and update notebook for MarkdownHeaderTextSplitter (#6069 ) Add test and update notebook for `MarkdownHeaderTextSplitter`.	2023-06-13 09:07:52 -07:00
Keshav Kumar	8fdf88b8e3	Fix for ModuleNotFoundError while running langchain-server. Issue #5833 (#6077 ) This PR fixes the error `ModuleNotFoundError: No module named 'langchain.cli'` Fixes https://github.com/hwchase17/langchain/issues/5833 (issue)	2023-06-13 08:37:07 -07:00
Zander Chase	0c52275bdb	Use Run object from SDK (#6067 ) Update the Run object in the tracer to extend that in the SDK to include the parameters necessary for tracking/tracing	2023-06-13 07:14:11 -07:00
Harrison Chase	cde1e8739a	turn off repr (#6078 )	2023-06-12 22:45:24 -07:00
Nuno Campos	a9b3b2e327	Enable serialization for anthropic (#6049 )	2023-06-12 22:39:10 -07:00
Harrison Chase	6ac5d80286	propogate kwargs fully (#6076 )	2023-06-12 22:37:55 -07:00
Harrison Chase	ec1a2adf9c	improve tools (#6062 )	2023-06-12 22:19:03 -07:00
Julius Lipp	5b6bbf4ab2	Add embaas document extraction api endpoints (#6048 ) # Introduces embaas document extraction api endpoints In this PR, we add support for embaas document extraction endpoints to Text Embedding Models (with LLMs, in different PRs coming). We currently offer the MTEB leaderboard top performers, will continue to add top embedding models and soon add support for customers to deploy thier own models. Additional Documentation + Infomation can be found [here](https://embaas.io). While developing this integration, I closely followed the patterns established by other langchain integrations. Nonetheless, if there are any aspects that require adjustments or if there's a better way to present a new integration, let me know! :) Additionally, I fixed some docs in the embeddings integration. Related PR: #5976 #### Who can review? DataLoaders - @eyurtsev	2023-06-12 19:13:52 -07:00
Zander Chase	2f0088039d	Log tracer errors (#6066 ) Example (would log several times if not for the helper fn. Would emit no logs due to mulithreading previously) ![image](https://github.com/hwchase17/langchain/assets/130414180/070d25ae-1f06-4487-9617-0a6f66f3f01e)	2023-06-12 17:13:49 -07:00
Lance Martin	b023f0c0f2	Text splitter for Markdown files by header (#5860 ) This creates a new kind of text splitter for markdown files. The user can supply a set of headers that they want to split the file on. We define a new text splitter class, `MarkdownHeaderTextSplitter`, that does a few things: (1) For each line, it determines the associated set of user-specified headers (2) It groups lines with common headers into splits See notebook for example usage and test cases.	2023-06-12 15:46:42 -07:00
Jens Madsen	2c91f0d750	chore: spedd up integration test by using smaller model (#6044 ) Adds a new parameter `relative_chunk_overlap` for the `SentenceTransformersTokenTextSplitter` constructor. The parameter sets the chunk overlap using a relative factor, e.g. for a model where the token limit is 100, a `relative_chunk_overlap=0.5` implies that `chunk_overlap=50` Tag maintainers/contributors who might be interested: @hwchase17, @dev2049	2023-06-12 13:27:10 -07:00
Harrison Chase	5922742d56	comment out	2023-06-12 10:57:31 -07:00
Harrison Chase	681ba6d520	embaas title	2023-06-12 08:00:14 -07:00
Ben Flast	7a5e36f3f5	Mongo db doc fix (#6042 ) I missed a few errors in my initial fix @hwchase1. Thanks!	2023-06-12 07:29:27 -07:00
Harrison Chase	289e9aeb9d	bump ver to 198 (#6026 )	2023-06-11 21:32:45 -07:00
Harrison Chase	d1561b74eb	Harrison/cognitive search (#6011 ) Co-authored-by: Fabrizio Ruocco <ruoccofabrizio@gmail.com>	2023-06-11 21:15:42 -07:00
wenmeng zhou	bb7ac9edb5	add dashscope text embedding (#5929 ) #### What I do Adding embedding api for [DashScope](https://help.aliyun.com/product/610100.html), which is the DAMO Academy's multilingual text unified vector model based on the LLM base. It caters to multiple mainstream languages worldwide and offers high-quality vector services, helping developers quickly transform text data into high-quality vector data. Currently supported languages include Chinese, English, Spanish, French, Portuguese, Indonesian, and more. #### Who can review? Models - @hwchase17 - @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-11 21:14:20 -07:00
Ben Flast	010d0bfeea	Update MongoDB Atlas support docs (#6022 ) Updating MongoDB Atlas support docs @hwchase17 let me know if you have any questions	2023-06-11 20:57:15 -07:00
Harrison Chase	e05997c25e	Harrison/hologres (#6012 ) Co-authored-by: Changgeng Zhao <changgeng@nyu.edu> Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>	2023-06-11 20:56:51 -07:00
ljeagle	c5bce4a465	add from_documents interface in awadb vector store (#6023 ) added new interface from_documents in awadb vector store @dev2049 --------- Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-11 19:35:03 -07:00
Zander Chase	2c9619bc1d	Remove from PR template (#6018 )	2023-06-11 19:34:26 -07:00
ju-bezdek	18f5c985d9	Langchain decorators (#6017 ) Added description of LangChain Decorators ✨ into the integration section <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-11 19:32:24 -07:00
Zander Chase	a197acfcd3	Update check (#6020 ) We were assigning the name as None in on_chat_model_start then not updating, resulting in a validation error.	2023-06-11 17:59:09 -07:00
Nuno Campos	18af149e91	nc/load (#5733 ) Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-11 15:51:28 -07:00
Zander Chase	614cff89bc	I before E (#6015 )	2023-06-11 15:45:12 -07:00
Harrison Chase	a7227ee01b	Harrison/embaas (#6010 ) Co-authored-by: Julius Lipp <43986145+juliuslipp@users.noreply.github.com>	2023-06-11 13:35:14 -07:00
xu0o0	232faba796	fix: TypeError when loading confluence pages by cql (#5878 ) The Confluence loader uses the wrong API (`Confluence.cql()` provided by `atlassian-python-api`) to load pages by CQL. `Confluence.cql()` is a wrapper of the `/rest/api/search` API which searches for entities in Confluence. To search for pages in Confluence, the loader can use the `/rest/api/content/search` API. #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> #### References ##### Cloud API https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-content/#api-wiki-rest-api-content-search-get https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-search/#api-wiki-rest-api-search-get ##### Server API https://docs.atlassian.com/ConfluenceServer/rest/8.3.1/#api/content-search https://docs.atlassian.com/ConfluenceServer/rest/8.3.1/#api/search	2023-06-11 13:23:22 -07:00
Akhil Vempali	d7d629911b	feat: ✨ Added filtering option to FAISS vectorstore (#5966 ) Inspired by the filtering capability available in ChromaDB, added the same functionality to the FAISS vectorestore as well. Since FAISS does not have an inbuilt method of filtering used the approach suggested in this [thread](https://github.com/facebookresearch/faiss/issues/1079) Langchain Issue inspiration: https://github.com/hwchase17/langchain/issues/4572 - [x] Added filtering capability to semantic similarly and MMR - [x] Added test cases for filtering in `tests/integration_tests/vectorstores/test_faiss.py` #### Who can review? Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049 - @hwchase17	2023-06-11 13:20:03 -07:00
Jiaping(JP) Zhang	6e90406e0f	[APIChain] enhance the robustness or url (#6008 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> I used the APIChain sometimes it failed during the intermediate step when generating the api url and calling the `request` function. After some digging, I found the url sometimes includes the space at the beginning, like `%20https://...api.com` which causes the ` self.requests_wrapper.get` internal function to fail. Including a little string preprocessing `.strip` to remove the space seems to improve the robustness of the APIchain to make sure it can send the request and retrieve the API result more reliably. <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @vowelparrot Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-11 13:13:57 -07:00
Ikko Eltociear Ashimine	c868a3eef3	Update databricks.md (#6006 ) HuggingFace -> Hugging Face #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review?	2023-06-11 13:13:33 -07:00
Harrison Chase	20e9ce8a62	bump version to 197 (#6007 )	2023-06-11 10:14:57 -07:00
Harrison Chase	704d56e241	support kwargs (#5990 )	2023-06-11 10:09:22 -07:00
Mark Pors	b934677a81	Obey handler.raise_error in _ahandle_event_for_handler (#6001 ) Obey `handler.raise_error` in `_ahandle_event_for_handler` Exceptions for async callbacks were only logged as warnings, also when `raise_error = True` #### Who can review? @hwchase17 @agola11	2023-06-11 09:49:26 -07:00
Harrison Chase	2d038b57b2	Harrison/arxiv fix (#5993 ) Co-authored-by: Juanjo do Olmo <87780148+SimplyJuanjo@users.noreply.github.com>	2023-06-11 09:48:09 -07:00
Vincent	0b740c9baa	add ocr_languages param for ConfluenceLoader.load() (#5823 ) @eyurtsev 当Confluence文档内容中包含附件，且附件内容为非英文时，提取出来的文本是乱码的。 When the content of the document contains attachments, and the content of the attachments is not in English, the extracted text is garbled. 这主要是因为没有为pytesseract传递lang参数，默认情况下只支持英文。 This is mainly because lang parameter is not passed to pytesseract, and only English is supported by default. 所以我给ConfluenceLoader.load()添加了ocr_languages参数，以便支持多种语言。 So I added the ocr_languages parameter to ConfluenceLoader.load () to support multiple languages.	2023-06-10 16:51:04 -07:00
Thomas B	ac3e6e3944	Fix IndexError in RecursiveCharacterTextSplitter (#5902 ) Fixes (not reported) an error that may occur in some cases in the RecursiveCharacterTextSplitter. An empty `new_separators` array ([]) would end up in the else path of the condition below and used in a function where it is expected to be non empty. ```python if new_separators is None: ... else: # _split_text() expects this array to be non-empty! other_info = self._split_text(s, new_separators) ``` resulting in an `IndexError` ```python def _split_text(self, text: str, separators: List[str]) -> List[str]: """Split incoming text and return chunks.""" final_chunks = [] # Get appropriate separator to use > separator = separators[-1] E IndexError: list index out of range langchain/text_splitter.py:425: IndexError ``` #### Who can review? @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:48:53 -07:00
Satheesh Valluru	d2270a2261	Fix: Grammer fix in documentation (#5925 ) Fix for grammatical errors in the documentation of `vectorstore`. @vowelparrot	2023-06-10 16:43:36 -07:00
Jens Madsen	1250cd4630	fix: use model token limit not tokenizer ditto (#5939 ) This fixes a token limit bug in the SentenceTransformersTokenTextSplitter. Before the token limit was taken from tokenizer used by the model. However, for some models the token limit of the tokenizer (from `AutoTokenizer.from_pretrained`) does not equal the token limit of the model. This was a false assumption. Therefore, the token limit of the text splitter is now taken from the sentence transformers model token limit. Twitter: @plasmajens #### Before submitting #### Who can review? @hwchase17 and/or @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:36:03 -07:00
Ofer Mendelevitch	f8cf09a230	Update to Vectara integration (#5950 ) This PR updates the Vectara integration (@hwchase17 ): * Adds reuse of requests.session to imrpove efficiency and speed. * Utilizes Vectara's low-level API (instead of standard API) to better match user's specific chunking with LangChain * Now add_texts puts all the texts into a single Vectara document so indexing is much faster. * updated variables names from alpha to lambda_val (to be consistent with Vectara docs) and added n_context_sentence so it's available to use if needed. * Updates to documentation and tests --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 16:27:01 -07:00
qued	e4224a396b	feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955 ) # Unstructured XML Loader Adds an `UnstructuredXMLLoader` class for .xml files. Works with unstructured>=0.6.7. A plain text representation of the text with the XML tags will be available under the `page_content` attribute in the doc. ### Testing ```python from langchain.document_loaders import UnstructuredXMLLoader loader = UnstructuredXMLLoader( "example_data/factbook.xml", ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	2023-06-10 16:24:42 -07:00
Lance Martin	21bd16bb59	Create Airtable loader (#5958 ) Create document loader for Airtable	2023-06-10 15:43:18 -07:00
Harrison Chase	9218684759	Add a new vector store - AwaDB (#5971 ) (#5992 ) Added AwaDB vector store, which is a wrapper over the AwaDB, that can be used as a vector storage and has an efficient similarity search. Added integration tests for the vector store Added jupyter notebook with the example Delete a unneeded empty file and resolve the conflict(https://github.com/hwchase17/langchain/pull/5886) Please check, Thanks! @dev2049 @hwchase17 --------- <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: ljeagle <vincent_jieli@yeah.net> Co-authored-by: vincent <awadb.vincent@gmail.com>	2023-06-10 15:42:32 -07:00
Tomaz Bratanic	d5819a7ca7	Add additional parameters to Graph Cypher Chain (#5979 ) Based on the inspiration from the SQL chain, the following three parameters are added to Graph Cypher Chain. - top_k: Limited the number of results from the database to be used as context - return_direct: Return database results without transforming them to natural language - return_intermediate_steps: Return intermediate steps	2023-06-10 14:39:55 -07:00
Daniel Grittner	0ca37e613c	Fix handling of missing action & input for async MRKL agent (#5985 ) Hi, This is a fix for https://github.com/hwchase17/langchain/pull/5014. This PR forgot to add the ability to self solve the ValueError(f"Could not parse LLM output: {llm_output}") error for `_atake_next_step`.	2023-06-10 14:38:20 -07:00
Harrison Chase	ca1afa7213	add test for structured tools (#5989 )	2023-06-10 14:37:26 -07:00
constDave	5f356b9993	Fixed typo missing "use" (#5991 ) <!-- Fixed a simple typo on https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/vectorstore.html where the word "use" was missing. #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-10 14:31:58 -07:00
Kaarthik Andavar	d6f5d0c6b1	Fix: SnowflakeLoader returning empty documents (#5967 ) Fix SnowflakeLoader's Behavior of Returning Empty Documents Description: This PR addresses the issue where the SnowflakeLoader was consistently returning empty documents. After investigation, it was found that the query method within the SnowflakeLoader was not properly fetching and processing the data. Changes: 1. Modified the query method in SnowflakeLoader to handle data fetch and processing more accurately. 2. Enhanced error handling within the SnowflakeLoader to catch and log potential issues that may arise during data loading. Impact: This fix will ensure the SnowflakeLoader reliably returns the expected documents instead of empty ones, improving the efficiency and reliability of data processing tasks in the LangChain project. Before Fix: `[ Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}), Document(page_content='', metadata={}) ]` After Fix: `[Document(page_content='CUSTOMER_ID: 1\nFIRST_NAME: John\nLAST_NAME: Doe\nEMAIL: john.doe@example.com\nPHONE: 555-123-4567\nADDRESS: 123 Elm St, San Francisco, CA 94102', metadata={}), Document(page_content='CUSTOMER_ID: 2\nFIRST_NAME: Jane\nLAST_NAME: Doe\nEMAIL: jane.doe@example.com\nPHONE: 555-987-6543\nADDRESS: 456 Oak St, San Francisco, CA 94103', metadata={}), Document(page_content='CUSTOMER_ID: 3\nFIRST_NAME: Michael\nLAST_NAME: Smith\nEMAIL: michael.smith@example.com\nPHONE: 555-234-5678\nADDRESS: 789 Pine St, San Francisco, CA 94104', metadata={}), Document(page_content='CUSTOMER_ID: 4\nFIRST_NAME: Emily\nLAST_NAME: Johnson\nEMAIL: emily.johnson@example.com\nPHONE: 555-345-6789\nADDRESS: 321 Maple St, San Francisco, CA 94105', metadata={}), Document(page_content='CUSTOMER_ID: 5\nFIRST_NAME: David\nLAST_NAME: Williams\nEMAIL: david.williams@example.com\nPHONE: 555-456-7890\nADDRESS: 654 Birch St, San Francisco, CA 94106', metadata={}), Document(page_content='CUSTOMER_ID: 6\nFIRST_NAME: Emma\nLAST_NAME: Jones\nEMAIL: emma.jones@example.com\nPHONE: 555-567-8901\nADDRESS: 987 Cedar St, San Francisco, CA 94107', metadata={}), Document(page_content='CUSTOMER_ID: 7\nFIRST_NAME: Oliver\nLAST_NAME: Brown\nEMAIL: oliver.brown@example.com\nPHONE: 555-678-9012\nADDRESS: 147 Cherry St, San Francisco, CA 94108', metadata={}), Document(page_content='CUSTOMER_ID: 8\nFIRST_NAME: Sophia\nLAST_NAME: Davis\nEMAIL: sophia.davis@example.com\nPHONE: 555-789-0123\nADDRESS: 369 Walnut St, San Francisco, CA 94109', metadata={}), Document(page_content='CUSTOMER_ID: 9\nFIRST_NAME: James\nLAST_NAME: Taylor\nEMAIL: james.taylor@example.com\nPHONE: 555-890-1234\nADDRESS: 258 Hawthorn St, San Francisco, CA 94110', metadata={}), Document(page_content='CUSTOMER_ID: 10\nFIRST_NAME: Isabella\nLAST_NAME: Wilson\nEMAIL: isabella.wilson@example.com\nPHONE: 555-901-2345\nADDRESS: 963 Aspen St, San Francisco, CA 94111', metadata={})] ` Tests: All unit and integration tests have been run and passed successfully. Additional tests were added to validate the new behavior of the SnowflakeLoader. Checklist: - [x] Code changes are covered by tests - [x] Code passes `make format` and `make lint` - [x] This PR does not introduce any breaking changes Please review and let me know if any changes are required.	2023-06-10 13:03:50 -07:00
Harrison Chase	62ec10a7f5	bump version to 196 (#5988 )	2023-06-10 09:06:35 -07:00
German Martin	736a1819aa	LOTR: Lord of the Retrievers. A retriever that merge several retrievers together applying document_formatters to them. (#5798 ) "One Retriever to merge them all, One Retriever to expose them, One Retriever to bring them all and in and process them with Document formatters." Hi @dev2049! Here bothering people again! I'm using this simple idea to deal with merging the output of several retrievers into one. I'm aware of DocumentCompressorPipeline and ContextualCompressionRetriever but I don't think they allow us to do something like this. Also I was getting in trouble to get the pipeline working too. Please correct me if i'm wrong. This allow to do some sort of "retrieval" preprocessing and then using the retrieval with the curated results anywhere you could use a retriever. My use case is to generate diff indexes with diff embeddings and sources for a more colorful results then filtering them with one or many document formatters. I saw some people looking for something like this, here: https://github.com/hwchase17/langchain/issues/3991 and something similar here: https://github.com/hwchase17/langchain/issues/5555 This is just a proposal I know I'm missing tests , etc. If you think this is a worth it idea I can work on tests and anything you want to change. Let me know! --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-10 08:41:02 -07:00
Lance Martin	f3e7ac0a2c	Add load() to snowflake loader (#5956 ) Quick fix for recently added [snowflake data loader](https://github.com/hwchase17/langchain/pull/5825/files).	2023-06-09 11:27:29 -07:00
Harrison Chase	3678cba0be	bump ver to 195 (#5949 )	2023-06-09 09:17:08 -07:00
Harrison Chase	7af186fddf	fixes to docs (#5919 )	2023-06-09 09:15:53 -07:00
Kacper Łukawski	7cc200766e	Expose full params in Qdrant (#5947 ) # Expose full params in Qdrant There were many questions regarding supporting some additional parameters in Qdrant integration. Qdrant supports many vector search optimizations that were impossible to use directly in Qdrant before. That includes: 1. Possibility to manipulate collection params while using `Qdrant.from_texts`. The PR allows setting things such as quantization, HNWS config, optimizers config, etc. That makes it consistent with raw `QdrantClient`. 2. Extended options while searching. It includes HNSW options, exact search, score threshold filtering, and read consistency in distributed mode. After merging that PR, #4858 might also be closed. ## Who can review? VectorStores / Retrievers / Memory @dev2049 @hwchase17	2023-06-09 08:56:32 -07:00
Rubén Martínez	db7ef635c0	Add support for the endpoint URL in DynamoDBChatMesasgeHistory (#5836 ) This PR adds the possibility of specifying the endpoint URL to AWS in the DynamoDBChatMessageHistory, so that it is possible to target not only the AWS cloud services, but also a local installation. Specifying the endpoint URL, which is normally not done when addressing the cloud services, is very helpful when targeting a local instance (like [Localstack](https://localstack.cloud/)) when running local tests. Fixes #5835 #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-08 23:21:11 -07:00
Lior	0eb1bc1a02	Fix the issue where the parameters passed to VertexAI ignored #5889 (#5891 ) Fixes #5889 and fixes the name of the argument in init_vertexai @hwchase17 @agola11 Co-authored-by: Lior Durahly <lior.durahly@superwise.ai>	2023-06-08 23:15:22 -07:00
Fei Wang	63fcf41bea	Fix openai proxy error (#5914 ) Fixes proxy error. Since openai does not parse proxy parameters and uses openai.proxy directly, the proxy method needs to be modified. `7610c5adfa/openai/api_requestor.py (LL90)` #### Who can review? @hwchase17 - project lead Models - @hwchase17 - @agola11 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-08 23:15:06 -07:00
felpigeon	2791a753bf	Add start index to metadata in TextSplitter (#5912 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Add start index to metadata in TextSplitter - Modified method `create_documents` to track start position of each chunk - The `start_index` is included in the metadata if the `add_start_index` parameter in the class constructor is set to `True` This enables referencing back to the original document, particularly useful when a specific chunk is retrieved. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @eyurtsev @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 23:09:32 -07:00
Philip Kiely - Baseten	a09a0e3511	Baseten integration (#5862 ) This PR adds a Baseten integration. I've done my best to follow the contributor's guidelines and add docs, an example notebook, and an integration test modeled after similar integrations' test. Please let me know if there is anything I can do to improve the PR. When it is merged, please tag https://twitter.com/basetenco and https://twitter.com/philip_kiely as contributors (the note on the PR template said to include Twitter accounts)	2023-06-08 23:05:57 -07:00
Tamara Lazarevic	0ce8745928	Fix typo (#5894 )	2023-06-08 23:05:22 -07:00
Andrew Grangaard	d8ae925425	arxiv: Correct name of search client attribute to 'arxiv_search' from incorrect 'arxiv_client' (#5917 ) + this private attribute is referenced as `arxiv_search` in internal usage and is set when verifying the environment twitter: @spazm #### Who can review? Any of @hwchase17, @leo-gan, or @bongsang might be interested in reviewing. + Mismatch between `arxiv_client` attribute vs `arxiv_search` in validation and usage is present in the initial commit by @hwchase17. + @leo-gan has made most of the edits. + @bongsang implemented pdf download.	2023-06-08 22:49:11 -07:00
sergiolrinditex	fe8bbc2da7	Create snowflake Loader (#5825 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: rlm <pexpresss31@gmail.com>	2023-06-08 22:03:00 -07:00
Zander Chase	77c286cf02	Use LCP Client in Tracer (#5908 ) Move the LCP calls to the client.	2023-06-08 21:15:14 -07:00
Frank Hübner	3ec6400d70	Feature/add AWS Kendra Index Retriever (#5856 ) adding a new retriever for AWS Kendra @dev2049 please take a look!	2023-06-08 15:44:09 -07:00
Piyush Jain	a6ebffb695	Fixes model arguments for amazon models (#5896 ) Fixes #5713 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @agola11 @aarora79 @rsgrewal-aws	2023-06-08 14:16:01 -07:00
小铭	767fa91eae	Fix the shortcut conflict for document page search (#5874 ) Fix the document page to open both search and Mendable when pressing Ctrl+K. I have changed the shortcut for Mendable to Ctrl+J. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 14:15:19 -07:00
Zander Chase	5f74db4500	Update run eval imports in init (#5858 )	2023-06-08 10:44:36 -07:00
warjiang	511c12dd39	fix: update qa_chain doc for "chai_type" (#5877 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> `load_qa_with_sources_chain` method already support four type of chain, including `map_rerank`. update document to prevent any misunderstandings 😀. ![image](https://github.com/hwchase17/langchain/assets/6478745/325260b2-6121-4900-aef9-001febff811a) <!-- Remove if not applicable --> Fixes # (issue) No, just update document. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-08 07:32:51 -07:00
Harrison Chase	893d20f735	bump version to 194 (#5866 )	2023-06-07 22:47:48 -07:00
Harrison Chase	35cfd25db3	Harrison/nebula graph (#5865 ) Co-authored-by: Wey Gu <weyl.gu@gmail.com> Co-authored-by: chenweisomebody <chenweisomebody@gmail.com>	2023-06-07 21:56:43 -07:00
Harrison Chase	658f8bdee7	Harrison/fauna loader (#5864 ) Co-authored-by: Shadid12 <Shadid12@users.noreply.github.com>	2023-06-07 21:32:23 -07:00
Liang Zhang	5518f24ec3	Implement saving and loading of RetrievalQA chain (#5818 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #3983 Mimicing what we do for saving and loading VectorDBQA chain, I added the logic for RetrievalQA chain. Also added a unit test. I did not find how we test other chains for their saving and loading functionality, so I just added a file with one test case. Let me know if there are recommended ways to test it. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 21:07:13 -07:00
Liang Zhang	b93638ef1e	Refactor and update databricks integration page (#5575 ) # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 20:45:47 -07:00
volodymyr-memsql	a1549901ce	Added SingleStoreDB Vector Store (#5619 ) - Added `SingleStoreDB` vector store, which is a wrapper over the SingleStore DB database, that can be used as a vector storage and has an efficient similarity search. - Added integration tests for the vector store - Added jupyter notebook with the example @dev2049 --------- Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:45:33 -07:00
jjzhuo	78aa59c68b	Fix serialization issue with W&B (#5693 ) The chain input_documents are not displaying properly in W&B, due to serialization issue: <img width="1164" alt="Screenshot 2023-06-04 at 11 58 26 AM" src="https://github.com/hwchase17/langchain/assets/134809928/f31f14f6-0935-4cca-9913-6760cd40eadf"> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:44:59 -07:00
Alec Flett	ec0dd6e34a	propagate callbacks to ConversationalRetrievalChain (#5572 ) # Allow callbacks to monitor ConversationalRetrievalChain <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> I ran into an issue where load_qa_chain was not passing the callbacks down to the child LLM chains, and so made sure that callbacks are propagated. There are probably more improvements to do here but this seemed like a good place to stop. Note that I saw a lot of references to callbacks_manager, which seems to be deprecated. I left that code alone for now. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 20:25:21 -07:00
Jeff Vestal	3294774148	Add knn and query search field options to ElasticKnnSearch (#5641 ) in the `ElasticKnnSearch` class added 2 arguments that were not exposed properly `knn_search` added: - `vector_query_field: Optional[str] = 'vector'` -- vector_query_field: Field name to use in knn search if not default 'vector' `knn_hybrid_search` added: - `vector_query_field: Optional[str] = 'vector'` -- vector_query_field: Field name to use in knn search if not default 'vector' - `query_field: Optional[str] = 'text'` -- query_field: Field name to use in search if not default 'text' Fixes # https://github.com/hwchase17/langchain/issues/5633 cc: @dev2049 @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 20:19:14 -07:00
Mark Marryatt	cef79ca579	Fix exporting GCP Vertex Matching Engine from vectorstores (#5793 ) The Vertex Matching Engine docs include [the line](`b177a29d3f/docs/modules/indexes/vectorstores/examples/matchingengine.ipynb (L32)`) `from langchain.vectorstores import MatchingEngine` which doesn't work as it wasn't added to the vectorestores module exports. - @dev2049	2023-06-07 19:45:33 -07:00
Dave Ingram	106364a45c	Update to Getting Started docs page for Memory (#5855 ) Simply fixing a small typo in the memory page. Also removed an extra code block at the end of the file. Along the way, the current outputs seem to have changed in a few places so left that for posterity, and updated the number of runs which seems harmless, though I can clean that up if preferred.	2023-06-07 19:45:21 -07:00
bnassivet	9355e3f5f5	qdrant vector store - search with relevancy scores (#5781 ) Implementation of similarity_search_with_relevance_scores for quadrant vector store. As implemented the method is also compatible with other capacities such as filtering. Integration tests updated. #### Who can review? Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-06-07 19:26:40 -07:00
Ning Ren	f15763518a	docs: add Shale Protocol integration guide (#5814 ) This PR adds documentation for Shale Protocol's integration with LangChain. [Shale Protocol](https://shaleprotocol.com) provides forever-free production-ready inference APIs to the open-source community. We have global data centers and plan to support all major open LLMs (estimated ~1,000 by 2025). The team consists of software and ML engineers, AI researchers, designers, and operators across North America and Asia. Combined together, the team has 50+ years experience in machine learning, cloud infrastructure, software engineering and product development. Team members have worked at places like Google and Microsoft. #### Who can review? Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11 --------- Co-authored-by: Karen Sheng <46656667+karensheng@users.noreply.github.com>	2023-06-07 19:25:59 -07:00
Duarte OC	137da7e4b6	Update microsoft loader example with docx2txt dependency (#5832 ) @eyurtsev	2023-06-07 19:21:48 -07:00
Aidan Holland	9f4b720a63	Add additional VertexAI Params (#5837 ) ## Changes - Added the `stop` param to the `_VertexAICommon` class so it can be set at llm initialization ## Example Usage ```python VertexAI( # ... temperature=0.15, max_output_tokens=128, top_p=1, top_k=40, stop=["\n```"], ) ``` ## Possible Reviewers - @hwchase17 - @agola11	2023-06-07 19:20:37 -07:00
Eduard van Valkenburg	76fcd96dae	Add logging in PBI tool (#5841 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Add some logging into the powerbi tool so that you can see the queries being sent to PBI and attempts to correct them. <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @vowelparrot <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 19:19:21 -07:00
Matt Robinson	11fec7d4d1	feat: Add `UnstructuredCSVLoader` for CSV files (#5844 ) ### Summary Adds an `UnstructuredCSVLoader` for loading CSVs. One advantage of using `UnstructuredCSVLoader` relative to the standard `CSVLoader` is that if you use `UnstructuredCSVLoader` in `"elements"` mode, an HTML representation of the table will be available in the metadata. #### Who can review? @hwchase17 @eyurtsev	2023-06-07 19:18:01 -07:00
Soos3D	0b4a51930c	Add how to use a custom scraping function with the sitemap loader. (#5847 ) Hi! I just added an example of how to use a custom scraping function with the sitemap loader. I recently used this feature and had to dig in the source code to find it. I thought it might be useful to other devs to have an example in the Jupyter Notebook directly. I only added the example to the documentation page. @eyurtsev I was not able to run the lint. Please let me know if I have to do anything else. I know this is a very small contribution, but I hope it will be valuable. My Twitter handle is @web3Dav3. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-07 19:16:51 -07:00
Yessen Kanapin	c66755b661	Add DeepInfra embeddings integration with tests and examples, better exception handling for Deep Infra LLM (#5854 ) #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead - @agola11 --------- Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	2023-06-07 19:14:30 -07:00
ugfly1210	4d8cda1c3b	FIX: backslash escaped (#5815 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> LatexTextSplitter needs to use "\n\\\chapter" when separators are escaped, such as "\n\\\chapter", otherwise it will report an error: (re.error: bad escape \c at position 1 (line 2, column 1)) Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use re.error: bad escape \c at position 1 (line 2, column 1) See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @hwchase17 @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> Co-authored-by: Pang <ugfly@qq.com>	2023-06-07 16:01:07 -07:00
Zander Chase	3af36943e8	Rm extraneous args to the trace group helper (#5801 ) These are being ignored	2023-06-07 13:09:29 -07:00
whysage	8ef7274ee6	feat: issue-5712 add sleep tool (#5715 ) Fixes # 5712 added sleep tool	2023-06-07 09:39:02 -07:00
Zander Chase	d9fcc45d05	Add in the async methods and link the run id (#5810 )	2023-06-07 08:27:44 -07:00
Harrison Chase	ce7c11625f	bump version to 193 (#5838 )	2023-06-07 07:38:57 -07:00
warjiang	5a207cce8f	fix: fullfill openai params when embedding (#5821 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #5822 I upgrade my langchain lib by execute `pip install -U langchain`, and the verion is 0.0.192。But i found that openai.api_base not working. I use azure openai service as openai backend, the openai.api_base is very import for me. I hava compared tag/0.0.192 and tag/0.0.191, and figure out that: ![image](https://github.com/hwchase17/langchain/assets/6478745/e183fdb2-8224-45c9-b3b4-26d62823999a) openai params is moved inside `_invocation_params` function，and used in some openai invoke: ![image](https://github.com/hwchase17/langchain/assets/6478745/5a55a048-5fa9-4bf4-aaef-3902226bec5e) ![image](https://github.com/hwchase17/langchain/assets/6478745/85b8cebc-eeb8-4538-a525-814719c8f8df) but still some case not covered like: ![image](https://github.com/hwchase17/langchain/assets/6478745/e0297620-f2b2-4f4f-98bd-d0ed19022dac) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-07 07:32:57 -07:00
Harrison Chase	b3ae6bcd3f	bump ver to 192 (#5812 )	2023-06-06 22:23:11 -07:00
Harrison Chase	5468528748	rm docs mongo (#5811 )	2023-06-06 22:22:44 -07:00
Andrew Switlyk	69f4ffb851	Update adding_memory.ipynb (#5806 ) just change "to" to "too" so it matches the above prompt <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-06 22:10:53 -07:00
Sun bin	2be4fbb835	add doc about reusing MongoDBAtlasVectorSearch (#5805 ) DOC: add doc about reusing MongoDBAtlasVectorSearch #### Who can review? Anyone authorized.	2023-06-06 22:10:36 -07:00
bnassivet	062c3c00a2	fixed faiss integ tests (#5808 ) Fixes # 5807 Realigned tests with implementation. Also reinforced folder unicity for the test_faiss_local_save_load test using date-time suffix #### Before submitting - Integration test updated - formatting and linting ok (locally) #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 - project lead VectorStores / Retrievers / Memory -@dev2049	2023-06-06 22:07:27 -07:00
SvMax	92b87c2fec	added support for different types in ResponseSchema class (#5789 ) I added support for specifing different types with ResponseSchema objects: ## before ` extracted_info = ResponseSchema(name="extracted_info", description="List of extracted information") ` generate the following doc: ```json\n{\n\t\"extracted_info\": string // List of extracted information}``` This brings GPT to create a JSON with only one string in the specified field even if you requested a List in the description. ## now `extracted_info = ResponseSchema(name="extracted_info", type="List[string]", description="List of extracted information") ` generate the following doc: ```json\n{\n\t\"extracted_info\": List[string] // List of extracted information}``` This way the model responds better to the prompt generating an array of strings. Tag maintainers/contributors who might be interested: Agents / Tools / Toolkits @vowelparrot Don't know who can be interested, I suppose this is a tool, so I tagged you vowelparrot, anyway, it's a minor change, and shouldn't impact any other part of the framework.	2023-06-06 22:00:48 -07:00
Harrison Chase	3954bcf396	WIP: openai settings (#5792 ) [] need to test more [] make sure they arent saved when serializing [] do for embeddings	2023-06-06 21:57:58 -07:00
Alex Lee	b7999a9bc1	Add UTF-8 json ouput support while langchain.debug is set to True. (#5802 ) Before: <img width="984" alt="image" src="https://github.com/hwchase17/langchain/assets/4317474/2b0807b4-a1d6-4df2-87cc-92b1c8e10534"> After: <img width="992" alt="image" src="https://github.com/hwchase17/langchain/assets/4317474/128c2c7d-2ed5-4c95-954d-b0964c83526a"> Thanks in advance. @agola11	2023-06-06 21:56:33 -07:00
kourosh hakhamaneshi	a0d847f636	[Docs][Hotfix] Fix broken links (#5800 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Some links were broken from the previous merge. This PR fixes them. Tested locally. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2023-06-06 17:17:16 -07:00
Zander Chase	217b5cc72d	Base RunEvaluator Chain (#5750 ) Clean up a bit and only implement the QA and reference free implementations from https://github.com/hwchase17/langchain/pull/5618	2023-06-06 16:42:15 -07:00
Lance Martin	4092fd21dc	YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772 ) This introduces the `YoutubeAudioLoader`, which will load blobs from a YouTube url and write them. Blobs are then parsed by `OpenAIWhisperParser()`, as show in this [PR](https://github.com/hwchase17/langchain/pull/5580), but we extend the parser to split audio such that each chuck meets the 25MB OpenAI size limit. As shown in the notebook, this enables a very simple UX: ``` # Transcribe the video to text loader = GenericLoader(YoutubeAudioLoader([url],save_dir),OpenAIWhisperParser()) docs = loader.load() ``` Tested on full set of Karpathy lecture videos: ``` # Karpathy lecture videos urls = ["https://youtu.be/VMj-3S1tku0" "https://youtu.be/PaCmpygFfXo", "https://youtu.be/TCH_1BHY58I", "https://youtu.be/P6sfmUTpUmc", "https://youtu.be/q8SA3rM6ckI", "https://youtu.be/t3YJ5hKiMQ0", "https://youtu.be/kCc8FmEb1nY"] # Directory to save audio files save_dir = "~/Downloads/YouTube" # Transcribe the videos to text loader = GenericLoader(YoutubeAudioLoader(urls,save_dir),OpenAIWhisperParser()) docs = loader.load() ```	2023-06-06 15:15:08 -07:00
Gengliang Wang	2a4b32dee2	Revise DATABRICKS_API_TOKEN as DATABRICKS_TOKEN (#5796 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> In the [Databricks integration](https://python.langchain.com/en/latest/integrations/databricks.html) and [Databricks LLM](https://python.langchain.com/en/latest/modules/models/llms/integrations/databricks.html), we suggestted users to set the ENV variable `DATABRICKS_API_TOKEN`. However, this is inconsistent with the other Databricks library. To make it consistent, this PR changes the variable from `DATABRICKS_API_TOKEN` to `DATABRICKS_TOKEN` After changes, there is no more `DATABRICKS_API_TOKEN` in the doc ``` $ git grep DATABRICKS_API_TOKEN\|wc -l 0 $ git grep DATABRICKS_TOKEN\|wc -l 8 ``` cc @hwchase17 @dev2049 @mengxr since you have reviewed the previous PRs.	2023-06-06 14:22:49 -07:00
Paul-Emile Brotons	daf3e99b96	fixing from_documents method of the MongoDB Atlas vector store (#5794 ) FIxed a bug in from_documents method --> Collection objects do not implement truth value testing or bool(). @dev2049	2023-06-06 14:22:23 -07:00
Ankush Gola	b177a29d3f	support returning run info for llms, chat models and chains (#5666 ) returning the run id is important for accessing the run later on	2023-06-06 10:07:46 -07:00
Yoann Poupart	65111eb2b3	Attribute support for html tags (#5782 ) # What does this PR do? Change the HTML tags so that a tag with attributes can be found. ## Before submitting - [x] Tests added - [x] CI/CD validated ### Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.	2023-06-06 09:27:37 -07:00
Zander Chase	0cfaa76e45	Set Falsey (#5783 ) Seems natural to try to disable logging by setting `MY_VAR=false` rather than unsetting (especially once you've already set it in the background)	2023-06-06 09:26:38 -07:00
Harrison Chase	2ae2d6cd1d	fix ver 191 (#5784 )	2023-06-06 09:17:23 -07:00
Zander Chase	204a73c1d9	Use client from LCP-SDK (#5695 ) - Remove the client implementation (this breaks backwards compatibility for existing testers. I could keep the stub in that file if we want, but not many people are using it yet - Add SDK as dependency - Update the 'run_on_dataset' method to be a function that optionally accepts a client as an argument - Remove the langchain plus server implementation (you get it for free with the SDK now) We could make the SDK optional for now, but the plan is to use w/in the tracer so it would likely become a hard dependency at some point.	2023-06-06 06:51:05 -07:00
Harrison Chase	08e2352f7b	bump ver 191 (#5766 )	2023-06-05 20:54:08 -07:00
berkedilekoglu	f907b62526	Scores are explained in vectorestore docs (#5613 ) # Scores in Vectorestores' Docs Are Explained Following vectorestores can return scores with similar documents by using `similarity_search_with_score`: - chroma - docarray_hnsw - docarray_in_memory - faiss - myscale - qdrant - supabase - vectara - weaviate However, in documents, these scores were either not explained at all or explained in a way that could lead to misunderstandings (e.g., FAISS). For instance in FAISS document: if we consider the score returned by the function as a similarity score, we understand that a document returning a higher score is more similar to the source document. However, since the scores returned by the function are distance scores, we should understand that smaller scores correspond to more similar documents. For the libraries other than Vectara, I wrote the scores they use by investigating from the source libraries. Since I couldn't be certain about the score metric used by Vectara, I didn't make any changes in its documentation. The links mentioned in Vectara's documentation became broken due to updates, so I replaced them with working ones. VectorStores / Retrievers / Memory - @dev2049 my twitter: [berkedilekoglu](https://twitter.com/berkedilekoglu) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 20:39:49 -07:00
Adil Ansari	233b52735e	feat: Support for `Tigris` Vector Database for vector search (#5703 ) ### Changes - New vector store integration - [Tigris](https://tigrisdata.com) - Adds [tigrisdb](https://pypi.org/project/tigrisdb/) optional dependency - Example notebook demonstrating usage Fixes #5535 Closes tigrisdata/tigris-client-python#40 #### Twitter handles We'd love a shoutout on our [@TigrisData](https://twitter.com/TigrisData) and [@adilansari](https://twitter.com/adilansari) twitter handles #### Who can review? @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 20:39:16 -07:00
Edrick Da Corte Henriquez	38dabdbb3a	Update tutorials.md (#5761 ) # Added an overview of LangChain modules Aimed at introducing newcomers to LangChain's main modules :) Twitter handle is @edrick_dch ## Who can review? @eyurtsev	2023-06-05 20:37:11 -07:00
Ankush Gola	84a46753ab	Tracing Group (#5326 ) Add context manager to group all runs under a virtual parent --------- Co-authored-by: vowelparrot <130414180+vowelparrot@users.noreply.github.com>	2023-06-05 19:18:43 -07:00
Ilya	d5b1608216	fix markdown text splitter horizontal lines (#5625 ) Fixes #5614 #### Issue The `**` combination produces an exception when used as a seperator in `re.split`. Instead `\\\` should be used for regex exprations. #### Who can review? @eyurtsev	2023-06-05 16:40:26 -07:00
Harrison Chase	25487fa5ee	Harrison/youtube multi language (#5758 ) Co-authored-by: rafly lesmana <raflylesmana111@gmail.com>	2023-06-05 16:38:07 -07:00
Shelby Jenkins	2dcda8a8ac	Strips whitespace and \n from loc before filtering urls from sitemap (#5728 ) Fixes #5699 #### Who can review? Tag maintainers/contributors who might be interested: @woodworker @LeSphax @johannhartmann --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 16:33:55 -07:00
Harrison Chase	98dd6d068a	cohere retries (#5757 ) …719) A minor update to retry Cohore API call in case of errors using tenacity as it is done for OpenAI LLMs. #### Who can review? @hwchase17, @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Sagar Sapkota <22609549+sagar-spkt@users.noreply.github.com>	2023-06-05 16:28:58 -07:00
M Waleed Kadous	5124c1e0d9	Add aviary support (#5661 ) Aviary is an open source toolkit for evaluating and deploying open source LLMs. You can find out more about it on [http://github.com/ray-project/aviary). You can try it out at [http://aviary.anyscale.com](aviary.anyscale.com). This code adds support for Aviary in LangChain. To minimize dependencies, it connects directly to the HTTP endpoint. The current implementation is not accelerated and uses the default implementation of `predict` and `generate`. It includes a test and a simple example. @hwchase17 and @agola11 could you have a look at this? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 16:28:42 -07:00
felpigeon	a47c8618ec	Add class attribute "return_generated_question" to class "BaseConversationalRetrievalChain" (#5749 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Adding a class attribute "return_generated_question" to class "BaseConversationalRetrievalChain". If set to `True`, the chain's output has a key "generated_question" with the question generated by the sub-chain `question_generator` as the value. This way the generated question can be logged. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @dev2049 @vowelparrot	2023-06-05 16:10:12 -07:00
Leonid Ganeline	87ad4fc4b2	docs: updated `ecosystem/dependents` (#5753 ) updated `ecosystem/dependents` data (it was updated 2+ weeks ago) #### Who can review? @hwchase17 @eyurtsev @dev2049	2023-06-05 16:09:55 -07:00
Leonid Ganeline	92a5f00ffb	docs: `ecosystem/integrations` update 5 (#5752 ) - added missed integration to `docs/ecosystem/integrations/` - updated notebooks to consistent format: changed titles, file names; added descriptions #### Who can review? @hwchase17 @dev2049	2023-06-05 16:08:55 -07:00
Lance Martin	aea090045b	Create OpenAIWhisperParser for generating Documents from audio files (#5580 ) # OpenAIWhisperParser This PR creates a new parser, `OpenAIWhisperParser`, that uses the [OpenAI Whisper model](https://platform.openai.com/docs/guides/speech-to-text/quickstart) to perform transcription of audio files to text (`Documents`). Please see the notebook for usage.	2023-06-05 15:51:13 -07:00
Hao Chen	a4c9053d40	Integrate Clickhouse as Vector Store (#5650 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> #### Description This PR is mainly to integrate open source version of ClickHouse as Vector Store as it is easy for both local development and adoption of LangChain for enterprises who already have large scale clickhouse deployment. ClickHouse is a open source real-time OLAP database with full SQL support and a wide range of functions to assist users in writing analytical queries. Some of these functions and data structures perform distance operations between vectors, [enabling ClickHouse to be used as a vector database](https://clickhouse.com/blog/vector-search-clickhouse-p1). Recently added ClickHouse capabilities like [Approximate Nearest Neighbour (ANN) indices](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/annindexes) support faster approximate matching of vectors and provide a promising development aimed to further enhance the vector matching capabilities of ClickHouse. In LangChain, some ClickHouse based commercial variant vector stores like [Chroma](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/chroma.py) and [MyScale](https://github.com/hwchase17/langchain/blob/master/langchain/vectorstores/myscale.py), etc are already integrated, but for some enterprises with large scale Clickhouse clusters deployment, it will be more straightforward to upgrade existing clickhouse infra instead of moving to another similar vector store solution, so we believe it's a valid requirement to integrate open source version of ClickHouse as vector store. As `clickhouse-connect` is already included by other integrations, this PR won't include any new dependencies. #### Before submitting <!-- If you're adding a new integration, please include: 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> 1. Added a test for the integration: https://github.com/haoch/langchain/blob/clickhouse/tests/integration_tests/vectorstores/test_clickhouse.py 2. Added an example notebook and document showing its use: * Notebook: https://github.com/haoch/langchain/blob/clickhouse/docs/modules/indexes/vectorstores/examples/clickhouse.ipynb * Doc: https://github.com/haoch/langchain/blob/clickhouse/docs/integrations/clickhouse.md #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @hwchase17 @dev2049 Could you please help review? --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 13:32:04 -07:00
Gustavo Brian	2f2d27fd82	Error in documentation: Chroma constructor (#5731 ) Chroma("langchain_store", embeddings.embed_query) must be Chroma("langchain_store", embeddings)	2023-06-05 13:30:58 -07:00
George Geddes	019eb13681	Fix a typo in the documentation for the Slack document loader (#5745 ) Fixes a typo I noticed while reading the docs.	2023-06-05 13:30:24 -07:00
Andrew Grangaard	450eb91fe2	Removes unnecessary backslash escaping for backticks in python (#5751 ) Fixed python deprecation warning: DeprecationWarning: invalid escape sequence '`' backticks (`) do not have special meaning in python strings and should not be escaped. -- @spazm on twitter ### Who can review: @nfcampos ported this change from javascript, @hwchase17 wrote the original STRUCTURED_FORMAT_INSTRUCTIONS,	2023-06-05 13:30:11 -07:00
Daniel Chalef	0551bc90a5	Zep Hybrid Search (#5742 ) Zep now supports persisting custom metadata with messages and hybrid search across both message embeddings and structured metadata. This PR implements custom metadata and enhancements to the `ZepChatMessageHistory` and `ZepRetriever` classes to implement this support. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049 --------- Co-authored-by: Daniel Chalef <daniel.chalef@private.org>	2023-06-05 12:59:28 -07:00
Tomaz Bratanic	a0ea6f6b6b	Cypher search: Check if generated Cypher is provided in backticks (#5541 ) # Check if generated Cypher code is wrapped in backticks Some LLMs like the VertexAI like to explain how they generated the Cypher statement and wrap the actual code in three backticks: ![Screenshot from 2023-06-01 08-08-23](https://github.com/hwchase17/langchain/assets/19948365/1d8eecb3-d26c-4882-8f5b-6a9bc7e93690) I have observed a similar pattern with OpenAI chat models in a conversational settings, where multiple user and assistant message are provided to the LLM to generate Cypher statements, where then the LLM wants to maybe apologize for previous steps or explain its thoughts. Interestingly, both OpenAI and VertexAI wrap the code in three backticks if they are doing any explaining or apologizing. Checking if the generated cypher is wrapped in backticks seems like a low-hanging fruit to expand the cypher search to other LLMs and conversational settings.	2023-06-05 12:48:13 -07:00
Abhijeet Malamkar	1a9ac3b1f9	Adding support to save multiple memories at a time. Cuts save time by … (#5172 ) # Adding support to save multiple memories at a time. Cuts save time by more then half <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 - VectorStores / Retrievers / Memory - @dev2049 --> @dev2049 @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-05 12:47:48 -07:00
kourosh hakhamaneshi	625717daa8	docs: Added Deploying LLMs into production + a new ecosystem (#4047 ) Signed-off-by: Kourosh Hakhamaneshi <kourosh@anyscale.com> Co-authored-by: Kamil Kaczmarek <kaczmarek.poczta@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-05 12:47:27 -07:00
Ralph Schlosser	74f8e603d9	Addresses GPT4All wrapper model_type attribute issues #5720 . (#5743 ) Fixes #5720. A more in-depth discussion is in my comment here: https://github.com/hwchase17/langchain/issues/5720#issuecomment-1577047018 In a nutshell, there has been a subtle change in the latest version of GPT4Alls Python bindings. The change I submitted yesterday is compatible with this version, however, this version is as of yet unreleased and thus the code change breaks Langchain's wrapper under the currently released version of GPT4All. This pull request proposes a backwards-compatible solution.	2023-06-05 12:45:29 -07:00
Harrison Chase	d0d89d39ef	bump version to 190 (#5704 )	2023-06-04 20:04:50 -07:00
mheguy-stingray	b64c39dfe7	top_k and top_p transposed in vertexai (#5673 ) Fix transposed properties in vertexai model Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-04 16:59:53 -07:00
Tobias Herbold	3fb0e4872a	sqlalchemy MovedIn20Warning declarative_base DEPRICATION fix (#5676 ) fix for the sqlalchemy deprecated declarative_base import : ``` MovedIn20Warning: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (deprecated since: 2.0) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9) Base = declarative_base() # type: Any ``` Import is wrapped in an try catch Block to fallback to the old import if needed. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-04 16:52:52 -07:00
Jens Madsen	8d9e9e013c	refactor: extract token text splitter function (#5179 ) # Token text splitter for sentence transformers The current TokenTextSplitter only works with OpenAi models via the `tiktoken` package. This is not clear from the name `TokenTextSplitter`. In this (first PR) a token based text splitter for sentence transformer models is added. In the future I think we should work towards injecting a tokenizer into the TokenTextSplitter to make ti more flexible. Could perhaps be reviewed by @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-04 14:41:44 -07:00
Nathan Azrak	26ec845921	Raise an exception in MKRL and Chat Output Parsers if parsing text which contains both an action and a final answer (#5609 ) Raises exception if OutputParsers receive a response with both a valid action and a final answer Currently, if an OutputParser receives a response which includes both an action and a final answer, they return a FinalAnswer object. This allows the parser to accept responses which propose an action and hallucinate an answer without the action being parsed or taken by the agent. This PR changes the logic to: 1. store a variable checking whether a response contains the `FINAL_ANSWER_ACTION` (this is the easier condition to check). 2. store a variable checking whether the response contains a valid action 3. if both are present, raise a new exception stating that both are present 4. if an action is present, return an AgentAction 5. if an answer is present, return an AgentAnswer 6. if neither is present, raise the relevant exception based around the action format (these have been kept consistent with the prior exception messages) Disclaimer: * Existing mock data included strings which did include an action and an answer. This might indicate that prioritising returning AgentAnswer was always correct, and I am patching out desired behaviour? @hwchase17 to advice. Curious if there are allowed cases where this is not hallucinating, and we do want the LLM to output an action which isn't taken. * I have not passed `send_to_llm` through this new exception Fixes #5601 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 - project lead @vowelparrot	2023-06-04 14:40:49 -07:00
Lucas Rodrigues	c112d7334d	Update MongoDBChatMessageHistory to create an index on SessionId (#5632 ) All the queries to the database are done based on the SessionId property, this will optimize how Mongo retrieves all messages from a session #### Who can review? Tag maintainers/contributors who might be interested: @dev2049	2023-06-04 14:39:56 -07:00
Jason Weill	6c11f94013	Retitles Bedrock doc to appear in correct alphabetical order in site nav (#5639 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #5638. Retitles "Amazon Bedrock" page to "Bedrock" so that the Integrations section of the left nav is properly sorted in alphabetical order. #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-04 14:39:25 -07:00
Will Smith	6e25e65085	SQL agent : Improved prompt engineering prevents agent guessing database column names. (#5671 ) @vowelparrot: Minor change to the SQL agent: Tells agent to introspect the schema of the most relevant tables, I found this to dramatically decrease the chance that the agent wastes times guessing column names.	2023-06-04 14:39:00 -07:00
Nuhman Pk	8f98592ac9	Added Dependencies Status, Open issues and releases badges in Readme.md (#5681 ) [![Dependency Status](https://img.shields.io/librariesio/github/hwchase17/langchain)](https://libraries.io/github/hwchase17/langchain) [![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues) [![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)	2023-06-04 14:30:52 -07:00
Harrison Chase	b9040669a0	Harrison/pipeline prompt (#5540 ) idea is to make prompts more composable	2023-06-04 14:29:37 -07:00
George Roberts	647210a4b9	Add args_schema to google_places tool (#5680 ) Tiny change to actually add the args_schema to the tool. @vowelparrot	2023-06-04 14:28:46 -07:00
Ralph Schlosser	8fea0529c1	This fixes issue #5651 - GPT4All wrapper loading issue (#5657 ) Fixes #5651 Small typo in wrapper code. Note the `model_type` parameter is currently unused by GPT4All. https://github.com/hwchase17/langchain/issues/5651 #### Who can review?	2023-06-04 07:21:16 -07:00
Jiayao Yu	6a3ceaa377	Support similarity_score_threshold retrieval with Chroma (#5655 ) Fixes https://github.com/hwchase17/langchain/issues/5067 Verified the following code now works correctly: ``` db = Chroma(persist_directory=index_directory(index_name), embedding_function=embeddings) retriever = db.as_retriever(search_type="similarity_score_threshold", search_kwargs={"score_threshold": 0.4}) docs = retriever.get_relevant_documents(query) ```	2023-06-03 16:57:00 -07:00
Hao Chen	3e45b83065	Improve Error Messaging for APOC Procedure Failure in Neo4jGraph (#5547 ) ## Improve Error Messaging for APOC Procedure Failure in Neo4jGraph This commit revises the error message provided when the 'apoc.meta.data()' procedure fails. Previously, the message simply instructed the user to install the APOC plugin in Neo4j. The new error message is more specific. Also removed an unnecessary newline in the Cypher statement variable: `node_properties_query`. Fixes #5545 ## Who can review? - @vowelparrot - @dev2049	2023-06-03 16:56:39 -07:00
Ricardo Reis	33ea606f45	Update youtube.py - Fix metadata validation error in YoutubeLoader (#5479 ) This commit addresses a ValueError occurring when the YoutubeLoader class tries to add datetime metadata from a YouTube video's publish date. The error was happening because the ChromaDB metadata validation only accepts str, int, or float data types. In the `_get_video_info` method of the `YoutubeLoader` class, the publish date retrieved from the YouTube video was of datetime type. This commit fixes the issue by converting the datetime object to a string before adding it to the metadata dictionary. Additionally, this commit introduces error handling in the `_get_video_info` method to ensure that all metadata fields have valid values. If a metadata field is found to be None, a default value is assigned. This prevents potential errors during metadata validation when metadata fields are None. The file modified in this commit is youtube.py. # Your PR Title (What it does) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-03 16:56:17 -07:00
Shuqian	5af2c51e78	refactor: BaseStringMessagePromptTemplate from_template method (#5332 ) # refactor BaseStringMessagePromptTemplate from_template method Refactor the `from_template` method of the `BaseStringMessagePromptTemplate` class to allow passing keyword arguments to the `from_template` method of `PromptTemplate`. Enable the usage of arguments like `template_format`. In my scenario, I intend to utilize Jinja2 for formatting the human message prompt in the chat template. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Models - @hwchase17 - @agola11 - @jonasalexander --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 16:55:58 -07:00
mbchang	d3bdb8ea6d	FileCallbackHandler (#5589 ) # like [StdoutCallbackHandler](https://github.com/hwchase17/langchain/blob/master/langchain/callbacks/stdout.py), but writes to a file When running experiments I have found myself wanting to log the outputs of my chains in a more lightweight way than using WandB tracing. This PR contributes a callback handler that writes to file what `StdoutCallbackHandler` would print. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> ## Example Notebook <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> See the included `filecallbackhandler.ipynb` notebook for usage. Would it be better to include this notebook under `modules/callbacks` or under `integrations/`? ![image](https://github.com/hwchase17/langchain/assets/6439365/c624de0e-343f-4eab-a55b-8808a887489f) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-03 16:48:48 -07:00
rajib	1c51d3db0f	Created fix for 5475 (#5659 ) Created fix for 5475 Currently in PGvector, we do not have any function that returns the instance of an existing store. The from_documents always adds embeddings and then returns the store. This fix is to add a function that will return the instance of an existing store Also changed the jupyter example for PGVector to show the example of using the function <!-- Remove if not applicable --> Fixes # 5475 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? @dev2049 @hwchase17 Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> --------- Co-authored-by: rajib76 <rajib76@yahoo.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 16:47:52 -07:00
Michael Landis	475007d63a	fix: correct momento chat history notebook typo and title (#5646 ) This PR corrects a minor typo in the Momento chat message history notebook and also expands the title from "Momento" to "Momento Chat History", inline with other chat history storage providers. #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? cc @dev2049 who reviewed the original integration	2023-06-03 16:39:27 -07:00
Paul-Emile Brotons	92f218207b	removing client+namespace in favor of collection (#5610 ) removing client+namespace in favor of collection for an easier instantiation and to be similar to the typescript library @dev2049	2023-06-03 16:27:31 -07:00
Harrison Chase	ad09367a92	Harrison/pubmed integration (#5664 ) Co-authored-by: younis basher <71520361+younis-ba@users.noreply.github.com> Co-authored-by: Younis Bashir <younis@omicmd.com>	2023-06-03 16:25:28 -07:00
Harrison Chase	9921f8cc3a	Harrison/update azure nb (#5665 ) Co-authored-by: NEWTON MALLICK <38786893+N-E-W-T-O-N@users.noreply.github.com>	2023-06-03 16:25:08 -07:00
C.J. Jameson	4e71a1702b	nit: pgvector python example notebook, fix variable reference (#5595 ) # Your PR Title (What it does) Fixes the pgvector python example notebook : one of the variables was not referencing anything ## Before submitting ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-06-03 15:29:34 -07:00
Leonid Ganeline	b201cfaa0f	docs `ecosystem/integrations` update 4 (#5590 ) # docs `ecosystem/integrations` update 4 Added missed integrations. Fixed inconsistencies. ## Who can review? @hwchase17 @dev2049	2023-06-03 15:29:03 -07:00
Davis Chase	ae3611730a	handle single arg to and/or (#5637 ) @ryderwishart @eyurtsev thoughts on handling this in the parser itself? related to #5570	2023-06-03 15:18:46 -07:00
khallbobo	934319fc28	Add parameters to send_message() call for vertexai chat models (PaLM2) (#5566 ) # Ensure parameters are used by vertexai chat models (PaLM2) The current version of the google aiplatform contains a bug where parameters for a chat model are not used as intended. See https://github.com/googleapis/python-aiplatform/issues/2263 Params can be passed both to start_chat() and send_message(); however, the parameters passed to start_chat() will not be used if send_message() is called without the overrides. This is due to the defaults in send_message() being global values rather than None (there is code in send_message() which would use the params from start_chat() if the param passed to send_message() evaluates to False, but that won't happen as the defaults are global values). Fixes # 5531 @hwchase17 @agola11	2023-06-03 15:17:38 -07:00
UmerHA	44ad9628c9	QuickFix for FinalStreamingStdOutCallbackHandler: Ignore new lines & white spaces (#5497 ) # Make FinalStreamingStdOutCallbackHandler more robust by ignoring new lines & white spaces `FinalStreamingStdOutCallbackHandler` doesn't work out of the box with `ChatOpenAI`, as it tokenized slightly differently than `OpenAI`. The response of `OpenAI` contains the tokens `["\nFinal", " Answer", ":"]` while `ChatOpenAI` contains `["Final", " Answer", ":"]`. This PR make `FinalStreamingStdOutCallbackHandler` more robust by ignoring new lines & white spaces when determining if the answer prefix has been reached. Fixes #5433 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: Tracing / Callbacks - @agola11 Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) \| Discord: RicChilligerDude#7589	2023-06-03 15:05:58 -07:00
Nathan Azrak	1f4abb265a	Adds the option to pass the original prompt into the AgentExecutor for PlanAndExecute agents (#5401 ) # Adds the option to pass the original prompt into the AgentExecutor for PlanAndExecute agents This PR allows the user to optionally specify that they wish for the original prompt/objective to be passed into the Executor agent used by the PlanAndExecute agent. This solves a potential problem where the plan is formed referring to some context contained in the original prompt, but which is not included in the current prompt. Currently, the prompt format given to the Executor is: ``` System: Respond to the human as helpfully and accurately as possible. You have access to the following tools: <Tool and Action Description> <Output Format Description> Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation:. Thought: Human: <Previous steps> <Current step> ``` This PR changes the final part after `Human:` to optionally insert the objective: ``` Human: <objective> <Previous steps> <Current step> ``` I have given a specific example in #5400 where the context of a database path is lost, since the plan refers to the "given path". The PR has been linted and formatted. So that existing behaviour is not changed, I have defaulted the argument to `False` and added it as the last argument in the signature, so it does not cause issues for any users passing args positionally as opposed to using keywords. Happy to take any feedback or make required changes! Fixes #5400 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @vowelparrot --------- Co-authored-by: Nathan Azrak <nathan.azrak@gmail.com>	2023-06-03 14:59:09 -07:00
Felipe Ferreira	ae2cf1f598	Implements support for Personal Access Token Authentication in the ConfluenceLoader (#5385 ) # Implements support for Personal Access Token Authentication in the ConfluenceLoader Fixes #5191 Implements a new optional parameter for the ConfluenceLoader: `token`. This allows the use of personal access authentication when using the on-prem server version of Confluence. ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @Jflick58 Twitter Handle: felipe_yyc --------- Co-authored-by: Felipe <feferreira@ea.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 14:57:49 -07:00
Gardner Bickford	b81f98b8a6	Update confluence.py to return spaces between elements (#5383 ) # Update confluence.py to return spaces between elements like headers and links. Please see https://stackoverflow.com/questions/48913975/how-to-return-nicely-formatted-text-in-beautifulsoup4-when-html-text-is-across-m Given: ```html <address> 183 Main St<br>East Copper<br>Massachusetts<br>U S A<br> MA 01516-113 </address> ``` The document loader currently returns: ``` '183 Main StEast CopperMassachusettsU S A MA 01516-113' ``` After this change, the document loader will return: ``` 183 Main St East Copper Massachusetts U S A MA 01516-113 ``` @eyurtsev would you prefer this to be an option that can be passed in?	2023-06-03 14:57:25 -07:00
Zeeland	b72401b47b	pref: reduce DB query error rate (#5339 ) # Reduce DB query error rate If you use sql agent of `SQLDatabaseToolkit` to query data, it is prone to errors in query fields and often uses fields that do not exist in database tables for queries. However, the existing prompt does not effectively make the agent aware that there are problems with the fields they query. At this time, we urgently need to improve the prompt so that the agent realizes that they have queried non-existent fields and allows them to use the `schema_sql_db`, that is,` ListSQLDatabaseTool` first queries the corresponding fields in the table in the database, and then uses `QuerySQLDatabaseTool` for querying. There is a demo of my project to show this problem. Original Agent ```python def create_mysql_kit(): db = SQLDatabase.from_uri("mysql+pymysql://xxxxxxx") llm = OpenAI(temperature=0) toolkit = SQLDatabaseToolkit(db=db, llm=llm) agent_executor = create_sql_agent( llm=OpenAI(temperature=0), toolkit=toolkit, verbose=True ) agent_executor.run("Who are the users of sysuser in this system? Tell me the username of all users") if __name__ == '__main__': create_mysql_kit() ``` original output ``` > Entering new AgentExecutor chain... Action: list_tables_sql_db Action Input: "" Observation: app_sysrole_menus, app_bimfacemodel, app_project_users, app_measuringpointdata, auth_user, auth_user_groups, django_apscheduler_djangojobexecution, app_project, app_elementpoint, django_apscheduler_djangojob, django_content_type, app_sysrole, django_admin_log, app_bimfaceaccount, app_measuringpoint_warning_thresholds, app_measuringpoint, app_company, auth_group_permissions, app_sysuser, app_sysuser_companies, app_sysmenu, app_datawarningthreshold, auth_group, auth_permission, app_datawarningrecord, auth_user_user_permissions, app_bimfaceaccount_bimface_models, django_migrations, app_measuringitem, django_session Thought: I should query the app_sysuser table to get the users in the system. Action: query_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: SELECT username FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT name FROM app_sysuser LIMIT 10; Observation: SELECT name FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT name FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'name' in 'field list'") [SQL: SELECT name FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT id, username FROM app_sysuser LIMIT 10; Observation: SELECT id, username FROM app_sysuser LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT id, username FROM app_sysuser LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'id' in 'field list'") [SQL: SELECT id, username FROM app_sysuser LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT user_id, username FROM app_sysuser LIMIT 10; Observation: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10; Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT user_id, username FROM app_sysuser WHERE user_id IS NOT NULL AND user_id BETWEEN 0 AND 10 UNION ALL SELECT user_id, username FROM app_sysuser WHERE username IS NOT NULL ORDER BY user_id ASC LIMIT 10;] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should double check my query before executing it. Action: query_checker_sql_db Action Input: SELECT user_id, user_name FROM app_sysuser LIMIT 10; Observation: SELECT user_id, user_name FROM app_sysuser WHERE user_id NOT IN (NULL) AND user_name BETWEEN 'A' AND 'Z' UNION ALL SELECT user_id, user_name FROM app_sysuser WHERE user_id IS NOT NULL AND user_name LIKE '%a%' LIMIT 10; Thought: The query looks correct, so I should execute it. Action: query_sql_db Action Input: SELECT user_id, user_name FROM app_sysuser WHERE user_id NOT IN (NULL) AND user_name BETWEEN 'A' AND 'Z' UNION ALL SELECT user_id, user_name FROM app_sysuser WHERE user_id IS NOT NULL AND user_name LIKE '%a%' LIMIT 10; Observation: [] Thought: I now know the final answer. Final Answer: There are no users in the sysuser table. > Finished chain. Process finished with exit code 0 ``` As you can see, agent always use incorrect sql to query the data. Actually, the username field name of my `app_sysuser` table is `user_name`, not the `username` believed by the agent. Optimized agent output ``` > Entering new AgentExecutor chain... Action: list_tables_sql_db Action Input: "" Observation: app_project_users, django_content_type, app_measuringpoint_warning_thresholds, app_bimfaceaccount_bimface_models, django_session, app_sysuser, django_migrations, app_bimfacemodel, app_elementpoint, app_measuringpoint, django_apscheduler_djangojobexecution, auth_permission, app_sysuser_companies, app_measuringpointdata, app_measuringitem, app_sysrole_menus, django_admin_log, auth_group, django_apscheduler_djangojob, app_sysmenu, app_project, app_bimfaceaccount, app_datawarningthreshold, app_datawarningrecord, auth_user_groups, auth_user_user_permissions, auth_group_permissions, auth_user, app_company, app_sysrole Thought: I should query the app_sysuser table to get the usernames of all the users. Action: query_sql_db Action Input: SELECT username FROM app_sysuser Observation: Error: (pymysql.err.OperationalError) (1054, "Unknown column 'username' in 'field list'") [SQL: SELECT username FROM app_sysuser] (Background on this error at: https://sqlalche.me/e/20/e3q8) Thought: I should use the schema_sql_db tool to get the correct table fields. Action: schema_sql_db Action Input: app_sysuser Observation: CREATE TABLE app_sysuser ( user_id INTEGER NOT NULL AUTO_INCREMENT, user_account VARCHAR(50) COLLATE utf8mb4_general_ci NOT NULL, user_password VARCHAR(255) COLLATE utf8mb4_general_ci NOT NULL, user_name VARCHAR(50) COLLATE utf8mb4_general_ci NOT NULL, user_avatar VARCHAR(255) COLLATE utf8mb4_general_ci, user_email VARCHAR(254) COLLATE utf8mb4_general_ci, user_phone VARCHAR(20) COLLATE utf8mb4_general_ci, user_create_time DATETIME(6) NOT NULL, user_update_time DATETIME(6) NOT NULL, user_status SMALLINT, role_id_id INTEGER, PRIMARY KEY (user_id), CONSTRAINT app_sysuser_role_id_id_51b50f39_fk_app_sysrole_role_id FOREIGN KEY(role_id_id) REFERENCES app_sysrole (role_id) )COLLATE utf8mb4_general_ci DEFAULT CHARSET=utf8mb4 ENGINE=InnoDB /* 3 rows from app_sysuser table: user_id user_account user_password user_name user_avatar user_email user_phone user_create_time user_update_time user_status role_id_id xxxxxxxxxxxxxx */ Thought: I should query the app_sysuser table to get the usernames of all the users. Action: query_sql_db Action Input: SELECT user_account FROM app_sysuser LIMIT 10 Observation: [('baiyun',), ('eatrice',), ('lisi',), ('pingxiang',), ('wangwu',), ('zeeland',), ('zsj',), ('zzw',)] Thought: I now know the final answer Final Answer: The usernames of the users in the sysuser table are baiyun, eatrice, lisi, pingxiang, wangwu, zeeland, zsj, and zzw. > Finished chain. Process finished with exit code 0 ``` I have tested about 10 related prompts and they all work properly, with a much lower error rate compared to before ## Who can review? @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-03 14:55:56 -07:00
mbchang	ce6dbe41a9	minor refactor GenerativeAgentMemory (#5315 ) # minor refactor of GenerativeAgentMemory <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> - refactor `format_memories_detail` to be more reusable - modified prompts for getting topics for reflection and for generating insights - update `characters.ipynb` to reflect changes ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 --> @vowelparrot @hwchase17 @dev2049	2023-06-03 14:53:14 -07:00
Leonid Ganeline	95c6ed0568	docs: `modules` pages simplified (#5116 ) # docs: modules pages simplified Fixied #5627 issue Merged several repetitive sections in the `modules` pages. Some texts, that were hard to understand, were also simplified. ## Who can review? @hwchase17 @dev2049	2023-06-03 14:44:32 -07:00
Chandan Routray	bc875a9df1	Fixed multi input prompt for MapReduceChain (#4979 ) # Fixed multi input prompt for MapReduceChain Added `kwargs` support for inner chains of `MapReduceChain` via `from_params` method Currently the `from_method` method of intialising `MapReduceChain` chain doesn't work if prompt has multiple inputs. It happens because it uses `StuffDocumentsChain` and `MapReduceDocumentsChain` underneath, both of them require specifying `document_variable_name` if `prompt` of their `llm_chain` has more than one `input`. With this PR, I have added support for passing their respective `kwargs` via the `from_params` method. ## Fixes https://github.com/hwchase17/langchain/issues/4752 ## Who can review? @dev2049 @hwchase17 @agola11 --------- Co-authored-by: imeckr <chandanroutray2012@gmail.com>	2023-06-03 14:41:03 -07:00
Matt Robinson	a97e4252e3	feat: add `UnstructuredExcelLoader` for `.xlsx` and `.xls` files (#5617 ) # Unstructured Excel Loader Adds an `UnstructuredExcelLoader` class for `.xlsx` and `.xls` files. Works with `unstructured>=0.6.7`. A plain text representation of the Excel file will be available under the `page_content` attribute in the doc. If you use the loader in `"elements"` mode, an HTML representation of the Excel file will be available under the `text_as_html` metadata key. Each sheet in the Excel document is its own document. ### Testing ```python from langchain.document_loaders import UnstructuredExcelLoader loader = UnstructuredExcelLoader( "example_data/stanley-cups.xlsx", mode="elements" ) docs = loader.load() ``` ## Who can review? @hwchase17 @eyurtsev	2023-06-03 12:44:12 -07:00
Leonid Ganeline	9a7488a5ce	fix import issue (#5636 ) # fix for the import issue Added document loader classes from [`figma`, `iugu`, `onedrive_file`] to `document_loaders/__inti__.py` imports Also sorted `__all__` Fixed #5623 issue	2023-06-02 14:58:41 -07:00
Zander Chase	20ec1173f4	Update Tracer Auth / Reduce Num Calls (#5517 ) Update the session creation and calls --------- Co-authored-by: Ankush Gola <ankush.gola@gmail.com>	2023-06-02 12:13:56 -07:00
Sean Morgan	949729ff5c	Fix bedrock llm boto3 client instantiation (#5629 ) Same issue as https://github.com/hwchase17/langchain/pull/5574	2023-06-02 12:04:49 -07:00
Caleb Ellington	c5a7a85a4e	fix chroma update_document to embed entire documents, fixes a characer-wise embedding bug (#5584 ) # Chroma update_document full document embeddings bugfix Chroma update_document takes a single document, but treats the page_content sting of that document as a list when getting the new document embedding. This is a two-fold problem, where the resulting embedding for the updated document is incorrect (it's only an embedding of the first character in the new page_content) and it calls the embedding function for every character in the new page_content string, using many tokens in the process. Fixes #5582 Co-authored-by: Caleb Ellington <calebellington@Calebs-MBP.hsd1.ca.comcast.net>	2023-06-02 11:12:48 -07:00
Davis Chase	3c6fa9126a	bump 189 (#5620 )	2023-06-02 09:09:22 -07:00
Davis Chase	d784401215	Dev2049/add argilla callback (#5621 ) Co-authored-by: Alvaro Bartolome <alvarobartt@gmail.com> Co-authored-by: Daniel Vila Suero <daniel@argilla.io> Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com> Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>	2023-06-02 09:05:06 -07:00
Kacper Łukawski	71a7c16ee0	Fix: Qdrant ids (#5515 ) # Fix Qdrant ids creation There has been a bug in how the ids were created in the Qdrant vector store. They were previously calculated based on the texts. However, there are some scenarios in which two documents may have the same piece of text but different metadata, and that's a valid case. Deduplication should be done outside of insertion. It has been fixed and covered with the integration tests. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-02 08:57:34 -07:00
Jeff Vestal	d1f65d8dc1	Es knn index search 5346 (#5569 ) # Create elastic_vector_search.ElasticKnnSearch class This extends `langchain/vectorstores/elastic_vector_search.py` by adding a new class `ElasticKnnSearch` Features: - Allow creating an index with the `dense_vector` mapping compataible with kNN search - Store embeddings in index for use with kNN search (correct mapping creates HNSW data structure) - Perform approximate kNN search - Perform hybrid BM25 (`query{}`) + kNN (`knn{}`) search - perform knn search by either providing a `query_vector` or passing a hosted `model_id` to use query_vector_builder to automatically generate a query_vector at search time Connection options - Using `cloud_id` from Elastic Cloud - Passing elasticsearch client object search options - query - k - query_vector - model_id - size - source - knn_boost (hybrid search) - query_boost (hybrid search) - fields This also adds examples to `docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb` Fixes # [5346](https://github.com/hwchase17/langchain/issues/5346) cc: @dev2049 --> --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-02 08:40:35 -07:00
Davis Chase	8b3df18bcc	human approval callback (#5581 ) ![Screenshot 2023-06-01 at 2 39 40 PM](https://github.com/hwchase17/langchain/assets/130488702/769f1480-7e51-46d9-bcde-698d0b091803)	2023-06-02 06:59:33 -07:00
Zander Chase	6655f43282	Rm Template Title (#5616 ) Remove the redundant title from the PR template #### Before submitting	2023-06-02 06:54:55 -07:00
Bharat Ramanathan	28d6277396	docs(integration): update colab and external links in WandbTracing docs (#5602 ) # Update Wandb Tracking documentation This PR updates the Wandb Tracking documentation for formatting, updated broken links and colab notebook links --------- Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>	2023-06-02 02:58:42 -07:00
Waldecir Santos	db45970a66	Fix SQLAlchemy truncating text when it is too big (#5206 ) # Fixes SQLAlchemy truncating the result if you have a big/text column with many chars. SQLAlchemy truncates columns if you try to convert a Row or Sequence to a string directly For comparison: - Before: ```[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ... (2 characters truncated) ... hat is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]``` - After: ```[('Harrison', 'That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio That is my Bio ')]``` ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: I'm not sure who to tag for chains, maybe @vowelparrot ?	2023-06-01 21:33:31 -04:00
Davis Chase	4c572ffe95	nit (#5578 )	2023-06-01 14:21:15 -07:00
sseide	001b147450	Documentation fixes (linting and broken links) (#5563 ) # Lint sphinx documentation and fix broken links This PR lints multiple warnings shown in generation of the project documentation (using "make docs_linkcheck" and "make docs_build"). Additionally documentation internal links to (now?) non-existent files are modified to point to existing documents as it seemed the new correct target. The documentation is not updated content wise. There are no source code changes. Fixes # (issue) - broken documentation links to other files within the project - sphinx formatting (linting) ## Before submitting No source code changes, so no new tests added. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 13:06:17 -07:00
Sean Morgan	8441cff1d7	Fix bedrock auth validation (#5574 ) https://github.com/hwchase17/langchain/pull/5523 has a small bug if client was not passed in constructor	2023-06-01 12:35:06 -07:00
Andrew Lei	6258f72a00	Add missing comma in conv chat agent prompt json (#5573 ) # Add missing comma in conversational chat agent prompt json Inspired by: https://github.com/hwchase17/langchainjs/pull/1498	2023-06-01 12:12:44 -07:00
Ikko Eltociear Ashimine	14a611775c	Fix typo in docugami.ipynb (#5571 ) # Fix typo in docugami.ipynb Fixed typo. infromation -> information	2023-06-01 11:45:56 -07:00
Blithe	80b3fdf2f7	make the elasticsearch api support version which below 8.x (#5495 ) the api which create index or search in the elasticsearch below 8.x is different with 8.x. When use the es which below 8.x , it will throw error. I fix the problem Co-authored-by: gaofeng27692 <gaofeng27692@hundsun.com>	2023-06-01 10:58:20 -07:00
Davis Chase	6632188606	bump 188 (#5568 )	2023-06-01 08:50:54 -07:00
Davis Chase	6afb463e9b	Qdrant self query (#5567 ) Add self query abilities to qdrant vectorstore	2023-06-01 08:40:31 -07:00
Patrick Keane	47c2ec2d0b	Corrects inconsistently misspelled variable name. (#5559 ) Corrects a spelling error (of the word separator) in several variable names. Three cut/paste instances of this were corrected, amidst instances of it also being named properly, which would likely would lead to issues for someone in the future. Here is one such example: ``` seperators = self.get_separators_for_language(Language.PYTHON) super().__init__(separators=seperators, kwargs) ``` becomes ``` separators = self.get_separators_for_language(Language.PYTHON) super().__init__(separators=separators, kwargs) ``` Make test results below: ``` ============================== 708 passed, 52 skipped, 27 warnings in 11.70s ============================== ```	2023-06-01 10:27:58 -04:00
Harrison Chase	342b671d05	add brave search util (#5538 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 01:11:51 -07:00
Davis Chase	983a213bdc	add maxcompute (#5533 ) cc @pengwork (fresh branch, no creds)	2023-06-01 00:54:42 -07:00
Bharat Ramanathan	22603d19e0	feat(integrations): Add WandbTracer (#4521 ) # WandbTracer This PR adds the `WandbTracer` and deprecates the existing `WandbCallbackHandler`. Added an example notebook under the docs section alongside the `LangchainTracer` Here's an example [colab](https://colab.research.google.com/drive/1pY13ym8ENEZ8Fh7nA99ILk2GcdUQu0jR?usp=sharing) with the same notebook and the [trace](https://wandb.ai/parambharat/langchain-tracing/runs/8i45cst6) generated from the colab run Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-06-01 00:01:19 -07:00
Leonid Ganeline	373ad49157	docs `ecosystem/integrations` update 3 (#5470 ) # docs: `ecosystem_integrations` update 3 Next cycle of updating the `ecosystem/integrations` * Added an integration `template` file * Added missed integration files * Fixed several document_loaders/notebooks ## Who can review? Is it possible to assign somebody to review PRs on docs? Thanks.	2023-05-31 17:54:05 -07:00
Aditi Viswanathan	bc66b3fb8d	make BaseEntityStore inherit from BaseModel (#5478 ) # Make BaseEntityStore inherit from BaseModel This enables initializing InMemoryEntityStore by optionally passing in a value for the store field. ## Who can review? It's a small change so I think any of the reviewers can review, but tagging @dev2049 who seems most relevant since the change relates to Memory.	2023-05-31 17:32:19 -07:00
Sheng Han Lim	3bae595182	Add texts with embeddings to PGVector wrapper (#5500 ) Similar to #1813 for faiss, this PR is to extend functionality to pass text and its vector pair to initialize and add embeddings to the PGVector wrapper. Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @dev2049	2023-05-31 17:31:52 -07:00
Tobias van der Werff	8d07ba0d51	Fix wrong class instantiation in docs MMR example (#5501 ) # Fix wrong class instantiation in docs MMR example <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> When looking at the Maximal Marginal Relevance ExampleSelector example at https://python.langchain.com/en/latest/modules/prompts/example_selectors/examples/mmr.html, I noticed that there seems to be an error. Initially, the `MaxMarginalRelevanceExampleSelector` class is used as an `example_selector` argument to the `FewShotPromptTemplate` class. Then, according to the text, a comparison is made to regular similarity search. However, the `FewShotPromptTemplate` still uses the `MaxMarginalRelevanceExampleSelector` class, so the output is the same. To fix it, I added an instantiation of the `SemanticSimilarityExampleSelector` class, because this seems to be what is intended. ## Who can review? @hwchase17	2023-05-31 17:30:59 -07:00
Taras Tsugrii	b61f50665e	[retrievers][knn] Replace loop appends with list comprehension. (#5529 ) # Replace loop appends with list comprehension. It's much faster, more idiomatic and slightly more readable.	2023-05-31 16:57:24 -07:00
Taras Tsugrii	0ad76c3380	Replace loop appends with list comprehension. (#5528 ) # Replace loop appends with list comprehension. It's significantly faster because it avoids repeated method lookup. It's also more idiomatic and readable.	2023-05-31 16:56:13 -07:00
Timothy Ji	bd9e0f3934	Add param requests_kwargs for WebBaseLoader (#5485 ) # Add param `requests_kwargs` for WebBaseLoader Fixes # (issue) #5483 ## Who can review? @eyurtsev	2023-05-31 15:27:38 -07:00
Taras Tsugrii	359fb8fa3a	Replace list comprehension with generator. (#5526 ) # Replace list comprehension with generator. Since these strings can be fairly long, it's best to not construct unnecessary temporary list just to pass it to `join`. Generators produce items one-by-one and even though they are slightly more expensive than lists in terms of CPU they are much more memory-friendly and slightly more readable.	2023-05-31 15:10:43 -07:00
Matt Robinson	4c8aad0d1b	docs: unstructured no longer requires installing detectron2 from source (#5524 ) # Update Unstructured docs to remove the `detectron2` install instructions Removes `detectron2` installation instructions from the Unstructured docs because installing `detectron2` is no longer required for `unstructured>=0.7.0`. The `detectron2` model now runs using the ONNX runtime. ## Who can review? @hwchase17 @eyurtsev	2023-05-31 15:03:21 -07:00
Rithwik Ediga Lakhamsani	d765d77e9b	Add minor fixes for PySpark Document Loader Docs (#5525 ) # Add minor fixes for PySpark Document Loader Docs Renamed "PySpack" to "PySpark" and executed the notebook to show outputs.	2023-05-31 15:02:57 -07:00
Taras Tsugrii	af41cdfc8b	Replace enumerate with zip. (#5527 ) # Replace enumerate with zip. It's more idiomatic and slightly more readable.	2023-05-31 15:02:23 -07:00
James O'Dwyer	226a7521ed	Add Managed Motorhead (#5507 ) # Add Managed Motorhead This change enabled MotorheadMemory to utilize Metal's managed version of Motorhead. We can easily enable this by passing in a `api_key` and `client_id` in order to hit the managed url and access the memory api on Metal. Twitter: [@softboyjimbo](https://twitter.com/softboyjimbo) ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 14:55:41 -07:00
Piyush Jain	5ffa924488	Skips creating boto client for Bedrock if passed in constructor (#5523 ) # Skips creating boto client if passed in constructor Current LLM and Embeddings class always creates a new boto client, even if one is passed in a constructor. This blocks certain users from passing in externally created boto clients, for example in SSO authentication. ## Who can review? @hwchase17 @jasondotparse @rsgrewal-aws <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-31 14:54:12 -07:00
Leonid Ganeline	6b47aaab82	added DeepLearing.AI course link (#5518 ) # added DeepLearing.AI course link ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: not @hwchase17 - hehe	2023-05-31 14:53:14 -07:00
Víctor Navarro Aránguiz	f39340ff6b	Add allow_download as attribute for GPT4All (#5512 ) # Added support for download GPT4All model if does not exist I've include the class attribute `allow_download` to the GPT4All class. By default, `allow_download` is set to False. ## Changes Made - Added a new attribute `allow_download` to the GPT4All class. - Updated the `validate_environment` method to pass the `allow_download` parameter to the GPT4All model constructor. ## Context This change provides more control over model downloading in the GPT4All class. Previously, if the model file was not found in the cache directory `~/.cache/gpt4all/`, the package returned error "Failed to retrieve model (type=value_error)". Now, if `allow_download` is set as True then it will use GPT4All package to download it . With the addition of the `allow_download` attribute, users can now choose whether the wrapper is allowed to download the model or not. ## Dependencies There are no new dependencies introduced by this change. It only utilizes existing functionality provided by the GPT4All package. ## Testing Since this is a minor change to the existing behavior, the existing test suite for the GPT4All package should cover this scenario Co-authored-by: Vokturz <victornavarrrokp47@gmail.com>	2023-05-31 13:32:31 -07:00
Zander Chase	ea09c0846f	Add Feedback Methods + Evaluation examples (#5166 ) Add CRUD methods to interact with feedback endpoints + added eval examples to the notebook	2023-05-31 11:14:27 -07:00
Davis Chase	46b7181f13	bump 187 (#5504 )	2023-05-31 07:35:09 -07:00
Harrison Chase	f0ea77b230	add more vars to text splitter (#5503 )	2023-05-31 07:21:20 -07:00
Piyush Jain	562fdfc8f9	Bedrock llm and embeddings (#5464 ) # Bedrock LLM and Embeddings This PR adds a new LLM and an Embeddings class for the [Bedrock](https://aws.amazon.com/bedrock) service. The PR also includes example notebooks for using the LLM class in a conversation chain and embeddings usage in creating an embedding for a query and document. Note: AWS is doing a private release of the Bedrock service on 05/31/2023; users need to request access and added to an allowlist in order to start using the Bedrock models and embeddings. Please use the [Bedrock Home Page](https://aws.amazon.com/bedrock) to request access and to learn more about the models available in Bedrock. <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @vowelparrot VectorStores / Retrievers / Memory - @dev2049 -->	2023-05-31 07:17:01 -07:00
Harrison Chase	5ce74b5958	code splitter docs (#5480 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 07:11:53 -07:00
Harrison Chase	470b2822a3	Add matching engine vectorstore (#3350 ) Co-authored-by: Tom Piaggio <tomaspiaggio@google.com> Co-authored-by: scafati98 <jupyter@matchingengine.us-central1-a.c.scafati-joonix.internal> Co-authored-by: scafati98 <scafatieugenio@gmail.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 02:28:02 -07:00
Kacper Łukawski	8bcaca435a	Feature: Qdrant filters supports (#5446 ) # Support Qdrant filters Qdrant has an [extensive filtering system](https://qdrant.tech/documentation/concepts/filtering/) with rich type support. This PR makes it possible to use the filters in Langchain by passing an additional param to both the `similarity_search_with_score` and `similarity_search` methods. ## Who can review? @dev2049 @hwchase17 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-31 02:26:16 -07:00
Harrison Chase	f72bb966f8	Harrison/html splitter (#5468 ) Co-authored-by: David Revillas <26328973+r3v1@users.noreply.github.com>	2023-05-30 21:06:07 -07:00
Ankush Gola	1671c2afb2	py tracer fixes (#5377 )	2023-05-30 18:47:06 -07:00
Jose Ignacio Hervás Díaz	ce8b7a2a69	SQLite-backed Entity Memory (#5129 ) # SQLite-backed Entity Memory Following the initiative of https://github.com/hwchase17/langchain/pull/2397 I think it would be helpful to be able to persist Entity Memory on disk by default Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 18:39:47 -07:00
Jeff Vestal	46e181aa8b	Allow ElasticsearchEmbeddings to create a connection with ES Client object (#5321 ) This PR adds a new method `from_es_connection` to the `ElasticsearchEmbeddings` class allowing users to use Elasticsearch clusters outside of Elastic Cloud. Users can create an Elasticsearch Client object and pass that to the new function. The returned object is identical to the one returned by calling `from_credentials` ``` # Create Elasticsearch connection es_connection = Elasticsearch( hosts=['https://es_cluster_url:port'], basic_auth=('user', 'password') ) # Instantiate ElasticsearchEmbeddings using es_connection embeddings = ElasticsearchEmbeddings.from_es_connection( model_id, es_connection, ) ``` I also added examples to the elasticsearch jupyter notebook Fixes # https://github.com/hwchase17/langchain/issues/5239 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 17:26:30 -07:00
Mark Pors	0a44bfdca3	Allow for async use of SelfAskWithSearchChain (#5394 ) # Allow for async use of SelfAskWithSearchChain Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 17:02:39 -07:00
Víctor Navarro Aránguiz	8121e04200	added n_threads functionality for gpt4all (#5427 ) # Added support for modifying the number of threads in the GPT4All model I have added the capability to modify the number of threads used by the GPT4All model. This allows users to adjust the model's parallel processing capabilities based on their specific requirements. ## Changes Made - Updated the `validate_environment` method to set the number of threads for the GPT4All model using the `values["n_threads"]` parameter from the `GPT4All` class constructor. ## Context Useful in scenarios where users want to optimize the model's performance by leveraging multi-threading capabilities. Please note that the `n_threads` parameter was included in the `GPT4All` class constructor but was previously unused. This change ensures that the specified number of threads is utilized by the model . ## Dependencies There are no new dependencies introduced by this change. It only utilizes existing functionality provided by the GPT4All package. ## Testing Since this is a minor change testing is not required. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:31:30 -07:00
Blithe	e31705b5ab	convert the parameter 'text' to uppercase in the function 'parse' of the class BooleanOutputParser (#5397 ) when the LLMs output 'yes\|no'，BooleanOutputParser can parse it to 'True\|False', fix the ValueError in parse(). <!-- when use the BooleanOutputParser in the chain_filter.py, the LLMs output 'yes\|no'，the function 'parse' will throw ValueError。 --> Fixes # (issue) #5396 https://github.com/hwchase17/langchain/issues/5396 --------- Co-authored-by: gaofeng27692 <gaofeng27692@hundsun.com>	2023-05-30 16:26:17 -07:00
Natalie	199cc700a3	Ability to specify credentials wihen using Google BigQuery as a data loader (#5466 ) # Adds ability to specify credentials when using Google BigQuery as a data loader Fixes #5465 . Adds ability to set credentials which must be of the `google.auth.credentials.Credentials` type. This argument is optional and will default to `None. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:25:22 -07:00
Harrison Chase	eab4b4ccd7	add simple test for imports (#5461 ) Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:24:27 -07:00
Janos Tolgyesi	1111f18eb4	Add maximal relevance search to SKLearnVectorStore (#5430 ) # Add maximal relevance search to SKLearnVectorStore This PR implements the maximum relevance search in SKLearnVectorStore. Twitter handle: jtolgyesi (I submitted also the original implementation of SKLearnVectorStore) ## Before submitting Unit tests are included. Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 16:13:33 -07:00
Ayan Bandyopadhyay	8181f9e362	Update psychicapi version (#5471 ) Update [psychicapi](https://pypi.org/project/psychicapi/) python package dependency to the latest version 0.5. The newest python package version addresses breaking changes in the Psychic http api.	2023-05-30 15:55:22 -07:00
Kacper Łukawski	f93d256190	Feat: Add batching to Qdrant (#5443 ) # Add batching to Qdrant Several people requested a batching mechanism while uploading data to Qdrant. It is important, as there are some limits for the maximum size of the request payload, and without batching implemented in Langchain, users need to implement it on their own. This PR exposes a new optional `batch_size` parameter, so all the documents/texts are loaded in batches of the expected size (64, by default). The integration tests of Qdrant are extended to cover two cases: 1. Documents are sent in separate batches. 2. All the documents are sent in a single request.	2023-05-30 15:33:54 -07:00
Camille Van Hoffelen	80e133f16d	Added async _acall to FakeListLLM (#5439 ) # Added Async _acall to FakeListLLM FakeListLLM is handy when unit testing apps built with langchain. This allows the use of FakeListLLM inside concurrent code with [asyncio](https://docs.python.org/3/library/asyncio.html). I also changed the pydocstring which was out of date. ## Who can review? @hwchase17 - project lead @agola11 - async	2023-05-30 14:34:36 -07:00
Leonid Ganeline	1f11f80641	docs: cleaning (#5413 ) # docs cleaning Changed docs to consistent format (probably, we need an official doc integration template): - ClearML - added product descriptions; changed title/headers - Rebuff - added product descriptions; changed title/headers - WhyLabs - added product descriptions; changed title/headers - Docugami - changed title/headers/structure - Airbyte - fixed title - Wolfram Alpha - added descriptions, fixed title - OpenWeatherMap - - added product descriptions; changed title/headers - Unstructured - changed description ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 @dev2049	2023-05-30 13:58:16 -07:00
Matt Wells	1d861dc37a	MRKL output parser no longer breaks well formed queries (#5432 ) # Handles the edge scenario in which the action input is a well formed SQL query which ends with a quoted column There may be a cleaner option here (or indeed other edge scenarios) but this seems to robustly determine if the action input is likely to be a well formed SQL query in which we don't want to arbitrarily trim off `"` characters Fixes #5423 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Agents / Tools / Toolkits - @vowelparrot	2023-05-30 15:58:47 -04:00
Yoann Poupart	c1807d8408	`encoding_kwargs` for InstructEmbeddings (#5450 ) # What does this PR do? Bring support of `encode_kwargs` for ` HuggingFaceInstructEmbeddings`, change the docstring example and add a test to illustrate with `normalize_embeddings`. Fixes #3605 (Similar to #3914) Use case: ```python from langchain.embeddings import HuggingFaceInstructEmbeddings model_name = "hkunlp/instructor-large" model_kwargs = {'device': 'cpu'} encode_kwargs = {'normalize_embeddings': True} hf = HuggingFaceInstructEmbeddings( model_name=model_name, model_kwargs=model_kwargs, encode_kwargs=encode_kwargs ) ```	2023-05-30 11:57:04 -07:00
Patrick Keane	e09afb4b44	Removes duplicated call from langchain/client/langchain.py (#5449 ) This removes duplicate code presumably introduced by a cut-and-paste error, spotted while reviewing the code in ```langchain/client/langchain.py```. The original code had back to back occurrences of the following code block: ``` response = self._get( path, params=params, ) raise_for_status_with_text(response) ```	2023-05-30 11:52:46 -07:00
Jan Brinkmann	0d3a9d481f	Fixed docstring in faiss.py for load_local (#5440 ) # Fix for docstring in faiss.py vectorstore (load_local) The doctring should reflect that load_local loads something FROM the disk.	2023-05-30 11:41:00 -07:00
Davis Chase	4379bd4cbb	bump 186 (#5459 )	2023-05-30 10:47:59 -07:00
Davis Chase	2649b638dd	fix (#5457 )	2023-05-30 10:42:20 -07:00
Davis Chase	64b4165c8d	bump 185 (#5442 )	2023-05-30 08:08:11 -07:00
ByronHsu	9d658aaa5a	Add more code splitters (go, rst, js, java, cpp, scala, ruby, php, swift, rust) (#5171 ) As the title says, I added more code splitters. The implementation is trivial, so i don't add separate tests for each splitter. Let me know if any concerns. Fixes # (issue) https://github.com/hwchase17/langchain/issues/5170 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev @hwchase17 --------- Signed-off-by: byhsu <byhsu@linkedin.com> Co-authored-by: byhsu <byhsu@linkedin.com>	2023-05-30 11:04:05 -04:00
Paul-Emile Brotons	a61b7f7e7c	adding MongoDBAtlasVectorSearch (#5338 ) # Add MongoDBAtlasVectorSearch for the python library Fixes #5337 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-30 07:59:01 -07:00
Harrison Chase	c4b502a470	Harrison/condense q llm (#5438 )	2023-05-30 07:15:37 -07:00
Lei Xu	ee57054d05	Rename and fix typo in lancedb (#5425 ) # Fix typo in LanceDB notebook filename	2023-05-30 00:24:17 -07:00
Zander Chase	26ff18575c	Set old LCTracer to default to port 8000 (#5381 ) Issue from: https://discord.com/channels/1038097195422978059/1069478035918688346/1112445980466483222	2023-05-29 22:42:53 -07:00
Harrison Chase	760632b292	Harrison/spark reader (#5405 ) Co-authored-by: Rithwik Ediga Lakhamsani <rithwik.ediga@databricks.com> Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:23:17 -07:00
UmerHA	8259f9b7fa	DocumentLoader for GitHub (#5408 ) # Creates GitHubLoader (#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes #5257 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:11:21 -07:00
German Martin	0b3e0dd1d2	New Trello document loader (#4767 ) # Added New Trello loader class and documentation Simple Loader on top of py-trello wrapper. With a board name you can pull cards and to do some field parameter tweaks on load operation. I included documentation and examples. Included unit test cases using patch and a fixture for py-trello client class. --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 19:47:56 -07:00
Harrison Chase	72f99ff953	Harrison/text splitter (#5417 ) adds support for keeping separators around when using recursive text splitter	2023-05-29 16:56:31 -07:00
小铭	cf5803e44c	Add ToolException that a tool can throw. (#5050 ) # Add ToolException that a tool can throw This is an optional exception that tool throws when execution error occurs. When this exception is thrown, the agent will not stop working,but will handle the exception according to the handle_tool_error variable of the tool,and the processing result will be returned to the agent as observation,and printed in pink on the console.It can be used like this: ```python from langchain.schema import ToolException from langchain import LLMMathChain, SerpAPIWrapper, OpenAI from langchain.agents import AgentType, initialize_agent from langchain.chat_models import ChatOpenAI from langchain.tools import BaseTool, StructuredTool, Tool, tool from langchain.chat_models import ChatOpenAI llm = ChatOpenAI(temperature=0) llm_math_chain = LLMMathChain(llm=llm, verbose=True) class Error_tool: def run(self, s: str): raise ToolException('The current search tool is not available.') def handle_tool_error(error) -> str: return "The following errors occurred during tool execution:"+str(error) search_tool1 = Error_tool() search_tool2 = SerpAPIWrapper() tools = [ Tool.from_function( func=search_tool1.run, name="Search_tool1", description="useful for when you need to answer questions about current events.You should give priority to using it.", handle_tool_error=handle_tool_error, ), Tool.from_function( func=search_tool2.run, name="Search_tool2", description="useful for when you need to answer questions about current events", return_direct=True, ) ] agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True, handle_tool_errors=handle_tool_error) agent.run("Who is Leo DiCaprio's girlfriend? What is her current age raised to the 0.43 power?") ``` ![image](https://github.com/hwchase17/langchain/assets/32786500/51930410-b26e-4f85-a1e1-e6a6fb450ada) ## Who can review? - @vowelparrot --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-29 20:05:58 +00:00
Harrison Chase	cce731c3c2	bump version 184 (#5407 )	2023-05-29 07:53:32 -07:00
Harrison Chase	2da8c48be1	Harrison/datetime parser (#4693 ) Co-authored-by: Jacob Valdez <jacobfv@msn.com> Co-authored-by: Jacob Valdez <jacob.valdez@limboid.ai> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-05-29 07:52:30 -07:00
Leonid Ganeline	1837caa70d	docs: `ecosystem/integrations` update 1 (#5219 ) # docs: ecosystem/integrations update It is the first in a series of `ecosystem/integrations` updates. The ecosystem/integrations list is missing many integrations. I'm adding the missing integrations in a consistent format: 1. description of the integrated system 2. `Installation and Setup` section with 'pip install ...`, Key setup, and other necessary settings 3. Sections like `LLM`, `Text Embedding Models`, `Chat Models`... with links to correspondent examples and imports of the used classes. This PR keeps new docs, that are presented in the `docs/modules/models/text_embedding/examples` but missed in the `ecosystem/integrations`. The next PRs will cover the next example sections. Also updated `integrations.rst`: added the `Dependencies` section with a link to the packages used in LangChain. ## Who can review? @hwchase17 @eyurtsev @dev2049	2023-05-29 07:25:17 -07:00
Leonid Ganeline	a3598193a0	docs: `ecosystem/integrations` update 2 (#5282 ) # docs: ecosystem/integrations update 2 #5219 - part 1 The second part of this update (parts are independent of each other! no overlap): - added diffbot.md - updated confluence.ipynb; added confluence.md - updated college_confidential.md - updated openai.md - added blackboard.md - added bilibili.md - added azure_blob_storage.md - added azlyrics.md - added aws_s3.md ## Who can review? @hwchase17@agola11 @agola11 @vowelparrot @dev2049	2023-05-29 07:19:43 -07:00
Eduard van Valkenburg	ccb6238de1	Implemented appending arbitrary messages (#5293 ) # Implemented appending arbitrary messages to the base chat message history, the in-memory and cosmos ones. <!-- Thank you for contributing to LangChain! Your PR will appear in our next release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> As discussed this is the alternative way instead of #4480, with a add_message method added that takes a BaseMessage as input, so that the user can control what is in the base message like kwargs. <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting <!-- If you're adding a new integration, include an integration test and an example notebook showing its use! --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-29 07:18:59 -07:00
Harrison Chase	d6fb25c439	Harrison/prediction guard update (#5404 ) Co-authored-by: Daniel Whitenack <whitenack.daniel@gmail.com>	2023-05-29 07:14:59 -07:00
Harrison Chase	416c8b1da3	Harrison/deep infra (#5403 ) Co-authored-by: Yessen Kanapin <yessenzhar@gmail.com> Co-authored-by: Yessen Kanapin <yessen@deepinfra.com>	2023-05-29 07:10:50 -07:00
Timothy Ji	100d6655df	Reformat openai proxy setting as code (#5330 ) # Reformat the openai proxy setting as code Only affect the doc for openai Model - @hwchase17 - @agola11	2023-05-29 07:02:47 -07:00
Justin Flick	c09f8e4ddc	Add pagination for Vertex AI embeddings (#5325 ) Fixes #5316 --------- Co-authored-by: Justin Flick <jflick@homesite.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-05-29 06:57:41 -07:00
Harrison Chase	3e16468423	Harrison/llamacpp (#5402 ) Co-authored-by: Gavin S <gavinswanson@gmail.com>	2023-05-29 06:44:58 -07:00
Chandan Routray	642ae83d86	Removed deprecated llm attribute for load_chain (#5343 ) # Removed deprecated llm attribute for load_chain Currently `load_chain` for some chain types expect `llm` attribute to be present but `llm` is deprecated attribute for those chains and might not be persisted during their `chain.save`. Fixes #5224 [(issue)](https://github.com/hwchase17/langchain/issues/5224) ## Who can review? @hwchase17 @dev2049 --------- Co-authored-by: imeckr <chandanroutray2012@gmail.com>	2023-05-29 06:44:47 -07:00
Oleh Kuznetsov	f6615cac41	Update llamacpp demonstration notebook (#5344 ) # Update llamacpp demonstration notebook Add instructions to install with BLAS backend, and update the example of model usage. Fixes #5071. However, it is more like a prevention of similar issues in the future, not a fix, since there was no problem in the framework functionality ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: - @hwchase17 - @agola11	2023-05-29 06:43:26 -07:00
Martin Holecek	44b48d9518	Fix update_document function, add test and documentation. (#5359 ) # Fix for `update_document` Function in Chroma ## Summary This pull request addresses an issue with the `update_document` function in the Chroma class, as described in [#5031](https://github.com/hwchase17/langchain/issues/5031#issuecomment-1562577947). The issue was identified as an `AttributeError` raised when calling `update_document` due to a missing corresponding method in the `Collection` object. This fix refactors the `update_document` method in `Chroma` to correctly interact with the `Collection` object. ## Changes 1. Fixed the `update_document` method in the `Chroma` class to correctly call methods on the `Collection` object. 2. Added the corresponding test `test_chroma_update_document` in `tests/integration_tests/vectorstores/test_chroma.py` to reflect the updated method call. 3. Added an example and explanation of how to use the `update_document` function in the Jupyter notebook tutorial for Chroma. ## Test Plan All existing tests pass after this change. In addition, the `test_chroma_update_document` test case now correctly checks the functionality of `update_document`, ensuring that the function works as expected and updates the content of documents correctly. ## Reviewers @dev2049 This fix will ensure that users are able to use the `update_document` function as expected, without encountering the previous `AttributeError`. This will enhance the usability and reliability of the Chroma class for all users. Thank you for considering this pull request. I look forward to your feedback and suggestions.	2023-05-29 06:39:25 -07:00
Louis Amaudruz	e455ba4ed5	Add async support to routing chains (#5373 ) # Add async support for (LLM) routing chains <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. --> <!-- Remove if not applicable --> Add asynchronous LLM calls support for the routing chains. More specifically: - Add async `aroute` function (i.e. async version of `route`) to the `RouterChain` which calls the routing LLM asynchronously - Implement the async `_acall` for the `LLMRouterChain` - Implement the async `_acall` function for `MultiRouteChain` which first calls asynchronously the routing chain with its new `aroute` function, and then calls asynchronously the relevant destination chain. <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? - @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Async - @agola11 -->	2023-05-29 06:37:26 -07:00
Gael Grosch	8b7721ebbb	fix: Blob.from_data mimetype is lost (#5395 ) # Fix lost mimetype when using Blob.from_data method The mimetype is lost due to a typo in the class attribue name Fixes # - (no issue opened but I can open one if needed) ## Changes * Fixed typo in name * Added unit-tests to validate the output Blob ## Review @eyurtsev	2023-05-29 06:36:50 -07:00
Jacob Lee	f77f27163d	Update PR template with Twitter handle request (#5382 ) # Updates PR template to request Twitter handle for shoutouts! Makes it easier for maintainers to show their appreciation 😄	2023-05-29 06:23:17 -07:00
Zander Chase	14099f1b93	Use Default Factory (#5380 ) We shouldn't be calling a constructor for a default value - should use default_factory instead. This is especially ad in this case since it requires an optional dependency and an API key to be set. Resolves #5361	2023-05-29 06:22:35 -07:00
Harrison Chase	6df90ad9fd	handle json parsing errors (#5371 ) adds tests cases, consolidates a lot of PRs	2023-05-29 06:18:19 -07:00
玄猫	99a1e3f3a3	Fix: Handle empty documents in ContextualCompressionRetriever (Issue #5304 ) (#5306 ) # Fix: Handle empty documents in ContextualCompressionRetriever (Issue #5304) Fixes #5304 Prevent cohere.error.CohereAPIError caused by an empty list of documents by adding a condition to check if the input documents list is empty in the compress_documents method. If the list is empty, return an empty list immediately, avoiding the error and unnecessary processing. @dev2049 --------- Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>	2023-05-28 13:19:34 -07:00
os1ma	1366d070fc	Add path validation to DirectoryLoader (#5327 ) # Add path validation to DirectoryLoader This PR introduces a minor adjustment to the DirectoryLoader by adding validation for the path argument. Previously, if the provided path didn't exist or wasn't a directory, DirectoryLoader would return an empty document list due to the behavior of the `glob` method. This could potentially cause confusion for users, as they might expect a file-loading error instead. So, I've added two validations to the load method of the DirectoryLoader: - Raise a FileNotFoundError if the provided path does not exist - Raise a ValueError if the provided path is not a directory Due to the relatively small scope of these changes, a new issue was not created. ## Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: @eyurtsev	2023-05-28 15:31:23 -04:00
Harrison Chase	ad7f4c0317	bump to 183 (#5372 )	2023-05-28 11:42:58 -07:00
Harrison Chase	b6927970f1	revert bad json (#5370 )	2023-05-28 10:22:02 -07:00
Matt Wells	9a5c9df809	Fixes iter error in FAISS add_embeddings call (#5367 ) # Remove re-use of iter within add_embeddings causing error As reported in https://github.com/hwchase17/langchain/issues/5336 there is an issue currently involving the atempted re-use of an iterator within the FAISS vectorstore adapter Fixes # https://github.com/hwchase17/langchain/issues/5336 ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: VectorStores / Retrievers / Memory - @dev2049	2023-05-28 09:59:30 -07:00

3352 changed files with 248589 additions and 123127 deletions

									
										42

.devcontainer/Dockerfile
									
												View File
											
				@@ -1,42 +0,0 @@

				# This is a Dockerfile for Developer Container

				# Use the Python base image

				ARG VARIANT="3.11-bullseye"

				FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT} AS langchain-dev-base

				USER vscode

				# Define the version of Poetry to install (default is 1.4.2)

				# Define the directory of python virtual environment

				ARG PYTHON_VIRTUALENV_HOME=/home/vscode/langchain-py-env \

				    POETRY_VERSION=1.4.2 

				ENV POETRY_VIRTUALENVS_IN_PROJECT=false \

				    POETRY_NO_INTERACTION=true 

				# Create a Python virtual environment for Poetry and install it

				RUN python3 -m venv ${PYTHON_VIRTUALENV_HOME} && \

				    $PYTHON_VIRTUALENV_HOME/bin/pip install --upgrade pip && \

				    $PYTHON_VIRTUALENV_HOME/bin/pip install poetry==${POETRY_VERSION}

				ENV PATH="$PYTHON_VIRTUALENV_HOME/bin:$PATH" \

				    VIRTUAL_ENV=$PYTHON_VIRTUALENV_HOME

				# Setup for bash

				RUN poetry completions bash >> /home/vscode/.bash_completion && \

				    echo "export PATH=$PYTHON_VIRTUALENV_HOME/bin:$PATH" >> ~/.bashrc

				# Set the working directory for the app

				WORKDIR /workspaces/langchain

				# Use a multi-stage build to install dependencies

				FROM langchain-dev-base AS langchain-dev-dependencies

				ARG PYTHON_VIRTUALENV_HOME

				# Copy only the dependency files for installation

				COPY pyproject.toml poetry.lock poetry.toml ./

				# Install the Poetry dependencies (this layer will be cached as long as the dependencies don't change)

				RUN poetry install --no-interaction --no-ansi --with dev,test,docs

									
										37

.devcontainer/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,37 @@

				# Dev container

				This project includes a [dev container](https://containers.dev/), which lets you use a container as a full-featured dev environment.

				You can use the dev container configuration in this folder to build and run the app without needing to install any of its tools locally! You can use it in [GitHub Codespaces](https://github.com/features/codespaces) or the [VS Code Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers).

				## GitHub Codespaces

				[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)

				You may use the button above, or follow these steps to open this repo in a Codespace:

				1. Click the **Code** drop-down menu at the top of https://github.com/hwchase17/langchain.

				1. Click on the **Codespaces** tab.

				1. Click **Create codespace on master** .

				For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).

				## VS Code Dev Containers

				[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)

				If you already have VS Code and Docker installed, you can use the button above to get started. This will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.

				You can also follow these steps to open this repo in a container using the VS Code Dev Containers extension:

				1. If this is your first time using a development container, please ensure your system meets the pre-reqs (i.e. have Docker installed) in the [getting started steps](https://aka.ms/vscode-remote/containers/getting-started).

				2. Open a locally cloned copy of the code:

				   - Clone this repository to your local filesystem.

				   - Press <kbd>F1</kbd> and select the **Dev Containers: Open Folder in Container...** command.

				   - Select the cloned copy of this folder, wait for the container to start, and try things out!

				You can learn more in the [Dev Containers documentation](https://code.visualstudio.com/docs/devcontainers/containers).

				## Tips and tricks

				* If you are working with the same repository folder in a container and Windows, you'll want consistent line endings (otherwise you may see hundreds of changes in the SCM view). The `.gitattributes` file in the root of this repo will disable line ending conversion and should prevent this. See [tips and tricks](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files) for more info.

				* If you'd like to review the contents of the image used in this dev container, you can check it out in the [devcontainers/images](https://github.com/devcontainers/images/tree/main/src/python) repo.

									
										45

.devcontainer/devcontainer.json
									
												View File
												
				@@ -1,24 +1,26 @@

				// For format details, see https://aka.ms/devcontainer.json. For config options, see the

				// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-dockerfile

				// README at: https://github.com/devcontainers/templates/tree/main/src/docker-existing-docker-compose

				{

					"dockerComposeFile": "./docker-compose.yaml",

					"service": "langchain",

					"workspaceFolder": "/workspaces/langchain",

					// Name for the dev container

					"name": "langchain",

					"customizations": {

						"vscode": {

							"extensions": [   

								"ms-python.python"

							],

							"settings": {

								"python.defaultInterpreterPath": "/home/vscode/langchain-py-env/bin/python3.11"

							}

						}

					},

					// Features to add to the dev container. More info: https://containers.dev/features.

					"features": {},

					// Point to a Docker Compose file

					"dockerComposeFile": "./docker-compose.yaml",

					// Required when using Docker Compose. The name of the service to connect to once running

					"service": "langchain",

					// The optional 'workspaceFolder' property is the path VS Code should open by default when

					// connected. This is typically a file mount in .devcontainer/docker-compose.yml

					"workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",

					// Prevent the container from shutting down

					"overrideCommand": true

					// Features to add to the dev container. More info: https://containers.dev/features

					// "features": {

					// 	"ghcr.io/devcontainers-contrib/features/poetry:2": {}

					// }

					// Use 'forwardPorts' to make a list of ports inside the container available locally.

					// "forwardPorts": [],

				@@ -26,8 +28,9 @@

					// Uncomment the next line to run commands after the container is created.

					// "postCreateCommand": "cat /etc/os-release",

					// Uncomment to connect as an existing user other than the container default. More info: https://aka.ms/dev-containers-non-root.

					// "remoteUser": "devcontainer"

					"remoteUser": "vscode",

					"overrideCommand": true

					// Configure tool-specific properties.

					// "customizations": {},

					// Uncomment to connect as root instead. More info: https://aka.ms/dev-containers-non-root.

					// "remoteUser": "root"

				}

									
										7

.devcontainer/docker-compose.yaml
									
												View File
												
				@@ -2,10 +2,11 @@ version: '3'

				services:

				  langchain:

				    build:

				      dockerfile: .devcontainer/Dockerfile

				      context: ../ 

				      dockerfile: libs/langchain/dev.Dockerfile

				      context: ..

				    volumes:

				      - ../:/workspaces/langchain

				   # Update this to wherever you want VS Code to mount the folder of your project

				      - ..:/workspaces:cached

				    networks:

				      - langchain-network 

				  #   environment:

3

.gitattributes vendored Normal file

View File

@@ -0,0 +1,3 @@
 * text=auto eol=lf
 *.{cmd,[cC][mM][dD]} text eol=crlf
 *.{bat,[bB][aA][tT]} text eol=crlf

									
										79

.github/CONTRIBUTING.md
									
										vendored
									
												View File
												
				@@ -59,6 +59,8 @@ we do not want these to get in the way of getting good code into the codebase.

				## 🚀 Quick Start

				> **Note:** You can run this repository locally (which is described below) or in a [development container](https://containers.dev/) (which is described in the [.devcontainer folder](https://github.com/hwchase17/langchain/tree/master/.devcontainer)).

				This project uses [Poetry](https://python-poetry.org/) as a dependency manager. Check out Poetry's [documentation on how to install it](https://python-poetry.org/docs/#installation) on your system before proceeding.

				❗Note: If you use `Conda` or `Pyenv` as your environment / package manager, avoid dependency conflicts by doing the following first:

				@@ -67,6 +69,14 @@ This project uses [Poetry](https://python-poetry.org/) as a dependency manager.

				3. Tell Poetry to use the virtualenv python environment (`poetry config virtualenvs.prefer-active-python true`)

				4. Continue with the following steps.

				There are two separate projects in this repository:

				- `langchain`: core langchain code, abstractions, and use cases

				- `langchain.experimental`: more experimental code

				Each of these has their OWN development environment. 

				In order to run any of the commands below, please move into their respective directories.

				For example, to contribute to `langchain` run `cd libs/langchain` before getting started with the below.

				To install requirements:

				```bash

				@@ -93,6 +103,14 @@ To run formatting for this project:

				make format

				```

				Additionally, you can run the formatter only on the files that have been modified in your current branch as compared to the master branch using the format_diff command:

				```bash

				make format_diff

				```

				This is especially useful when you have made changes to a subset of the project and want to ensure your changes are properly formatted without affecting the rest of the codebase.

				### Linting

				Linting for this project is done via a combination of [Black](https://black.readthedocs.io/en/stable/), [isort](https://pycqa.github.io/isort/), [flake8](https://flake8.pycqa.org/en/latest/), and [mypy](http://mypy-lang.org/).

				@@ -103,8 +121,42 @@ To run linting for this project:

				make lint

				```

				In addition, you can run the linter only on the files that have been modified in your current branch as compared to the master branch using the lint_diff command:

				```bash

				make lint_diff

				```

				This can be very helpful when you've made changes to only certain parts of the project and want to ensure your changes meet the linting standards without having to check the entire codebase.

				We recognize linting can be annoying - if you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.

				### Spellcheck

				Spellchecking for this project is done via [codespell](https://github.com/codespell-project/codespell).

				Note that `codespell` finds common typos, so could have false-positive (correctly spelled but rarely used) and false-negatives (not finding misspelled) words.

				To check spelling for this project:

				```bash

				make spell_check

				```

				To fix spelling in place:

				```bash

				make spell_fix

				```

				If codespell is incorrectly flagging a word, you can skip spellcheck for that word by adding it to the codespell config in the `pyproject.toml` file.

				```python

				[tool.codespell]

				...

				# Add here:

				ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'

				```

				### Coverage

				Code coverage (i.e. the amount of code that is covered by unit tests) helps identify areas of the code that are potentially more or less brittle.

				@@ -204,32 +256,43 @@ When you run `poetry install`, the `langchain` package is installed as editable

				## Documentation

				While the code is split between `langchain` and `langchain.experimental`, the documentation is one holistic thing.

				This covers how to get started contributing to documentation.

				### Contribute Documentation

				Docs are largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.

				The docs directory contains Documentation and API Reference.

				Documentation is built using [Docusaurus 2](https://docusaurus.io/).

				API Reference are largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.

				For that reason, we ask that you add good documentation to all classes and methods.

				Similar to linting, we recognize documentation can be annoying. If you do not want to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.

				### Build Documentation Locally

				In the following commands, the prefix `api_` indicates that those are operations for the API Reference.

				Before building the documentation, it is always a good idea to clean the build directory:

				```bash

				make docs_clean

				make api_docs_clean

				```

				Next, you can run the linkchecker to make sure all links are valid:

				```bash

				make docs_linkcheck

				```

				Finally, you can build the documentation as outlined below:

				Next, you can build the documentation as outlined below:

				```bash

				make docs_build

				make api_docs_build

				```

				Finally, you can run the linkchecker to make sure all links are valid:

				```bash

				make docs_linkcheck

				make api_docs_linkcheck

				```

				## 🏭 Release Process

									
										2

.github/ISSUE_TEMPLATE/bug-report.yml
									
										vendored
									
												View File
												
				@@ -46,7 +46,7 @@ body:

				        - @agola11

				        Tools / Toolkits

				        - @vowelparrot

				        - ...

				      placeholder: "@Username ..."

									
										70

.github/PULL_REQUEST_TEMPLATE.md
									
										vendored
									
												View File
												
				@@ -1,56 +1,28 @@

				# Your PR Title (What it does)

				<!-- Thank you for contributing to LangChain!

				<!--

				Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution.

				Replace this comment with:

				  - Description: a description of the change, 

				  - Issue: the issue # it fixes (if applicable),

				  - Dependencies: any dependencies required for this change,

				  - Tag maintainer: for a quicker response, tag the relevant maintainer (see below),

				  - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out!

				Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change.

				Please make sure you're PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally.

				After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost.

				-->

				If you're adding a new integration, please include:

				  1. a test for the integration, preferably unit tests that do not rely on network access,

				  2. an example notebook showing its use.

				<!-- Remove if not applicable -->

				Maintainer responsibilities:

				  - General / Misc / if you don't know who to tag: @baskaryan

				  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev

				  - Models / Prompts: @hwchase17, @baskaryan

				  - Memory: @hwchase17

				  - Agents / Tools / Toolkits: @hinthornw

				  - Tracing / Callbacks: @agola11

				  - Async: @agola11

				Fixes # (issue)

				If no one reviews your PR within a few days, feel free to @-mention the same people again.

				## Before submitting

				<!-- If you're adding a new integration, please include:

				1. a test for the integration - favor unit tests that does not rely on network access.

				2. an example notebook showing its use

				See contribution guidelines for more information on how to write tests, lint

				etc:

				https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

				-->

				## Who can review?

				Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested:

				<!-- For a quicker response, figure out the right person to tag with @

				  @hwchase17 - project lead

				  Tracing / Callbacks

				  - @agola11

				  Async

				  - @agola11

				  DataLoaders

				  - @eyurtsev

				  Models

				  - @hwchase17

				  - @agola11

				  Agents / Tools / Toolkits

				  - @vowelparrot

				  VectorStores / Retrievers / Memory

				  - @dev2049

				See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

				 -->

									
										2

.github/actions/poetry_setup/action.yml
									
										vendored
									
												View File
												
				@@ -52,11 +52,13 @@ runs:

				    - name: Check Poetry File

				      shell: bash

				      working-directory: ${{ inputs.working-directory }}

				      run: |

				        poetry check

				    - name: Check lock file

				      shell: bash

				      working-directory: ${{ inputs.working-directory }}

				      run: |

				        poetry lock --check

									
										46

.github/workflows/_lint.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,46 @@

				name: lint

				on:

				  workflow_call:

				    inputs:

				      working-directory:

				        required: true

				        type: string

				        description: "From which folder this pipeline executes"

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  build:

				    defaults:

				      run:

				        working-directory: ${{ inputs.working-directory }}

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version:

				          - "3.8"

				          - "3.9"

				          - "3.10"

				          - "3.11"

				    steps:

				      - uses: actions/checkout@v3

				      - name: Install poetry

				        run: |

				          pipx install poetry==$POETRY_VERSION

				      - name: Set up Python ${{ matrix.python-version }}

				        uses: actions/setup-python@v4

				        with:

				          python-version: ${{ matrix.python-version }}

				          cache: poetry

				      - name: Install dependencies

				        run: |

				          poetry install

				      - name: Install langchain editable

				        if: ${{ inputs.working-directory != 'langchain' }}

				        run: |

				          pip install -e ../langchain

				      - name: Analysing the code with our lint

				        run: |

				          make lint

									
										51

.github/workflows/_release.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,51 @@

				name: release

				on:

				  workflow_call:

				    inputs:

				      working-directory:

				        required: true

				        type: string

				        description: "From which folder this pipeline executes"

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  if_release:

				    if: |

				        ${{ github.event.pull_request.merged == true }}

				        && ${{ contains(github.event.pull_request.labels.*.name, 'release') }}

				    runs-on: ubuntu-latest

				    defaults:

				      run:

				        working-directory: ${{ inputs.working-directory }}

				    steps:

				      - uses: actions/checkout@v3

				      - name: Install poetry

				        run: pipx install poetry==$POETRY_VERSION

				      - name: Set up Python 3.10

				        uses: actions/setup-python@v4

				        with:

				          python-version: "3.10"

				          cache: "poetry"

				      - name: Build project for distribution

				        run: poetry build

				      - name: Check Version

				        id: check-version

				        run: |

				          echo version=$(poetry version --short) >> $GITHUB_OUTPUT

				      - name: Create Release

				        uses: ncipollo/release-action@v1

				        with:

				          artifacts: "dist/*"

				          token: ${{ secrets.GITHUB_TOKEN }}

				          draft: false

				          generateReleaseNotes: true

				          tag: v${{ steps.check-version.outputs.version }}

				          commit: master

				      - name: Publish to PyPI

				        env:

				          POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_API_TOKEN }}

				        run: |

				          poetry publish

									
										61

.github/workflows/_test.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,61 @@

				name: test

				on:

				  workflow_call:

				    inputs:

				      working-directory:

				        required: true

				        type: string

				        description: "From which folder this pipeline executes"

				      test_type:

				        type: string

				        description: "Test types to run"

				        default: '["core", "extended"]'

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  build:

				    defaults:

				      run:

				        working-directory: ${{ inputs.working-directory }}

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version:

				          - "3.8"

				          - "3.9"

				          - "3.10"

				          - "3.11"

				        test_type: ${{ fromJSON(inputs.test_type) }}

				    name: Python ${{ matrix.python-version }} ${{ matrix.test_type }}

				    steps:

				      - uses: actions/checkout@v3

				      - name: Set up Python ${{ matrix.python-version }}

				        uses: "./.github/actions/poetry_setup"

				        with:

				          python-version: ${{ matrix.python-version }}

				          working-directory: ${{ inputs.working-directory }}

				          poetry-version: "1.4.2"

				          cache-key: ${{ matrix.test_type }}

				          install-command: |

				              if [ "${{ matrix.test_type }}" == "core" ]; then

				                echo "Running core tests, installing dependencies with poetry..."

				                poetry install

				              else

				                echo "Running extended tests, installing dependencies with poetry..."

				                poetry install -E extended_testing

				              fi

				      - name: Install langchain editable

				        if: ${{ inputs.working-directory != 'langchain' }}

				        run: |

				          pip install -e ../langchain

				      - name: Run ${{matrix.test_type}} tests

				        run: |

				          if [ "${{ matrix.test_type }}" == "core" ]; then

				            make test

				          else

				            make extended_tests

				          fi

				        shell: bash

									
										22

.github/workflows/codespell.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,22 @@

				---

				name: Codespell

				on:

				  push:

				    branches: [master]

				  pull_request:

				    branches: [master]

				permissions:

				  contents: read

				jobs:

				  codespell:

				    name: Check for spelling errors

				    runs-on: ubuntu-latest

				    steps:

				      - name: Checkout

				        uses: actions/checkout@v3

				      - name: Codespell

				        uses: codespell-project/actions-codespell@v2

									
										27

.github/workflows/langchain_ci.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,27 @@

				---

				name: libs/langchain CI

				on:

				  push:

				    branches: [ master ]

				  pull_request:

				    paths:

				      - '.github/workflows/_lint.yml'

				      - '.github/workflows/_test.yml'

				      - '.github/workflows/langchain_ci.yml'

				      - 'libs/langchain/**'

				  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

				jobs:

				  lint:

				    uses:

				      ./.github/workflows/_lint.yml

				    with:

				      working-directory: libs/langchain

				    secrets: inherit

				  test:

				    uses:

				      ./.github/workflows/_test.yml

				    with:

				      working-directory: libs/langchain

				    secrets: inherit

									
										29

.github/workflows/langchain_experimental_ci.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,29 @@

				---

				name: libs/langchain-experimental CI

				on:

				  push:

				    branches: [ master ]

				  pull_request:

				    paths:

				      - '.github/workflows/_lint.yml'

				      - '.github/workflows/_test.yml'

				      - '.github/workflows/langchain_experimental_ci.yml'

				      - 'libs/langchain/**'

				      - 'libs/experimental/**'

				  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

				jobs:

				  lint:

				    uses:

				      ./.github/workflows/_lint.yml

				    with:

				      working-directory: libs/experimental

				    secrets: inherit

				  test:

				    uses:

				      ./.github/workflows/_test.yml

				    with:

				      working-directory: libs/experimental

				      test_type: '["core"]'

				    secrets: inherit

									
										20

.github/workflows/langchain_experimental_release.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,20 @@

				---

				name: libs/langchain-experimental Release

				on:

				  pull_request:

				    types:

				      - closed

				    branches:

				      - master

				    paths:

				      - 'libs/experimental/pyproject.toml'

				  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

				jobs:

				  release:

				    uses:

				      ./.github/workflows/_release.yml

				    with:

				      working-directory: libs/experimental

				    secrets: inherit

									
										20

.github/workflows/langchain_release.yml
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,20 @@

				---

				name: libs/langchain Release

				on:

				  pull_request:

				    types:

				      - closed

				    branches:

				      - master

				    paths:

				      - 'libs/langchain/pyproject.toml'

				  workflow_dispatch:  # Allows to trigger the workflow manually in GitHub UI

				jobs:

				  release:

				    uses:

				      ./.github/workflows/_release.yml

				    with:

				      working-directory: libs/langchain

				    secrets: inherit

									
										38

.github/workflows/linkcheck.yml
									
										vendored
									
												View File
											
				@@ -1,38 +0,0 @@

				name: linkcheck

				on:

				  push:

				    branches: [master]

				  pull_request:

				    paths:

				      - 'docs/**'

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  build:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version:

				          - "3.11"

				    steps:

				      - uses: actions/checkout@v3

				      - name: Install poetry

				        run: |

				          pipx install poetry==$POETRY_VERSION

				      - name: Set up Python ${{ matrix.python-version }}

				        uses: actions/setup-python@v4

				        with:

				          python-version: ${{ matrix.python-version }}

				          cache: poetry

				      - name: Install dependencies

				        run: |

				          poetry install --with docs

				      - name: Build the docs

				        run: |

				          make docs_build

				      - name: Analyzing the docs with linkcheck

				        run: |

				          make docs_linkcheck

									
										36

.github/workflows/lint.yml
									
										vendored
									
												View File
											
				@@ -1,36 +0,0 @@

				name: lint

				on:

				  push:

				    branches: [master]

				  pull_request:

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  build:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version:

				          - "3.8"

				          - "3.9"

				          - "3.10"

				          - "3.11"

				    steps:

				      - uses: actions/checkout@v3

				      - name: Install poetry

				        run: |

				          pipx install poetry==$POETRY_VERSION

				      - name: Set up Python ${{ matrix.python-version }}

				        uses: actions/setup-python@v4

				        with:

				          python-version: ${{ matrix.python-version }}

				          cache: poetry

				      - name: Install dependencies

				        run: |

				          poetry install

				      - name: Analysing the code with our lint

				        run: |

				          make lint

									
										49

.github/workflows/release.yml
									
										vendored
									
												View File
											
				@@ -1,49 +0,0 @@

				name: release

				on:

				  pull_request:

				    types:

				      - closed

				    branches:

				      - master

				    paths:

				      - 'pyproject.toml'

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  if_release:

				    if: |

				        ${{ github.event.pull_request.merged == true }}

				        && ${{ contains(github.event.pull_request.labels.*.name, 'release') }}

				    runs-on: ubuntu-latest

				    steps:

				      - uses: actions/checkout@v3

				      - name: Install poetry

				        run: pipx install poetry==$POETRY_VERSION

				      - name: Set up Python 3.10

				        uses: actions/setup-python@v4

				        with:

				          python-version: "3.10"

				          cache: "poetry"

				      - name: Build project for distribution

				        run: poetry build

				      - name: Check Version

				        id: check-version

				        run: |

				          echo version=$(poetry version --short) >> $GITHUB_OUTPUT

				      - name: Create Release

				        uses: ncipollo/release-action@v1

				        with:

				          artifacts: "dist/*"

				          token: ${{ secrets.GITHUB_TOKEN }}

				          draft: false

				          generateReleaseNotes: true

				          tag: v${{ steps.check-version.outputs.version }}

				          commit: master

				      - name: Publish to PyPI

				        env:

				          POETRY_PYPI_TOKEN_PYPI: ${{ secrets.PYPI_API_TOKEN }}

				        run: |

				          poetry publish

									
										49

.github/workflows/test.yml
									
										vendored
									
												View File
											
				@@ -1,49 +0,0 @@

				name: test

				on:

				  push:

				    branches: [master]

				  pull_request:

				  workflow_dispatch:

				env:

				  POETRY_VERSION: "1.4.2"

				jobs:

				  build:

				    runs-on: ubuntu-latest

				    strategy:

				      matrix:

				        python-version:

				          - "3.8"

				          - "3.9"

				          - "3.10"

				          - "3.11"

				        test_type:

				          - "core"

				          - "extended"

				    name: Python ${{ matrix.python-version }} ${{ matrix.test_type }}

				    steps:

				      - uses: actions/checkout@v3

				      - name: Set up Python ${{ matrix.python-version }}

				        uses: "./.github/actions/poetry_setup"

				        with:

				          python-version: ${{ matrix.python-version }}

				          poetry-version: "1.4.2"

				          cache-key: ${{ matrix.test_type }}

				          install-command: |

				              if [ "${{ matrix.test_type }}" == "core" ]; then

				                echo "Running core tests, installing dependencies with poetry..."

				                poetry install

				              else

				                echo "Running extended tests, installing dependencies with poetry..."

				                poetry install -E extended_testing

				              fi

				      - name: Run ${{matrix.test_type}} tests

				        run: |

				          if [ "${{ matrix.test_type }}" == "core" ]; then

				            make test

				          else

				            make extended_tests

				          fi

				        shell: bash

22

.gitignore vendored

View File

@@ -73,6 +73,7 @@ instance/
 # Sphinx documentation
 docs/_build/
 docs/docs/_build/
 # PyBuilder
 target/
@@ -149,4 +150,23 @@ wandb/
 # integration test artifacts
 data_map*
 \[('_type', 'fake'), ('stop', None)]
 \[('_type', 'fake'), ('stop', None)]
 # Replit files
 *replit*
 node_modules
 docs/.yarn/
 docs/node_modules/
 docs/.docusaurus/
 docs/.cache-loader/
 docs/_dist
 docs/api_reference/api_reference.rst
 docs/api_reference/_build
 docs/api_reference/*/
 !docs/api_reference/_static/
 !docs/api_reference/templates/
 !docs/api_reference/themes/
 docs/docs_skeleton/build
 docs/docs_skeleton/node_modules
 docs/docs_skeleton/yarn.lock

4

.gitmodules vendored Normal file

View File

@@ -0,0 +1,4 @@
 [submodule "docs/_docs_skeleton"]
 	path = docs/_docs_skeleton
 	url = https://github.com/langchain-ai/langchain-shared-docs
 	branch = main

									
										9

.readthedocs.yaml
									
												View File
												
				@@ -9,10 +9,13 @@ build:

				  os: ubuntu-22.04

				  tools:

				    python: "3.11"

				  jobs:

				    pre_build:

				      - python docs/api_reference/create_api_rst.py

				# Build documentation in the docs/ directory with Sphinx

				sphinx:

				   configuration: docs/conf.py

				   configuration: docs/api_reference/conf.py

				# If using Sphinx, optionally build your docs in additional formats such as PDF

				# formats:

				@@ -21,6 +24,6 @@ sphinx:

				# Optionally declare the Python requirements required to build your docs

				python:

				   install:

				   - requirements: docs/requirements.txt

				   - requirements: docs/api_reference/requirements.txt

				   - method: pip

				     path: .

				     path: .

									
										57

MIGRATE.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,57 @@

				# Migrating to `langchain_experimental`

				We are moving any experimental components of LangChain, or components with vulnerability issues, into `langchain_experimental`.

				This guide covers how to migrate.

				## Installation

				Previously:

				`pip install -U langchain`

				Now (only if you want to access things in experimental):

				`pip install -U langchain langchain_experimental`

				## Things in `langchain.experimental`

				Previously:

				`from langchain.experimental import ...`

				Now:

				`from langchain_experimental import ...`

				## PALChain

				Previously:

				`from langchain.chains import PALChain`

				Now:

				`from langchain_experimental.pal_chain import PALChain`

				## SQLDatabaseChain

				Previously:

				`from langchain.chains import SQLDatabaseChain`

				Now:

				`from langchain_experimental.sql import SQLDatabaseChain`

				## `load_prompt` for Python files

				Note: this only applies if you want to load Python files as prompts.

				If you want to load json/yaml files, no change is needed.

				Previously:

				`from langchain.prompts import load_prompt`

				Now:

				`from langchain_experimental.prompts import load_prompt`

									
										73

Makefile
									
												View File
												
				@@ -1,57 +1,45 @@

				.PHONY: all clean format lint test tests test_watch integration_tests docker_tests help extended_tests

				.PHONY: all clean docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck

				# Default target executed when no arguments are given to make.

				all: help

				coverage:

					poetry run pytest --cov \

						--cov-config=.coveragerc \

						--cov-report xml \

						--cov-report term-missing:skip-covered

				clean: docs_clean

				######################

				# DOCUMENTATION

				######################

				clean: docs_clean api_docs_clean

				docs_build:

					cd docs && poetry run make html

					docs/.local_build.sh

				docs_clean:

					cd docs && poetry run make clean

					rm -r docs/_dist

				docs_linkcheck:

					poetry run linkchecker docs/_build/html/index.html

					poetry run linkchecker docs/_dist/docs_skeleton/ --ignore-url node_modules

				format:

					poetry run black .

					poetry run ruff --select I --fix .

				api_docs_build:

					poetry run python docs/api_reference/create_api_rst.py

					cd docs/api_reference && poetry run make html

				PYTHON_FILES=.

				lint: PYTHON_FILES=.

				lint_diff: PYTHON_FILES=$(shell git diff --name-only --diff-filter=d master | grep -E '\.py$$')

				api_docs_clean:

					rm -f docs/api_reference/api_reference.rst

					cd docs/api_reference && poetry run make clean

				lint lint_diff:

					poetry run mypy $(PYTHON_FILES)

					poetry run black $(PYTHON_FILES) --check

					poetry run ruff .

				api_docs_linkcheck:

					poetry run linkchecker docs/api_reference/_build/html/index.html

				TEST_FILE ?= tests/unit_tests/

				spell_check:

					poetry run codespell --toml pyproject.toml

				test:

					poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)

				spell_fix:

					poetry run codespell --toml pyproject.toml -w

				tests: 

					poetry run pytest --disable-socket --allow-unix-socket $(TEST_FILE)

				extended_tests:

					poetry run pytest --disable-socket --allow-unix-socket --only-extended tests/unit_tests

				test_watch:

					poetry run ptw --now . -- tests/unit_tests

				integration_tests:

					poetry run pytest tests/integration_tests

				docker_tests:

					docker build -t my-langchain-image:test .

					docker run --rm my-langchain-image:test

				######################

				# HELP

				######################

				help:

					@echo '----'

				@@ -59,12 +47,3 @@ help:

					@echo 'docs_build                   - build the documentation'

					@echo 'docs_clean                   - clean the documentation build artifacts'

					@echo 'docs_linkcheck               - run linkchecker on the documentation'

					@echo 'format                       - run code formatters'

					@echo 'lint                         - run linters'

					@echo 'test                         - run unit tests'

					@echo 'tests                        - run unit tests'

					@echo 'test TEST_FILE=<test_file>   - run all tests in file'

					@echo 'extended_tests               - run only extended unit tests'

					@echo 'test_watch                   - run unit tests in watch mode'

					@echo 'integration_tests            - run integration tests'

					@echo 'docker_tests                 - run unit tests in docker'

									
										28

README.md
									
												View File
												
				@@ -2,9 +2,9 @@

				⚡ Building applications with LLMs through composability ⚡

				[![lint](https://github.com/hwchase17/langchain/actions/workflows/lint.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/lint.yml)

				[![test](https://github.com/hwchase17/langchain/actions/workflows/test.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/test.yml)

				[![linkcheck](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/linkcheck.yml)

				[![Release Notes](https://img.shields.io/github/release/hwchase17/langchain)](https://github.com/hwchase17/langchain/releases)

				[![CI](https://github.com/hwchase17/langchain/actions/workflows/langchain_ci.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/langchain_ci.yml)

				[![Experimental CI](https://github.com/hwchase17/langchain/actions/workflows/langchain_experimental_ci.yml/badge.svg)](https://github.com/hwchase17/langchain/actions/workflows/langchain_experimental_ci.yml)

				[![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain)

				[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

				[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)

				@@ -12,6 +12,8 @@

				[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/hwchase17/langchain)

				[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/hwchase17/langchain)

				[![GitHub star chart](https://img.shields.io/github/stars/hwchase17/langchain?style=social)](https://star-history.com/#hwchase17/langchain)

				[![Dependency Status](https://img.shields.io/librariesio/github/hwchase17/langchain)](https://libraries.io/github/hwchase17/langchain)

				[![Open Issues](https://img.shields.io/github/issues-raw/hwchase17/langchain)](https://github.com/hwchase17/langchain/issues)

				Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwchase17/langchainjs).

				@@ -19,11 +21,19 @@ Looking for the JS/TS version? Check out [LangChain.js](https://github.com/hwcha

				**Production Support:** As you move your LangChains into production, we'd love to offer more comprehensive support.

				Please fill out [this form](https://forms.gle/57d8AmXBYp8PP8tZA) and we'll set up a dedicated support Slack channel.

				## 🚨Breaking Changes for select chains (SQLDatabase) on 7/28

				In an effort to make `langchain` leaner and safer, we are moving select chains to `langchain_experimental`.

				This migration has already started, but we are remaining backwards compatible until 7/28.

				On that date, we will remove functionality from `langchain`.

				Read more about the motivation and the progress [here](https://github.com/hwchase17/langchain/discussions/8043).

				Read how to migrate your code [here](MIGRATE.md).

				## Quick Install

				`pip install langchain`

				or

				`conda install langchain -c conda-forge`

				`pip install langsmith && conda install langchain -c conda-forge`

				## 🤔 What is this?

				@@ -33,22 +43,22 @@ This library aims to assist in the development of those types of applications. C

				**❓ Question Answering over specific documents**

				- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/question_answering.html)

				- [Documentation](https://python.langchain.com/docs/use_cases/question_answering/)

				- End-to-end Example: [Question Answering over Notion Database](https://github.com/hwchase17/notion-qa)

				**💬 Chatbots**

				- [Documentation](https://langchain.readthedocs.io/en/latest/use_cases/chatbots.html)

				- [Documentation](https://python.langchain.com/docs/use_cases/chatbots/)

				- End-to-end Example: [Chat-LangChain](https://github.com/hwchase17/chat-langchain)

				**🤖 Agents**

				- [Documentation](https://langchain.readthedocs.io/en/latest/modules/agents.html)

				- [Documentation](https://python.langchain.com/docs/modules/agents/)

				- End-to-end Example: [GPT+WolframAlpha](https://huggingface.co/spaces/JavaFXpert/Chat-GPT-LangChain)

				## 📖 Documentation

				Please see [here](https://langchain.readthedocs.io/en/latest/?) for full documentation on:

				Please see [here](https://python.langchain.com) for full documentation on:

				- Getting started (installation, setting up the environment, simple examples)

				- How-To examples (demos, integrations, helper functions)

				@@ -84,7 +94,7 @@ Memory refers to persisting state between calls of a chain/agent. LangChain prov

				[BETA] Generative models are notoriously hard to evaluate with traditional metrics. One new way of evaluating them is using language models themselves to do the evaluation. LangChain provides some prompts/chains for assisting in this.

				For more information on these concepts, please see our [full documentation](https://langchain.readthedocs.io/en/latest/).

				For more information on these concepts, please see our [full documentation](https://python.langchain.com).

				## 💁 Contributing

									
										17

docs/.local_build.sh
									
										Executable file
									
												View File
												
				@@ -0,0 +1,17 @@

				#!/usr/bin/env bash

				set -o errexit

				set -o nounset

				set -o pipefail

				set -o xtrace

				SCRIPT_DIR="$(cd "$(dirname "$0")"; pwd)"

				cd "${SCRIPT_DIR}"

				mkdir -p _dist/docs_skeleton

				cp -r {docs_skeleton,snippets} _dist

				cp -r extras/* _dist/docs_skeleton/docs

				cd _dist/docs_skeleton

				poetry run nbdoc_build

				yarn install

				yarn start

									
										57

docs/additional_resources/tracing.md
									
												View File
											
				@@ -1,57 +0,0 @@

				# Tracing

				By enabling tracing in your LangChain runs, you’ll be able to more effectively visualize, step through, and debug your chains and agents.

				First, you should install tracing and set up your environment properly.

				You can use either a locally hosted version of this (uses Docker) or a cloud hosted version (in closed alpha).

				If you're interested in using the hosted platform, please fill out the form [here](https://forms.gle/tRCEMSeopZf6TE3b6).

				- [Locally Hosted Setup](../tracing/local_installation.md)

				- [Cloud Hosted Setup](../tracing/hosted_installation.md)

				## Tracing Walkthrough

				When you first access the UI, you should see a page with your tracing sessions.

				An initial one "default" should already be created for you.

				A session is just a way to group traces together.

				If you click on a session, it will take you to a page with no recorded traces that says "No Runs."

				You can create a new session with the new session form.

				![](../tracing/homepage.png)

				If we click on the `default` session, we can see that to start we have no traces stored.

				![](../tracing/default_empty.png)

				If we now start running chains and agents with tracing enabled, we will see data show up here.

				To do so, we can run [this notebook](../tracing/agent_with_tracing.ipynb) as an example.

				After running it, we will see an initial trace show up.

				![](../tracing/first_trace.png)

				From here we can explore the trace at a high level by clicking on the arrow to show nested runs.

				We can keep on clicking further and further down to explore deeper and deeper.

				![](../tracing/explore.png)

				We can also click on the "Explore" button of the top level run to dive even deeper.

				Here, we can see the inputs and outputs in full, as well as all the nested traces.

				![](../tracing/explore_trace.png)

				We can keep on exploring each of these nested traces in more detail.

				For example, here is the lowest level trace with the exact inputs/outputs to the LLM.

				![](../tracing/explore_llm.png)

				## Changing Sessions

				1. To initially record traces to a session other than `"default"`, you can set the `LANGCHAIN_SESSION` environment variable to the name of the session you want to record to:

				```python

				import os

				os.environ["LANGCHAIN_TRACING"] = "true"

				os.environ["LANGCHAIN_SESSION"] = "my_session" # Make sure this session actually exists. You can create a new session in the UI.

				```

				2. To switch sessions mid-script or mid-notebook, do NOT set the `LANGCHAIN_SESSION` environment variable. Instead: `langchain.set_tracing_callback_manager(session_name="my_session")`

									
										90

docs/additional_resources/youtube.md
									
												View File
											
				@@ -1,90 +0,0 @@

				# YouTube

				This is a collection of `LangChain` videos on `YouTube`.

				### ⛓️[Official LangChain YouTube channel](https://www.youtube.com/@LangChain)⛓️

				### Introduction to LangChain with Harrison Chase, creator of LangChain

				- [Building the Future with LLMs, `LangChain`, & `Pinecone`](https://youtu.be/nMniwlGyX-c) by [Pinecone](https://www.youtube.com/@pinecone-io)

				- [LangChain and Weaviate with Harrison Chase and Bob van Luijt - Weaviate Podcast #36](https://youtu.be/lhby7Ql7hbk) by [Weaviate • Vector Database](https://www.youtube.com/@Weaviate)

				- [LangChain Demo + Q&A with Harrison Chase](https://youtu.be/zaYTXQFR0_s?t=788) by [Full Stack Deep Learning](https://www.youtube.com/@FullStackDeepLearning)

				- [LangChain Agents: Build Personal Assistants For Your Data (Q&A with Harrison Chase and Mayo Oshin)](https://youtu.be/gVkF8cwfBLI) by [Chat with data](https://www.youtube.com/@chatwithdata)

				- ⛓️ [LangChain "Agents in Production" Webinar](https://youtu.be/k8GNCCs16F4) by [LangChain](https://www.youtube.com/@LangChain)

				## Videos (sorted by views)

				- [Building AI LLM Apps with LangChain (and more?) - LIVE STREAM](https://www.youtube.com/live/M-2Cj_2fzWI?feature=share) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)

				- [First look - `ChatGPT` + `WolframAlpha` (`GPT-3.5` and Wolfram|Alpha via LangChain by James Weaver)](https://youtu.be/wYGbY811oMo) by [Dr Alan D. Thompson](https://www.youtube.com/@DrAlanDThompson) 

				- [LangChain explained - The hottest new Python framework](https://youtu.be/RoR4XJw8wIc) by [AssemblyAI](https://www.youtube.com/@AssemblyAI)

				- [Chatbot with INFINITE MEMORY using `OpenAI` & `Pinecone` - `GPT-3`, `Embeddings`, `ADA`, `Vector DB`, `Semantic`](https://youtu.be/2xNzB7xq8nk) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)

				- [LangChain for LLMs is... basically just an Ansible playbook](https://youtu.be/X51N9C-OhlE) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)

				- [Build your own LLM Apps with LangChain & `GPT-Index`](https://youtu.be/-75p09zFUJY) by [1littlecoder](https://www.youtube.com/@1littlecoder)

				- [`BabyAGI` - New System of Autonomous AI Agents with LangChain](https://youtu.be/lg3kJvf1kXo) by [1littlecoder](https://www.youtube.com/@1littlecoder)

				- [Run `BabyAGI` with Langchain Agents (with Python Code)](https://youtu.be/WosPGHPObx8) by [1littlecoder](https://www.youtube.com/@1littlecoder)

				- [How to Use Langchain With `Zapier` | Write and Send Email with GPT-3 | OpenAI API Tutorial](https://youtu.be/p9v2-xEa9A0) by [StarMorph AI](https://www.youtube.com/@starmorph)

				- [Use Your Locally Stored Files To Get Response From GPT - `OpenAI` | Langchain | Python](https://youtu.be/NC1Ni9KS-rk) by [Shweta Lodha](https://www.youtube.com/@shweta-lodha)

				- [`Langchain JS` | How to Use GPT-3, GPT-4 to Reference your own Data | `OpenAI Embeddings` Intro](https://youtu.be/veV2I-NEjaM) by [StarMorph AI](https://www.youtube.com/@starmorph)

				- [The easiest way to work with large language models | Learn LangChain in 10min](https://youtu.be/kmbS6FDQh7c) by [Sophia Yang](https://www.youtube.com/@SophiaYangDS)

				- [4 Autonomous AI Agents: “Westworld” simulation `BabyAGI`, `AutoGPT`, `Camel`, `LangChain`](https://youtu.be/yWbnH6inT_U) by [Sophia Yang](https://www.youtube.com/@SophiaYangDS)

				- [AI CAN SEARCH THE INTERNET? Langchain Agents + OpenAI ChatGPT](https://youtu.be/J-GL0htqda8) by [tylerwhatsgood](https://www.youtube.com/@tylerwhatsgood)

				- [Query Your Data with GPT-4 | Embeddings, Vector Databases | Langchain JS Knowledgebase](https://youtu.be/jRnUPUTkZmU) by [StarMorph AI](https://www.youtube.com/@starmorph)

				- [`Weaviate` + LangChain for LLM apps presented by Erika Cardenas](https://youtu.be/7AGj4Td5Lgw) by [`Weaviate` • Vector Database](https://www.youtube.com/@Weaviate)

				- [Langchain Overview — How to Use Langchain & `ChatGPT`](https://youtu.be/oYVYIq0lOtI) by [Python In Office](https://www.youtube.com/@pythoninoffice6568)

				- [Langchain Overview - How to Use Langchain & `ChatGPT`](https://youtu.be/oYVYIq0lOtI) by [Python In Office](https://www.youtube.com/@pythoninoffice6568)

				- [Custom langchain Agent & Tools with memory. Turn any `Python function` into langchain tool with Gpt 3](https://youtu.be/NIG8lXk0ULg) by [echohive](https://www.youtube.com/@echohive)

				- [LangChain: Run Language Models Locally - `Hugging Face Models`](https://youtu.be/Xxxuw4_iCzw) by [Prompt Engineering](https://www.youtube.com/@engineerprompt)

				- [`ChatGPT` with any `YouTube` video using langchain and `chromadb`](https://youtu.be/TQZfB2bzVwU) by [echohive](https://www.youtube.com/@echohive)

				- [How to Talk to a `PDF` using LangChain and `ChatGPT`](https://youtu.be/v2i1YDtrIwk) by [Automata Learning Lab](https://www.youtube.com/@automatalearninglab)

				- [Langchain Document Loaders Part 1: Unstructured Files](https://youtu.be/O5C0wfsen98) by [Merk](https://www.youtube.com/@merksworld) 

				- [LangChain - Prompt Templates (what all the best prompt engineers use)](https://youtu.be/1aRu8b0XNOQ) by [Nick Daigler](https://www.youtube.com/@nick_daigs)

				- [LangChain. Crear aplicaciones Python impulsadas por GPT](https://youtu.be/DkW_rDndts8) by [Jesús Conde](https://www.youtube.com/@0utKast)

				- [Easiest Way to Use GPT In Your Products | LangChain Basics Tutorial](https://youtu.be/fLy0VenZyGc) by [Rachel Woods](https://www.youtube.com/@therachelwoods)

				- [`BabyAGI` + `GPT-4` Langchain Agent with Internet Access](https://youtu.be/wx1z_hs5P6E) by [tylerwhatsgood](https://www.youtube.com/@tylerwhatsgood)

				- [Learning LLM Agents. How does it actually work? LangChain, AutoGPT & OpenAI](https://youtu.be/mb_YAABSplk) by [Arnoldas Kemeklis](https://www.youtube.com/@processusAI)

				- [Get Started with LangChain in `Node.js`](https://youtu.be/Wxx1KUWJFv4) by [Developers Digest](https://www.youtube.com/@DevelopersDigest)

				- [LangChain + `OpenAI` tutorial: Building a Q&A system w/ own text data](https://youtu.be/DYOU_Z0hAwo) by [Samuel Chan](https://www.youtube.com/@SamuelChan)

				- [Langchain + `Zapier` Agent](https://youtu.be/yribLAb-pxA) by [Merk](https://www.youtube.com/@merksworld)

				- [Connecting the Internet with `ChatGPT` (LLMs) using Langchain And Answers Your Questions](https://youtu.be/9Y0TBC63yZg) by [Kamalraj M M](https://www.youtube.com/@insightbuilder)

				- [Build More Powerful LLM Applications for Business’s with LangChain (Beginners Guide)](https://youtu.be/sp3-WLKEcBg) by[ No Code Blackbox](https://www.youtube.com/@nocodeblackbox)

				- ⛓️ [LangFlow LLM Agent Demo for 🦜🔗LangChain](https://youtu.be/zJxDHaWt-6o) by [Cobus Greyling](https://www.youtube.com/@CobusGreylingZA)

				- ⛓️ [Chatbot Factory: Streamline Python Chatbot Creation with LLMs and Langchain](https://youtu.be/eYer3uzrcuM) by [Finxter](https://www.youtube.com/@CobusGreylingZA)

				- ⛓️ [LangChain Tutorial - ChatGPT mit eigenen Daten](https://youtu.be/0XDLyY90E2c) by [Coding Crashkurse](https://www.youtube.com/@codingcrashkurse6429)

				- ⛓️ [Chat with a `CSV` | LangChain Agents Tutorial (Beginners)](https://youtu.be/tjeti5vXWOU) by [GoDataProf](https://www.youtube.com/@godataprof)

				- ⛓️ [Introdução ao Langchain - #Cortes - Live DataHackers](https://youtu.be/fw8y5VRei5Y) by [Prof. João Gabriel Lima](https://www.youtube.com/@profjoaogabriellima)

				- ⛓️ [LangChain: Level up `ChatGPT` !? | LangChain Tutorial Part 1](https://youtu.be/vxUGx8aZpDE) by [Code Affinity](https://www.youtube.com/@codeaffinitydev)

				- ⛓️ [KI schreibt krasses Youtube Skript 😲😳 | LangChain Tutorial Deutsch](https://youtu.be/QpTiXyK1jus) by [SimpleKI](https://www.youtube.com/@simpleki)

				- ⛓️ [Chat with Audio: Langchain, `Chroma DB`, OpenAI, and `Assembly AI`](https://youtu.be/Kjy7cx1r75g) by [AI Anytime](https://www.youtube.com/@AIAnytime)

				- ⛓️ [QA over documents with Auto vector index selection with Langchain router chains](https://youtu.be/9G05qybShv8) by [echohive](https://www.youtube.com/@echohive)

				- ⛓️ [Build your own custom LLM application with `Bubble.io` & Langchain (No Code & Beginner friendly)](https://youtu.be/O7NhQGu1m6c) by [No Code Blackbox](https://www.youtube.com/@nocodeblackbox)

				- ⛓️ [Simple App to Question Your Docs: Leveraging `Streamlit`, `Hugging Face Spaces`, LangChain, and `Claude`!](https://youtu.be/X4YbNECRr7o) by [Chris Alexiuk](https://www.youtube.com/@chrisalexiuk)

				- ⛓️ [LANGCHAIN AI- `ConstitutionalChainAI` + Databutton AI ASSISTANT Web App](https://youtu.be/5zIU6_rdJCU) by [Avra](https://www.youtube.com/@Avra_b)

				- ⛓️ [LANGCHAIN AI AUTONOMOUS AGENT WEB APP - 👶 `BABY AGI` 🤖 with EMAIL AUTOMATION using `DATABUTTON`](https://youtu.be/cvAwOGfeHgw) by [Avra](https://www.youtube.com/@Avra_b)

				- ⛓️ [The Future of Data Analysis: Using A.I. Models in Data Analysis (LangChain)](https://youtu.be/v_LIcVyg5dk) by [Absent Data](https://www.youtube.com/@absentdata)

				- ⛓️ [Memory in LangChain | Deep dive (python)](https://youtu.be/70lqvTFh_Yg) by [Eden Marco](https://www.youtube.com/@EdenMarco)

				- ⛓️ [9 LangChain UseCases | Beginner's Guide | 2023](https://youtu.be/zS8_qosHNMw) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)

				- ⛓️ [Use Large Language Models in Jupyter Notebook | LangChain | Agents & Indexes](https://youtu.be/JSe11L1a_QQ) by [Abhinaw Tiwari](https://www.youtube.com/@AbhinawTiwariAT)

				- ⛓️ [How to Talk to Your Langchain Agent | `11 Labs` + `Whisper`](https://youtu.be/N4k459Zw2PU) by [VRSEN](https://www.youtube.com/@vrsen)

				- ⛓️ [LangChain Deep Dive: 5 FUN AI App Ideas To Build Quickly and Easily](https://youtu.be/mPYEPzLkeks) by [James NoCode](https://www.youtube.com/@jamesnocode)

				- ⛓️ [BEST OPEN Alternative to OPENAI's EMBEDDINGs for Retrieval QA: LangChain](https://youtu.be/ogEalPMUCSY) by [Prompt Engineering](https://www.youtube.com/@engineerprompt)

				- ⛓️ [LangChain 101: Models](https://youtu.be/T6c_XsyaNSQ) by [Mckay Wrigley](https://www.youtube.com/@realmckaywrigley)

				- ⛓️ [LangChain with JavaScript Tutorial #1 | Setup & Using LLMs](https://youtu.be/W3AoeMrg27o) by [Leon van Zyl](https://www.youtube.com/@leonvanzyl)

				- ⛓️ [LangChain Overview & Tutorial for Beginners: Build Powerful AI Apps Quickly & Easily (ZERO CODE)](https://youtu.be/iI84yym473Q) by [James NoCode](https://www.youtube.com/@jamesnocode)

				- ⛓️ [LangChain In Action: Real-World Use Case With Step-by-Step Tutorial](https://youtu.be/UO699Szp82M) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)

				- ⛓️ [Summarizing and Querying Multiple Papers with LangChain](https://youtu.be/p_MQRWH5Y6k) by [Automata Learning Lab](https://www.youtube.com/@automatalearninglab)

				- ⛓️ [Using Langchain (and `Replit`) through `Tana`, ask `Google`/`Wikipedia`/`Wolfram Alpha` to fill out a table](https://youtu.be/Webau9lEzoI) by [Stian Håklev](https://www.youtube.com/@StianHaklev)

				- ⛓️ [Langchain PDF App (GUI) | Create a ChatGPT For Your `PDF` in Python](https://youtu.be/wUAUdEw5oxM) by [Alejandro AO - Software & Ai](https://www.youtube.com/@alejandro_ao)

				- ⛓️ [Auto-GPT with LangChain 🔥 | Create Your Own Personal AI Assistant](https://youtu.be/imDfPmMKEjM) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)

				- ⛓️ [Create Your OWN Slack AI Assistant with Python & LangChain](https://youtu.be/3jFXRNn2Bu8) by [Dave Ebbelaar](https://www.youtube.com/@daveebbelaar)

				- ⛓️ [How to Create LOCAL Chatbots with GPT4All and LangChain [Full Guide]](https://youtu.be/4p1Fojur8Zw) by [Liam Ottley](https://www.youtube.com/@LiamOttley)

				- ⛓️ [Build a `Multilingual PDF` Search App with LangChain, `Cohere` and `Bubble`](https://youtu.be/hOrtuumOrv8) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)

				- ⛓️ [Building a LangChain Agent (code-free!) Using `Bubble` and `Flowise`](https://youtu.be/jDJIIVWTZDE) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)

				- ⛓️ [Build a LangChain-based Semantic PDF Search App with No-Code Tools Bubble and Flowise](https://youtu.be/s33v5cIeqA4) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)

				- ⛓️ [LangChain Memory Tutorial | Building a ChatGPT Clone in Python](https://youtu.be/Cwq91cj2Pnc) by [Alejandro AO - Software & Ai](https://www.youtube.com/@alejandro_ao)

				- ⛓️ [ChatGPT For Your DATA | Chat with Multiple Documents Using LangChain](https://youtu.be/TeDgIDqQmzs) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)

				- ⛓️ [`Llama Index`: Chat with Documentation using URL Loader](https://youtu.be/XJRoDEctAwA) by [Merk](https://www.youtube.com/@merksworld)

				- ⛓️ [Using OpenAI, LangChain, and `Gradio` to Build Custom GenAI Applications](https://youtu.be/1MsmqMg3yUc) by [David Hundley](https://www.youtube.com/@dkhundley)

				---------------------

				⛓ icon marks a new video [last update 2023-05-15]

0

docs/Makefile → docs/api_reference/Makefile

View File

0

docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css

View File

									
										125

docs/api_reference/conf.py
									
										Normal file
									
												View File
												
				@@ -0,0 +1,125 @@

				"""Configuration file for the Sphinx documentation builder."""

				# Configuration file for the Sphinx documentation builder.

				#

				# This file only contains a selection of the most common options. For a full

				# list see the documentation:

				# https://www.sphinx-doc.org/en/master/usage/configuration.html

				# -- Path setup --------------------------------------------------------------

				# If extensions (or modules to document with autodoc) are in another directory,

				# add these directories to sys.path here. If the directory is relative to the

				# documentation root, use os.path.abspath to make it absolute, like shown here.

				#

				import os

				import sys

				import toml

				sys.path.insert(0, os.path.abspath("."))

				sys.path.insert(0, os.path.abspath("../../libs/langchain"))

				with open("../../libs/langchain/pyproject.toml") as f:

				    data = toml.load(f)

				# -- Project information -----------------------------------------------------

				project = "🦜🔗 LangChain"

				copyright = "2023, Harrison Chase"

				author = "Harrison Chase"

				version = data["tool"]["poetry"]["version"]

				release = version

				html_title = project + " " + version

				html_last_updated_fmt = "%b %d, %Y"

				# -- General configuration ---------------------------------------------------

				# Add any Sphinx extension module names here, as strings. They can be

				# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom

				# ones.

				extensions = [

				    "sphinx.ext.autodoc",

				    "sphinx.ext.autodoc.typehints",

				    "sphinx.ext.autosummary",

				    "sphinx.ext.napoleon",

				    "sphinx.ext.viewcode",

				    "sphinxcontrib.autodoc_pydantic",

				    "sphinx_copybutton",

				    "sphinx_panels",

				    "IPython.sphinxext.ipython_console_highlighting",

				]

				source_suffix = [".rst"]

				autodoc_pydantic_model_show_json = False

				autodoc_pydantic_field_list_validators = False

				autodoc_pydantic_config_members = False

				autodoc_pydantic_model_show_config_summary = False

				autodoc_pydantic_model_show_validator_members = False

				autodoc_pydantic_model_show_validator_summary = False

				autodoc_pydantic_model_signature_prefix = "class"

				autodoc_pydantic_field_signature_prefix = "param"

				autodoc_member_order = "groupwise"

				autoclass_content = "both"

				autodoc_typehints_format = "short"

				autodoc_default_options = {

				    "members": True,

				    "show-inheritance": True,

				    "inherited-members": "BaseModel",

				    "undoc-members": True,

				    "special-members": "__call__",

				}

				# autodoc_typehints = "description"

				# Add any paths that contain templates here, relative to this directory.

				templates_path = ["templates"]

				# List of patterns, relative to source directory, that match files and

				# directories to ignore when looking for source files.

				# This pattern also affects html_static_path and html_extra_path.

				exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]

				# -- Options for HTML output -------------------------------------------------

				# The theme to use for HTML and HTML Help pages.  See the documentation for

				# a list of builtin themes.

				#

				html_theme = "scikit-learn-modern"

				html_theme_path = ["themes"]

				# redirects dictionary maps from old links to new links

				html_additional_pages = {}

				redirects = {

				    "index": "api_reference",

				}

				for old_link in redirects:

				    html_additional_pages[old_link] = "redirects.html"

				html_context = {

				    "display_github": True,  # Integrate GitHub

				    "github_user": "hwchase17",  # Username

				    "github_repo": "langchain",  # Repo name

				    "github_version": "master",  # Version

				    "conf_py_path": "/docs/api_reference",  # Path in the checkout to the docs root

				    "redirects": redirects,

				}

				# Add any paths that contain custom static files (such as style sheets) here,

				# relative to this directory. They are copied after the builtin static files,

				# so a file named "default.css" will overwrite the builtin "default.css".

				html_static_path = ["_static"]

				# These paths are either relative to html_static_path

				# or fully qualified paths (eg. https://...)

				html_css_files = [

				    "css/custom.css",

				]

				html_use_index = False

				myst_enable_extensions = ["colon_fence"]

				# generate autosummary even if no references

				autosummary_generate = True

									
										96

docs/api_reference/create_api_rst.py
									
										Normal file
									
												View File
												
				@@ -0,0 +1,96 @@

				"""Script for auto-generating api_reference.rst"""

				import glob

				import re

				from pathlib import Path

				ROOT_DIR = Path(__file__).parents[2].absolute()

				PKG_DIR = ROOT_DIR / "libs" / "langchain" / "langchain"

				WRITE_FILE = Path(__file__).parent / "api_reference.rst"

				def load_members() -> dict:

				    members: dict = {}

				    for py in glob.glob(str(PKG_DIR) + "/**/*.py", recursive=True):

				        module = py[len(str(PKG_DIR)) + 1 :].replace(".py", "").replace("/", ".")

				        top_level = module.split(".")[0]

				        if top_level not in members:

				            members[top_level] = {"classes": [], "functions": []}

				        with open(py, "r") as f:

				            for line in f.readlines():

				                cls = re.findall(r"^class ([^_].*)\(", line)

				                members[top_level]["classes"].extend([module + "." + c for c in cls])

				                func = re.findall(r"^def ([^_].*)\(", line)

				                afunc = re.findall(r"^async def ([^_].*)\(", line)

				                func_strings = [module + "." + f for f in func + afunc]

				                members[top_level]["functions"].extend(func_strings)

				    return members

				def construct_doc(members: dict) -> str:

				    full_doc = """\

				.. _api_reference:

				=============

				API Reference

				=============

				"""

				    for module, _members in sorted(members.items(), key=lambda kv: kv[0]):

				        classes = _members["classes"]

				        functions = _members["functions"]

				        if not (classes or functions):

				            continue

				        module_title = module.replace("_", " ").title()

				        if module_title == "Llms":

				            module_title = "LLMs"

				        section = f":mod:`langchain.{module}`: {module_title}"

				        full_doc += f"""\

				{section}

				{'=' * (len(section) + 1)}

				.. automodule:: langchain.{module}

				    :no-members:

				    :no-inherited-members:

				"""

				        if classes:

				            cstring = "\n    ".join(sorted(classes))

				            full_doc += f"""\

				Classes

				--------------

				.. currentmodule:: langchain

				.. autosummary::

				    :toctree: {module}

				    :template: class.rst

				    {cstring}

				"""

				        if functions:

				            fstring = "\n    ".join(sorted(functions))

				            full_doc += f"""\

				Functions

				--------------

				.. currentmodule:: langchain

				.. autosummary::

				    :toctree: {module}

				    {fstring}

				"""

				    return full_doc

				def main() -> None:

				    members = load_members()

				    full_doc = construct_doc(members)

				    with open(WRITE_FILE, "w") as f:

				        f.write(full_doc)

				if __name__ == "__main__":

				    main()

									
										8

docs/api_reference/index.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,8 @@

				=============

				LangChain API

				=============

				.. toctree::

				    :maxdepth: 2

				    api_reference.rst

0

docs/make.bat → docs/api_reference/make.bat

View File

									
										9

docs/api_reference/modules/evaluation.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,9 @@

				Evaluation

				=======================

				LangChain has a number of convenient evaluation chains you can use off the shelf to grade your models' oupputs.

				.. automodule:: langchain.evaluation

				   :members:

				   :undoc-members:

				   :inherited-members:

4

docs/requirements.txt → docs/api_reference/requirements.txt

View File

@@ -1,13 +1,13 @@
 -e libs/langchain
 autodoc_pydantic==1.8.0
 myst_parser
 nbsphinx==0.8.9
 sphinx==4.5.0
 sphinx-autobuild==2021.3.14
 sphinx_book_theme
 sphinx_rtd_theme==1.0.0
 sphinx-typlog-theme==0.8.0
 sphinx-panels
 toml
 myst_nb
 sphinx_copybutton
 pydata-sphinx-theme==0.13.1
 pydata-sphinx-theme==0.13.1

27

docs/api_reference/templates/COPYRIGHT.txt Normal file

View File

@@ -0,0 +1,27 @@
 Copyright (c) 2007-2023 The scikit-learn developers.
 All rights reserved.
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
 * Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
 * Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.
 * Neither the name of the copyright holder nor the names of its
   contributors may be used to endorse or promote products derived from
   this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

									
										28

docs/api_reference/templates/class.rst
									
										Normal file
									
												View File
												
				@@ -0,0 +1,28 @@

				:mod:`{{module}}`.{{objname}}

				{{ underline }}==============

				.. currentmodule:: {{ module }}

				.. autoclass:: {{ objname }}

				   {% block methods %}

				   {% if methods %}

				   .. rubric:: {{ _('Methods') }}

				   .. autosummary::

				   {% for item in methods %}

				      ~{{ name }}.{{ item }}

				   {%- endfor %}

				   {% endif %}

				   {% endblock %}

				   {% block attributes %}

				   {% if attributes %}

				   .. rubric:: {{ _('Attributes') }}

				   .. autosummary::

				   {% for item in attributes %}

				      ~{{ name }}.{{ item }}

				   {%- endfor %}

				   {% endif %}

				   {% endblock %}

									
										15

docs/api_reference/templates/redirects.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,15 @@

				{% set redirect = pathto(redirects[pagename]) %}

				<!DOCTYPE html>

				<html>

				  <head>

				    <meta charset="utf-8">

				    <meta name="viewport" content="width=device-width, initial-scale=1.0">

				    <meta http-equiv="Refresh" content="0; url={{ redirect }}" />

				    <meta name="Description" content="scikit-learn: machine learning in Python">

				    <link rel="canonical" href="{{ redirect }}" />

				    <title>scikit-learn: machine learning in Python</title>

				  </head>

				  <body>

				    <p>You will be automatically redirected to the <a href="{{ redirect }}">new location of this page</a>.</p>

				  </body>

				</html>

27

docs/api_reference/themes/COPYRIGHT.txt Normal file

View File

@@ -0,0 +1,27 @@
 Copyright (c) 2007-2023 The scikit-learn developers.
 All rights reserved.
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:
 * Redistributions of source code must retain the above copyright notice, this
   list of conditions and the following disclaimer.
 * Redistributions in binary form must reproduce the above copyright notice,
   this list of conditions and the following disclaimer in the documentation
   and/or other materials provided with the distribution.
 * Neither the name of the copyright holder nor the names of its
   contributors may be used to endorse or promote products derived from
   this software without specific prior written permission.
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
 FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

									
										67

docs/api_reference/themes/scikit-learn-modern/javascript.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,67 @@

				<script>

				$(document).ready(function() {

				    /* Add a [>>>] button on the top-right corner of code samples to hide

				     * the >>> and ... prompts and the output and thus make the code

				     * copyable. */

				    var div = $('.highlight-python .highlight,' +

				                '.highlight-python3 .highlight,' +

				                '.highlight-pycon .highlight,' +

						'.highlight-default .highlight')

				    var pre = div.find('pre');

				    // get the styles from the current theme

				    pre.parent().parent().css('position', 'relative');

				    var hide_text = 'Hide prompts and outputs';

				    var show_text = 'Show prompts and outputs';

				    // create and add the button to all the code blocks that contain >>>

				    div.each(function(index) {

				        var jthis = $(this);

				        if (jthis.find('.gp').length > 0) {

				            var button = $('<span class="copybutton">&gt;&gt;&gt;</span>');

				            button.attr('title', hide_text);

				            button.data('hidden', 'false');

				            jthis.prepend(button);

				        }

				        // tracebacks (.gt) contain bare text elements that need to be

				        // wrapped in a span to work with .nextUntil() (see later)

				        jthis.find('pre:has(.gt)').contents().filter(function() {

				            return ((this.nodeType == 3) && (this.data.trim().length > 0));

				        }).wrap('<span>');

				    });

				    // define the behavior of the button when it's clicked

				    $('.copybutton').click(function(e){

				        e.preventDefault();

				        var button = $(this);

				        if (button.data('hidden') === 'false') {

				            // hide the code output

				            button.parent().find('.go, .gp, .gt').hide();

				            button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'hidden');

				            button.css('text-decoration', 'line-through');

				            button.attr('title', show_text);

				            button.data('hidden', 'true');

				        } else {

				            // show the code output

				            button.parent().find('.go, .gp, .gt').show();

				            button.next('pre').find('.gt').nextUntil('.gp, .go').css('visibility', 'visible');

				            button.css('text-decoration', 'none');

				            button.attr('title', hide_text);

				            button.data('hidden', 'false');

				        }

				    });

					/*** Add permalink buttons next to glossary terms ***/

					$('dl.glossary > dt[id]').append(function() {

						return ('<a class="headerlink" href="#' +

							    this.getAttribute('id') +

							    '" title="Permalink to this term">¶</a>');

					});

				});

				</script>

				{%- if pagename != 'index' and pagename != 'documentation' %}

				    {% if theme_mathjax_path %}

				<script id="MathJax-script" async src="{{ theme_mathjax_path }}"></script>

				    {% endif %}

				{%- endif %}

									
										142

docs/api_reference/themes/scikit-learn-modern/layout.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,142 @@

				{# TEMPLATE VAR SETTINGS #}

				{%- set url_root = pathto('', 1) %}

				{%- if url_root == '#' %}{% set url_root = '' %}{% endif %}

				{%- if not embedded and docstitle %}

				  {%- set titlesuffix = " &mdash; "|safe + docstitle|e %}

				{%- else %}

				  {%- set titlesuffix = "" %}

				{%- endif %}

				{%- set lang_attr = 'en' %}

				<!DOCTYPE html>

				<!--[if IE 8]><html class="no-js lt-ie9" lang="{{ lang_attr }}" > <![endif]-->

				<!--[if gt IE 8]><!--> <html class="no-js" lang="{{ lang_attr }}" > <!--<![endif]-->

				<head>

				  <meta charset="utf-8">

				  {{ metatags }}

				  <meta name="viewport" content="width=device-width, initial-scale=1.0">

				  {% block htmltitle %}

				  <title>{{ title|striptags|e }}{{ titlesuffix }}</title>

				  {% endblock %}

				  <link rel="canonical" href="http://scikit-learn.org/stable/{{pagename}}.html" />

				  {% if favicon_url %}

				  <link rel="shortcut icon" href="{{ favicon_url|e }}"/>

				  {% endif %}

				  <link rel="stylesheet" href="{{ pathto('_static/css/vendor/bootstrap.min.css', 1) }}" type="text/css" />

				  {%- for css in css_files %}

				    {%- if css|attr("rel") %}

				  <link rel="{{ css.rel }}" href="{{ pathto(css.filename, 1) }}" type="text/css"{% if css.title is not none %} title="{{ css.title }}"{% endif %} />

				    {%- else %}

				  <link rel="stylesheet" href="{{ pathto(css, 1) }}" type="text/css" />

				    {%- endif %}

				  {%- endfor %}

				  <link rel="stylesheet" href="{{ pathto('_static/' + style, 1) }}" type="text/css" />

				<script id="documentation_options" data-url_root="{{ pathto('', 1) }}" src="{{ pathto('_static/documentation_options.js', 1) }}"></script>

				<script src="{{ pathto('_static/jquery.js', 1) }}"></script>

				{%- block extrahead %} {% endblock %}

				</head>

				<body>

				{% include "nav.html" %}

				{%- block content %}

				<div class="d-flex" id="sk-doc-wrapper">

				    <input type="checkbox" name="sk-toggle-checkbox" id="sk-toggle-checkbox">

				    <label id="sk-sidemenu-toggle" class="sk-btn-toggle-toc btn sk-btn-primary" for="sk-toggle-checkbox">Toggle Menu</label>

				    <div id="sk-sidebar-wrapper" class="border-right">

				      <div class="sk-sidebar-toc-wrapper">

				        <div class="btn-group w-100 mb-2" role="group" aria-label="rellinks">

				          {%- if prev %}

				            <a href="{{ prev.link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ prev.title|striptags }}">Prev</a>

				          {%- else %}

				            <a href="#" role="button" class="btn sk-btn-rellink py-1 disabled"">Prev</a>

				          {%- endif %}

				          {%- if parents -%}

				            <a href="{{ parents[-1].link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ parents[-1].title|striptags }}">Up</a>

				          {%- else %}

				            <a href="#" role="button" class="btn sk-btn-rellink disabled py-1">Up</a>

				          {%- endif %}

				          {%- if next %}

				            <a href="{{ next.link|e }}" role="button" class="btn sk-btn-rellink py-1" sk-rellink-tooltip="{{ next.title|striptags }}">Next</a>

				          {%- else %}

				            <a href="#" role="button" class="btn sk-btn-rellink py-1 disabled"">Next</a>

				          {%- endif %}

				        </div>

				        {%- if pagename != "install" %}

				        <div class="alert alert-warning p-1 mb-2" role="alert">

				          <p class="text-center mb-0">

				          <strong>LangChain {{ release }}</strong><br/>

				          </p>

				        </div>

				        {%- endif %}

				            {%- if meta and meta['parenttoc']|tobool %}

				            <div class="sk-sidebar-toc">

				            {% set nav = get_nav_object(maxdepth=3, collapse=True, numbered=True) %}

				              <ul>

				              {% for main_nav_item in nav %}

				              {% if main_nav_item.active %}

				              <li>

				                <a href="{{ main_nav_item.url }}" class="sk-toc-active">{{ main_nav_item.title }}</a>

				              </li>

				              <ul>

				              {% for nav_item in main_nav_item.children %}

				                <li>

				                  <a href="{{ nav_item.url }}" class="{% if nav_item.active %}sk-toc-active{% endif %}">{{ nav_item.title }}</a>

				                  {% if nav_item.children %}

				                  <ul>

				                    {% for inner_child in nav_item.children %}

				                      <li class="sk-toctree-l3">

				                        <a href="{{ inner_child.url }}">{{ inner_child.title }}</a>

				                      </li>

				                    {% endfor %}

				                  </ul>

				                  {% endif %}

				                </li>

				              {% endfor %}

				              </ul>

				              {% endif %}

				              {% endfor %}

				              </ul>

				            </div>

				            {%- elif meta and meta['globalsidebartoc']|tobool %}

				            <div class="sk-sidebar-toc sk-sidebar-global-toc">

				              {{ toctree(maxdepth=2, titles_only=True) }}

				            </div>

				            {%- else %}

				            <div class="sk-sidebar-toc">

				              {{ toc }}

				            </div>

				            {%- endif %}

				      </div>

				    </div>

				    <div id="sk-page-content-wrapper">

				      <div class="sk-page-content container-fluid body px-md-3" role="main">

				        {% block body %}{% endblock %}

				      </div>

				    <div class="container">

				      <footer class="sk-content-footer">

				        {%- if pagename != 'index' %}

				        {%- if show_copyright %}

				          {%- if hasdoc('copyright') %}

				            {% trans path=pathto('copyright'), copyright=copyright|e %}&copy; {{ copyright }}.{% endtrans %}

				          {%- else %}

				            {% trans copyright=copyright|e %}&copy; {{ copyright }}.{% endtrans %}

				          {%- endif %}

				        {%- endif %}

				        {%- if last_updated %}

				          {% trans last_updated=last_updated|e %}Last updated on {{ last_updated }}.{% endtrans %}

				        {%- endif %}

				        {%- if show_source and has_source and sourcename %}

				          <a href="{{ pathto('_sources/' + sourcename, true)|e }}" rel="nofollow">{{ _('Show this page source') }}</a>

				        {%- endif %}

				        {%- endif %}

				      </footer>

				    </div>

				  </div>

				</div>

				{%- endblock %}

				<script src="{{ pathto('_static/js/vendor/bootstrap.min.js', 1) }}"></script>

				{% include "javascript.html" %}

				</body>

				</html>

									
										69

docs/api_reference/themes/scikit-learn-modern/nav.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,69 @@

				{%- if pagename != 'index' and pagename != 'documentation' %}

				  {%- set nav_bar_class = "sk-docs-navbar" %}

				  {%- set top_container_cls = "sk-docs-container" %}

				{%- else %}

				  {%- set nav_bar_class = "sk-landing-navbar" %}

				  {%- set top_container_cls = "sk-landing-container" %}

				{%- endif %}

				{% if theme_link_to_live_contributing_page|tobool %}

				{# Link to development page for live builds #}

				  {%- set development_link = "https://scikit-learn.org/dev/developers/index.html" %}

				{# Open on a new development page in new window/tab for live builds #}

				  {%- set development_attrs = 'target="_blank" rel="noopener noreferrer"' %}

				{%- else %}

				  {%- set development_link = pathto('developers/index') %}

				  {%- set development_attrs = '' %}

				{%- endif %}

				<nav id="navbar" class="{{ nav_bar_class }} navbar navbar-expand-md navbar-light bg-light py-0">

				  <div class="container-fluid {{ top_container_cls }} px-0">

				    {%- if logo_url %}

				      <a class="navbar-brand py-0" href="{{ pathto('index') }}">

				        <img

				          class="sk-brand-img"

				          src="{{ logo_url|e }}"

				          alt="logo"/>

				      </a>

				    {%- endif %}

				    <button

				      id="sk-navbar-toggler"

				      class="navbar-toggler"

				      type="button"

				      data-toggle="collapse"

				      data-target="#navbarSupportedContent"

				      aria-controls="navbarSupportedContent"

				      aria-expanded="false"

				      aria-label="Toggle navigation"

				    >

				      <span class="navbar-toggler-icon"></span>

				    </button>

				    <div class="sk-navbar-collapse collapse navbar-collapse" id="navbarSupportedContent">

				      <ul class="navbar-nav mr-auto">

				        <li class="nav-item">

				          <a class="sk-nav-link nav-link" href="{{ pathto('api_reference') }}">API</a>

				        </li>

				        <li class="nav-item">

				          <a class="sk-nav-link nav-link" target="_blank" rel="noopener noreferrer" href="https://python.langchain.com/">Python Docs</a>

				        </li>

				        {%- for title, link, link_attrs in drop_down_navigation %}

				        <li class="nav-item">

				          <a class="sk-nav-link nav-link nav-more-item-mobile-items" href="{{ link }}" {{ link_attrs }}>{{ title }}</a>

				        </li>

				        {%- endfor %}

				      </ul>

				      {%- if pagename != "search"%}

				      <div id="searchbox" role="search">

				          <div class="searchformwrapper">

				          <form class="search" action="{{ pathto('search') }}" method="get">

				            <input class="sk-search-text-input" type="text" name="q" aria-labelledby="searchlabel" />

				            <input class="sk-search-text-btn" type="submit" value="{{ _('Go') }}" />

				          </form>

				          </div>

				      </div>

				      {%- endif %}

				    </div>

				  </div>

				</nav>

									
										16

docs/api_reference/themes/scikit-learn-modern/search.html
									
										Normal file
									
												View File
												
				@@ -0,0 +1,16 @@

				{%- extends "basic/search.html" %}

				{% block extrahead %}

				  <script type="text/javascript" src="{{ pathto('_static/underscore.js', 1) }}"></script>

				  <script type="text/javascript" src="{{ pathto('searchindex.js', 1) }}" defer></script>

				  <script type="text/javascript" src="{{ pathto('_static/doctools.js', 1) }}"></script>

				  <script type="text/javascript" src="{{ pathto('_static/language_data.js', 1) }}"></script>

				  <script type="text/javascript" src="{{ pathto('_static/searchtools.js', 1) }}"></script>

				  <!-- <script type="text/javascript" src="{{ pathto('_static/sphinx_highlight.js', 1) }}"></script> -->

				  <script type="text/javascript">

				    $(document).ready(function() {

				      if (!Search.out) {

				        Search.init();

				      }

				    });

				  </script>

				{% endblock %}

1395

docs/api_reference/themes/scikit-learn-modern/static/css/theme.css Normal file

View File

File diff suppressed because it is too large Load Diff

6

docs/api_reference/themes/scikit-learn-modern/static/css/vendor/bootstrap.min.css vendored Normal file

View File

File diff suppressed because one or more lines are too long

6

docs/api_reference/themes/scikit-learn-modern/static/js/vendor/bootstrap.min.js vendored Normal file

View File

File diff suppressed because one or more lines are too long

2

docs/api_reference/themes/scikit-learn-modern/static/js/vendor/jquery-3.6.3.slim.min.js vendored Normal file

View File

File diff suppressed because one or more lines are too long

8

docs/api_reference/themes/scikit-learn-modern/theme.conf Normal file

View File

@@ -0,0 +1,8 @@
 [theme]
 inherit = basic
 pygments_style = default
 stylesheet = css/theme.css
 [options]
 link_to_live_contributing_page = false
 mathjax_path =

									
										112

docs/conf.py
									
												View File
											
				@@ -1,112 +0,0 @@

				"""Configuration file for the Sphinx documentation builder."""

				# Configuration file for the Sphinx documentation builder.

				#

				# This file only contains a selection of the most common options. For a full

				# list see the documentation:

				# https://www.sphinx-doc.org/en/master/usage/configuration.html

				# -- Path setup --------------------------------------------------------------

				# If extensions (or modules to document with autodoc) are in another directory,

				# add these directories to sys.path here. If the directory is relative to the

				# documentation root, use os.path.abspath to make it absolute, like shown here.

				#

				# import os

				# import sys

				# sys.path.insert(0, os.path.abspath('.'))

				import toml

				with open("../pyproject.toml") as f:

				    data = toml.load(f)

				# -- Project information -----------------------------------------------------

				project = "🦜🔗 LangChain"

				copyright = "2023, Harrison Chase"

				author = "Harrison Chase"

				version = data["tool"]["poetry"]["version"]

				release = version

				html_title = project + " " + version

				html_last_updated_fmt = "%b %d, %Y"

				# -- General configuration ---------------------------------------------------

				# Add any Sphinx extension module names here, as strings. They can be

				# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom

				# ones.

				extensions = [

				    "sphinx.ext.autodoc",

				    "sphinx.ext.autodoc.typehints",

				    "sphinx.ext.autosummary",

				    "sphinx.ext.napoleon",

				    "sphinx.ext.viewcode",

				    "sphinxcontrib.autodoc_pydantic",

				    "myst_nb",

				    "sphinx_copybutton",

				    "sphinx_panels",

				    "IPython.sphinxext.ipython_console_highlighting",

				]

				source_suffix = [".ipynb", ".html", ".md", ".rst"]

				autodoc_pydantic_model_show_json = False

				autodoc_pydantic_field_list_validators = False

				autodoc_pydantic_config_members = False

				autodoc_pydantic_model_show_config_summary = False

				autodoc_pydantic_model_show_validator_members = False

				autodoc_pydantic_model_show_field_summary = False

				autodoc_pydantic_model_members = False

				autodoc_pydantic_model_undoc_members = False

				# autodoc_typehints = "signature"

				# autodoc_typehints = "description"

				# Add any paths that contain templates here, relative to this directory.

				templates_path = ["_templates"]

				# List of patterns, relative to source directory, that match files and

				# directories to ignore when looking for source files.

				# This pattern also affects html_static_path and html_extra_path.

				exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]

				# -- Options for HTML output -------------------------------------------------

				# The theme to use for HTML and HTML Help pages.  See the documentation for

				# a list of builtin themes.

				#

				html_theme = "sphinx_book_theme"

				html_theme_options = {

				    "path_to_docs": "docs",

				    "repository_url": "https://github.com/hwchase17/langchain",

				    "use_repository_button": True,

				}

				html_context = {

				    "display_github": True,  # Integrate GitHub

				    "github_user": "hwchase17",  # Username

				    "github_repo": "langchain",  # Repo name

				    "github_version": "master",  # Version

				    "conf_py_path": "/docs/",  # Path in the checkout to the docs root

				}

				# Add any paths that contain custom static files (such as style sheets) here,

				# relative to this directory. They are copied after the builtin static files,

				# so a file named "default.css" will overwrite the builtin "default.css".

				html_static_path = ["_static"]

				# These paths are either relative to html_static_path

				# or fully qualified paths (eg. https://...)

				html_css_files = [

				    "css/custom.css",

				]

				html_js_files = [

				    "js/mendablesearch.js",

				]

				nb_execution_mode = "off"

				myst_enable_extensions = ["colon_fence"]

									
										192

docs/dependents.md
									
												View File
											
				@@ -1,192 +0,0 @@

				# Dependents

				Dependents stats for `hwchase17/langchain`

				[![](https://img.shields.io/static/v1?label=Used%20by&message=5152&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)

				[![](https://img.shields.io/static/v1?label=Used%20by%20(public)&message=172&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)

				[![](https://img.shields.io/static/v1?label=Used%20by%20(private)&message=4980&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)

				[![](https://img.shields.io/static/v1?label=Used%20by%20(stars)&message=17239&color=informational&logo=slickpic)](https://github.com/hwchase17/langchain/network/dependents)

				[update: 2023-05-17; only dependent repositories with Stars > 100]

				| Repository | Stars  |

				| :--------  | -----: |

				|[openai/openai-cookbook](https://github.com/openai/openai-cookbook) | 35401 |

				|[LAION-AI/Open-Assistant](https://github.com/LAION-AI/Open-Assistant) | 32861 |

				|[microsoft/TaskMatrix](https://github.com/microsoft/TaskMatrix) | 32766 |

				|[hpcaitech/ColossalAI](https://github.com/hpcaitech/ColossalAI) | 29560 |

				|[reworkd/AgentGPT](https://github.com/reworkd/AgentGPT) | 22315 |

				|[imartinez/privateGPT](https://github.com/imartinez/privateGPT) | 17474 |

				|[openai/chatgpt-retrieval-plugin](https://github.com/openai/chatgpt-retrieval-plugin) | 16923 |

				|[mindsdb/mindsdb](https://github.com/mindsdb/mindsdb) | 16112 |

				|[jerryjliu/llama_index](https://github.com/jerryjliu/llama_index) | 15407 |

				|[mlflow/mlflow](https://github.com/mlflow/mlflow) | 14345 |

				|[GaiZhenbiao/ChuanhuChatGPT](https://github.com/GaiZhenbiao/ChuanhuChatGPT) | 10372 |

				|[databrickslabs/dolly](https://github.com/databrickslabs/dolly) | 9919 |

				|[AIGC-Audio/AudioGPT](https://github.com/AIGC-Audio/AudioGPT) | 8177 |

				|[logspace-ai/langflow](https://github.com/logspace-ai/langflow) | 6807 |

				|[imClumsyPanda/langchain-ChatGLM](https://github.com/imClumsyPanda/langchain-ChatGLM) | 6087 |

				|[arc53/DocsGPT](https://github.com/arc53/DocsGPT) | 5292 |

				|[e2b-dev/e2b](https://github.com/e2b-dev/e2b) | 4622 |

				|[nsarrazin/serge](https://github.com/nsarrazin/serge) | 4076 |

				|[madawei2699/myGPTReader](https://github.com/madawei2699/myGPTReader) | 3952 |

				|[zauberzeug/nicegui](https://github.com/zauberzeug/nicegui) | 3952 |

				|[go-skynet/LocalAI](https://github.com/go-skynet/LocalAI) | 3762 |

				|[GreyDGL/PentestGPT](https://github.com/GreyDGL/PentestGPT) | 3388 |

				|[mmabrouk/chatgpt-wrapper](https://github.com/mmabrouk/chatgpt-wrapper) | 3243 |

				|[zilliztech/GPTCache](https://github.com/zilliztech/GPTCache) | 3189 |

				|[wenda-LLM/wenda](https://github.com/wenda-LLM/wenda) | 3050 |

				|[marqo-ai/marqo](https://github.com/marqo-ai/marqo) | 2930 |

				|[gkamradt/langchain-tutorials](https://github.com/gkamradt/langchain-tutorials) | 2710 |

				|[PrefectHQ/marvin](https://github.com/PrefectHQ/marvin) | 2545 |

				|[project-baize/baize-chatbot](https://github.com/project-baize/baize-chatbot) | 2479 |

				|[whitead/paper-qa](https://github.com/whitead/paper-qa) | 2399 |

				|[langgenius/dify](https://github.com/langgenius/dify) | 2344 |

				|[GerevAI/gerev](https://github.com/GerevAI/gerev) | 2283 |

				|[hwchase17/chat-langchain](https://github.com/hwchase17/chat-langchain) | 2266 |

				|[guangzhengli/ChatFiles](https://github.com/guangzhengli/ChatFiles) | 1903 |

				|[Azure-Samples/azure-search-openai-demo](https://github.com/Azure-Samples/azure-search-openai-demo) | 1884 |

				|[OpenBMB/BMTools](https://github.com/OpenBMB/BMTools) | 1860 |

				|[Farama-Foundation/PettingZoo](https://github.com/Farama-Foundation/PettingZoo) | 1813 |

				|[OpenGVLab/Ask-Anything](https://github.com/OpenGVLab/Ask-Anything) | 1571 |

				|[IntelligenzaArtificiale/Free-Auto-GPT](https://github.com/IntelligenzaArtificiale/Free-Auto-GPT) | 1480 |

				|[hwchase17/notion-qa](https://github.com/hwchase17/notion-qa) | 1464 |

				|[NVIDIA/NeMo-Guardrails](https://github.com/NVIDIA/NeMo-Guardrails) | 1419 |

				|[Unstructured-IO/unstructured](https://github.com/Unstructured-IO/unstructured) | 1410 |

				|[Kav-K/GPTDiscord](https://github.com/Kav-K/GPTDiscord) | 1363 |

				|[paulpierre/RasaGPT](https://github.com/paulpierre/RasaGPT) | 1344 |

				|[StanGirard/quivr](https://github.com/StanGirard/quivr) | 1330 |

				|[lunasec-io/lunasec](https://github.com/lunasec-io/lunasec) | 1318 |

				|[vocodedev/vocode-python](https://github.com/vocodedev/vocode-python) | 1286 |

				|[agiresearch/OpenAGI](https://github.com/agiresearch/OpenAGI) | 1156 |

				|[h2oai/h2ogpt](https://github.com/h2oai/h2ogpt) | 1141 |

				|[jina-ai/thinkgpt](https://github.com/jina-ai/thinkgpt) | 1106 |

				|[yanqiangmiffy/Chinese-LangChain](https://github.com/yanqiangmiffy/Chinese-LangChain) | 1072 |

				|[ttengwang/Caption-Anything](https://github.com/ttengwang/Caption-Anything) | 1064 |

				|[jina-ai/dev-gpt](https://github.com/jina-ai/dev-gpt) | 1057 |

				|[juncongmoo/chatllama](https://github.com/juncongmoo/chatllama) | 1003 |

				|[greshake/llm-security](https://github.com/greshake/llm-security) | 1002 |

				|[visual-openllm/visual-openllm](https://github.com/visual-openllm/visual-openllm) | 957 |

				|[richardyc/Chrome-GPT](https://github.com/richardyc/Chrome-GPT) | 918 |

				|[irgolic/AutoPR](https://github.com/irgolic/AutoPR) | 886 |

				|[mmz-001/knowledge_gpt](https://github.com/mmz-001/knowledge_gpt) | 867 |

				|[thomas-yanxin/LangChain-ChatGLM-Webui](https://github.com/thomas-yanxin/LangChain-ChatGLM-Webui) | 850 |

				|[microsoft/X-Decoder](https://github.com/microsoft/X-Decoder) | 837 |

				|[peterw/Chat-with-Github-Repo](https://github.com/peterw/Chat-with-Github-Repo) | 826 |

				|[cirediatpl/FigmaChain](https://github.com/cirediatpl/FigmaChain) | 782 |

				|[hashintel/hash](https://github.com/hashintel/hash) | 778 |

				|[seanpixel/Teenage-AGI](https://github.com/seanpixel/Teenage-AGI) | 773 |

				|[jina-ai/langchain-serve](https://github.com/jina-ai/langchain-serve) | 738 |

				|[corca-ai/EVAL](https://github.com/corca-ai/EVAL) | 737 |

				|[ai-sidekick/sidekick](https://github.com/ai-sidekick/sidekick) | 717 |

				|[rlancemartin/auto-evaluator](https://github.com/rlancemartin/auto-evaluator) | 703 |

				|[poe-platform/api-bot-tutorial](https://github.com/poe-platform/api-bot-tutorial) | 689 |

				|[SamurAIGPT/Camel-AutoGPT](https://github.com/SamurAIGPT/Camel-AutoGPT) | 666 |

				|[eyurtsev/kor](https://github.com/eyurtsev/kor) | 608 |

				|[run-llama/llama-lab](https://github.com/run-llama/llama-lab) | 559 |

				|[namuan/dr-doc-search](https://github.com/namuan/dr-doc-search) | 544 |

				|[pieroit/cheshire-cat](https://github.com/pieroit/cheshire-cat) | 520 |

				|[griptape-ai/griptape](https://github.com/griptape-ai/griptape) | 514 |

				|[getmetal/motorhead](https://github.com/getmetal/motorhead) | 481 |

				|[hwchase17/chat-your-data](https://github.com/hwchase17/chat-your-data) | 462 |

				|[langchain-ai/langchain-aiplugin](https://github.com/langchain-ai/langchain-aiplugin) | 452 |

				|[jina-ai/agentchain](https://github.com/jina-ai/agentchain) | 439 |

				|[SamurAIGPT/ChatGPT-Developer-Plugins](https://github.com/SamurAIGPT/ChatGPT-Developer-Plugins) | 437 |

				|[alexanderatallah/window.ai](https://github.com/alexanderatallah/window.ai) | 433 |

				|[michaelthwan/searchGPT](https://github.com/michaelthwan/searchGPT) | 427 |

				|[mpaepper/content-chatbot](https://github.com/mpaepper/content-chatbot) | 425 |

				|[mckaywrigley/repo-chat](https://github.com/mckaywrigley/repo-chat) | 422 |

				|[whyiyhw/chatgpt-wechat](https://github.com/whyiyhw/chatgpt-wechat) | 421 |

				|[freddyaboulton/gradio-tools](https://github.com/freddyaboulton/gradio-tools) | 407 |

				|[jonra1993/fastapi-alembic-sqlmodel-async](https://github.com/jonra1993/fastapi-alembic-sqlmodel-async) | 395 |

				|[yeagerai/yeagerai-agent](https://github.com/yeagerai/yeagerai-agent) | 383 |

				|[akshata29/chatpdf](https://github.com/akshata29/chatpdf) | 374 |

				|[OpenGVLab/InternGPT](https://github.com/OpenGVLab/InternGPT) | 368 |

				|[ruoccofabrizio/azure-open-ai-embeddings-qna](https://github.com/ruoccofabrizio/azure-open-ai-embeddings-qna) | 358 |

				|[101dotxyz/GPTeam](https://github.com/101dotxyz/GPTeam) | 357 |

				|[mtenenholtz/chat-twitter](https://github.com/mtenenholtz/chat-twitter) | 354 |

				|[amosjyng/langchain-visualizer](https://github.com/amosjyng/langchain-visualizer) | 343 |

				|[msoedov/langcorn](https://github.com/msoedov/langcorn) | 334 |

				|[showlab/VLog](https://github.com/showlab/VLog) | 330 |

				|[continuum-llms/chatgpt-memory](https://github.com/continuum-llms/chatgpt-memory) | 324 |

				|[steamship-core/steamship-langchain](https://github.com/steamship-core/steamship-langchain) | 323 |

				|[daodao97/chatdoc](https://github.com/daodao97/chatdoc) | 320 |

				|[xuwenhao/geektime-ai-course](https://github.com/xuwenhao/geektime-ai-course) | 308 |

				|[StevenGrove/GPT4Tools](https://github.com/StevenGrove/GPT4Tools) | 301 |

				|[logan-markewich/llama_index_starter_pack](https://github.com/logan-markewich/llama_index_starter_pack) | 300 |

				|[andylokandy/gpt-4-search](https://github.com/andylokandy/gpt-4-search) | 299 |

				|[Anil-matcha/ChatPDF](https://github.com/Anil-matcha/ChatPDF) | 287 |

				|[itamargol/openai](https://github.com/itamargol/openai) | 273 |

				|[BlackHC/llm-strategy](https://github.com/BlackHC/llm-strategy) | 267 |

				|[momegas/megabots](https://github.com/momegas/megabots) | 259 |

				|[bborn/howdoi.ai](https://github.com/bborn/howdoi.ai) | 238 |

				|[Cheems-Seminar/grounded-segment-any-parts](https://github.com/Cheems-Seminar/grounded-segment-any-parts) | 232 |

				|[ur-whitelab/exmol](https://github.com/ur-whitelab/exmol) | 227 |

				|[sullivan-sean/chat-langchainjs](https://github.com/sullivan-sean/chat-langchainjs) | 227 |

				|[explosion/spacy-llm](https://github.com/explosion/spacy-llm) | 226 |

				|[recalign/RecAlign](https://github.com/recalign/RecAlign) | 218 |

				|[jupyterlab/jupyter-ai](https://github.com/jupyterlab/jupyter-ai) | 218 |

				|[alvarosevilla95/autolang](https://github.com/alvarosevilla95/autolang) | 215 |

				|[conceptofmind/toolformer](https://github.com/conceptofmind/toolformer) | 213 |

				|[MagnivOrg/prompt-layer-library](https://github.com/MagnivOrg/prompt-layer-library) | 209 |

				|[JohnSnowLabs/nlptest](https://github.com/JohnSnowLabs/nlptest) | 208 |

				|[airobotlab/KoChatGPT](https://github.com/airobotlab/KoChatGPT) | 197 |

				|[langchain-ai/auto-evaluator](https://github.com/langchain-ai/auto-evaluator) | 195 |

				|[yvann-hub/Robby-chatbot](https://github.com/yvann-hub/Robby-chatbot) | 195 |

				|[alejandro-ao/langchain-ask-pdf](https://github.com/alejandro-ao/langchain-ask-pdf) | 192 |

				|[daveebbelaar/langchain-experiments](https://github.com/daveebbelaar/langchain-experiments) | 189 |

				|[NimbleBoxAI/ChainFury](https://github.com/NimbleBoxAI/ChainFury) | 187 |

				|[kaleido-lab/dolphin](https://github.com/kaleido-lab/dolphin) | 184 |

				|[Anil-matcha/Website-to-Chatbot](https://github.com/Anil-matcha/Website-to-Chatbot) | 183 |

				|[plchld/InsightFlow](https://github.com/plchld/InsightFlow) | 180 |

				|[OpenBMB/AgentVerse](https://github.com/OpenBMB/AgentVerse) | 166 |

				|[benthecoder/ClassGPT](https://github.com/benthecoder/ClassGPT) | 166 |

				|[jbrukh/gpt-jargon](https://github.com/jbrukh/gpt-jargon) | 161 |

				|[hardbyte/qabot](https://github.com/hardbyte/qabot) | 160 |

				|[shaman-ai/agent-actors](https://github.com/shaman-ai/agent-actors) | 153 |

				|[radi-cho/datasetGPT](https://github.com/radi-cho/datasetGPT) | 153 |

				|[poe-platform/poe-protocol](https://github.com/poe-platform/poe-protocol) | 152 |

				|[paolorechia/learn-langchain](https://github.com/paolorechia/learn-langchain) | 149 |

				|[ajndkr/lanarky](https://github.com/ajndkr/lanarky) | 149 |

				|[fengyuli-dev/multimedia-gpt](https://github.com/fengyuli-dev/multimedia-gpt) | 147 |

				|[yasyf/compress-gpt](https://github.com/yasyf/compress-gpt) | 144 |

				|[homanp/superagent](https://github.com/homanp/superagent) | 143 |

				|[realminchoi/babyagi-ui](https://github.com/realminchoi/babyagi-ui) | 141 |

				|[ethanyanjiali/minChatGPT](https://github.com/ethanyanjiali/minChatGPT) | 141 |

				|[ccurme/yolopandas](https://github.com/ccurme/yolopandas) | 139 |

				|[hwchase17/langchain-streamlit-template](https://github.com/hwchase17/langchain-streamlit-template) | 138 |

				|[Jaseci-Labs/jaseci](https://github.com/Jaseci-Labs/jaseci) | 136 |

				|[hirokidaichi/wanna](https://github.com/hirokidaichi/wanna) | 135 |

				|[Haste171/langchain-chatbot](https://github.com/Haste171/langchain-chatbot) | 134 |

				|[jmpaz/promptlib](https://github.com/jmpaz/promptlib) | 130 |

				|[Klingefjord/chatgpt-telegram](https://github.com/Klingefjord/chatgpt-telegram) | 130 |

				|[filip-michalsky/SalesGPT](https://github.com/filip-michalsky/SalesGPT) | 128 |

				|[handrew/browserpilot](https://github.com/handrew/browserpilot) | 128 |

				|[shauryr/S2QA](https://github.com/shauryr/S2QA) | 127 |

				|[steamship-core/vercel-examples](https://github.com/steamship-core/vercel-examples) | 127 |

				|[yasyf/summ](https://github.com/yasyf/summ) | 127 |

				|[gia-guar/JARVIS-ChatGPT](https://github.com/gia-guar/JARVIS-ChatGPT) | 126 |

				|[jerlendds/osintbuddy](https://github.com/jerlendds/osintbuddy) | 125 |

				|[ibiscp/LLM-IMDB](https://github.com/ibiscp/LLM-IMDB) | 124 |

				|[Teahouse-Studios/akari-bot](https://github.com/Teahouse-Studios/akari-bot) | 124 |

				|[hwchase17/chroma-langchain](https://github.com/hwchase17/chroma-langchain) | 124 |

				|[menloparklab/langchain-cohere-qdrant-doc-retrieval](https://github.com/menloparklab/langchain-cohere-qdrant-doc-retrieval) | 123 |

				|[peterw/StoryStorm](https://github.com/peterw/StoryStorm) | 123 |

				|[chakkaradeep/pyCodeAGI](https://github.com/chakkaradeep/pyCodeAGI) | 123 |

				|[petehunt/langchain-github-bot](https://github.com/petehunt/langchain-github-bot) | 115 |

				|[su77ungr/CASALIOY](https://github.com/su77ungr/CASALIOY) | 113 |

				|[eunomia-bpf/GPTtrace](https://github.com/eunomia-bpf/GPTtrace) | 113 |

				|[zenml-io/zenml-projects](https://github.com/zenml-io/zenml-projects) | 112 |

				|[pablomarin/GPT-Azure-Search-Engine](https://github.com/pablomarin/GPT-Azure-Search-Engine) | 111 |

				|[shamspias/customizable-gpt-chatbot](https://github.com/shamspias/customizable-gpt-chatbot) | 109 |

				|[WongSaang/chatgpt-ui-server](https://github.com/WongSaang/chatgpt-ui-server) | 108 |

				|[davila7/file-gpt](https://github.com/davila7/file-gpt) | 104 |

				|[enhancedocs/enhancedocs](https://github.com/enhancedocs/enhancedocs) | 102 |

				|[aurelio-labs/arxiv-bot](https://github.com/aurelio-labs/arxiv-bot) | 101 |

				_Generated by [github-dependents-info](https://github.com/nvuillam/github-dependents-info)_

				[github-dependents-info --repo hwchase17/langchain --markdownfile dependents.md --minstars 100 --sort stars]

7

docs/docs_skeleton/.gitignore vendored Normal file

View File

@@ -0,0 +1,7 @@
 .yarn/
 node_modules/
 .docusaurus
 .cache-loader
 docs/api

									
										49

docs/docs_skeleton/README.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,49 @@

				# Website

				This website is built using [Docusaurus 2](https://docusaurus.io/), a modern static website generator.

				### Installation

				```

				$ yarn

				```

				### Local Development

				```

				$ yarn start

				```

				This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.

				### Build

				```

				$ yarn build

				```

				This command generates static content into the `build` directory and can be served using any static contents hosting service.

				### Deployment

				Using SSH:

				```

				$ USE_SSH=true yarn deploy

				```

				Not using SSH:

				```

				$ GIT_USER=<Your GitHub username> yarn deploy

				```

				If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.

				### Continuous Integration

				Some common defaults for linting/formatting have been set for you. If you integrate your project with an open source Continuous Integration system (e.g. Travis CI, CircleCI), you may check for issues using the following command.

				```

				$ yarn ci

				```

									
										12

docs/docs_skeleton/babel.config.js
									
										Normal file
									
												View File
												
				@@ -0,0 +1,12 @@

				/**

				 * Copyright (c) Meta Platforms, Inc. and affiliates.

				 *

				 * This source code is licensed under the MIT license found in the

				 * LICENSE file in the root directory of this source tree.

				 *

				 * @format

				 */

				module.exports = {

				  presets: [require.resolve("@docusaurus/core/lib/babel/preset")],

				};

									
										76

docs/docs_skeleton/code-block-loader.js
									
										Normal file
									
												View File
												
				@@ -0,0 +1,76 @@

				/* eslint-disable prefer-template */

				/* eslint-disable no-param-reassign */

				// eslint-disable-next-line import/no-extraneous-dependencies

				const babel = require("@babel/core");

				const path = require("path");

				const fs = require("fs");

				/**

				 *

				 * @param {string|Buffer} content Content of the resource file

				 * @param {object} [map] SourceMap data consumable by https://github.com/mozilla/source-map

				 * @param {any} [meta] Meta data, could be anything

				 */

				async function webpackLoader(content, map, meta) {

				  const cb = this.async();

				  if (!this.resourcePath.endsWith(".ts")) {

				    cb(null, JSON.stringify({ content, imports: [] }), map, meta);

				    return;

				  }

				  try {

				    const result = await babel.parseAsync(content, {

				      sourceType: "module",

				      filename: this.resourcePath,

				    });

				    const imports = [];

				    result.program.body.forEach((node) => {

				      if (node.type === "ImportDeclaration") {

				        const source = node.source.value;

				        if (!source.startsWith("langchain")) {

				          return;

				        }

				        node.specifiers.forEach((specifier) => {

				          if (specifier.type === "ImportSpecifier") {

				            const local = specifier.local.name;

				            const imported = specifier.imported.name;

				            imports.push({ local, imported, source });

				          } else {

				            throw new Error("Unsupported import type");

				          }

				        });

				      }

				    });

				    imports.forEach((imp) => {

				      const { imported, source } = imp;

				      const moduleName = source.split("/").slice(1).join("_");

				      const docsPath = path.resolve(__dirname, "docs", "api", moduleName);

				      const available = fs.readdirSync(docsPath, { withFileTypes: true });

				      const found = available.find(

				        (dirent) =>

				          dirent.isDirectory() &&

				          fs.existsSync(path.resolve(docsPath, dirent.name, imported + ".md"))

				      );

				      if (found) {

				        imp.docs =

				          "/" + path.join("docs", "api", moduleName, found.name, imported);

				      } else {

				        throw new Error(

				          `Could not find docs for ${source}.${imported} in docs/api/`

				        );

				      }

				    });

				    cb(null, JSON.stringify({ content, imports }), map, meta);

				  } catch (err) {

				    cb(err);

				  }

				}

				module.exports = webpackLoader;

0

docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored

View File

Before

Width: | Height: | Size: 559 KiB

After

Width: | Height: | Size: 559 KiB

0

docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/ChaindeskDashboard.png vendored

View File

Before

Width: | Height: | Size: 157 KiB

After

Width: | Height: | Size: 157 KiB

0

docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored

View File

Before

Width: | Height: | Size: 235 KiB

After

Width: | Height: | Size: 235 KiB

0

docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored

View File

Before

Width: | Height: | Size: 148 KiB

After

Width: | Height: | Size: 148 KiB

0

docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored

View File

Before

Width: | Height: | Size: 3.5 MiB

After

Width: | Height: | Size: 3.5 MiB

BIN
docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 18 KiB

BIN
docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 85 KiB

BIN
docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 16 KiB

									
										21

docs/docs_skeleton/docs/_static/css/custom.css
									
										vendored
									
										Normal file
									
												View File
												
				@@ -0,0 +1,21 @@

				pre {

				  white-space: break-spaces;

				}

				@media (min-width: 1200px) {

				  .container,

				  .container-lg,

				  .container-md,

				  .container-sm,

				  .container-xl {

				    max-width: 2560px !important;

				  }

				}

				#my-component-root *, #headlessui-portal-root * {

				  z-index: 10000;

				}

				.content-container p {

				    margin: revert;

				}

BIN
docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 542 B

BIN
docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 1.2 KiB

BIN
docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 15 KiB

0

docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js vendored

View File

BIN
docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 103 KiB

BIN
docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 136 KiB

BIN
docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file

View File

Binary file not shown.

After

Width: | Height: | Size: 34 KiB

5

docs/docs_skeleton/docs/get_started/installation.mdx Normal file

View File

@@ -0,0 +1,5 @@
 # Installation
 import Installation from "@snippets/get_started/installation.mdx"
 <Installation/>

65

docs/docs_skeleton/docs/get_started/introduction.mdx Normal file

View File

@@ -0,0 +1,65 @@
 ---
 sidebar_position: 0
 ---
 # Introduction
 **LangChain** is a framework for developing applications powered by language models. It enables applications that are:
 - **Data-aware**: connect a language model to other sources of data
 - **Agentic**: allow a language model to interact with its environment
 The main value props of LangChain are:
 . **Components**: abstractions for working with language models, along with a collection of implementations for each abstraction. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
 . **Off-the-shelf chains**: a structured assembly of components for accomplishing specific higher-level tasks
 Off-the-shelf chains make it easy to get started. For more complex applications and nuanced use-cases, components make it easy to customize existing chains or build new ones.
 ## Get started
 [Here’s](/docs/get_started/installation.html) how to install LangChain, set up your environment, and start building.
 We recommend following our [Quickstart](/docs/get_started/quickstart.html) guide to familiarize yourself with the framework by building your first LangChain application.
 _**Note**: These docs are for the LangChain [Python package](https://github.com/hwchase17/langchain). For documentation on [LangChain.js](https://github.com/hwchase17/langchainjs), the JS/TS version, [head here](https://js.langchain.com/docs)._
 ## Modules
 LangChain provides standard, extendable interfaces and external integrations for the following modules, listed from least to most complex:
 #### [Model I/O](/docs/modules/model_io/)
 Interface with language models
 #### [Data connection](/docs/modules/data_connection/)
 Interface with application-specific data
 #### [Chains](/docs/modules/chains/)
 Construct sequences of calls
 #### [Agents](/docs/modules/agents/)
 Let chains choose which tools to use given high-level directives
 #### [Memory](/docs/modules/memory/)
 Persist application state between runs of a chain
 #### [Callbacks](/docs/modules/callbacks/)
 Log and stream intermediate steps of any chain
 ## Examples, ecosystem, and resources
 ### [Use cases](/docs/use_cases/)
 Walkthroughs and best-practices for common end-to-end use cases, like:
 - [Chatbots](/docs/use_cases/chatbots/)
 - [Answering questions using sources](/docs/use_cases/question_answering/)
 - [Analyzing structured data](/docs/use_cases/tabular.html)
 - and much more...
 ### [Guides](/docs/guides/)
 Learn best practices for developing with LangChain.
 ### [Ecosystem](/docs/ecosystem/)
 LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/integrations/) and [dependent repos](/docs/ecosystem/dependents).
 ### [Additional resources](/docs/additional_resources/)
 Our community is full of prolific developers, creative builders, and fantastic teachers. Check out [YouTube tutorials](/docs/additional_resources/youtube.html) for great tutorials from folks in the community, and [Gallery](https://github.com/kyrolabs/awesome-langchain) for a list of awesome LangChain projects, compiled by the folks at [KyroLabs](https://kyrolabs.com).
 <h3><span style={{color:"#2e8555"}}> Support </span></h3>
 Join us on [GitHub](https://github.com/hwchase17/langchain) or [Discord](https://discord.gg/6adMQxSpJS) to ask questions, share feedback, meet other developers building with LangChain, and dream about the future of LLM’s.
 ## API reference
 Head to the [reference](https://api.python.langchain.com) section for full documentation of all classes and methods in the LangChain Python package.

162

docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file

View File

@@ -0,0 +1,162 @@
 # Quickstart
 ## Installation
 To install LangChain run:
 import Tabs from '@theme/Tabs';
 import TabItem from '@theme/TabItem';
 import Install from "@snippets/get_started/quickstart/installation.mdx"
 <Install/>
 For more details, see our [Installation guide](/docs/get_started/installation.html).
 ## Environment setup
 Using LangChain will usually require integrations with one or more model providers, data stores, APIs, etc. For this example, we'll use OpenAI's model APIs.
 import OpenAISetup from "@snippets/get_started/quickstart/openai_setup.mdx"
 <OpenAISetup/>
 ## Building an application
 Now we can start building our language model application. LangChain provides many modules that can be used to build language model applications.
 Modules can be used as stand-alones in simple applications and they can be combined for more complex use cases.
 The core building block of LangChain applications is the LLMChain.
 This combines three things:
 - LLM: The language model is the core reasoning engine here. In order to work with LangChain, you need to understand the different types of language models and how to work with them.
 - Prompt Templates: This provides instructions to the language model. This controls what the language model outputs, so understanding how to construct prompts and different prompting strategies is crucial.
 - Output Parsers: These translate the raw response from the LLM to a more workable format, making it easy to use the output downstream.
 In this getting started guide we will cover those three components by themselves, and then cover the LLMChain which combines all of them.
 Understanding these concepts will set you up well for being able to use and customize LangChain applications.
 Most LangChain applications allow you to configure the LLM and/or the prompt used, so knowing how to take advantage of this will be a big enabler.
 ## LLMs
 There are two types of language models, which in LangChain are called:
 - LLMs: this is a language model which takes a string as input and returns a string
 - ChatModels: this is a language model which takes a list of messages as input and returns a message
 The input/output for LLMs is simple and easy to understand - a string.
 But what about ChatModels? The input there is a list of `ChatMessage`s, and the output is a single `ChatMessage`.
 A `ChatMessage` has two required components:
 - `content`: This is the content of the message.
 - `role`: This is the role of the entity from which the `ChatMessage` is coming from.
 LangChain provides several objects to easily distinguish between different roles:
 - `HumanMessage`: A `ChatMessage` coming from a human/user.
 - `AIMessage`: A `ChatMessage` coming from an AI/assistant.
 - `SystemMessage`: A `ChatMessage` coming from the system.
 - `FunctionMessage`: A `ChatMessage` coming from a function call.
 If none of those roles sound right, there is also a `ChatMessage` class where you can specify the role manually.
 For more information on how to use these different messages most effectively, see our prompting guide.
 LangChain exposes a standard interface for both, but it's useful to understand this difference in order to construct prompts for a given language model.
 The standard interface that LangChain exposes has two methods:
 - `predict`: Takes in a string, returns a string
 - `predict_messages`: Takes in a list of messages, returns a message.
 Let's see how to work with these different types of models and these different types of inputs.
 First, let's import an LLM and a ChatModel.
 import ImportLLMs from "@snippets/get_started/quickstart/import_llms.mdx"
 <ImportLLMs/>
 The `OpenAI` and `ChatOpenAI` objects are basically just configuration objects.
 You can initialize them with parameters like `temperature` and others, and pass them around.
 Next, let's use the `predict` method to run over a string input.
 import InputString from "@snippets/get_started/quickstart/input_string.mdx"
 <InputString/>
 Finally, let's use the `predict_messages` method to run over a list of messages.
 import InputMessages from "@snippets/get_started/quickstart/input_messages.mdx"
 <InputMessages/>
 For both these methods, you can also pass in parameters as key word arguments.
 For example, you could pass in `temperature=0` to adjust the temperature that is used from what the object was configured with.
 Whatever values are passed in during run time will always override what the object was configured with.
 ## Prompt templates
 Most LLM applications do not pass user input directly into an LLM. Usually they will add the user input to a larger piece of text, called a prompt template, that provides additional context on the specific task at hand.
 In the previous example, the text we passed to the model contained instructions to generate a company name. For our application, it'd be great if the user only had to provide the description of a company/product, without having to worry about giving the model instructions.
 PromptTemplates help with exactly this!
 They bundle up all the logic for going from user input into a fully formatted prompt.
 This can start off very simple - for example, a prompt to produce the above string would just be:
 import PromptTemplateLLM from "@snippets/get_started/quickstart/prompt_templates_llms.mdx"
 import PromptTemplateChatModel from "@snippets/get_started/quickstart/prompt_templates_chat_models.mdx"
 <PromptTemplateLLM/>
 However, the advantages of using these over raw string formatting are several.
 You can "partial" out variables - eg you can format only some of the variables at a time.
 You can compose them together, easily combining different templates into a single prompt.
 For explanations of these functionalities, see the [section on prompts](/docs/modules/model_io/prompts) for more detail.
 PromptTemplates can also be used to produce a list of messages.
 In this case, the prompt not only contains information about the content, but also each message (its role, its position in the list, etc)
 Here, what happens most often is a ChatPromptTemplate is a list of ChatMessageTemplates.
 Each ChatMessageTemplate contains instructions for how to format that ChatMessage - its role, and then also its content.
 Let's take a look at this below:
 <PromptTemplateChatModel/>
 ChatPromptTemplates can also include other things besides ChatMessageTemplates - see the [section on prompts](/docs/modules/model_io/prompts) for more detail.
 ## Output Parsers
 OutputParsers convert the raw output of an LLM into a format that can be used downstream.
 There are few main type of OutputParsers, including:
 - Convert text from LLM -> structured information (eg JSON)
 - Convert a ChatMessage into just a string
 - Convert the extra information returned from a call besides the message (like OpenAI function invocation) into a string.
 For full information on this, see the [section on output parsers](/docs/modules/model_io/output_parsers)
 In this getting started guide, we will write our own output parser - one that converts a comma separated list into a list.
 import OutputParser from "@snippets/get_started/quickstart/output_parser.mdx"
 <OutputParser/>
 ## LLMChain
 We can now combine all these into one chain.
 This chain will take input variables, pass those to a prompt template to create a prompt, pass the prompt to an LLM, and then pass the output through an (optional) output parser.
 This is a convenient way to bundle up a modular piece of logic.
 Let's see it in action!
 import LLMChain from "@snippets/get_started/quickstart/llm_chain.mdx"
 <LLMChain/>
 ## Next Steps
 This is it!
 We've now gone over how to create the core building block of LangChain applications - the LLMChains.
 There is a lot more nuance in all these components (LLMs, prompts, output parsers) and a lot more different components to learn about as well.
 To continue on your journey:
 - [Dive deeper](/docs/modules/model_io) into LLMs, prompts, and output parsers
 - Learn the other [key components](/docs/modules)
 - Check out our [helpful guides](/docs/guides) for detailed walkthroughs on particular topics
 - Explore [end-to-end use cases](/docs/use_cases)

8

docs/docs_skeleton/docs/guides/evaluation/comparison/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 3
 ---
 # Comparison Evaluators
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

12

docs/docs_skeleton/docs/guides/evaluation/examples/index.mdx Normal file

View File

@@ -0,0 +1,12 @@
 ---
 sidebar_position: 5
 ---
 # Examples
 🚧 _Docs under construction_ 🚧
 Below are some examples for inspecting and checking different chains.
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

28

docs/docs_skeleton/docs/guides/evaluation/index.mdx Normal file

View File

@@ -0,0 +1,28 @@
 ---
 sidebar_position: 6
 ---
 import DocCardList from "@theme/DocCardList";
 # Evaluation
 Language models can be unpredictable. This makes it challenging to ship reliable applications to production, where repeatable, useful outcomes across diverse inputs are a minimum requirement. Tests help demonstrate each component in an LLM application can produce the required or expected functionality. These tests also safeguard against regressions while you improve interconnected pieces of an integrated system. However, measuring the quality of generated text can be challenging. It can be hard to agree on the right set of metrics for your application, and it can be difficult to translate those into better performance. Furthermore, it's common to lack sufficient evaluation data to adequately test the range of inputs and expected outputs for each component when you're just getting started. The LangChain community is building open source tools and guides to help address these challenges.
 LangChain exposes different types of evaluators for common types of evaluation. Each type has off-the-shelf implementations you can use to get started, as well as an
  extensible API so you can create your own or contribute improvements for everyone to use. The following sections have example notebooks for you to get started.
 - [String Evaluators](/docs/modules/evaluation/string/): Evaluate the predicted string for a given input, usually against a reference string
 - [Trajectory Evaluators](/docs/modules/evaluation/trajectory/): Evaluate the whole trajectory of agent actions
 - [Comparison Evaluators](/docs/modules/evaluation/comparison/): Compare predictions from two runs on a common input
 This section also provides some additional examples of how you could use these evaluators for different scenarios or apply to different chain implementations in the LangChain library. Some examples include:
 - [Preference Scoring Chain Outputs](/docs/modules/evaluation/examples/comparisons): An example using a comparison evaluator on different models or prompts to select statistically significant differences in aggregate preference scores
 ## Reference Docs
 For detailed information of the available evaluators, including how to instantiate, configure, and customize them. Check out the [reference documentation](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.evaluation) directly.
 <DocCardList />

8

docs/docs_skeleton/docs/guides/evaluation/string/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 2
 ---
 # String Evaluators
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

8

docs/docs_skeleton/docs/guides/evaluation/trajectory/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 4
 ---
 # Trajectory Evaluators
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

									
										12

docs/docs_skeleton/docs/guides/langsmith/index.md
									
										Normal file
									
												View File
												
				@@ -0,0 +1,12 @@

				# LangSmith

				import DocCardList from "@theme/DocCardList";

				LangSmith helps you trace and evaluate your language model applications and intelligent agents to help you

				move from prototype to production.

				Check out the [interactive walkthrough](walkthrough) below to get started.

				For more information, please refer to the [LangSmith documentation](https://docs.smith.langchain.com/)

				<DocCardList />

13

docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx Normal file

View File

@@ -0,0 +1,13 @@
 # Conversational
 This walkthrough demonstrates how to use an agent optimized for conversation. Other agents are often optimized for using tools to figure out the best response, which is not ideal in a conversational setting where you may want the agent to be able to chat with the user as well.
 import Example from "@snippets/modules/agents/agent_types/conversational_agent.mdx"
 <Example/>
 import ChatExample from "@snippets/modules/agents/agent_types/chat_conversation_agent.mdx"
 ## Using a chat model
 <ChatExample/>

57

docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx Normal file

View File

@@ -0,0 +1,57 @@
 ---
 sidebar_position: 0
 ---
 # Agent types
 ## Action agents
 Agents use an LLM to determine which actions to take and in what order.
 An action can either be using a tool and observing its output, or returning a response to the user.
 Here are the agents available in LangChain.
 ### [Zero-shot ReAct](/docs/modules/agents/agent_types/react.html)
 This agent uses the [ReAct](https://arxiv.org/pdf/2205.00445.pdf) framework to determine which tool to use
 based solely on the tool's description. Any number of tools can be provided.
 This agent requires that a description is provided for each tool.
 **Note**: This is the most general purpose action agent.
 ### [Structured input ReAct](/docs/modules/agents/agent_types/structured_chat.html)
 The structured tool chat agent is capable of using multi-input tools.
 Older agents are configured to specify an action input as a single string, but this agent can use a tools' argument
 schema to create a structured action input. This is useful for more complex tool usage, like precisely
 navigating around a browser.
 ### [OpenAI Functions](/docs/modules/agents/agent_types/openai_functions_agent.html)
 Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been explicitly fine-tuned to detect when a
 function should to be called and respond with the inputs that should be passed to the function.
 The OpenAI Functions Agent is designed to work with these models.
 ### [Conversational](/docs/modules/agents/agent_types/chat_conversation_agent.html)
 This agent is designed to be used in conversational settings.
 The prompt is designed to make the agent helpful and conversational.
 It uses the ReAct framework to decide which tool to use, and uses memory to remember the previous conversation interactions.
 ### [Self ask with search](/docs/modules/agents/agent_types/self_ask_with_search.html)
 This agent utilizes a single tool that should be named `Intermediate Answer`.
 This tool should be able to lookup factual answers to questions. This agent
 is equivalent to the original [self ask with search paper](https://ofir.io/self-ask.pdf),
 where a Google search API was provided as the tool.
 ### [ReAct document store](/docs/modules/agents/agent_types/react_docstore.html)
 This agent uses the ReAct framework to interact with a docstore. Two tools must
 be provided: a `Search` tool and a `Lookup` tool (they must be named exactly as so).
 The `Search` tool should search for a document, while the `Lookup` tool should lookup
 a term in the most recently found document.
 This agent is equivalent to the
 original [ReAct paper](https://arxiv.org/pdf/2210.03629.pdf), specifically the Wikipedia example.
 ## [Plan-and-execute agents](/docs/modules/agents/agent_types/plan_and_execute.html)
 Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).

11

docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx Normal file

View File

@@ -0,0 +1,11 @@
 # OpenAI functions
 Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been fine-tuned to detect when a function should to be called and respond with the inputs that should be passed to the function.
 In an API call, you can describe functions and have the model intelligently choose to output a JSON object containing arguments to call those functions.
 The goal of the OpenAI Function APIs is to more reliably return valid and useful function calls than a generic text completion or chat API.
 The OpenAI Functions Agent is designed to work with these models.
 import Example from "@snippets/modules/agents/agent_types/openai_functions_agent.mdx";
 <Example/>

11

docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx Normal file

View File

@@ -0,0 +1,11 @@
 # Plan and execute
 Plan and execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).
 The planning is almost always done by an LLM.
 The execution is usually done by a separate agent (equipped with tools).
 import Example from "@snippets/modules/agents/agent_types/plan_and_execute.mdx"
 <Example/>

15

docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx Normal file

View File

@@ -0,0 +1,15 @@
 # ReAct
 This walkthrough showcases using an agent to implement the [ReAct](https://react-lm.github.io/) logic.
 import Example from "@snippets/modules/agents/agent_types/react.mdx"
 <Example/>
 ## Using chat models
 You can also create ReAct agents that use chat models instead of LLMs as the agent driver.
 import ChatExample from "@snippets/modules/agents/agent_types/react_chat.mdx"
 <ChatExample/>

10

docs/docs_skeleton/docs/modules/agents/agent_types/structured_chat.mdx Normal file

View File

@@ -0,0 +1,10 @@
 # Structured tool chat
 The structured tool chat agent is capable of using multi-input tools.
 Older agents are configured to specify an action input as a single string, but this agent can use the provided tools' `args_schema` to populate the action input.
 import Example from "@snippets/modules/agents/agent_types/structured_chat.mdx"
 <Example/>

									
										2

docs/docs_skeleton/docs/modules/agents/how_to/_category_.yml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,2 @@

				label: 'How-to'

				position: 1

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_agent.mdx Normal file

View File

@@ -0,0 +1,14 @@
 # Custom LLM Agent
 This notebook goes through how to create your own custom LLM agent.
 An LLM agent consists of three parts:
 - PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
 - LLM: This is the language model that powers the agent
 - `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
 - OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
 import Example from "@snippets/modules/agents/how_to/custom_llm_agent.mdx"
 <Example/>

14

docs/docs_skeleton/docs/modules/agents/how_to/custom_llm_chat_agent.mdx Normal file

View File

@@ -0,0 +1,14 @@
 # Custom LLM Agent (with a ChatModel)
 This notebook goes through how to create your own custom agent based on a chat model.
 An LLM chat agent consists of three parts:
 - PromptTemplate: This is the prompt template that can be used to instruct the language model on what to do
 - ChatModel: This is the language model that powers the agent
 - `stop` sequence: Instructs the LLM to stop generating as soon as this string is found
 - OutputParser: This determines how to parse the LLMOutput into an AgentAction or AgentFinish object
 import Example from "@snippets/modules/agents/how_to/custom_llm_chat_agent.mdx"
 <Example/>

16

docs/docs_skeleton/docs/modules/agents/how_to/mrkl.mdx Normal file

View File

@@ -0,0 +1,16 @@
 # Replicating MRKL
 This walkthrough demonstrates how to replicate the [MRKL](https://arxiv.org/pdf/2205.00445.pdf) system using agents.
 This uses the example Chinook database.
 To set it up follow the instructions on https://database.guide/2-sample-databases-sqlite/, placing the `.db` file in a notebooks folder at the root of this repository.
 import Example from "@snippets/modules/agents/how_to/mrkl.mdx"
 <Example/>
 ## With a chat model
 import ChatExample from "@snippets/modules/agents/how_to/mrkl_chat.mdx"
 <ChatExample/>

85

docs/docs_skeleton/docs/modules/agents/index.mdx Normal file

View File

@@ -0,0 +1,85 @@
 ---
 sidebar_position: 4
 ---
 # Agents
 The core idea of agents is to use an LLM to choose a sequence of actions to take.
 In chains, a sequence of actions is hardcoded (in code).
 In agents, a language model is used as a reasoning engine to determine which actions to take and in which order.
 There are several key components here:
 ## Agent
 This is the class responsible for deciding what step to take next.
 This is powered by a language model and a prompt.
 This prompt can include things like:
 . The personality of the agent (useful for having it respond in a certain way)
 . Background context for the agent (useful for giving it more context on the types of tasks it's being asked to do)
 . Prompting strategies to invoke better reasoning (the most famous/widely used being [ReAct](https://arxiv.org/abs/2210.03629))
 LangChain provides a few different types of agents to get started.
 Even then, you will likely want to customize those agents with parts (1) and (2).
 For a full list of agent types see [agent types](/docs/modules/agents/agent_types/)
 ## Tools
 Tools are functions that an agent calls.
 There are two important considerations here:
 . Giving the agent access to the right tools
 . Describing the tools in a way that is most helpful to the agent
 Without both, the agent you are trying to build will not work.
 If you don't give the agent access to a correct set of tools, it will never be able to accomplish the objective.
 If you don't describe the tools properly, the agent won't know how to properly use them.
 LangChain provides a wide set of tools to get started, but also makes it easy to define your own (including custom descriptions).
 For a full list of tools, see [here](/docs/modules/agents/tools/)
 ## Toolkits
 Often the set of tools an agent has access to is more important than a single tool.
 For this LangChain provides the concept of toolkits - groups of tools needed to accomplish specific objectives.
 There are generally around 3-5 tools in a toolkit.
 LangChain provides a wide set of toolkits to get started.
 For a full list of toolkits, see [here](/docs/modules/agents/toolkits/)
 ## AgentExecutor
 The agent executor is the runtime for an agent.
 This is what actually calls the agent and executes the actions it chooses.
 Pseudocode for this runtime is below:
 ```python
 next_action = agent.get_action(...)
 while next_action != AgentFinish:
     observation = run(next_action)
     next_action = agent.get_action(..., next_action, observation)
 return next_action
 ```
 While this may seem simple, there are several complexities this runtime handles for you, including:
 . Handling cases where the agent selects a non-existent tool
 . Handling cases where the tool errors
 . Handling cases where the agent produces output that cannot be parsed into a tool invocation
 . Logging and observability at all levels (agent decisions, tool calls) either to stdout or [LangSmith](https://smith.langchain.com).
 ## Other types of agent runtimes
 The `AgentExecutor` class is the main agent runtime supported by LangChain.
 However, there are other, more experimental runtimes we also support.
 These include:
 - [Plan-and-execute Agent](/docs/modules/agents/agent_types/plan_and_execute.html)
 - [Baby AGI](/docs/use_cases/autonomous_agents/baby_agi.html)
 - [Auto GPT](/docs/use_cases/autonomous_agents/autogpt.html)
 ## Get started
 import GetStarted from "@snippets/modules/agents/get_started.mdx"
 <GetStarted/>

10

docs/docs_skeleton/docs/modules/agents/toolkits/index.mdx Normal file

View File

@@ -0,0 +1,10 @@
 ---
 sidebar_position: 3
 ---
 # Toolkits
 :::info
 Head to [Integrations](/docs/integrations/toolkits/) for documentation on built-in toolkit integrations.
 :::
 Toolkits are collections of tools that are designed to be used together for specific tasks and have convenience loading methods.

21

docs/docs_skeleton/docs/modules/agents/tools/index.mdx Normal file

View File

@@ -0,0 +1,21 @@
 ---
 sidebar_position: 2
 ---
 # Tools
 :::info
 Head to [Integrations](/docs/integrations/tools/) for documentation on built-in tool integrations.
 :::
 Tools are interfaces that an agent can use to interact with the world.
 ## Get started
 Tools are functions that agents can use to interact with the world.
 These tools can be generic utilities (e.g. search), other chains, or even other agents.
 Currently, tools can be loaded with the following snippet:
 import GetStarted from "@snippets/modules/agents/tools/get_started.mdx"
 <GetStarted/>

14

docs/docs_skeleton/docs/modules/callbacks/index.mdx Normal file

View File

@@ -0,0 +1,14 @@
 ---
 sidebar_position: 5
 ---
 # Callbacks
 :::info
 Head to [Integrations](/docs/integrations/callbacks/) for documentation on built-in callbacks integrations with 3rd-party tools.
 :::
 LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.
 import GetStarted from "@snippets/modules/callbacks/get_started.mdx"
 <GetStarted/>

7

docs/docs_skeleton/docs/modules/chains/additional/analyze_document.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Analyze Document
 The AnalyzeDocumentChain can be used as an end-to-end to chain. This chain takes in a single document, splits it up, and then runs it through a CombineDocumentsChain.
 import Example from "@snippets/modules/chains/additional/analyze_document.mdx"
 <Example/>

7

docs/docs_skeleton/docs/modules/chains/additional/constitutional_chain.mdx Normal file

View File

@@ -0,0 +1,7 @@
 # Self-critique chain with constitutional AI
 The ConstitutionalChain is a chain that ensures the output of a language model adheres to a predefined set of constitutional principles. By incorporating specific rules and guidelines, the ConstitutionalChain filters and modifies the generated content to align with these principles, thus providing more controlled, ethical, and contextually appropriate responses. This mechanism helps maintain the integrity of the output while minimizing the risk of generating content that may violate guidelines, be offensive, or deviate from the desired context.
 import Example from "@snippets/modules/chains/additional/constitutional_chain.mdx"
 <Example/>

8

docs/docs_skeleton/docs/modules/chains/additional/index.mdx Normal file

View File

@@ -0,0 +1,8 @@
 ---
 sidebar_position: 4
 ---
 # Additional
 import DocCardList from "@theme/DocCardList";
 <DocCardList />

8

docs/docs_skeleton/docs/modules/chains/additional/moderation.mdx Normal file

View File

@@ -0,0 +1,8 @@
 # Moderation
 This notebook walks through examples of how to use a moderation chain, and several common ways for doing so. Moderation chains are useful for detecting text that could be hateful, violent, etc. This can be useful to apply on both user input, but also on the output of a Language Model. Some API providers, like OpenAI, [specifically prohibit](https://beta.openai.com/docs/usage-policies/use-case-policy) you, or your end users, from generating some types of harmful content. To comply with this (and to just generally prevent your application from being harmful) you may often want to append a moderation chain to any LLMChains, in order to make sure any output the LLM generates is not harmful.
 If the content passed into the moderation chain is harmful, there is not one best way to handle it, it probably depends on your application. Sometimes you may want to throw an error in the Chain (and have your application handle that). Other times, you may want to return something to the user explaining that the text was harmful. There could even be other ways to handle it! We will cover all these ways in this walkthrough.
 import Example from "@snippets/modules/chains/additional/moderation.mdx"
 <Example/>

Compare commits

1116 Commits v0.0.182 ... eugene/tes

42 .devcontainer/Dockerfile Unescape Escape View File

37 .devcontainer/README.md Normal file Unescape Escape View File

45 .devcontainer/devcontainer.json Unescape Escape View File

7 .devcontainer/docker-compose.yaml Unescape Escape View File

3 .gitattributes vendored Normal file Unescape Escape View File

79 .github/CONTRIBUTING.md vendored Unescape Escape View File

2 .github/ISSUE_TEMPLATE/bug-report.yml vendored Unescape Escape View File

70 .github/PULL_REQUEST_TEMPLATE.md vendored Unescape Escape View File

2 .github/actions/poetry_setup/action.yml vendored Unescape Escape View File

46 .github/workflows/_lint.yml vendored Normal file Unescape Escape View File

51 .github/workflows/_release.yml vendored Normal file Unescape Escape View File

61 .github/workflows/_test.yml vendored Normal file Unescape Escape View File

22 .github/workflows/codespell.yml vendored Normal file Unescape Escape View File

27 .github/workflows/langchain_ci.yml vendored Normal file Unescape Escape View File

29 .github/workflows/langchain_experimental_ci.yml vendored Normal file Unescape Escape View File

20 .github/workflows/langchain_experimental_release.yml vendored Normal file Unescape Escape View File

20 .github/workflows/langchain_release.yml vendored Normal file Unescape Escape View File

38 .github/workflows/linkcheck.yml vendored Unescape Escape View File

36 .github/workflows/lint.yml vendored Unescape Escape View File

49 .github/workflows/release.yml vendored Unescape Escape View File

49 .github/workflows/test.yml vendored Unescape Escape View File

22 .gitignore vendored Unescape Escape View File

4 .gitmodules vendored Normal file Unescape Escape View File

9 .readthedocs.yaml Unescape Escape View File

57 MIGRATE.md Normal file Unescape Escape View File

73 Makefile Unescape Escape View File

28 README.md Unescape Escape View File

17 docs/.local_build.sh Executable file Unescape Escape View File

57 docs/additional_resources/tracing.md Unescape Escape View File

90 docs/additional_resources/youtube.md Unescape Escape View File

0 docs/Makefile → docs/api_reference/Makefile Unescape Escape View File

0 docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css Unescape Escape View File

125 docs/api_reference/conf.py Normal file Unescape Escape View File

96 docs/api_reference/create_api_rst.py Normal file Unescape Escape View File

8 docs/api_reference/index.rst Normal file Unescape Escape View File

0 docs/make.bat → docs/api_reference/make.bat Unescape Escape View File

9 docs/api_reference/modules/evaluation.rst Normal file Unescape Escape View File

4 docs/requirements.txt → docs/api_reference/requirements.txt Unescape Escape View File

27 docs/api_reference/templates/COPYRIGHT.txt Normal file Unescape Escape View File

28 docs/api_reference/templates/class.rst Normal file Unescape Escape View File

15 docs/api_reference/templates/redirects.html Normal file Unescape Escape View File

27 docs/api_reference/themes/COPYRIGHT.txt Normal file Unescape Escape View File

67 docs/api_reference/themes/scikit-learn-modern/javascript.html Normal file Unescape Escape View File

142 docs/api_reference/themes/scikit-learn-modern/layout.html Normal file Unescape Escape View File

69 docs/api_reference/themes/scikit-learn-modern/nav.html Normal file Unescape Escape View File

16 docs/api_reference/themes/scikit-learn-modern/search.html Normal file Unescape Escape View File

1395 docs/api_reference/themes/scikit-learn-modern/static/css/theme.css Normal file View File

6 docs/api_reference/themes/scikit-learn-modern/static/css/vendor/bootstrap.min.css vendored Normal file View File

6 docs/api_reference/themes/scikit-learn-modern/static/js/vendor/bootstrap.min.js vendored Normal file View File

2 docs/api_reference/themes/scikit-learn-modern/static/js/vendor/jquery-3.6.3.slim.min.js vendored Normal file View File

8 docs/api_reference/themes/scikit-learn-modern/theme.conf Normal file Unescape Escape View File

112 docs/conf.py Unescape Escape View File

192 docs/dependents.md Unescape Escape View File

7 docs/docs_skeleton/.gitignore vendored Normal file Unescape Escape View File

49 docs/docs_skeleton/README.md Normal file Unescape Escape View File

12 docs/docs_skeleton/babel.config.js Normal file Unescape Escape View File

76 docs/docs_skeleton/code-block-loader.js Normal file Unescape Escape View File

0 docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored Unescape Escape View File

0 docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/ChaindeskDashboard.png vendored Unescape Escape View File

0 docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored Unescape Escape View File

0 docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored Unescape Escape View File

0 docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file View File

21 docs/docs_skeleton/docs/_static/css/custom.css vendored Normal file Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file View File

0 docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js vendored Unescape Escape View File

BIN docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file View File

BIN docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file View File

5 docs/docs_skeleton/docs/get_started/installation.mdx Normal file Unescape Escape View File

65 docs/docs_skeleton/docs/get_started/introduction.mdx Normal file Unescape Escape View File

162 docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file Unescape Escape View File

8 docs/docs_skeleton/docs/guides/evaluation/comparison/index.mdx Normal file Unescape Escape View File

12 docs/docs_skeleton/docs/guides/evaluation/examples/index.mdx Normal file Unescape Escape View File

1116 Commits

v0.0.182 ... eugene/tes

42

.devcontainer/Dockerfile

View File

37

.devcontainer/README.md Normal file

View File

45

.devcontainer/devcontainer.json

View File

7

.devcontainer/docker-compose.yaml

View File

3

.gitattributes vendored Normal file

View File

79

.github/CONTRIBUTING.md vendored

View File

2

.github/ISSUE_TEMPLATE/bug-report.yml vendored

View File

70

.github/PULL_REQUEST_TEMPLATE.md vendored

View File

2

.github/actions/poetry_setup/action.yml vendored

View File

46

.github/workflows/_lint.yml vendored Normal file

View File

51

.github/workflows/_release.yml vendored Normal file

View File

61

.github/workflows/_test.yml vendored Normal file

View File

22

.github/workflows/codespell.yml vendored Normal file

View File

27

.github/workflows/langchain_ci.yml vendored Normal file

View File

29

.github/workflows/langchain_experimental_ci.yml vendored Normal file

View File

20

.github/workflows/langchain_experimental_release.yml vendored Normal file

View File

20

.github/workflows/langchain_release.yml vendored Normal file

View File

38

.github/workflows/linkcheck.yml vendored

View File

36

.github/workflows/lint.yml vendored

View File

49

.github/workflows/release.yml vendored

View File

49

.github/workflows/test.yml vendored

View File

22

.gitignore vendored

View File

4

.gitmodules vendored Normal file

View File

9

.readthedocs.yaml

View File

57

MIGRATE.md Normal file

View File

73

Makefile

View File

28

README.md

View File

17

docs/.local_build.sh Executable file

View File

57

docs/additional_resources/tracing.md

View File

90

docs/additional_resources/youtube.md

View File

0

docs/Makefile → docs/api_reference/Makefile

View File

0

docs/_static/css/custom.css → docs/api_reference/_static/css/custom.css

View File

125

docs/api_reference/conf.py Normal file

View File

96

docs/api_reference/create_api_rst.py Normal file

View File

8

docs/api_reference/index.rst Normal file

View File

0

docs/make.bat → docs/api_reference/make.bat

View File

9

docs/api_reference/modules/evaluation.rst Normal file

View File

4

docs/requirements.txt → docs/api_reference/requirements.txt

View File

27

docs/api_reference/templates/COPYRIGHT.txt Normal file

View File

28

docs/api_reference/templates/class.rst Normal file

View File

15

docs/api_reference/templates/redirects.html Normal file

View File

27

docs/api_reference/themes/COPYRIGHT.txt Normal file

View File

67

docs/api_reference/themes/scikit-learn-modern/javascript.html Normal file

View File

142

docs/api_reference/themes/scikit-learn-modern/layout.html Normal file

View File

69

docs/api_reference/themes/scikit-learn-modern/nav.html Normal file

View File

16

docs/api_reference/themes/scikit-learn-modern/search.html Normal file

View File

1395

docs/api_reference/themes/scikit-learn-modern/static/css/theme.css Normal file

View File

6

docs/api_reference/themes/scikit-learn-modern/static/css/vendor/bootstrap.min.css vendored Normal file

View File

6

docs/api_reference/themes/scikit-learn-modern/static/js/vendor/bootstrap.min.js vendored Normal file

View File

2

docs/api_reference/themes/scikit-learn-modern/static/js/vendor/jquery-3.6.3.slim.min.js vendored Normal file

View File

8

docs/api_reference/themes/scikit-learn-modern/theme.conf Normal file

View File

112

docs/conf.py

View File

192

docs/dependents.md

View File

7

docs/docs_skeleton/.gitignore vendored Normal file

View File

49

docs/docs_skeleton/README.md Normal file

View File

12

docs/docs_skeleton/babel.config.js Normal file

View File

76

docs/docs_skeleton/code-block-loader.js Normal file

View File

0

docs/_static/ApifyActors.png → docs/docs_skeleton/docs/_static/ApifyActors.png vendored

View File

0

docs/_static/DataberryDashboard.png → docs/docs_skeleton/docs/_static/ChaindeskDashboard.png vendored

View File

0

docs/_static/HeliconeDashboard.png → docs/docs_skeleton/docs/_static/HeliconeDashboard.png vendored

View File

0

docs/_static/HeliconeKeys.png → docs/docs_skeleton/docs/_static/HeliconeKeys.png vendored

View File

0

docs/_static/MetalDash.png → docs/docs_skeleton/docs/_static/MetalDash.png vendored

View File

BIN
docs/docs_skeleton/docs/_static/android-chrome-192x192.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/android-chrome-512x512.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/apple-touch-icon.png vendored Normal file

View File

21

docs/docs_skeleton/docs/_static/css/custom.css vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon-16x16.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon-32x32.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/favicon.ico vendored Normal file

View File

0

docs/_static/js/mendablesearch.js → docs/docs_skeleton/docs/_static/js/mendablesearch.js vendored

View File

BIN
docs/docs_skeleton/docs/_static/lc_modules.jpg vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/parrot-chainlink-icon.png vendored Normal file

View File

BIN
docs/docs_skeleton/docs/_static/parrot-icon.png vendored Normal file

View File

5

docs/docs_skeleton/docs/get_started/installation.mdx Normal file

View File

65

docs/docs_skeleton/docs/get_started/introduction.mdx Normal file

View File

162

docs/docs_skeleton/docs/get_started/quickstart.mdx Normal file

View File

8

docs/docs_skeleton/docs/guides/evaluation/comparison/index.mdx Normal file

View File

12

docs/docs_skeleton/docs/guides/evaluation/examples/index.mdx Normal file

View File

28

docs/docs_skeleton/docs/guides/evaluation/index.mdx Normal file

View File